Commit Graph

87313 Commits

Author SHA1 Message Date
Timothy Arceri
966567aa12 mesa: reset linked_stages bitmask when re-linking
34953f8907 added this bitmask but it wasn't being reset when
a program was relinked. If a stage was removed from the new
program then it could case a crash as we expect the linked shader
for that stage to not be null.

Fixes crashes in:
ESEXT-CTS.tessellation_shader.single.xfb_captures_data_from_correct_stage
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98917
2016-12-01 10:24:16 +11:00
Rob Clark
45eef9af03 freedreno/a5xx: fix negative branches
Looks like immed branch offset size increased again.. making what we
think is a small negative number look to hw like a huge positive number.
And things go badly when shader tries to jump to hyperspace.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 17:32:54 -05:00
Rob Clark
ef30e91fe6 freedreno: fix android build with a5xx
Android doesn't build all the files that normal linux/autotools build
does (mainly standalond ir3_compiler).. but possibly we should pull
C_SOURCES + aNxx_SOURCES into a single variable picked up by both
Android.mk and Makefile.am?  (Suggested by Rob H.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 17:32:54 -05:00
Rob Clark
8b6f8f2576 freedreno/a5xx: fix discard
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 17:32:54 -05:00
Ville Syrjälä
676c0cf287 anv: Prefer in-tree headers to out-of-tree headers
Set the include paths to consider in-tree headers before out-of-tree
headers.

Avoids the build failing due to stale headers being present in
$prefix. Previosuly 'make -ki install' or something similar was required
to update the out-of-tree headers to allow the build to succeed.

Also avoids having to rebuild the entire thing after every 'make
install'.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2016-11-30 20:01:00 +02:00
Rob Clark
946cf4eb68 freedreno/a5xx: initial support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:35:49 -05:00
Rob Clark
fcba3046e1 freedreno: update generated headers
Pull in a5xx

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:25:48 -05:00
Rob Clark
8c56789f60 freedreno: make gmem tile size alignment configurable
a5xx seems to prefer 64 pixel alignment, in at least some cases.  Make
this configurable per generation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:25:48 -05:00
Rob Clark
728e2c4d38 freedreno/ir3: don't offset inloc by 8
On a3xx/a4xx, the SP_VS_VPC_DST_REG.OUTLOCn is offset by 8, so we used
to add this offset into fs->inputs[n].inloc.  But a5xx drops this extra
offset-by-8.  So instead make inloc zero based and add the offset when
we emit OUTLOCn values (for the gen's that need the offset).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:25:48 -05:00
Rob Clark
7a59157287 freedreno/a3xx: use new shader linkage helper
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:25:48 -05:00
Rob Clark
98c83b5d1c freedreno/a4xx: use new shader linkage helper
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:25:48 -05:00
Rob Clark
1be5670c8d freedreno/ir3: add new helper for shader linkage
Helps simplify things on a5xx, where pos/psize get added to the vs-out
map.  And anyways, simplifies a3xx and a4xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30 12:25:48 -05:00
Nicolai Hähnle
f60374aa68 st/mesa: skip lower_output_reads when possible
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-30 09:10:02 +01:00
Nicolai Hähnle
0a58b258ca st/glsl_to_tgsi: swizzle PROGRAM_OUTPUTs correctly in src_register translation
This is required for reading directly from fragment shader stencil and depth
outputs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-30 09:09:59 +01:00
Nicolai Hähnle
611166b8ed gallium: add PIPE_CAP_TGSI_CAN_READ_OUTPUTS
Drivers that support this benefit by saving one lowering pass in the
GLSL-to-TGSI conversion.

radeonsi already supports this because all outputs are stored in temporary
variables before the export (except for TCS outputs, which have always
been readable in TGSI anyway due to their special semantics).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-11-30 09:09:50 +01:00
Bas Nieuwenhuizen
abc887faa1 ac/nir: Fix out of bounds array access.
With nir_intrinsic_ssbo_atomic_comp_swap we run out of params.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-11-30 07:09:38 +01:00
Kristian H. Kristensen
d3d7cab812 aubinator: Add support for enum types
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
7fc659d8d5 intel/genxml: Fix ksp for INTERFACE_DESCRIPTOR_DATA
This one was split across two dwords as "Kernel Start Pointer" and
"Kernel Start Pointer High", which looks like it works when the driver
only accesses "Kernel Start Pointer". This breaks, of course, with BO
offsets > 4G.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
99e573b4e0 intel/genxml: Use enum 3D_Logic_Op_Function where applicable
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
374d19ac00 intel/genxml: Use blend function and factor enums where applicable
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
09fe8ad010 intel/genxml: Use enum 3D_Vertex_Component_Control where applicable
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
54e71e5851 intel/genxml: Use enum 3D_Stencil_Operation where applicable
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
193c1b72e0 intel/genxml: Use enum SURFACE_FORMAT where applicable
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
0799022bf9 intel/genxml: Use enum 3D_Prim_Topo_Type where applicable
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
993babc014 intel/genxml: Use 3D_Compare_Function for gen8+ test functions
When the state fields where shuffled around for gen8, the compare
function enums were downgraded to just uints. Change them to enum
3D_Compare_Function.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
fc2225b1af intel/genxml: Emit genxml enums as C enums
The previous commits got rid of any clashes between #defines and enum
values and we can now emit the genxml enums as debugger friendly C
enums.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
8fc74b879e intel/genxml: Remove duplicate COMPAREFUNCTION values
These values were defined both as an enum and as inline values. Remove
the inline values and reference the 3D_Compare_Function enum instead.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
5814fc1bb7 intel/genxml: Allow referencing enums in type attributes
This lets us reference enums in the type attribute of a field.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
3b6b6f6463 anv: Emit cherryview SF state without including gen9_pack.h
Cleaner this way and we avoid including gen9_pack.h when we compile with
gen8_pack.h. We also avoid the if (cherryview) condition for non-gen8
gens that don't need it.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
908febcf21 anv: Don't include two different pack headers
The batch chain logic only needs the pre-gen8 size of
MI_BATCH_BUFFER_START, which seems like something we can make a special
case for. The other two gen7 references, MI_BATCH_BUFFER_END and
MI_NOOP, are the same on all gens.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
be9c2ab23b intel/genxml: Move enums above structs
We'll need to define them before we can reference them in structs and
instructions. Enums have no dependencies, so move them first in the
file.

Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Kristian H. Kristensen
ce26486115 genxml: Add values for Barycentric Interpolation Mode
Signed-off-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-29 22:02:49 -08:00
Ilia Mirkin
ed0b3cbd09 anv: remove per-sample shading from TODO
This was done some time ago.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-11-30 00:17:56 -05:00
Ilia Mirkin
be92b3f49d anv: clean up VkPhysicalDeviceFeatures list
Remove duplicate .alphaToOne, add missing .shaderResourceMinLod, and
reorder a few entries to match their vulkan.h order. All the sparse
features are still left out entirely.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-11-30 00:17:56 -05:00
Michel Dänzer
550cd272b4 vulkan/wsi/x11: Destroy Present event context when destroying swapchain
Without this, the X server may accumulate stale Present event contexts
if a client creates and destroys multiple swapchains using the same
window.

v2: Based on Chris Wilson's review:
* Use xcb_present_select_input_checked so that protocol errors
  generated by old X servers can be handled gracefully
* Use xcb_discard_reply() instead of free(xcb_request_check())
v3: Rebased on top of this code having been refactored out of anv

Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-11-30 12:31:25 +09:00
Timothy Arceri
2ea021a1eb glsl: use linked_shaders bitmask to iterate stages for subroutine fields
This should be faster than looping over every stage and null checking, but
will also make the code a bit cleaner when we switch to getting more fields
from gl_program rather than from gl_linked_shader as we can just copy the
pointer and not need to worry about null checking then copying.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-11-30 14:13:52 +11:00
Timothy Arceri
6d3458cbfb mesa: optimise interleaved sso validation
Now that we have a linked_stages bitfield we can use this
to check if the program is used at a later stage.

This change is also required to be able to use gl_program
rather than gl_shader_program in the CurrentProgram array.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-11-30 14:13:52 +11:00
Timothy Arceri
34953f8907 mesa/glsl: add bitmask to track stages a program was linked against
This will be used to enable us to store the current gl_program
rather than gl_shader_program in the gl_pipline_object allowing
us to simplify handing of validation.

Also we should not be depending on _LinkedShader for this information
as it may contain shaders from a failed linking attempt rather than
the current program still in use.

We could also use this mask to iterate over the stages during linking
with _mesa_bit_scan() rather then the current method of NULL checking
each stage.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-11-30 14:13:52 +11:00
Ilia Mirkin
ddf0f097e7 swr: [rasterizer jit] use signed integer representation for logic op
Instead of (incorrectly) biasing the snorm value to make it look like a
unorm, just use signed integer math.

This fixes arb_color_buffer_float-render GL_RGBA8_SNORM

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:55:00 -05:00
Ilia Mirkin
8ed703cfa6 swr: add missing rgbx8_srgb variant
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:57 -05:00
Ilia Mirkin
d6a06228a6 swr: reorder renderable formats, add grouping comments
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:54 -05:00
Ilia Mirkin
53ca06be8f swr: use util_copy_framebuffer_state helper
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:50 -05:00
Ilia Mirkin
86f7932b1e swr: enable cubemap arrays
Everything is in place for these.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:46 -05:00
Ilia Mirkin
8dd9853516 swr: rearrange caps into limits/supported/unsupported groups
I find this a lot more readable and compact - much easier to scan
through the list and see what's on and what's off.

No functional change intended.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:43 -05:00
Ilia Mirkin
9f568e5db1 swr: only store up to the LOD size
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-11-29 20:54:36 -05:00
Tim Rowley
f7ab0e4b7e swr: [rasterizer common] add SwrTrace() and macros
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-11-29 19:36:46 -06:00
Marek Olšák
662b9c24d0 radeonsi: don't fetch 8 dwords for samplerBuffer and imageBuffer
The compiler doesn't shrink s_load_dwordx8, so we always wasted 4 SGPRs.
Also, the extraction of the descriptor created some really ugly asm code
with lots of VALU bitwise ops and v_readfirstlane.

Totals from *affected* shaders:
SGPRS: 13880 -> 13253 (-4.52 %)
VGPRS: 15200 -> 15088 (-0.74 %)
Code Size: 499864 -> 459816 (-8.01 %) bytes
Max Waves: 1554 -> 1564 (0.64 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-29 23:52:31 +01:00
Marek Olšák
dbbdc6bb5a radeonsi: disable XNACK to free 2 SGPRs on APUs
My LLVM commit disables it for dGPUs, but not APUs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-29 23:52:31 +01:00
Marek Olšák
274fb601c2 radeonsi: count and report temp arrays in scratch separately
v2: only do this if debug output of shader dumping is enabled

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-11-29 23:52:31 +01:00
Marek Olšák
a91add9369 radeonsi: don't try to eliminate trivial VS outputs for PS and CS
PS and CS don't have any param exports, so it's a no-op.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-29 23:52:31 +01:00