third_party_mesa3d

Author	SHA1	Message	Date
Marek Olšák	235ebe9163	radeonsi/gfx10: fix corruption for chips with harvested TCCs Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	8cbe83445b	ac: add radeon_info::tcc_harvested Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	7d97013294	ac: fix incorrect vram_size reported by the kernel Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	3c0938bece	radeonsi/gfx10: fix L2 cache rinse programming Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Eric Engestrom	0efc253f02	etnaviv: fix bitmask typo Fixes: `d92689c46f` ("etnaviv: nir: add native integers (HALTI2+)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-09-30 17:54:33 +01:00
Adam Jackson	855dc17fcf	glx: Log the filename of the drm device if we fail to open it Helps point the user to the specific device that's having issues, since you're increasingly likely to have more than one. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/107 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-30 15:30:16 +00:00
Alyssa Rosenzweig	7be00b2a06	pan/midgard: Allow scheduling conditions with constants Now that we have constant adjustment logic abstracted, we can do this safely. Along with the csel inversion patch, this allows many more common csel ops to inline their condition in the bundle. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	c20063aa4a	pan/midgard: Add csel invert optimization Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	f0f4b39548	pan/midgard: Add mir_flip helper Useful for various operations on both commutative and anticommutative ops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	10037ce523	pan/midgard: Tightly pack 32-bit constants If we can reuse constant slots from other instructions, we would like to do so to include more instructions per bundle. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	a3ca283bc1	pan/midgard: Allow writeout to see into the future If an instruction could be scheduled to vmul to satisfy the writeout conditions, let's do that and save an instruction+cycle per fragment shader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	12a70ccd9e	pan/midgard: Allow 6 instructions per bundle We never had a scheduler good enough to hit this case before! :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	34ff50cadd	pan/midgard: Only one conditional per bundle allowed There's no r32 to save ya after you use up r31 :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	2715bd02ee	pan/midgard: Schedule to smul/sadd Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	57bac68fff	pan/midgard: Extend choose_instruction for scalar units Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	e9edae3ecb	pan/midgard: Don't double check SCALAR units Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	d3b3daa9d3	pan/midgard: Use new scheduler We still emit in-order but we switch to using the bundles created from the new scheduler, which will allow greater flexibility and room for out-of-order optimization. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	1409af9fc7	pan/midgard: Add distance metric to choose_instruction We require chosen instructions to be "close", to avoid ballooning register pressure. This is a kludge that will go away once we have proper liveness tracking in the scheduler, but for now it prevents a lot of needless spilling. v2: Lower threshold to 6 (from 8). Schedule is hurt, but a few shaders that spilled excessively are fixed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Derp	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	e9571b53e1	pan/midgard: Add mir_choose_alu helper Based on a given unit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	8462e82467	pan/midgard: Implement load/store pairing We can bundle two load/store together. This eliminates the need for explicit load/store pairing in a prepass, as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	7cf4932410	pan/midgard: Extend csel_swizzle to branches Conditions for branches don't have a swizzle explicitly in the emitted binary, but they do implicitly get swizzled in whatever instruction wrote r31, so we need to handle that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	c9ce5a92a0	pan/midgard: Add helpers for scheduling conditionals Conditional instructions (csel and conditional branches) require their condition to be written to a special condition pipeline register (r31.w for scalar, r31.xyzw for vector). However, pipeline registers are live only for the duration of a single bundle. As such, the logic to schedule conditionals correct is surprisingly complex. Essentially, we see if we could stuff the conditional within the same bundle as the csel/branch without breaking anything; if we can, we do that. If we can't, we add a dummy move to make room. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	6f92288e85	pan/midgard: Implement predicate->unit This allows ALUs to select for each unit of the bundle separately. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	5a9a48b81a	pan/midgard: Add predicate->exclude A bit of a kludge but allows setting an implicit dependency of synthetic conditional moves on the actual condition, fixing code generated like: vmul.feq r0, .. sadd.imov r31, .., r0 vadd.fcsel [...] The imov runs simultaneous with feq so it gets garbage results, but it's too late to add an actual dependency practically speaking, since the new synthetic imov doesn't have a node associated. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	6284f3ec25	pan/midgard: Add constant intersection filters In the future, we will want to keep track of which components of constants of various sizes correspond to which parts of the bundle constants, like in the old scheduler. For now, let's just stub it out for a simple rule of one instruction with embedded constants per bundle. We can eventually do better, of course. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	941bdd2088	pan/midgard: Remove csel constant unit force Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	da18525b6f	pan/midgard: Add mir_schedule_texture/ldst/alu helpers We don't actually do any scheduling here yet, but add per-tag helpers to consume an instruction, print it, pop it off the worklist. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	72a03bcafa	pan/midgard: Add mir_choose_bundle helper It's not always obvious what the optimal bundle type should be. Let's break out the logic to decide. Currently set for purely in-order operation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	b5396369d2	pan/midgard: Add mir_update_worklist helper After we've chosen an instruction, popped it off, and processed it, it's time to update the worklist, removing that instruction from the dependency graph to allow its dependents to be put onto the worklist. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	826fd7308b	pan/midgard: Add mir_choose_instruction stub In the future, this routine will implement the core scheduling logic to decide which instruction out of the worklist will be scheduled next, in a way that minimizes cycle count and register pressure. In the present, we are more interested in replicating in-order scheduling with the much-more-powerful out-of-order model. So rather than discriminating by a register pressure estimate, we simply choose the latest possible instruction in the worklist. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	f48038b588	pan/midgard: Initialize worklist This flows naturally from the dependency graph Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	a3b46c0db6	pan/midgard: Calculate dependency graph Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	adda411263	pan/midgard: Add flatten_mir helper We would like to flatten a linked list of midgard_instructions into an array of midgard_instruction pointers on the heap. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	0ecfcbf462	pan/midgard: Squeeze indices before scheduling This allows node_count to be correct while scheduling. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	ad05e8a52c	pan/midgard: Fix component count handling for ldst It's not based on the writemask and it can't be inferred; it's just intrinsic to the op itself. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	cc0544a0f5	pan/midgard: Add missing parans in SWIZZLE definition Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:11 -04:00
Daniel Schürmann	b3c1f601aa	nouveau: set lower_sub = true Subtractions are already implemented as additions anyway. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Eric Anholt	ca1aa5d225	v3d: Enable the late algebraic optimizations to get real subs. This worked better than my original v3d-local pass for just subs, and is a huge win over not producing subs. total instructions in shared programs: 6408469 -> 6167932 (-3.75%) total threads in shared programs: 153784 -> 154104 (0.21%) total uniforms in shared programs: 2157078 -> 1905823 (-11.65%) total max-temps in shared programs: 904546 -> 895796 (-0.97%) total spills in shared programs: 4959 -> 4993 (0.69%) total fills in shared programs: 6558 -> 6670 (1.71%) total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%) total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	1d29895e5b	aco: call nir_opt_algebraic_late() exhaustively 57559 shaders in 28980 tests Totals: SGPRS: 2963407 -> 2959935 (-0.12 %) VGPRS: 2014812 -> 2016328 (0.08 %) Spilled SGPRs: 1077 -> 1077 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 10348 -> 10348 (0.00 %) dwords per thread Code Size: 114545436 -> 114498084 (-0.04 %) bytes LDS: 933 -> 933 (0.00 %) blocks Max Waves: 375997 -> 375866 (-0.03 %) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	0fb27f1e5a	radv/aco: Don't lower subtractions 40228 shaders in 20236 tests Totals: SGPRS: 2045512 -> 2046496 (0.05 %) VGPRS: 1430856 -> 1430464 (-0.03 %) Spilled SGPRs: 1077 -> 1077 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 10348 -> 10348 (0.00 %) dwords per thread Code Size: 77202840 -> 77151832 (-0.07 %) bytes LDS: 863 -> 863 (0.00 %) blocks Max Waves: 260729 -> 260754 (0.01 %) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	239423d234	nir: Remove unnecessary subtraction optimizations These optimizations are already covered after lowering. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	99848a57b7	nir: recombine nir_op_*sub when lower_sub = false There are some optimizations which are only implemented for additions and some optimizations which assume that subtractions have been lowered. By lowering all subtractions first and later recombine for backends which prefer this option, we don't have to implement them twice. This patch also moves lower_negate to nir_opt_algebraic_late() to enable these optimizations for backends which make use of it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	10e508c815	freedreno: Enable the nir_opt_algebraic_late() pass. Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Eric Anholt	d54ae70ee7	vc4: Enable the nir_opt_algebraic_late() pass. Upcoming changes to sub optimization will make this pass required. Over the course of that series, we see uniforms +.46%, instructions -.24% (seems like a fine tradeoff -- uniforms are 1/2 the size of instructions as far as cache occupancy) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
pal1000	ffb0d3a25c	scons: Fix MSYS2 Mingw-w64 build. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> This patch is based on `28e3f85e09/mingw-w64-mesa/link-ole32.patch` but with tweaks to avoid MSVC build break when applied. v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation; v3: Fix obviously wrong compiler flags for swr driver; v4: Update original patch URL because it has been relocated; v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway; v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.	2019-09-29 10:57:16 +01:00
Vasily Khoruzhick	336b021d36	lima: set uniforms_address lower bits properly Looks like blob uses following values for uniforms buffer: 0 for 8 bytes 1 for 16 bytes 2 for 24 bytes 2 for 32 bytes 3 for 40 bytes 3 for 48 bytes 3 for 56 bytes 3 for 64 bytes 4 for 72 bytes It all looks like log2(size / 8) rounded up, so let's do the same. Fixes: 931fc2a7b3f9("lima: do not set the PP uniforms address lowest bits") Reviewed-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-28 10:34:19 -07:00
Mauro Rossi	411e50a8fd	android: aco: add support for libmesa_aco Android building rules are added in src/amd/Android.compiler.mk libmesa_aco static library is built conditionally to radeonsi as done for vulkan.radv module This will prevent Android build errors for non x86 systems filter-out compiler/aco_instruction_selection_setup.cpp source, as already included by compiler/aco_instruction_selection.cpp and would cause several multiple definition linker errors NOTE: libLLVM requires AMDGPU Disassembler to build radv with aco Fixes: `93c8ebf` ("aco: Initial commit of independent AMD compiler") Fixes: `a70a998` ("radv/aco: Setup alternate path in RADV to support the experimental ACO compiler") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:34 +02:00
Mauro Rossi	268fb10e9c	android: compiler/nir: build nir_divergence_analysis.c Prerequisite to avoid following radv linking error happening with aco FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so ... external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178: error: undefined reference to 'nir_divergence_analysis' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `df86c5f` ("nir: add divergence analysis pass.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:28 +02:00
Mauro Rossi	c24ad565ae	android: aco: fix undefined template 'std::__1::array' build errors Fixes a few building errors similar to the following: In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26: In file included from external/libcxx/include/algorithm:639: external/libcxx/include/utility:321:9: error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>' _T2 second; ^ Fixes: `93c8ebf` ("aco: Initial commit of independent AMD compiler") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:23 +02:00
Jonathan Marek	b38fcaa221	etnaviv: nir: fix gl_FragDepth Fixes the following piglit test: fragdepth_gles2 (for ETNA_MESA_DEBUG=nir) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:44 -04:00

1 2 3 4 5 ...

106973 Commits