third_party_mesa3d

Author	SHA1	Message	Date
Mike Blumenkrantz	87a9018ff9	zink: reorder commands more aggressively by starting resources in the unordered state in a given batch, they gain more opportunities to be promoted to the barrier cmdbuf and avoid breaking renderpasses Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20890>	2023-01-27 02:42:56 +00:00
Jesse Natalie	1a29f3dfdb	CI/windows: Apply CI_FDO_CONCURRENT to piglit too Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20924>	2023-01-27 01:49:19 +00:00
Jesse Natalie	1c5a64296d	CI/windows: Don't limit deqp-runner to 4 jobs If FDO_CI_CONCURRENT is set, use that, otherwise let deqp-runner choose concurrency based on system CPU cores. Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20924>	2023-01-27 01:49:19 +00:00
Marek Olšák	2ae08c3e8f	ac/llvm: remove llvm:: now that we use "using namespace llvm" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20297>	2023-01-26 19:33:55 -05:00
Marek Olšák	a273f64f80	ac/llvm: run the IPSCCP pass AMDVLK runs it and it seems useful. https://en.wikipedia.org/wiki/Sparse_conditional_constant_propagation 58380 shaders in 35438 tests Totals: SGPRS: 2709080 -> 2709224 (0.01 %) VGPRS: 1592972 -> 1592808 (-0.01 %) Spilled SGPRs: 2420 -> 2420 (0.00 %) Spilled VGPRs: 1077 -> 1077 (0.00 %) Private memory VGPRs: 253 -> 253 (0.00 %) Scratch size: 1232 -> 1232 (0.00 %) dwords per thread Code Size: 61382088 -> 61356504 (-0.04 %) bytes Max Waves: 849293 -> 849308 (0.00 %) Outputs: 127090 -> 127090 (0.00 %) Patch Outputs: 579 -> 579 (0.00 %) Totals from affected shaders: SGPRS: 5400 -> 5544 (2.67 %) VGPRS: 6200 -> 6036 (-2.65 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 975824 -> 950240 (-2.62 %) bytes Max Waves: 1214 -> 1229 (1.24 %) Outputs: 232 -> 232 (0.00 %) Patch Outputs: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20297>	2023-01-26 19:33:43 -05:00
Marek Olšák	d05c3811cd	ac/llvm: run the LLVM sinking pass because LLVM will stop running it shader-db was run with the sinking pass disabled in LLVM. 58380 shaders in 35438 tests Totals: SGPRS: 2730768 -> 2730768 (0.00 %) VGPRS: 1592932 -> 1592928 (-0.00 %) Spilled SGPRs: 2687 -> 2687 (0.00 %) Spilled VGPRs: 551 -> 551 (0.00 %) Private memory VGPRs: 253 -> 253 (0.00 %) Scratch size: 700 -> 700 (0.00 %) dwords per thread Code Size: 61238872 -> 61238868 (-0.00 %) bytes Max Waves: 849209 -> 849209 (0.00 %) Outputs: 127090 -> 127090 (0.00 %) Patch Outputs: 579 -> 579 (0.00 %) Totals from affected shaders: SGPRS: 440 -> 440 (0.00 %) VGPRS: 396 -> 392 (-1.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 49880 -> 49876 (-0.01 %) bytes Max Waves: 105 -> 105 (0.00 %) Outputs: 14 -> 14 (0.00 %) Patch Outputs: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20297>	2023-01-26 19:33:17 -05:00
Brian Paul	fbd32a04da	anv: add a third memory type for LLC configuration Commit `582bf4d9` turned on write-combining for most (all?) memory allocations. This caused a fairly large performance drop in some of our VMware tests (application traces, such as Windows Metro Paint). This patch adds a third memory type configuration: DEVICE_LOCAL, HOST_VISIBLE, HOST_COHERENT. This is uncached. Then, in anv_AllocateMemory() we only use write-combining for this uncached type. This memory type is found in the Intel Windows Vulkan driver. And according to https://asawicki.info/news_1740_vulkan_memory_types_on_pc_and_how_to_use_them uncached memory correlates to write-combined memory. This fixes our performance regression (and actually produced the fastest ever results for our test suite). Signed-off-by: Brian Paul <brianp@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20770>	2023-01-26 22:45:49 +00:00
Rob Clark	15e19d04f0	freedreno/drm: Synchronize handle close and lookup Handle lookup (for example PRIME_FD_TO_HANDLE) must be synchronized with GEM_CLOSE, otherwise re-import can race with bo_del path, resulting in the handle of the newly (re)imported BO getting closed. Now that the finalize step has been decoupled, fixing this is mostly just deleting code. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20918>	2023-01-26 22:21:47 +00:00
Rob Clark	444db624df	freedreno/drm: Split out bo->finalize() The complexity around batching up handle closing is simply to allow the virtgpu to back up ccmd's to the host (because virtio/virtgpu is pretty inefficient when it comes to lots of small msgs to the host, and it is common that when we are deleting BOs, we delete a lot of them at the same time. But that will make the locking fix in the next commit impossible (without nested locks). So let's flip this around and do the step that virtgpu wants to batch up first, before we get into closing GEM handles, etc. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20918>	2023-01-26 22:21:47 +00:00
Rob Clark	5a46e884ea	freedreno/drm: Remove bo_del_or_recycle() In prep for the next patch, where locking is swapped around to cover the whole bo_del() path, decouple handling of the recycle-to-BO-cache path. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20918>	2023-01-26 22:21:47 +00:00
Rob Clark	160137ccae	freedreno/drm: Detect zombie BOs When importing from a GEM name or dmabuf fd, we can race with the final unref of the same BO, in which case we can get a hit in the handle table for an fd_bo that another thread is about to free(). Detect and handle this case. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20918>	2023-01-26 22:21:47 +00:00
Rob Clark	547f50c244	freedreno/drm: Add some ref/unref debugging Helpful to catch common refcnt issues, like resurrecting a zombie object. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20918>	2023-01-26 22:21:47 +00:00
Emma Anholt	870beb2159	freedreno: Don't sync timestamps while perfetto isn't running. This may help with the regression in trace perf testing since enabling perfetto on the test drivers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20916>	2023-01-26 20:46:39 +00:00
Jesse Natalie	2010b91547	dzn: Report as a software device for non-Windows Fixes: `5f1b8b3e6c` ("dzn: Use DXGI swapchains") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20939>	2023-01-26 19:00:31 +00:00
Jesse Natalie	cdd1588d55	dzn: Don't recursively lock the physical device enum mutex Fixes: `cfa260cd27` ("dzn: Use common physical device list/enumeration helpers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20939>	2023-01-26 19:00:31 +00:00
Jesse Natalie	40a2b50599	dzn: Fix Windows WSI This was a merge conflict from the Win32 WSI DXGI swapchain changes. I missed moving a new line of code that was added when rearranging things for using the common helpers. Fixes: `cfa260cd` ("dzn: Use common physical device list/enumeration helpers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20944>	2023-01-26 18:03:50 +00:00
Eric Engestrom	633f2428f4	docs: update calendar for 22.3.4 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20942>	2023-01-26 17:37:59 +00:00
Eric Engestrom	c8a32d21cf	docs/relnotes: add sha256sum for 22.3.4 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20942>	2023-01-26 17:37:55 +00:00
Eric Engestrom	cf58992a36	docs: add release notes for 22.3.4 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20942>	2023-01-26 17:37:44 +00:00
Konrad Dybcio	50dee85b68	freedreno/registers: Add RBBM_GPR0_CNTL for non-GMU operation Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20925>	2023-01-26 15:45:50 +00:00
Rob Clark	f9bcf19e52	freedreno/a6xx: Add a few kernel regs/etc Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20925>	2023-01-26 15:45:50 +00:00
Gert Wollny	4767ebeffc	virgl: remove unused virgl_encoder_inline_write The only user was removed with `be8eeb3b59` virgl: remove unused virgl_transfer_inline_write so drop this code too. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18543>	2023-01-26 15:26:40 +00:00
Amber	228d812a0c	ir3, isaspec: add raw instruction to assembler/disassembler. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20789>	2023-01-26 14:26:11 +00:00
Ruijing Dong	f2a4ea5300	frontends/va: revert commit `0b02db30` revert commit `0b02db30` as it is not a proper way to fix av1 decoding 10bit issue. this is corresponding to the fix in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20870 Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20911>	2023-01-26 14:11:10 +00:00
Corentin Noël	dd3730f8bd	kopper: Do not free the given screen in initScreen implementation The given screen is already freed by the caller in case a NULL-pointer is returned by the implementation. Cc: mesa-stable Signed-off-by: Corentin Noël <corentin.noel@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20933>	2023-01-26 13:36:32 +00:00
Juston Li	4c03d4735e	util/tests/cache_test: Skip Cache.List if not supported FOZ_DB_UTIL_DYNAMIC_LIST depends on inotify support Fixes: `3b69b67545` ("util/fossilize_db: add runtime RO foz db loading via FOZ_DBS_DYNAMIC_LIST") Signed-off-by: Juston Li <justonli@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20865>	2023-01-26 13:06:27 +00:00
Juston Li	f18702250f	util/fossilize_db: add ifdef for inotify header FOZ_DB_UTIL_DYNAMIC_LIST is defined if the inotify header was detected. Fixes: `3b69b67545` ("util/fossilize_db: add runtime RO foz db loading via FOZ_DBS_DYNAMIC_LIST") Signed-off-by: Juston Li <justonli@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20865>	2023-01-26 13:06:27 +00:00
Timur Kristóf	65a917cb6e	nir: Add algebraic optimization for VKD3D-Proton fp32->fp16 conversion. VKD3D-Proton DXBC f32 to f16 conversion implements a float conversion using PackHalf2x16. Because the spec does not specify a rounding mode, it emits a sequence to ensure D3D-like behaviour for infinity. When we know the current backend has pack_half_2x16_rtz_split, we can eliminate the extra sequence. Fossil DB stats on GFX11: Totals from 835 (0.62% of 134913) affected shaders: VGPRs: 49368 -> 49224 (-0.29%) CodeSize: 5341956 -> 5124564 (-4.07%) Instrs: 1024062 -> 987041 (-3.62%) Latency: 6530956 -> 6465120 (-1.01%); split: -1.01%, +0.00% InvThroughput: 908189 -> 870253 (-4.18%) VClause: 18704 -> 18702 (-0.01%); split: -0.02%, +0.01% SClause: 33406 -> 33284 (-0.37%); split: -0.38%, +0.01% Copies: 67440 -> 65992 (-2.15%); split: -2.15%, +0.00% Branches: 18498 -> 18465 (-0.18%) PreSGPRs: 38409 -> 38331 (-0.20%) PreVGPRs: 44089 -> 43834 (-0.58%) Note, some fossils are from before this pattern was added to VKD3D-Proton, so the above may not reflect real-world impact. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Timur Kristóf	7985933a6d	nir: Lower pack_half_2x16_split to RTZ if available. Constant folding always uses RTNE for pack_half_2x16_split, but some backends implement it with RTZ. Lowering to RTZ when available ensures that the behaviour will be consistent between constant folding and the backend. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Timur Kristóf	c644461b71	radv, aco, ac: Implement pack_half_2x16_rtz_split. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Timur Kristóf	12652cc549	nir: Add pack_half_2x16_rtz_split opcode. Same as pack_half_2x16_rtz_split, but always uses RTZ mode. Note that pack_half_2x16 rounding mode is unspecified. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838>	2023-01-26 12:24:24 +00:00
Lionel Landwerlin	13cca48920	intel/fs: drop FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GFX7 We can lower FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD into other more generic sends and drop this internal opcode. The idea behind this change is to allow bindless surfaces to be used for UBO pulls and why it's interesting to be able to reuse setup_surface_descriptors(). But that will come in a later change. No shader-db changes on TGL & DG2. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20416>	2023-01-26 11:26:53 +00:00
Erico Nunes	5bc91550d1	lima/ci: Add more piglit unsupported tests to skip It is not an exhaustive list but it helps by reducing the bulk of "Failed to create waffle_context for OpenGL [34].x" errors in the logs by thousands of occurrences and those are probably not going to be needed. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20909>	2023-01-26 10:48:47 +00:00
Jose Fonseca	9f51340b99	llvmpipe: Ensure floating point SSE state is reset regardless of the write mask. The code emitted by lp_build_fpstate_set to reset the FP state could be jumped over when the write mask was zero, leading to denormals not being flushed to zero. Spotted by Roland Scheidegger. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20901>	2023-01-26 08:55:21 +00:00
Samuel Pitoiset	b97fee432c	radv: fix ignoring graphics shader stages that don't need to be imported If a shader stage is already imported from a library it should be properly ignored. Fixes recent CTS dEQP-VK.pipeline.fast_linked_library.misc.unused_shader_stages*. Fixes: `c8765c5244` ("radv: ignore shader stages that don't need to be imported with GPL") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20899>	2023-01-26 08:34:36 +00:00
Samuel Pitoiset	6bec915919	radv: fix creating libraries with PS epilog and all CB states as dynamic It's legal to create a library with FRAGMENT_OUTPUT_INTERFACE and with all CB states as dynamic, in this case the PS epilog should be dynamic. This fixes a bunch of regressions while running Zink/RADV CTS with RADV_PERFTEST=gpl. Zink is the final boss. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20882>	2023-01-26 08:14:39 +00:00
Iago Toral Quiroga	a3ed7f3ff2	v3dv: add a cl_advance_and_end helper For the common case where we're emitting packet we don't need to update the cl_out pointer and then store the result in cl->next, we can directly update cl->next. This shows a small improvement in vkoverhead's scores for basic draw tests. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20897>	2023-01-26 06:21:33 +00:00
Jesse Natalie	a08d6d8b59	dzn: Support Vulkan 1.2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	9d89b7e4a8	dzn: Ensure we don't mix DSV+simultaneous-access Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	4daeac01c5	dzn: Enhanced barriers fixes/workarounds Fix: Acquire/release should have one valid access/sync and one set to none. Workaround: D3D doesn't like simultaneous access resources leaving COMMON layout, nor does it like setting UAV/RTV access bits for the COMMON layout. Use UNDEFINED -> UNDEFINED layout transitions, where the access bits just aren't validated. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	c413c3dffc	dzn: Always do clears with copies on non-graphics queues Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	948ff5b8e2	dzn: Support float control Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	f391c2db62	dzn: Cache GPUVA for buffers Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	34f372c47c	dzn: Handle separate stencil usage Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	789acc2ffb	dzn: Fix dynamic rendering clear load op for non-multiview Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Jesse Natalie	e88070b1da	microsoft/compiler: Support float controls Float controls are emitted as function attributes on the entrypoint. These function attributes are not the standard build-in LLVM kind, but are strings, which the DXIL backend didn't know how to emit. So, this change adds string attribute support and uses it for fp32 ftz/preserve. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20919>	2023-01-26 03:16:50 +00:00
Timur Kristóf	9fc5d8d211	aco: Remove dynamic VS input loads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20733>	2023-01-26 02:43:11 +00:00
Timur Kristóf	15b689604e	radv: Lower dynamic VS inputs in NIR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20733>	2023-01-26 02:43:11 +00:00
Timur Kristóf	81620fc7b0	aco: Enable constant exec mask based optimization on compute shaders. We know for sure exec is initially -1 when the shader always has full subgroups. Fossil DB stats on GFX11: Totals from 3884 (2.88% of 134913) affected shaders: SpillSGPRs: 1673 -> 1697 (+1.43%); split: -1.67%, +3.11% SpillVGPRs: 2316 -> 2310 (-0.26%); split: -0.65%, +0.39% CodeSize: 19584436 -> 19567156 (-0.09%); split: -0.13%, +0.04% Scratch: 217088 -> 216832 (-0.12%) Instrs: 3784596 -> 3780303 (-0.11%); split: -0.15%, +0.03% Latency: 39971204 -> 39794967 (-0.44%); split: -0.47%, +0.03% InvThroughput: 7885552 -> 7801247 (-1.07%); split: -1.14%, +0.07% VClause: 74654 -> 74611 (-0.06%); split: -0.07%, +0.01% SClause: 103139 -> 103043 (-0.09%); split: -0.13%, +0.04% Copies: 279864 -> 281995 (+0.76%); split: -0.72%, +1.48% Branches: 92082 -> 92084 (+0.00%); split: -0.03%, +0.03% PreSGPRs: 155637 -> 149491 (-3.95%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20670>	2023-01-26 01:59:26 +00:00
Timur Kristóf	39448c8e9c	radv, aco: Add uses_full_subgroups to compute shader info. Allow the compiler to assume that the shader always has full subgroups, meaning that the initial EXEC mask is -1 in all waves (all lanes enabled). This assumption is incorrect for ray tracing and internal (meta) shaders because they can use unaligned dispatch. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20670>	2023-01-26 01:59:26 +00:00

1 2 3 4 5 ...

165847 Commits