Eric Engestrom
6965aff4d1
ci: move error handling functions at the end
...
So that everything is defined by the time we use it in here.
Cc: mesa-stable
(cherry picked from commit 5cd054ebe5
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-12 09:12:12 -08:00
Iván Briano
ea9b3f928d
intel/rt: fix ray_query stack address calculation
...
While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack
address calculation, what the HW actually uses is
NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform,
while the latter is fixed to 2048 for all current platforms.
Fixes: 6c84cbd8c9
("intel/dev/xe: Set max_eus_per_subslice using topology query")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
(cherry picked from commit aee04bf4fb
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-12 09:12:11 -08:00
Ian Romanick
7994534fe9
brw/cse: Don't eliminate instructions that write flags
...
With other changes in my tree, I observed this code from
dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second
cmp.z removed.
undef(8) %69:UD
cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F
mov(1) v58+0.0:D, 0d NoMask group0
(+f0.0) mov(1) v58+0.0:D, -1d NoMask group0
cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d
...
undef(8) %72:UD
cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F
mov(1) v63+0.0:D, 0d NoMask group0
(+f0.0) mov(1) v63+0.0:D, -1d NoMask group0
This was also fixed by running dead-code elimination before CSE. That
seems more like avoiding the problem than fixing it, though.
I believe this affects shader-db results because leaving the second
CMP in the shader can give more opportunities for cmod propagation.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Fixes: 234c45c929
("intel/brw: Write a new global CSE pass that works on defs")
shader-db:
All Intel platforms had similar results. (Lunar Lake shown)
total cycles in shared programs: 922097690 -> 922260862 (0.02%)
cycles in affected programs: 3178926 -> 3342098 (5.13%)
helped: 130
HURT: 88
helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16
helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18%
HURT stats (abs) min: 4 max: 11992 x̄: 2292.55 x̃: 47
HURT stats (rel) min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61%
95% mean confidence interval for cycles value: 320.36 1176.63
95% mean confidence interval for cycles %-change: 1.59% 5.73%
Cycles are HURT.
LOST: 2
GAINED: 1
fossil-db:
Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown)
Totals:
Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00%
Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00%
Max live registers: 48013385 -> 48013343 (-0.00%)
Totals from 507 (0.09% of 551441) affected shaders:
Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01%
Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86%
Max live registers: 94413 -> 94371 (-0.04%)
DG2
Totals:
Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00%
Fill count: 150673 -> 150631 (-0.03%)
Max live registers: 31871520 -> 31871476 (-0.00%)
Totals from 506 (0.08% of 633197) affected shaders:
Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01%
Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74%
Fill count: 2779 -> 2737 (-1.51%)
Max live registers: 51383 -> 51339 (-0.09%)
Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00%
Fill count: 106610 -> 106614 (+0.00%)
Max live registers: 32195006 -> 32194966 (-0.00%)
Totals from 411 (0.06% of 637268) affected shaders:
Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01%
Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02%
Fill count: 2865 -> 2869 (+0.14%)
Max live registers: 42883 -> 42843 (-0.09%)
(cherry picked from commit 9aba731d03
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-12 09:12:10 -08:00
Ian Romanick
1e792b0933
brw/copy: Don't copy propagate through smaller entry dest size
...
Copy propagation would incorrectly occur in this code
mov(16) v4+2.0:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0
to create
mov(16) v4+2.0:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, u0<0>:UD NoMask group0
This has different behavior. I think I just made a mistake when I
changed this condition in e3f502e007
.
It seems like this condition could be relaxed to cover cases like (note
the change of destination stride)
mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0
I'm not sure it's worth it.
No shader-db or fossil-db changes on any Intel platform. Even the code
for the test case mentioned in the original commit did not change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Fixes: e3f502e007
("intel/fs: Allow copy propagation between MOVs of mixed sizes")
Closes : #12116
(cherry picked from commit 80a5d158ae
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-12 09:12:07 -08:00
Dylan Baker
08955d2ee8
.pick_status.json: Update to 5e0b81413d
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-12 09:11:37 -08:00
Ian Romanick
8f53de4a5d
brw/emit: Add correct 3-source instruction assertions for each platform
...
Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled
on this while trying some stuff with !31852 .
v2: Don't be lazy. Add proper assertions for all the things on all the
platforms. Based on a suggestion by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Fixes: 7bed11fbde
("intel/brw: Allow immediates in the BFE instruction on Gfx12+")
(cherry picked from commit c1c09e3c4a
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:27 -08:00
Hans-Kristian Arntzen
baba2805ca
vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO.
...
This is required, otherwise we regress latency in cases where
applications are using FIFO without explicit KHR_present_wait.
This is an unacceptable regression.
The fix is to normalize the behavior to X11 WSI.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no >
Fixes: d052b0201e
("vulkan/wsi/wayland: Use fifo protocol for FIFO")
(cherry picked from commit 5f70858ece
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:26 -08:00
Karol Herbst
7cef55b993
nvc0: return NULL instead of asserting in nvc0_resource_from_user_memory
...
Fixes: 212f1ab40e
("nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY")
Acked-by: David Heidelberg <david@ixit.cz >
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Daniel Stone <daniels@collabora.com >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
(cherry picked from commit 277925471e
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:25 -08:00
Karol Herbst
b856d0d3cc
nv/codegen: Do not use a zero immediate for tex instructions
...
They aren't always legal for tex instructions, specifically for TXQ when
an actual source is needed.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11999
Fixes: 85a31fa1fc
("nv50/ir/nir: fix txq emission on MS textures")
(cherry picked from commit 47a1565c3d
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:24 -08:00
Lionel Landwerlin
1ab129ba70
anv: fix extent computation in image->image host copies
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: 0317c44872
("anv: add VK_EXT_host_image_copy support")
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com >
(cherry picked from commit 3ecf2a0518
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:23 -08:00
Eric Engestrom
7dc84d1c96
meson: bump spirv-tools version needed to v2022.1
...
Since c60a421f0c
("vtn: Add a debug flag to dump SPIR-V
assembly"), we use SPIR-V 1.6, which was added in `spirv-tools 2022.1`.
Fixes: c60a421f0c
("vtn: Add a debug flag to dump SPIR-V assembly")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11802
(cherry picked from commit 95c2496412
)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:19 -08:00
Dylan Baker
93d5d587f5
.pick_status.json: Update to ced2404cb4
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119 >
2024-11-08 10:03:17 -08:00
Dylan Baker
85ba713d76
VERSION: bump for 24.3.0-rc1 release
2024-11-07 10:44:28 -08:00
Chia-I Wu
879ec4270d
panvk: fix dummy sampler handle for vs
...
When there is no dynamic buffer, create_copy_table early returns. Make
sure dummy_sampler_handle is still set.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32018 >
2024-11-07 17:35:12 +00:00
Chia-I Wu
7e737500bd
panvk: fix missing same-subqueue wait for CmdWaitEvents2
...
CmdSetEvent2 does not call cs_wait_slots. CmdWaitEvents2 should wait
for the syncobj even on the same subqueue. To that goal, update
collect_cs_deps to not clear self from wait_subqueue_mask.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31997 >
2024-11-07 17:15:12 +00:00
Daniel Stone
fe50011ddb
build: Don't run wayland-protocols tests
...
There's not too much point in running tests in general, but also
specifically for wayland-protocols, which requires a newer
wayland-scanner to run the tests (for DTD validation) but not to parse
the protocol files.
Signed-off-by: Daniel Stone <daniels@collabora.com >
Fixes: cdef622a0a
("meson: Update wayland-protocols to 1.38")
Closes: mesa#12126
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32036 >
2024-11-07 15:55:58 +00:00
Eric Engestrom
f789dd42b8
ci: replace plain meson
with explicit meson setup
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32030 >
2024-11-07 14:48:41 +00:00
Eric Engestrom
1149d69b39
ci: drop unused extra args in build-vkd3d-proton.sh
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32030 >
2024-11-07 14:48:41 +00:00
Eric Engestrom
06cca41889
meson: add dependencies needed by wsi_common_x11.c even on non-drm platforms
...
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11907
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32012 >
2024-11-07 12:56:40 +00:00
Eric Engestrom
3e7be078f8
meson: drop variable initialized twice
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32012 >
2024-11-07 12:56:40 +00:00
Benjamin Herrenschmidt
e1098310da
dril: Fixup order of pixel formats in drilConfigs
...
Having the RGB* formats before the BGR* formats in the table causes
problems where under some circumstances, some applications end up
with the wrong colors.
The repro case for me is: Xvnc + mutter + chromium
There was an existing comment in dri_fill_in_modes() which explained
the problem. This was lost when dril_target.c was created.
Fixes: ec7afd2c24
("dril: rework config creation")
Fixes: 3de62b2f9a
("gallium/dril: Compatibility stub for the legacy DRI loader interface")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31950 >
2024-11-07 12:12:57 +00:00
Samuel Pitoiset
9cc07bbd09
radv: mark some GFX6-7 GPUs as Vulkan 1.3 conformant
...
It's the first time RADV is Vulkan conformant on GFX6-7! Some chips
are missing because we don't have access but most of the GFX6-7 GPUs
are covered.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32022 >
2024-11-07 11:50:10 +00:00
Mary Guillemard
125223b391
panvk: Ensure that render_info is not null in force_fb_preload
...
This fixes various crashes that I saw with occlusion query tests.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Fixes: ba2c7fd00a
("panvk: use force_fb_preload for unaligned preload")
Fixes: c108dfc930
("panvk: force_fb_preload should insert a barrier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32024 >
2024-11-07 10:26:48 +00:00
Christian Gmeiner
f4e8849d79
etnaviv: Fix incorrect pipe_nn creation
...
When etna_screen_create(..) is called with gpu != NULL and npu == NULL,
screen->pipe_nn is incorrectly set up. This leads to an unintended
stream configuration for compute-only contexts, as determined by
pipe = (compute_only && screen->pipe_nn) ? screen->pipe_nn : screen->pipe;
To address this, extend the gpu != npu condition by adding a check for
npu != NULL to ensure pipe_nn is only initialized when both gpu and npu
are provided.
Fixes: a4653587cc
("etnaviv: Add a separate NPU pipe")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com >
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32025 >
2024-11-07 10:02:48 +00:00
David Heidelberg
9f5ee44986
freedreno: python fixes
...
Acked-by: Rob Clark <robclark@freedesktop.org >
Signed-off-by: David Heidelberg <david@ixit.cz >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29390 >
2024-11-07 09:15:54 +00:00
Samuel Pitoiset
b67218645d
radv: save the trap handler report in the HOME directory
...
It's similar to where GPU hang reports are saved.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31988 >
2024-11-07 09:28:16 +01:00
Derek Foreman
71cc22504f
adv+zink/ci: Add a recent flake
...
Signed-off-by: Derek Foreman <derekf@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150 >
2024-11-07 00:03:23 +00:00
Derek Foreman
c26ab1aee1
vulkan/wsi/wayland: Pace frames with commit-timing-v1
...
Instead of using frame callbacks - which may stop firing if our surface is
occluded - use the new commit-timing-v1 protocol in combination with the
presentation feedback protocol.
If the required protocols are unavailable, or the environment variable
MESA_VK_WSI_DEBUG contains "nowlts", we fall back to frame callback
based pacing behaviour.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150 >
2024-11-07 00:03:23 +00:00
Derek Foreman
d052b0201e
vulkan/wsi/wayland: Use fifo protocol for FIFO
...
The fifo protocol allows us to ensure that a compositor presents
an image that we submit to it. Use this to reliably implement FIFO
semantics.
Note: On systems where the fifo protocol is available an occluded
surface may find itself unthrottled when previously it would have
been frozen.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150 >
2024-11-07 00:03:23 +00:00
Derek Foreman
50d3fb65db
vulkan/wsi/wayland: Use presentation timing v2 when available
...
Presentation timing v2 gives us a usable value instead of a 0 when
VRR is in use. Prefer that if available.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150 >
2024-11-07 00:03:23 +00:00
Derek Foreman
cdef622a0a
meson: Update wayland-protocols to 1.38
...
Update the wrap and the dependency, as well as bumping several build tags.
I've also turned off wayland-protocols tests, as we don't want to bump the
wayland-scanner version at this time.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150 >
2024-11-07 00:03:23 +00:00
Chia-I Wu
c108dfc930
panvk: force_fb_preload should insert a barrier
...
Preloading is effectively texel fetching. When we force preloading, we
need to insert a barrier for the feedback loop.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895 >
2024-11-06 15:23:45 -08:00
Chia-I Wu
ba2c7fd00a
panvk: use force_fb_preload for unaligned preload
...
Extend force_fb_preload to take an optional VkRenderingInfo. When it is
non-NULL, this is the unaligned preload and force_fb_preload should
clear attachments.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895 >
2024-11-06 15:23:41 -08:00
Felix DeGrood
bf96702985
intel/measure: increase size of filename malloc to account for \0
...
Corrects regression caused by prior commit that created memory
overwrite by not mallocing enough space for filename string.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32013 >
2024-11-06 22:12:29 +00:00
Sergi Blanch Torne
918978f525
Nightly full job for a630-gles-asan
...
The a630-gles-asan has a significant fraction, that's a trade-off for the
pre-merge, but then we need a full test in the nightly run.
The a630-gles-asan-full job usually takes 40-50 minutes. Therefore, the 20
minutes timeout is increased to 1h. The parallel feature is not used because
the nightly run is, with the introduction of this job, using 4 of the 6
devices available.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com >
Reviewed-by: Valentine Burley <valentine.burley@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31713 >
2024-11-06 21:44:44 +00:00
Pavel Ondračka
f59f322efc
r300/ci: fails update after recent piglit uprev
...
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745 >
2024-11-06 21:10:21 +00:00
Pavel Ondračka
5480831e5e
r300: add driconf math mode override for Unigine Tropics and Oilrush
...
Fixes rendering in both apps. Specifically they want the ME_RECIP_FF
opcode. Figured out by Filip Gawin.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/332
Reviewed-by: Filip Gawin <filip@gawin.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745 >
2024-11-06 21:10:21 +00:00
Pavel Ondračka
be595d0e52
r300: remove wrong Unigine Sanctuary driconf override
...
I used this for testing when adding r300 driconf support
and it was commited by mistake.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com >
Reviewed-by: Filip Gawin <filip@gawin.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745 >
2024-11-06 21:10:21 +00:00
Pavel Ondračka
584ac64670
r300: add switch to support IEEE and FF math opcodes
...
Also add support for the 0*NaN = NaN IEEE compliant multiply on R500.
All of this is disabled by default, but can be enabled with a
RADEON_DEBUG variable or alternativelly with a driconf tweak.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com >
Reviewed-by: Filip Gawin <filip@gawin.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745 >
2024-11-06 21:10:21 +00:00
Jesse Natalie
26fc1ea9e5
dzn: Clean up dri options cache
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32011 >
2024-11-06 20:53:13 +00:00
Rhys Perry
215c44c124
aco: apply extract to v_cvt_f32_ubyte0
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
f1a932bc29
aco: apply extract to p_extract_vector
...
fossil-db (navi21):
Totals from 46 (0.06% of 79395) affected shaders:
Instrs: 80126 -> 79944 (-0.23%); split: -0.27%, +0.04%
CodeSize: 486860 -> 485668 (-0.24%); split: -0.31%, +0.06%
Latency: 1615395 -> 1614218 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 705479 -> 705013 (-0.07%); split: -0.07%, +0.00%
Copies: 18934 -> 18797 (-0.72%); split: -0.98%, +0.25%
VALU: 52452 -> 52268 (-0.35%); split: -0.41%, +0.06%
SALU: 17253 -> 17255 (+0.01%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
6cb9d39bc2
aco: combine extracts with sub-dword definitions
...
fossil-db (navi21):
Totals from 23 (0.03% of 79395) affected shaders:
Instrs: 55133 -> 55099 (-0.06%)
CodeSize: 335744 -> 335512 (-0.07%)
Latency: 1709146 -> 1709031 (-0.01%)
InvThroughput: 613788 -> 613713 (-0.01%)
Copies: 14405 -> 14407 (+0.01%); split: -0.03%, +0.04%
VALU: 37038 -> 37000 (-0.10%)
SALU: 11125 -> 11131 (+0.05%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
30af7ae44f
aco: add and use apply_extract_twice helper
...
This will be used in the next commit.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
05d0fa894e
aco: allow applying sign-extended sel to p_extract more often
...
In the case of v1=p_extract(v1=p_extract(src, 0, 16, 1), 0, 32, 0).
When we apply extracts with sub-dword definitions, this will also
include v2b=p_extract(v2b=p_extract(src, 0, 8, 1), 0, 16, 0).
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
e47bc3e750
aco: shrink code size of some p_extract
...
fossil-db (navi21):
Totals from 37 (0.05% of 79395) affected shaders:
CodeSize: 2048204 -> 2047836 (-0.02%)
fossil-db (navi31):
Totals from 307 (0.39% of 79395) affected shaders:
CodeSize: 3075732 -> 3065236 (-0.34%); split: -0.34%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
d285333800
aco: add a bit more p_extract/p_insert validation
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
d3ac69f79b
aco: handle SGPR limitations when applying extract
...
We were already doing this, but missing it in a few places.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
07e28dad75
aco: disallow p_extract(,,32,)
...
Nothing uses these.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00
Rhys Perry
f528597906
aco: check for SDWA before applying extract to lshl/cvt_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762 >
2024-11-06 19:31:20 +00:00