Commit Graph

197495 Commits

Author SHA1 Message Date
Eric Engestrom
6965aff4d1 ci: move error handling functions at the end
So that everything is defined by the time we use it in here.

Cc: mesa-stable
(cherry picked from commit 5cd054ebe5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:12 -08:00
Iván Briano
ea9b3f928d intel/rt: fix ray_query stack address calculation
While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack
address calculation, what the HW actually uses is
NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform,
while the latter is fixed to 2048 for all current platforms.

Fixes: 6c84cbd8c9 ("intel/dev/xe: Set max_eus_per_subslice using topology query")

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit aee04bf4fb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:11 -08:00
Ian Romanick
7994534fe9 brw/cse: Don't eliminate instructions that write flags
With other changes in my tree, I observed this code from
dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second
cmp.z removed.

    undef(8) %69:UD
    cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F
    mov(1) v58+0.0:D, 0d NoMask group0
    (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0
    cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d
    ...
    undef(8) %72:UD
    cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F
    mov(1) v63+0.0:D, 0d NoMask group0
    (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0

This was also fixed by running dead-code elimination before CSE. That
seems more like avoiding the problem than fixing it, though.

I believe this affects shader-db results because leaving the second
CMP in the shader can give more opportunities for cmod propagation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 234c45c929 ("intel/brw: Write a new global CSE pass that works on defs")

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total cycles in shared programs: 922097690 -> 922260862 (0.02%)
cycles in affected programs: 3178926 -> 3342098 (5.13%)
helped: 130
HURT: 88
helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16
helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18%
HURT stats (abs)   min: 4 max: 11992 x̄: 2292.55 x̃: 47
HURT stats (rel)   min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61%
95% mean confidence interval for cycles value: 320.36 1176.63
95% mean confidence interval for cycles %-change: 1.59% 5.73%
Cycles are HURT.

LOST:   2
GAINED: 1

fossil-db:

Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown)
Totals:
Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00%
Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00%
Max live registers: 48013385 -> 48013343 (-0.00%)

Totals from 507 (0.09% of 551441) affected shaders:
Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01%
Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86%
Max live registers: 94413 -> 94371 (-0.04%)

DG2
Totals:
Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00%
Fill count: 150673 -> 150631 (-0.03%)
Max live registers: 31871520 -> 31871476 (-0.00%)

Totals from 506 (0.08% of 633197) affected shaders:
Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01%
Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74%
Fill count: 2779 -> 2737 (-1.51%)
Max live registers: 51383 -> 51339 (-0.09%)

Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00%
Fill count: 106610 -> 106614 (+0.00%)
Max live registers: 32195006 -> 32194966 (-0.00%)

Totals from 411 (0.06% of 637268) affected shaders:
Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01%
Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02%
Fill count: 2865 -> 2869 (+0.14%)
Max live registers: 42883 -> 42843 (-0.09%)

(cherry picked from commit 9aba731d03)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:10 -08:00
Ian Romanick
1e792b0933 brw/copy: Don't copy propagate through smaller entry dest size
Copy propagation would incorrectly occur in this code

    mov(16) v4+2.0:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0

to create

    mov(16) v4+2.0:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, u0<0>:UD NoMask group0

This has different behavior. I think I just made a mistake when I
changed this condition in e3f502e007.

It seems like this condition could be relaxed to cover cases like (note
the change of destination stride)

    mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0

I'm not sure it's worth it.

No shader-db or fossil-db changes on any Intel platform. Even the code
for the test case mentioned in the original commit did not change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e3f502e007 ("intel/fs: Allow copy propagation between MOVs of mixed sizes")
Closes: #12116
(cherry picked from commit 80a5d158ae)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:07 -08:00
Dylan Baker
08955d2ee8 .pick_status.json: Update to 5e0b81413d
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:11:37 -08:00
Ian Romanick
8f53de4a5d brw/emit: Add correct 3-source instruction assertions for each platform
Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled
on this while trying some stuff with !31852.

v2: Don't be lazy. Add proper assertions for all the things on all the
platforms. Based on a suggestion by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 7bed11fbde ("intel/brw: Allow immediates in the BFE instruction on Gfx12+")
(cherry picked from commit c1c09e3c4a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:27 -08:00
Hans-Kristian Arntzen
baba2805ca vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO.
This is required, otherwise we regress latency in cases where
applications are using FIFO without explicit KHR_present_wait.
This is an unacceptable regression.

The fix is to normalize the behavior to X11 WSI.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: d052b0201e ("vulkan/wsi/wayland: Use fifo protocol for FIFO")
(cherry picked from commit 5f70858ece)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:26 -08:00
Karol Herbst
7cef55b993 nvc0: return NULL instead of asserting in nvc0_resource_from_user_memory
Fixes: 212f1ab40e ("nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY")
Acked-by: David Heidelberg <david@ixit.cz>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 277925471e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:25 -08:00
Karol Herbst
b856d0d3cc nv/codegen: Do not use a zero immediate for tex instructions
They aren't always legal for tex instructions, specifically for TXQ when
an actual source is needed.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11999
Fixes: 85a31fa1fc ("nv50/ir/nir: fix txq emission on MS textures")
(cherry picked from commit 47a1565c3d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:24 -08:00
Lionel Landwerlin
1ab129ba70 anv: fix extent computation in image->image host copies
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0317c44872 ("anv: add VK_EXT_host_image_copy support")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 3ecf2a0518)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:23 -08:00
Eric Engestrom
7dc84d1c96 meson: bump spirv-tools version needed to v2022.1
Since c60a421f0c ("vtn: Add a debug flag to dump SPIR-V
assembly"), we use SPIR-V 1.6, which was added in `spirv-tools 2022.1`.

Fixes: c60a421f0c ("vtn: Add a debug flag to dump SPIR-V assembly")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11802
(cherry picked from commit 95c2496412)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:19 -08:00
Dylan Baker
93d5d587f5 .pick_status.json: Update to ced2404cb4
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:17 -08:00
Dylan Baker
85ba713d76 VERSION: bump for 24.3.0-rc1 release 2024-11-07 10:44:28 -08:00
Chia-I Wu
879ec4270d panvk: fix dummy sampler handle for vs
When there is no dynamic buffer, create_copy_table early returns.  Make
sure dummy_sampler_handle is still set.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32018>
2024-11-07 17:35:12 +00:00
Chia-I Wu
7e737500bd panvk: fix missing same-subqueue wait for CmdWaitEvents2
CmdSetEvent2 does not call cs_wait_slots.  CmdWaitEvents2 should wait
for the syncobj even on the same subqueue.  To that goal, update
collect_cs_deps to not clear self from wait_subqueue_mask.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31997>
2024-11-07 17:15:12 +00:00
Daniel Stone
fe50011ddb build: Don't run wayland-protocols tests
There's not too much point in running tests in general, but also
specifically for wayland-protocols, which requires a newer
wayland-scanner to run the tests (for DTD validation) but not to parse
the protocol files.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: cdef622a0a ("meson: Update wayland-protocols to 1.38")
Closes: mesa#12126
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32036>
2024-11-07 15:55:58 +00:00
Eric Engestrom
f789dd42b8 ci: replace plain meson with explicit meson setup
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32030>
2024-11-07 14:48:41 +00:00
Eric Engestrom
1149d69b39 ci: drop unused extra args in build-vkd3d-proton.sh
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32030>
2024-11-07 14:48:41 +00:00
Eric Engestrom
06cca41889 meson: add dependencies needed by wsi_common_x11.c even on non-drm platforms
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11907
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32012>
2024-11-07 12:56:40 +00:00
Eric Engestrom
3e7be078f8 meson: drop variable initialized twice
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32012>
2024-11-07 12:56:40 +00:00
Benjamin Herrenschmidt
e1098310da dril: Fixup order of pixel formats in drilConfigs
Having the RGB* formats before the BGR* formats in the table causes
problems where under some circumstances, some applications end up
with the wrong colors.

The repro case for me is: Xvnc + mutter + chromium

There was an existing comment in dri_fill_in_modes() which explained
the problem. This was lost when dril_target.c was created.

Fixes: ec7afd2c24 ("dril: rework config creation")
Fixes: 3de62b2f9a ("gallium/dril: Compatibility stub for the legacy DRI loader interface")

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31950>
2024-11-07 12:12:57 +00:00
Samuel Pitoiset
9cc07bbd09 radv: mark some GFX6-7 GPUs as Vulkan 1.3 conformant
It's the first time RADV is Vulkan conformant on GFX6-7! Some chips
are missing because we don't have access but most of the GFX6-7 GPUs
are covered.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32022>
2024-11-07 11:50:10 +00:00
Mary Guillemard
125223b391 panvk: Ensure that render_info is not null in force_fb_preload
This fixes various crashes that I saw with occlusion query tests.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: ba2c7fd00a ("panvk: use force_fb_preload for unaligned preload")
Fixes: c108dfc930 ("panvk: force_fb_preload should insert a barrier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32024>
2024-11-07 10:26:48 +00:00
Christian Gmeiner
f4e8849d79 etnaviv: Fix incorrect pipe_nn creation
When etna_screen_create(..) is called with gpu != NULL and npu == NULL,
screen->pipe_nn is incorrectly set up. This leads to an unintended
stream configuration for compute-only contexts, as determined by

  pipe = (compute_only && screen->pipe_nn) ? screen->pipe_nn : screen->pipe;

To address this, extend the gpu != npu condition by adding a check for
npu != NULL to ensure pipe_nn is only initialized when both gpu and npu
are provided.

Fixes: a4653587cc ("etnaviv: Add a separate NPU pipe")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32025>
2024-11-07 10:02:48 +00:00
David Heidelberg
9f5ee44986 freedreno: python fixes
Acked-by: Rob Clark <robclark@freedesktop.org>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29390>
2024-11-07 09:15:54 +00:00
Samuel Pitoiset
b67218645d radv: save the trap handler report in the HOME directory
It's similar to where GPU hang reports are saved.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31988>
2024-11-07 09:28:16 +01:00
Derek Foreman
71cc22504f adv+zink/ci: Add a recent flake
Signed-off-by: Derek Foreman <derekf@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
2024-11-07 00:03:23 +00:00
Derek Foreman
c26ab1aee1 vulkan/wsi/wayland: Pace frames with commit-timing-v1
Instead of using frame callbacks - which may stop firing if our surface is
occluded - use the new commit-timing-v1 protocol in combination with the
presentation feedback protocol.

If the required protocols are unavailable, or the environment variable
MESA_VK_WSI_DEBUG contains "nowlts", we fall back to frame callback
based pacing behaviour.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
2024-11-07 00:03:23 +00:00
Derek Foreman
d052b0201e vulkan/wsi/wayland: Use fifo protocol for FIFO
The fifo protocol allows us to ensure that a compositor presents
an image that we submit to it. Use this to reliably implement FIFO
semantics.

Note: On systems where the fifo protocol is available an occluded
surface may find itself unthrottled when previously it would have
been frozen.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
2024-11-07 00:03:23 +00:00
Derek Foreman
50d3fb65db vulkan/wsi/wayland: Use presentation timing v2 when available
Presentation timing v2 gives us a usable value instead of a 0 when
VRR is in use. Prefer that if available.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
2024-11-07 00:03:23 +00:00
Derek Foreman
cdef622a0a meson: Update wayland-protocols to 1.38
Update the wrap and the dependency, as well as bumping several build tags.

I've also turned off wayland-protocols tests, as we don't want to bump the
wayland-scanner version at this time.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
2024-11-07 00:03:23 +00:00
Chia-I Wu
c108dfc930 panvk: force_fb_preload should insert a barrier
Preloading is effectively texel fetching.  When we force preloading, we
need to insert a barrier for the feedback loop.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895>
2024-11-06 15:23:45 -08:00
Chia-I Wu
ba2c7fd00a panvk: use force_fb_preload for unaligned preload
Extend force_fb_preload to take an optional VkRenderingInfo.  When it is
non-NULL, this is the unaligned preload and force_fb_preload should
clear attachments.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895>
2024-11-06 15:23:41 -08:00
Felix DeGrood
bf96702985 intel/measure: increase size of filename malloc to account for \0
Corrects regression caused by prior commit that created memory
overwrite by not mallocing enough space for filename string.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32013>
2024-11-06 22:12:29 +00:00
Sergi Blanch Torne
918978f525 Nightly full job for a630-gles-asan
The a630-gles-asan has a significant fraction, that's a trade-off for the
pre-merge, but then we need a full test in the nightly run.

The a630-gles-asan-full job usually takes 40-50 minutes. Therefore, the 20
minutes timeout is increased to 1h. The parallel feature is not used because
the nightly run is, with the introduction of this job, using 4 of the 6
devices available.

Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31713>
2024-11-06 21:44:44 +00:00
Pavel Ondračka
f59f322efc r300/ci: fails update after recent piglit uprev
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745>
2024-11-06 21:10:21 +00:00
Pavel Ondračka
5480831e5e r300: add driconf math mode override for Unigine Tropics and Oilrush
Fixes rendering in both apps. Specifically they want the ME_RECIP_FF
opcode. Figured out by Filip Gawin.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/332
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745>
2024-11-06 21:10:21 +00:00
Pavel Ondračka
be595d0e52 r300: remove wrong Unigine Sanctuary driconf override
I used this for testing when adding r300 driconf support
and it was commited by mistake.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745>
2024-11-06 21:10:21 +00:00
Pavel Ondračka
584ac64670 r300: add switch to support IEEE and FF math opcodes
Also add support for the 0*NaN = NaN IEEE compliant multiply on R500.
All of this is disabled by default, but can be enabled with a
RADEON_DEBUG variable or alternativelly with a driconf tweak.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745>
2024-11-06 21:10:21 +00:00
Jesse Natalie
26fc1ea9e5 dzn: Clean up dri options cache
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32011>
2024-11-06 20:53:13 +00:00
Rhys Perry
215c44c124 aco: apply extract to v_cvt_f32_ubyte0
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
f1a932bc29 aco: apply extract to p_extract_vector
fossil-db (navi21):
Totals from 46 (0.06% of 79395) affected shaders:
Instrs: 80126 -> 79944 (-0.23%); split: -0.27%, +0.04%
CodeSize: 486860 -> 485668 (-0.24%); split: -0.31%, +0.06%
Latency: 1615395 -> 1614218 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 705479 -> 705013 (-0.07%); split: -0.07%, +0.00%
Copies: 18934 -> 18797 (-0.72%); split: -0.98%, +0.25%
VALU: 52452 -> 52268 (-0.35%); split: -0.41%, +0.06%
SALU: 17253 -> 17255 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
6cb9d39bc2 aco: combine extracts with sub-dword definitions
fossil-db (navi21):
Totals from 23 (0.03% of 79395) affected shaders:
Instrs: 55133 -> 55099 (-0.06%)
CodeSize: 335744 -> 335512 (-0.07%)
Latency: 1709146 -> 1709031 (-0.01%)
InvThroughput: 613788 -> 613713 (-0.01%)
Copies: 14405 -> 14407 (+0.01%); split: -0.03%, +0.04%
VALU: 37038 -> 37000 (-0.10%)
SALU: 11125 -> 11131 (+0.05%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
30af7ae44f aco: add and use apply_extract_twice helper
This will be used in the next commit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
05d0fa894e aco: allow applying sign-extended sel to p_extract more often
In the case of v1=p_extract(v1=p_extract(src, 0, 16, 1), 0, 32, 0).
When we apply extracts with sub-dword definitions, this will also
include v2b=p_extract(v2b=p_extract(src, 0, 8, 1), 0, 16, 0).

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
e47bc3e750 aco: shrink code size of some p_extract
fossil-db (navi21):
Totals from 37 (0.05% of 79395) affected shaders:
CodeSize: 2048204 -> 2047836 (-0.02%)

fossil-db (navi31):
Totals from 307 (0.39% of 79395) affected shaders:
CodeSize: 3075732 -> 3065236 (-0.34%); split: -0.34%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
d285333800 aco: add a bit more p_extract/p_insert validation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
d3ac69f79b aco: handle SGPR limitations when applying extract
We were already doing this, but missing it in a few places.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
07e28dad75 aco: disallow p_extract(,,32,)
Nothing uses these.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry
f528597906 aco: check for SDWA before applying extract to lshl/cvt_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00