Commit Graph

192236 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
acb10043cb nvk: add instruction count exec property
useful for shader-db, this isn't as simple as dividing the code size so it's
worth reporting.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30136>
2024-07-16 15:13:40 +00:00
Faith Ekstrand
4030447dab nak: gather instr count explicitly
This isn't as simple as dividing so we want a real shader info property for nvk
to consume. Plumb one through.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30136>
2024-07-16 15:13:40 +00:00
Alyssa Rosenzweig
67e3b3fbfd nouveau/drm-shim: set ram_user
this is required for nvk to create a heap. fixes zink+nvk+drm-shim:

run: ../src/gallium/drivers/zink/zink_screen.c:3371: zink_internal_create_screen: Assertion `i == ZINK_HEAP_HOST_VISIBLE_COHERENT_CACHED || i == ZINK_HEAP_DEVICE_LOCAL_LAZY || i == ZINK_HEAP_DEVICE_LOCAL_VISIBLE' failed.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30136>
2024-07-16 15:13:40 +00:00
Daniel Schürmann
6723128e94 aco/spill: Don't add phi definitions to live-in variables
Changes are because we don't add artificial uses to the phi definitions anymore.

Totals from 13 (0.02% of 79395) affected shaders: (GFX10.3)

Instrs: 230510 -> 230285 (-0.10%); split: -0.10%, +0.00%
CodeSize: 1269916 -> 1268760 (-0.09%); split: -0.10%, +0.01%
SpillSGPRs: 2057 -> 2058 (+0.05%)
Latency: 2729731 -> 2723103 (-0.24%)
InvThroughput: 696888 -> 695286 (-0.23%)
VClause: 5795 -> 5768 (-0.47%)
SClause: 6855 -> 6858 (+0.04%)
Copies: 32336 -> 32275 (-0.19%); split: -0.22%, +0.03%
VALU: 151782 -> 151731 (-0.03%); split: -0.04%, +0.01%
SALU: 30766 -> 30758 (-0.03%); split: -0.03%, +0.01%
VMEM: 12157 -> 12078 (-0.65%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
bb5af6bede aco: remove live-out variables from IR
Since we changed all passes to use the live-in variables,
these are not needed anymore.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
f86816ca85 aco/print_ir: print live-in instead of live-out variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
043ec096c1 aco/validate: use live-in variables for RA validation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
976dd71942 aco/cssa: use live-in variables instead of live-out variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
c146d4b6b6 aco/spill: use live-in variables directly rather than computing them
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
162876c875 aco/ra: use live-in variables directly rather than computing them
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Daniel Schürmann
29262f8cf3 aco: compute live-in variables in addition to live-out variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30120>
2024-07-16 14:00:49 +00:00
Mary Guillemard
9a4a03ec1f ci/panfrost: Update t760 fails
Add a new failure found while running gles job manually

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30150>
2024-07-16 13:10:56 +00:00
Mary Guillemard
32a4596d17 panfrost: Handle gracefully resource BO alloc failures
This makes panfrost_bo_alloc not assert anymore and propagate the
failure again.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30150>
2024-07-16 13:10:56 +00:00
Mary Guillemard
71a24a0c5e panfrost: Handle context_init errors correctly
This fix OpenCL CTS "multiple_device_context/test_multiples" failures.

Also improve create_context/destroy error management a bit.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30150>
2024-07-16 13:10:56 +00:00
Mary Guillemard
668bde4421 pan/kmod: Avoid deadlock on VA allocation failure on panthor
Fixes: 97f6a62f7e ("pan/kmod: Add a backend for panthor")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30150>
2024-07-16 13:10:55 +00:00
Daniel Schürmann
ffef3d1709 nir/opt_sink: ignore loops without backedge
Loops without backedge should not be considered loops.
For RADV, 2069 (2.61% of 79395) affected shaders.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28783>
2024-07-16 12:29:08 +00:00
Daniel Schürmann
79875737cc radv: use NIR loop invariant code motion pass
Totals from 3469 (4.37% of 79395) affected shaders: (GFX11)
MaxWaves: 78690 -> 78622 (-0.09%); split: +0.03%, -0.11%
Instrs: 11093592 -> 11092346 (-0.01%); split: -0.09%, +0.07%
CodeSize: 57979444 -> 58077232 (+0.17%); split: -0.12%, +0.29%
VGPRs: 257892 -> 258336 (+0.17%); split: -0.08%, +0.25%
SpillSGPRs: 2958 -> 2521 (-14.77%); split: -32.83%, +18.05%
Latency: 135247583 -> 134446992 (-0.59%); split: -0.61%, +0.02%
InvThroughput: 25654328 -> 25478620 (-0.68%); split: -0.73%, +0.05%
VClause: 244799 -> 244499 (-0.12%); split: -0.17%, +0.05%
SClause: 313323 -> 315081 (+0.56%); split: -0.40%, +0.96%
Copies: 835953 -> 842457 (+0.78%); split: -0.38%, +1.15%
Branches: 330136 -> 330210 (+0.02%); split: -0.03%, +0.05%
PreSGPRs: 193374 -> 200277 (+3.57%); split: -0.38%, +3.95%
PreVGPRs: 223947 -> 224227 (+0.13%); split: -0.02%, +0.15%
VALU: 6312413 -> 6314841 (+0.04%); split: -0.02%, +0.06%
SALU: 1222275 -> 1227329 (+0.41%); split: -0.26%, +0.67%
VMEM: 408421 -> 408412 (-0.00%)
SMEM: 430966 -> 430399 (-0.13%)
VOPD: 2482 -> 2440 (-1.69%); split: +0.44%, -2.14%

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28783>
2024-07-16 12:29:08 +00:00
Daniel Schürmann
540ee1c81a nir: implement loop invariant code motion (LICM) pass
This simple LICM pass hoists all loop-invariant instructions
from the loops' top-level control flow, skipping any nested CF.
The hoisted instructions are placed right before the loop.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28783>
2024-07-16 12:29:08 +00:00
Alejandro Piñeiro
e18b54fa5d drm-shim: stub synobj_timeline_wait and query ioctl
Needed to avoid unhandled code DRM ioctl errors on some platforms when
using shader-db.

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30184>
2024-07-16 11:17:59 +02:00
Dave Airlie
814a2da2f4 radv/video: advertise mutable/extended for dst video images.
This allows zink video to create planar image views if needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30203>
2024-07-16 07:04:15 +00:00
Samuel Pitoiset
8863704c6b radv/meta: add a helper to create descriptor set layout
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
3d322b787e radv/meta: add a helper to create pipeline layout
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
c6a626e000 radv/meta: add a helper to create compute pipeline
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
bf3b2d2912 radv/meta: remove useless checks for NULL handles before destroying
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
4deb138e7d radv/meta: remove unused number of rectangles for internal operations
It was always 1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
ecd3bbf826 radv/meta: remove redundant check for hw resolve pipelines
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
76e4edefbf radv/meta: remove unnecessary blit2d_dst_temps struct
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Samuel Pitoiset
e739d0e5bb radv/meta: remove non-valuable comments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30187>
2024-07-16 06:17:07 +00:00
Yukari Chiba
6f02ec5ed1 llvmpipe: add an implementation with llvm orcjit
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26018>
2024-07-16 12:22:29 +10:00
Yukari Chiba
0b69b8d0db llvmpipe/tests: add a new test for multiple symbols for orc jit testing
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26018>
2024-07-16 11:31:24 +10:00
Yukari Chiba
ba283c0d84 llvmpipe: add function name to gallivm_jit_function
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26018>
2024-07-16 11:31:24 +10:00
Yukari Chiba
28530c3eaa gallivm: add riscv support to the mattrs setting code
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26018>
2024-07-16 09:41:41 +10:00
Yukari Chiba
465510a211 util: detect RISC-V architecture
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26018>
2024-07-16 09:41:28 +10:00
Doug Brown
60488d6213 xa: add missing stride setup in renderer_draw_yuv
This fixes a problem observed in VMware VMs where Xv playback results in
a black screen instead of the actual video.

Signed-off-by: Doug Brown <doug@schmorgal.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11490
Fixes: 7672545223 ("gallium: move vertex stride to CSO")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30116>
2024-07-15 20:54:43 +00:00
Josh Simmons
1ced840632 radv: Add RADV_PROFILE_PSTATE envvar
Enable selecting the specific pstate to enter when using thread tracing
and when acquiring the profiling lock for performance queries.

Signed-off-by: Josh Simmons <josh@nega.tv>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30139>
2024-07-15 20:32:01 +00:00
Alyssa Rosenzweig
bda1de89db asahi: eliminate load_num_workgroups from TCS unrolled ID
honeykrisp doesn't want to implement this sysval, we don't need it here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
ae769727d8 libagx: handle VS/IA pipeline stats on GPU
This was an obnoxious bit of cheating we had in the gl4.6 driver that I added
literally the morning I passed gl4.6 cts, just to fix my last gl4.6 cts test.
It had an expiration date.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
1fbf2002e3 asahi: handle CS pipeline stat with indirect dispatch
no more stall here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
bc4d38d4ed libagx: add kernel for incrementing CS counter
for indirect dispatch

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
d26ae4f455 asahi,libagx: tessellate on device
Add OpenCL kernels implementing the tessellation algorithm on device. This is an
OpenCL C port of the D3D11 reference tessellator, originally written by
Microsoft in C++. There are significant differences compared to the CPU based
reference implementation:

* significant simplifications and clean up. The reference code did a lot of
  things in weird ways that would be inefficient on the GPU. I did a *lot* of
  work here to get good AGX assembly generated for the tessellation kernels ...
  the first attempts were quite bad! Notably, everything is carefully written to
  ensure that all private memory access is optimized out in NIR; the resulting
  kernels do not use scratch and do not spill on G13.

* prefix sum variants. To implement geom+tess efficiently, we need to first
  calculate the count of indices generated by the tessellator, then prefix sum
  that, then tessellate using the prefix sum results writing into 1 large index
  buffer for a single indirect draw. This isn't too bad, we already have most of
  the logic and the guts of the prefix sum kernel is shared with geometry
  shaders.

* VDM generation variant. To implement tess alone, it's fastest to generate a
  hardware Index List word for each patch, adding an appropriate 32-bit index
  bias to the dynamically allocated U16 index buffers. Then from the CPU, we
  have the illusion of a single draw to Stream Link with Return to. This
  requires packing hardware control words from the tessellator kernel.
  Fortunately, we have GenXML available so we just use agx_pack like we would in
  the driver.

Along the way, we pick up indirect tess support (this follows on naturally),
which gets rid of the other bit of tessellation-related cheating. Implementing
this requires reworking our internal agx_launch data structures, but that has
the nice side effect of speeding up GS invocations too (by fixing the workgroup
size).

Don't get me wrong. tessellator.cl is the single most unhinged file of my
career, featuring GenXML-based pack macros fed by dynamic memory allocation fed
by the inscrutable tessellation algorithm.

But it works *really* well.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
cc9b815efa libagx: specify heap size explicitly
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
a82c0211e7 asahi: tuck in null query check
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
bce466586e asahi: make agx_pack opencl compatible
we don't want generic pointers here to keep things happy. also rename
CONSTANT to avoid collisions

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
Alyssa Rosenzweig
9624b86af0 asahi: drop stale comment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:08:59 +00:00
Alyssa Rosenzweig
1d4f0d3002 asahi: drop old comment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:08:59 +00:00
Alyssa Rosenzweig
e8b673a109 agx: do not flush denorms for fp16 fmin/fmax
total instructions in shared programs: 2164639 -> 2158940 (-0.26%)
instructions in affected programs: 319475 -> 313776 (-1.78%)
helped: 1200
HURT: 6
Instructions are helped.

total alu in shared programs: 1690198 -> 1684653 (-0.33%)
alu in affected programs: 272173 -> 266628 (-2.04%)
helped: 1181
HURT: 6
Alu are helped.

total fscib in shared programs: 1686497 -> 1680797 (-0.34%)
fscib in affected programs: 272922 -> 267222 (-2.09%)
helped: 1200
HURT: 6
Fscib are helped.

total bytes in shared programs: 14334550 -> 14300314 (-0.24%)
bytes in affected programs: 2075546 -> 2041310 (-1.65%)
helped: 1200
HURT: 6
Bytes are helped.

total regs in shared programs: 662332 -> 662302 (<.01%)
regs in affected programs: 1103 -> 1073 (-2.72%)
helped: 14
HURT: 15
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
2024-07-15 19:29:00 +00:00
Alyssa Rosenzweig
6ac289dade agx: set lower_fminmax_signed_zero
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
2024-07-15 19:29:00 +00:00
Alyssa Rosenzweig
d238d766c6 nir: add lower_fminmax_signed_zero
This implements IEEE-754-2019 signed zero semantics for fmin/fmax, as now
required by NIR, for hardware that has busted signed zero behaviour for
fmin/fmax. Ian expressed interest in this for Intel.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
2024-07-15 19:29:00 +00:00
Alyssa Rosenzweig
0e46f7b39a nir/lower_alu: remove dead #define
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
2024-07-15 19:29:00 +00:00
Alyssa Rosenzweig
4ab3d95c11 nir/lower_double_ops: handle signed zero with min/max
Ensure the following identities hold to match IEEE-754-2019 and upcoming NIR:

   min(-0, +0) = -0
   min(+0, -0) = -0
   max(-0, +0) = +0
   max(+0, -0) = +0

NVK uses this lowering. In a simple compute shader using fmin64 on an SSBO with
signed zero preserve required, testing the effect of this patch, the instruction
count goes from 47->52. Obviously I'm not thrilled by that but I also couldn't
find any obvious way of mitigating the issue. (Maybe NVIDIA has special hardware
support here. By instruction count, lowering all the way to int64 is a loss,
though I don't know how to count cycles on NVIDIA.)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
2024-07-15 19:29:00 +00:00