Commit Graph

196666 Commits

Author SHA1 Message Date
Zan Dobersek
b44480e86a zink: fix bo_export caching
When creating and caching the bo_export object for a given zink_bo, the
screen file descriptor was used. Since no buffer object's file descriptor
would match that, bo_export objects were continuously added to the exports
list and no bo_export was effectively picked from the cache. Using the
buffer object's file descriptor avoids that.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: b0fe621459 ("zink: add back kms handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31715>
2024-10-17 17:27:20 +00:00
Aleksi Sapon
cfdb653f1c llvmpipe: update traces for aniso filtering fix
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
2024-10-17 16:47:22 +00:00
Aleksi Sapon
e7a851e6cf softpipe: Fix anisotropic sampling aliasing bug
"Backport" of the llvmpip fix.

Nearest sampling was being done using coordinates
on texel boundaries, which caused aliasing bugs.
Shift coordinates by half a texel to correct this.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
2024-10-17 16:47:22 +00:00
Aleksi Sapon
5947e3d760 llvmpipe: Fix pmin calculation
Based on the original code from sp_tex_sample.c,
this was supposed to be a comparison with pmax2,
not pmin2.

This mostly seemed to result in anisotropic filtering
turning on to "maximum" at any value of max_aniso > 1.
Most apparent when runing the texfilt test in demos.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
2024-10-17 16:47:22 +00:00
Aleksi Sapon
313115f98b llvmpipe: Fix anisotropic sampling aliasing bug
Nearest sampling was being done using coordinates
on texel boundaries, which caused aliasing bugs.
Shift coordinates by half a texel to correct this.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31562>
2024-10-17 16:47:21 +00:00
Connor Abbott
e9bb906a32 tu: Implement VK_EXT_pipeline_robustness
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31687>
2024-10-17 16:16:05 +00:00
Connor Abbott
c323848b0b ir3, tu: Plumb through support for per-shader robustness
We need to pass through the robust_modes flag to nir_opt_vectorize based
on a flag set when compiling the shader, not globally in the compiler,
for VK_EXT_pipeline_robustness. Refactor the ir3 compiler interface
to add an ir3_shader_nir_options struct that can be passed around to
the appropriate places, and wire it up in turnip to the shader key. The
shader key replaces the old mechanism of hashing in the compiler
options.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31687>
2024-10-17 16:16:05 +00:00
Anil Hiranniah
3d066e5ef1 panfrost: Fix a memory leak in the CSF backend
The geometry BO should be released in csf_cleanup_context().

Fixes: 447075eeee ("panfrost: Add support for the CSF job frontend")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31705>
2024-10-17 15:03:44 +00:00
Boris Brezillon
27bde761a7 panvk: Fix the hierarchy_mask selection
Always enable the level covering the whole FB, and disable the finest
levels if we don't have enough to cover everything.

This is suboptimal for small primitives, since it might force primitives
to be walked multiple times even if they don't cover the the tile being
processed. On the other hand, it's hard to guess the draw pattern, so
it's probably good enough for now.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31682>
2024-10-17 14:29:57 +00:00
Iago Toral Quiroga
ad111aed29 v3dv: fix leak during device initialization
Fixes: 188f1c6cbe ('v3dv: rewrite device identification')

Reviewed-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31703>
2024-10-17 13:27:05 +00:00
Juan A. Suarez Romero
e4301621a2 v3d/ci: add OpenCL failures
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31710>
2024-10-17 12:35:33 +00:00
Lionel Landwerlin
ea2bbe3271 anv: use stage mask to deduce cs/pb-stall requirements
When flushing the render target cache for future operations, we need a
stall at pixel scoreboard. We likely didn't see any issue until now
because a change in render target added the pb-stall.

When using a 2 compute shaders with the following pattern :
  vkCmdDispatch()
  vkCmdPipelineBarrier() ImageBarrier with (src|dst)AccessMask=0 & identical layout
  vkCmdDispatch()

we should ensure that the first dispatch is completed before executing
the second one, otherwise they can race to on resource accesses. This
fixes failures in some new CTS tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31676>
2024-10-17 11:55:33 +00:00
Georg Lehmann
aabadb30fc aco/print_ir: use parse_depctr_wait
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Georg Lehmann
ced7a01954 aco/statistics: update branch issue cycles
Foz-DB Navi31:
Totals from 14319 (18.04% of 79395) affected shaders:
Instrs: 20064495 -> 20062876 (-0.01%)
CodeSize: 105334568 -> 105327704 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Georg Lehmann
ec11cfc69d aco/insert_delay_alu: do not delay lane mask fast forwarding
The delay actually hurts performance in this case.

Foz-DB Navi31:
Totals from 30340 (38.21% of 79395) affected shaders:
Instrs: 30778999 -> 30726605 (-0.17%); split: -0.17%, +0.00%
CodeSize: 162380180 -> 162170808 (-0.13%); split: -0.13%, +0.00%
Latency: 228185562 -> 228186976 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 39001151 -> 39000897 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Georg Lehmann
e4889fd4b5 aco/insert_delay_alu: consider more implicit waits
Foz-DB Navi31:
Totals from 37961 (47.81% of 79395) affected shaders:
Instrs: 34175286 -> 33978599 (-0.58%)
CodeSize: 180059352 -> 179190076 (-0.48%); split: -0.48%, +0.00%
Latency: 259826196 -> 259798474 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 42792700 -> 42789298 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Georg Lehmann
840b5841d3 aco: do not track ALU delay across jumps
This assumes that the best case jump latency is higher than the worst case
ALU latency.

Foz-DB Navi31:
Totals from 17720 (22.32% of 79395) affected shaders:
Instrs: 26009663 -> 25929989 (-0.31%); split: -0.31%, +0.00%
CodeSize: 136571496 -> 136254420 (-0.23%); split: -0.23%, +0.00%
Latency: 215731308 -> 215722059 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 36534197 -> 36532070 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Georg Lehmann
977f435f4c aco/ir: add function to parse depctr waits
No Foz-DB changes on Navi31.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132>
2024-10-17 11:16:16 +00:00
Mike Blumenkrantz
a061c80629 zink: further improve image usage detection
there was some confusion over exactly where ici->usage should be set,
since the value must be set when doing all the format checks but then
also it was re-set again later to a potentially different value based
on an unchecked return

now get_image_usage() is set_image_usage() with a more consistent policy
around exactly where the usage is set

this code still sucks tho

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31686>
2024-10-17 10:38:28 +00:00
Georg Lehmann
dbf63a0788 nir: remove nir_op_is_derivative
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
f9d2aad7a3 nir: remove alu ddx/ddy
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
bf0d1a42b4 nir: remove uses_fddx_fddy
Unused and the code didn't even do what the comment said.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
cba575f4df nir: always emit ddx intrinsics
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
5205501e2f mesa/prog_to_nir: use derivative builder
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
41cce70584 spirv: remove alu fddx/fddy from comment
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
403e2393e3 ir3: remove alu fddx/fddy check
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
6cb6bc7133 elk: remove alu fddx/fddy check
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
1371a8fe2b nir/opt_move_discards_to_top: handle ddx/ddy intrinsics
Fixes: daa97bb41a ("amd: switch to derivative intrinsics")

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Marek Olšák
948f94b8c5 nir/opt_varyings: pack TCS inputs with cross-invocation access together
Unigine Heaven has a TCS that reads pos.xyz and tescoord.w from all
invocations in every invocation. By putting those two in the same vec4,
AMD hw can reduce the amount of shared memory that is allocated for those
inputs from 2 vec4s to 1 vec4.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
2024-10-17 03:30:07 +00:00
Marek Olšák
8e93907b7c nir/opt_varyings: assign locations of no_varying IO for TCS outputs only
Skip the code for other shader stages because it doesn't do anything there.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
2024-10-17 03:30:07 +00:00
Daniel Almeida
01586bc57e util: u_memstream: add tests
Now we have a few extra methods and things are diverging a bit between
Linux and Windows. Add a few unit tests to make sure this works.

Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
2024-10-17 02:50:21 +00:00
Daniel Almeida
279f38918f nak: memstream: move into common code
Move the memstream code into common code. Other Rust code interfacing
with FILE pointers will find the memstream abstraction useful.

Most notably, pinning is actually enforced this time with PhantomPinned.

Add a .clang-format from a sibling dir (i.e.: compiler/nir) while we're
at it to keep things tidy.

Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
2024-10-17 02:50:21 +00:00
Daniel Almeida
3136d1c8c6 util: memstream: add fflush support
Flush support is needed by the Rust code, which will switch from
its own memstream to u_memstream in a future patch.

Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>
2024-10-17 02:50:20 +00:00
Connor Abbott
e6c5dcd295 tu: Expose VK_KHR_dynamic_rendering_local_read
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:45 +00:00
Connor Abbott
a99600322c tu: Track possible feedback loops for dynamic renderpasses
With DRLR, we unfortunately don't get notified about when there are
feedback loops. We also can't detect ahead of time the case where an
input attachment is read before written. This means we need to try and
guess in the driver when that happens based on which input attachments
we read.

There are three main things we have to do with feedback loops:

- Forcing flushing between primitives in sysmem to avoid problems with
  compressed textures before a7xx.
- Forcing late depth test if it's a depth feedback loop.
- Invalidating UCHE because we may read blit results.

The way this is implemented is currently a bit conservative, following
radv. However the impact of the flushing should be mitigated by
switching to GMEM whenever we can, which we already do.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
ef693c5785 tu: Fix flushes for feedback_invalidate case
We also have to wait for the blits to land, and WFI so that everything
finishes before any draws. Noticed when adding the equivalent thing for
dynamic renderpasses.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
cad2ca74d9 tu: Make input attachments always contain a real descriptor
We need this for read-only input attachments.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
beb513ad78 tu: Support dynamic input attachments
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
d50eef5b06 tu: Support color attachment remapping
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
5a4dff5922 ir3: Fix non-bindless s2en texture/sampler order
In gallium samplers and textures have a 1:1 correspondance, so it was
impossible to tell which is which. But we use non-bindless textures for
subpass loads, which don't have a sampler (so common code lowers to an
index of 0) and therefore the order matters. With dynamic rendering, we
can have more than 16 slots for input attachments, so the s2en wasn't
getting optimized away and suddenly it starts mattering, and it turns
out we got it wrong. By fixing this we also make the order the same as
the bindless order, which was always tested well by Vulkan.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
65c0846537 nir/lower_input_attachments: Handle unscaled input attachments with no index
With VK_KHR_dynamic_rendering_local_read we can have input attachments
with no index, which normally correspond to depth/stencil attachments,
and we have to handle this here when determining whether we need to emit
an unscaled fragcoord for FDM.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
4bd506a7f3 spirv: Make the default input attachment index ~0
This will let us know when an input attachment doesn't have an
InputAttachmentIndex, which used to be illegal but is now allowed and
meaningful with VK_KHR_dynamic_rendering_local_read.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
87d9127136 v3dv: Don't misuse nir_variable::data.index
The code seems to assume that it's supposed to be a base index for the
binding array, but it's not. It's entirely meaningless and happens to be
set to 0 for all descriptor types except for input attachments, which is
the one case where it's already ignored. Delete the dead code so that
when we change it to a "no index" sentinel value v3dv doesn't blow up.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
8e153bbd8f vulkan/state: Track the input attachment count
Tilers need to know, at the very least, which input attachments are
"real" input attachments which map to a color or depth/stencil
attachment in tile memory and which are "fake" input attachments that
should be treated as regular textures. To do this we need to know
VkRenderingInputAttachmentIndexInfoKHR::colorAttachmentCount, which
wasn't provided by the state tracking infrastructure. When the
VkRenderingInputAttachmentIndexInfoKHR isn't provided, we don't know it,
though, so we have to record a special UKNOWN value. In that case, the
driver will know thanks to
VUID-VkGraphicsPipelineCreateInfo-renderPass-09652 that all input
attachments are "real" input attachments that map directly to color
attachments if they have an InputAttachmentIndex.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Connor Abbott
82169ec551 vulkan/state: Handle NULL in DS input attachment mapping correctly
NULL and VK_ATTACHMENT_UNUSED have different meanings, and we need to
preserve the difference for drivers that use the input attachment
mapping (i.e. turnip).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>
2024-10-17 00:30:44 +00:00
Daniel Stone
938e827c48 ci/deqp: Compress caselists with zstd
The Vulkan caselist is gigantic - over 400MB - and only growing over
time. Luckily zstd gets that down to about 9MB, with absolutely
negligible runtime overhead.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
2024-10-16 22:52:44 +00:00
Daniel Stone
ed8441a339 ci/deqp: Flatten fraction/shard sed into a single pass
Currently, we:
  * copy the case list to a temporary file
  * have sed read it in, decimate for fraction, write it out again
  * have sed read it in, decimate for shard, write it out again

Flatten this out to have a single-pass sed covering both fraction and
shard that reads the source caselist in directly and writes the final
one.

This comes at the cost of an expression that's unreadable even for sed,
but with the Vulkan mustpass over 400MB and growing, it's a good
improvement.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
2024-10-16 22:52:44 +00:00
Daniel Stone
0ec961325b ci: Strip yet more unnecessary things from the rootfs
We don't need to ship a Rust toolchain, drivers from Debian's version of
Mesa, etc. So stop doing it.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
2024-10-16 22:52:44 +00:00
Daniel Stone
9295de1396 ci/vk: Strip and optimise validation layers
The validation layers being installed are somehow 206MB in size. Slim
them down to 26 MB by building the dependencies in release mode, and
stripping on install.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
2024-10-16 22:52:44 +00:00
Daniel Stone
e26ea7a00e ci: Remove non-Proton Wine
We previously only had two traces (one for Raven, one for lavapipe)
running via Wine. To support this, the Vulkan container had Debian's
full Wine installation, one from DXVK, and then a third from Proton
which is used by the b2c jobs.

At 600MB each, this was pretty unsustainable. Remove everything but the
Proton ones.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31595>
2024-10-16 22:52:44 +00:00