This is necessary to allow optimizing VS inputs after nir_lower_io, which
is currently impossible because the loss of dual-slot information in NIR
would break VS inputs. With this, driver locations can be recomputed by
calling nir_recompute_io_bases. It's a prerequisite for optimizing varyings
with lowered IO.
When this is used, we will be able to eliminate unused dual-slot VS inputs
as well as unused low and high halves of dual-slot VS inputs for the first
time, which can happen due to optimizations of varyings. Without this,
st/mesa binds vertex buffers for dual-slot inputs that are fully or
partially unused in the shader.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25394>
We are about to rename mupuf/valve-infra into gfx-ci/ci-tron.
While most resources will transparently be redirected, gitlab does
not allow us to keep our containers during the migration.
To work around that, I uploaded the current containers to Eric's fork
of valve-infra. Let's use these containers until the migration is over!
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25593>
While working on cooperative matrix support, I noticed some invalid
DP4A instructions being generated.
dp4a(32) g33<1>UD g21<8,8,1>UD g1.0<0,1,0>UD g9<1,1,1>UD
This violates the constraint that the destination or a source can only
access two consecutive GRFs.
I'm a little surprised that validation didn't catch this. Perhaps
because it's a 3 source instruction? Either way, it seems like a bigger
project to fix that.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 0f809dbf40 ("intel/compiler: Basic support for DP4A instruction")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25554>
In release mode print the ralloc tree header pointers. In debug mode,
will look at the child allocations recursively checking for the canary
values. In most cases this should work and give us extra information
about the type of the subtree (GC, Linear) and the allocation sizes.
For example, calling it at the end of spirv_to_nir() will output the
following tree with a summary at the end (lines elided to focus on the
general structure of the output)
```
0xca8d00 (472 bytes)
0xdf2c60 (32 bytes)
0xdf2c00 (32 bytes)
0xdf2ba0 (32 bytes)
0xdf2b40 (32 bytes)
(...)
0xcde8b0 (64 bytes)
0xcde760 (72 bytes)
0xd2de40 (168 bytes)
0xcce660 (64 bytes)
0xcce510 (72 bytes)
0xcce5a0 (120 bytes)
0xcce490 (64 bytes)
(...)
0xcbc450 (456 bytes)
0xdf55c0 (160 bytes)
0xdf5730 (72 bytes)
0xdf57c0 (80 bytes)
0xdf5530 (72 bytes)
0xdf56a0 (80 bytes)
(...)
0xcbe840 (128 bytes)
0xcc4310 (4 bytes)
0xcbc660 (536 bytes) (gc context)
0xde6b40 (32576 bytes) (gc slab or large block)
0xddb160 (32704 bytes) (gc slab or large block)
0xdc8d50 (32704 bytes) (gc slab or large block)
(...)
0xcde9a0 (32704 bytes) (gc slab or large block)
0xcd6720 (32704 bytes) (gc slab or large block)
0xcce6e0 (32768 bytes) (gc slab or large block)
0xcbc330 (72 bytes)
0xd680a0 (208 bytes)
0xca9010 (78560 bytes)
0xca8f20 (176 bytes)
==== RALLOC INFO ptr=0xca8d30 info=0xca8d00
ralloc allocations = 4714
- linear = 0
- gc = 23
- other = 4691
content bytes = 1055139
ralloc metadata bytes = 226272
linear metadata bytes = 0
====
```
There's a flag to pass so only the summary at the end is printed.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25482>
When the guest driver is Virgl while Xwayland is on Zink, Virgl can
request virtgpu classic 3d resource allocations for swapchain images.
Zink will import when the image is shared with xserver and will export
for fd info of all 2d images later to be forwarded.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25579>
In a scenario like:
CmdBindTransformFeedbackBuffers()
BeginTransformFeedback()
CmdDraw() --> streamout descriptors emitted
EndTransformFeedback() --> streamout descriptors emitted as 0 (disabled)
CmdDraw()
BeginTransformFeedback()
CmdDraw() --> streamout descriptor not re-emitted
EndTransformFeedback()
Fix this by re-emitting streamout descriptors when streamout is
enabled/disabled because a buffer size of 0 acts like a disable bit.
This fixes dEQP-VK.transform_feedback.simple.backward_dependency_indirect*
on NAVI31.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25583>
Normally, this shouldn't fail, but it has various cases where it returns
`NULL`. Without this change, it would result in a segfault when
`wl_buffer_add_listener` is called.
This instead makes `EGLSwapBuffers` return a `EGL_BAD_ALLOC` error.
The other place `create_wl_buffer` is called already checks the return
value, and the Vulkan WSI code doesn't seem to have an issue like this.
Signed-off-by: Ian Douglas Scott <ian@iandouglasscott.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24915>
This test checks the driver's reported conformance version against the
version of the CTS we're running. This check fails every few months
and everyone has to go and bump the number in every driver.
Running this check only makes sense while preparing a conformance
submission, so skip it in the regular CI.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25519>
Sometimes updates to the farm config are needed before re-enabling it,
or updating test expectations with new failures that happened while the
farm was down, etc.
These need to be allowed, as the alternative is to update the
config/expectations/etc. blindly in one MR, merge it, and then merge
another MR with the farm re-enablement.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25560>
Gang submissions are mostly only be used for task shaders to both
submit GFX and ACE command buffers in the same submission. Though,
the gang leader (the last submitted CS) IP type should match the
queue IP type to determine which fence to signal.
But if chaining is enabled, it's possible that all GFX cmdbufs are
chained to the first GFX cmdbuf, which means the last CS is ACE and
not GFX.
Fix this by resetting chaining for GFX when a cmdbuffer has GFX+ACE.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9724
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24966>
This was completely broken, for example in the following scenario:
- submits something which needs sample positions (this creates a new
descriptor BO with sample positions)
- submits something which needs the tess rings (this creates a new
descriptor BO with tess rings but without sample positions, ie.
add_sample_positions would be FALSE)
- submits something which needs sample positions again (this won't
create a new descriptor BO because it incorrectly remembered that
sample positions were set)
Fix this by always writing the sample positions.
This should fix the following flakes:
- dEQP-VK.fragment_shading_barycentric.*.weights.pipeline_topology_dynamic.msaa_interpolate_at_sample.*
- dEQP-VK.pipeline.fast_linked_library.multisample_interpolation.sample_interpolate_at_distinct_values.*
- dEQP-VK.pipeline.fast_linked_library.multisample_interpolation.sample_interpolation_consistency.*
- dEQP-VK.draw.renderpass.linear_interpolation.*
- dEQP-VK.draw.dynamic_rendering.primary_cmd_buff.linear_interpolation.*
These flakes were extremely hard to reproduce!
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25529>
Part of the memory allocated (private) is a temporary working buffer
for the GRL kernels. Once the build operation is done, the buffer
becomes unused.
Rather than allocate a new buffer each time, reuse the current last
allocated one if its size fits the next build operation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25570>
In the default case, there's a special case with a few conditions.
Prefer the cheapest conditions first, so we can take advantage of
short-circuiting.
Effect is a small but still significant reduce in shader compilation
times, as can be seen by:
- Fossil replay for Rise of the Tomb Raider
```
Difference at 95.0% confidence
-0.433333 +/- 0.028609
-1.42556% +/- 0.0941163%
(Student's t, pooled s = 0.0337886)
```
- Fossil replay for Batman Arkham City
```
Difference at 95.0% confidence
-8.84 +/- 0.146083
-1.65932% +/- 0.0274207%
(Student's t, pooled s = 0.125423)
```
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25549>
unflushed usage is unflushed regardless of the submit count and is
critical for detecting multi-context synchronization
fixes Wolfenstein: The New Order load screen deadlock
Fixes: db12b881c7 ("zink: track/check submit info on resource batch usage")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25572>