Even though we don't advertise the sparseResidency feature, a bunch of
CTS tests just call GetPhysicalDeviceImageFormatProperties2() with
SPARSE_RESIDENCY_BIT and see if that fails.
Fixes: d2177f4764 ("nvk: Don't advertise sparse residency on Maxwell A")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30303>
This fixes sporadic rendering corruption reported on MTL with ChromeOS
in cases where multiple processes including Chrome were utilizing the
GPU concurrently, and one of the processes happened to submit a
BLORP-only batch buffer right after a switch from a different context.
In such a scenario we would fail to add the BO that holds the pixel
hashing tables to the execbuf IOCTL for the BLORP batch, because it
was being pinned from iris_restore_render_saved_bos() which isn't
called for BLORP operations, potentially causing it to use garbage as
pixel pipe hashing tables, which led to corruption of the BLORP
rendering.
Technically this could have affected DG2 as well, but it has only been
reported on MTL so far.
Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30274>
Doxygen documentation says
> If the file name is omitted (i.e. the line after \file is left
> blank) then the documentation block that contains the \file command will
> belong to the file it is located in.
so we can omit the filename itself when using the annotation.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>
Fixes piglit test arb_copy_buffer-data-sync.
The test fails inconsistently because svga_buffer_surface handle is sometimes
not present at time of copy check, so the copy does not happen.
Adding a validata_host_surface() call for the surface being copied ensures the
handle is present.
Signed-off-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Neha Bhende <neha.bhende@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29949>
It doesn't do anything substantial yet, but it ignores enough so internal
shaders won't generate warnings.
I've also added ByVal parsing, because I need this one to actually fix a
correctness issue in a later patch.
Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29896>
Gfx 12.5 moved scratch to a surface and SURFTYPE_SCRATCH has this pitch
restriction:
RENDER_SURFACE_STATE::Surface Pitch
For surfaces of type SURFTYPE_SCRATCH, valid range of pitch is:
[63,262143] -> [64B, 256KB]
The pitch of the surface is the scratch size per thread and the surface
should be large enough to accommodate every physical thread.
So here adding a new field to intel_device_info, setting it in
intel_device_info_init_common() so even offline tools can have it set.
And finally adding a check to fail shader compilation if needed
scratch is larger than supported.
This issue can be reproduced in debug builds when running
dEQP-VK.protected_memory.stack.stacksize_1024 on Gfx 12.5 or newer
platforms.
Ref: BSpec 43862 (r52666)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30271>
We want to use actual sparse-capable queues as the default
trtt->queue, not copy queues that may have a companion_rcs_batch.
Before this patch, if we expose more than one queue *and* the
application creates a copy queue first, we'll end up setting
trtt->queue as the copy queue, which will GPU hang when we submit the
TR-TT batches as they don't support the pipe_control commands we
issue.
The trtt->queue queue is used for binding/unbinding buffers in code
paths where there's no specific queue coming from user space, such as
when we're creating or destroying a sparse resource.
This is not a problem yet on i915.ko since we are exposing
only a single queue, and it is not a problem for xe.ko since TR-TT is
not the default there. This is also not a problem in applications
that create the render or compute queue first. We plan to expose more
queues when using TR-TT, so this would become a problem without this
patch.
None of VK-GL-CTS seems to exercise that, and none of the Steam games
I tested exercise that as well. I was able to reproduce this issue
using our internal tracing tool.
v2: New implementation that doesn't break when we only have a compute
queue (Lionel).
Fixes: 04bfe828db ("anv/sparse: allow sparse resouces to use TR-TT as its backend")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
On Gen12 (the oldest we support on Mesa right now for TR-TT) we
started having per-engine TR-TT registers and we are supposed to make
all contexts share the same TR-TT programming.
On LNL+, this is documented in the BSpec page for the TRTT_CNTRL
register (68417), with more details in HSDs 14020454786 and
16022013154.
On Gen12 platforms this information is a little harder to find and
there's a whole trail of HSDs leading up to 1209977595, which links to
the documents that describe the programming. BSpec for TR-TT on Gen12
is very confusing as it still contains registers and other information
from Gen11 that were not removed.
Regarding the additional BLT and COMP registers, please notice that on
the BSpec pages for the TR-TT registers, the "Register Instance"
section only lists the GFX registers as non-privileged. However, the
"User Mode Privileged Commands" lists the other instances of the TR-TT
Regsiters as non-privileged, which matches what we see: there's no
need to put these addresses in the FORCE_TO_NONPRIV registers.
Notice that for now, when TR-TT is being used we only expose a single
queue, so this change effectively does nothing until we start exposing
extra queues. I left that part for later to help bisectability.
v2:
- s/trtt_init_context_state/trtt_init_queues_state/ (José)
- pass device as the argument to init_queues_state (José)
v3:
- use async_submit_end (José)
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
Having this as a separate batch was the normal behavior until
7da5b1caef ("anv: move trtt submissions over to the
anv_async_submit").
While it certainly sounds better to do everything related to TR-TT
initialization in one batch, we need to revert it back to be a
separate batch (but now using the new anv_async_submit infrastructure)
because we'll want to run this batch on every engine.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
I'm not sure if two otherwise equal texture instructions ever have sources
in different orders, but they should be considered equal.
ministat of nir_opt_cse:
N Min Max Median Avg Stddev
x 9 6.586801 6.718673 6.682875 6.6621411 0.047817119
+ 9 6.519098 6.609235 6.552997 6.5605604 0.028879587
Difference at 95.0% confidence
-0.101581 +/- 0.0394755
-1.52475% +/- 0.585928%
(Student's t, pooled s = 0.0395)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30145>
This is faster.
ministat of nir_opt_cse:
N Min Max Median Avg Stddev
x 9 6.724212 6.84511 6.788336 6.7873378 0.034363882
+ 9 6.586801 6.718673 6.682875 6.6621411 0.047817119
Difference at 95.0% confidence
-0.125197 +/- 0.0416115
-1.84456% +/- 0.609248%
(Student's t, pooled s = 0.0416374)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30145>
ministat of nir_opt_cse:
N Min Max Median Avg Stddev
x 9 7.393408 7.490593 7.434056 7.4338972 0.028150325
+ 9 6.724212 6.84511 6.788336 6.7873378 0.034363882
Difference at 95.0% confidence
-0.646559 +/- 0.0313916
-8.69745% +/- 0.407925%
(Student's t, pooled s = 0.0314111)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30145>
Otherwise AMD_DEBUG=noexporteddcc will be ignored when modifier are
used.
Similarly to AMD_DEBUG=nodcc handling, this makes the application
unable to import buffers with DCC as well - the alternative would be
to implement the filtering only in the texture creation path, so in
the si_modifier_supports_resource function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30257>
A specific clear_surface blitter operation is required, because in this
case we need to save framebuffer information, but not in standard clear,
as we are currently doing.
This fixes a leak in depthstencil surface, which happens because we were
storing saving it as part for the framebuffer information, but the
blitter clear wasn't restoring it because it wasn't required (only it
is required in clearing a surface).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30240>
So far, in order to know what we need to save before using the blitter
utility, we use a boolean to know if we are going to do a blit or not,
and if we need to take in account conditional rendering or not.
In other to allow to specify more operations than blit or not, use an
enum to define what operation we would like to do, and based on that
what information we need to store.
This also merge the conditional rendering as part of these operations.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30240>
The ownership of the TargetMachine object is released when LPJit
singleton is constructed, leads to a slight memory loss detectable.
Keep the ownership by saving the unique pointer as another class member
named tm_unique.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30216>