Export instructions allow burst writes, so it makes send to try
to allocate consecutive registers, but for ring writes we don't
schedule the outputs correctly to exploit this, so for now
don't mark these instructions as export to let the RA restart
picking colors.
When the scheduler starts to emit the ring writes in the right order
to allow for bust writes we might revisit this.
This fixes
spec@glsl-1.50@execution@variable-indexing@gs-output-array-vec4-index-wr
Fixes: 79ca456b48
r600/sfn: rewrite NIR backend
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6975
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18212>
(cherry picked from commit fd71cd0b6a)
When linear rendering is used together with TS, the color tiles must be fully
contained in a single row of pixels. When wrapping around to the next row
TS gets confused and records wrong tile status information, leading to visual
corruption when the surface is resolved/decompressed.
The corruption can be fixed by increasing the stride alignment for linear
render targets, but that would break some existing use-cases, as some display
engines used together with Vivante GPUs currently don't support strides that
don't match the horizontal display resolution.
For now only enable linear PE rendering when the surface is properly aligned
already. This allows to use the optimization in a lot of common use-cases, but
falls back to the proven tiled rendering with subsequent resolve into linear
for the problematic cases.
CC: mesa-stable #22.2
Fixes: 53445284a4 ("etnaviv: add linear PE support")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18232>
(cherry picked from commit ea8fc9592c)
Conflicts:
src/gallium/drivers/etnaviv/etnaviv_surface.c
The decision whether to use fast clear aka TS currently checks for two
feature bits: FAST_CEAR and MC20. We check for MC20, as TS on MC1.0 bypasses
the memory offset and we don't have any way to fixup the GPU address to
account for that. It could be done with some support of the kernel driver,
but then GPUs with MC1.0 are very rare to find these days, so not sure if we
are ever going to bother with that.
Instead of checking two separate feature bits to determine if TS can be used,
mask out the FAST_CLEAR bit from the features when MC20 isn't present. This
way we only have to check for a single feature bit.
CC: mesa-stable #22.2
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18232>
(cherry picked from commit 09953d7b75)
Sync fd fence export implies a payload reset operation, and application
can immediately do another submission with the same fence after export.
Concurrent use of the same feedback slot is incorrect. Keeping a list of
feedback slots for sync_fd external fence is a bit over designed given
those fences are usually not checked or waited by the app, but will hand
off the ownership via sync fd to an external client.
Fixes: d7f2e6c8d0 ("venus: add fence feedback")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17975>
(cherry picked from commit 5457f4c0a4)
With the following SEND instruction :
send(1) nullUD nullUD g0UD 0x4200c504 a0.1<0>UD
This instruction although valid but somewhat nonsensical (SEND message
to write at offset contained in NULL register), triggers an error in
the validator.
The restriction is that we cannot have overlapping sources. The
validator not checking the type of register incorrectly thinks that
the null register (offset 0) is the same as g0.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>
(cherry picked from commit 3c6fa2703d)
tcs_out_lds_offsets is not sure to be 16 byte aligned, it's
calculated like this:
num_patches * patch_vertices * lshs_vertex_stride
num_patches and patch_vertices are not sure to be any value aligned,
lshs_vertex_stride is added one extra dword, so it's only 4 byte
aligned.
This may cause problem even before we switch to nir tess output
lower when write tess factor before read tail of input. But it's
more likely to cause problem after we switch to nir tess output
lower because the main body won't eliminate the low 4bit offset
but epilog will, so they use different offset to read/write tess
factor.
Fixes: 7598bfd768 ("radeonsi: replace llvm tcs output with nir lower pass")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7083
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18174>
(cherry picked from commit ff7c59672f)
For AMD we kinda have some modifiers with a max size ... (Which is
really a compositor/kms issue, but getting them to try kinda falls
into the unsolved "how to allocate/what pitch to use" bucket, so
we solve it on the allocating side)
Cc: mesa-stable
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18139>
(cherry picked from commit bb2a444324)
Even with dynamic rendering, you have to bind both aspects of the image
if the image contains both depth and stencil. One day, we may see this
restriction lifted but that will require deeper driver surgery into the
way we handle depth/stencil layouts.
Fixes: 42db590006 ("radv: convert the meta blit 2d path to dynamic rendering")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18084>
(cherry picked from commit 76b8b854a5)
With VK_FORMAT_B10G11R11_UFLOAT_PACK32 in particular, we're seeing
applications create image views with swizzle = R,G,B,0
But since the format has no alpha channel, the swizzle value for it
does not matter for the equivalence we're trying to verify.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a9edc268b9 ("anv: validate image view lowered storage formats for storage")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18081>
(cherry picked from commit 4ab38112f3)
If this happens then "after" is NULL, so we can't use it to get the
block, and the instruction is never moved at the end so we have to
create the split instructions before creating the collect to make sure
they are in the right order.
This happens when reloading a complex vector value that has been
coalesced at the end of a basic block, which apparently hasn't happened
until a gfxbench5 shader on zink hit this case. This fixes it.
Closes: #7054
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18082>
(cherry picked from commit dcfbb60392)
Clause scheduler edition of db2bdc1dc3 ("pan/bi: Require ATEST coverage mask
input in R60"). ATEST wants to read r60, which can't work if its input isn't
even in a register.
When per-sample shading isn't in use, prevents regressions in:
KHR-GLES31.core.sample_variables.mask.*
These tests previously passed because per-sample shading was forced. It's
not clear whether the bug addressed in this patch is possible to hit "in the
wild", i.e. without the optimizations in this series that allow us to use
per-pixel shading in more cases.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
(cherry picked from commit 394e1f5862)
Fixes flaking in
dEQP-GLES31.functional.image_load_store.cube.qualifiers.volatile_r32i due to
image reads being moved past a BARRIER.
To make this more robust/optimal, we probably need scheduling information
(coherent/volatile/etc) added to instructions like ACO does. That's left for a
future extension, for now I just want the test to stop flaking.
Fixes: 569e5dc745 ("pan/bi: Schedule for pressure pre-RA")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
(cherry picked from commit e12a9ce8d6)
without glsl array lowering, this intrinsic can creep in for tg4 ops,
which complicates everything. instead, rewrite these ops as residency+iand,
and then rewrite the existing residency ops to match
v2 (idr): Add missing size parameter to nir_is_sparse_texels_resident
calls.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16547>
I think I got all the drivers that need updating. This is only
necessary in drivers that support GLSL 4.00 / GL_ARB_gpu_shader5 and
have PIPE_CAP_TEXTURE_GATHER_OFFSETS = 0.
v2: Don't (accidentally) condition tg4 offsets lowering on tex rect
lowering. Noticed by Qiang.
v3: Add missing bool() cast.
v4: don't use designated initializers
Fixes: 640f909862 ("glsl: add _texture related sparse texture builtin functions")
Closes: #6365
Tested-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16547>
If we don't recognize the model, dev->model will be NULL. In that case, we can't
dereference dev->model to get the tilebuffer size. If we do, we'll segfault,
instead of gracefully refusing to probe and loading the swrast instead.
Fixes: 96d65b47c7 ("panfrost: Use implementation-specific tile size")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18115>
(cherry picked from commit 93f69e0452)
Additionally add support for PIPE_CAP_CLEAR_SCISSORED since we are already
touching the scissor state.
Fixes various gnome-shell rendering artifacts.
v2: Remove NEW_SCISSOR as clear now updates scissor registers explicitly
v3: Reset scissor_off state since its not off now after clear
Cc: mesa-stable
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18137>
(cherry picked from commit 7908cb895e)
Alpha-to-coverage acts like discard and happens after FS ends,
so like with discard LRZ write should be disabled.
With discard we don't know at the moment of binning whether
fragment would be not discarded, so we cannot write its depth to LRZ.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18102>
(cherry picked from commit 09676b5817)
SPIR-V shift ops unlike NIR have undefined behavior if shift count
larger than or equalt to bitwidth.
This means that true translation of NIR ishl/ishr/ushr to SPIR-V requires
masking like that done in gallivm.
This was seen in the case of soft fp64 in cts case
KHR-GL46.gpu_shader_fp64.builtin.ceil_double.
Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18074>
(cherry picked from commit b386df918f)