Commit Graph

171558 Commits

Author SHA1 Message Date
Juan A. Suarez Romero
6dc22d996c v3d/ci: make traces test mandatory
Similar to other drivers, let's run always the traces tests.

Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23135>
2023-05-23 07:50:49 +00:00
Juan A. Suarez Romero
496a7aedbb v3d/ci: run GPU piglit profile
Instead of running all the tests, run only the GPU related ones, which
should make the CI faster.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23135>
2023-05-23 07:50:49 +00:00
Samuel Pitoiset
d719e99f16 radv: apply a bug workaround for smoothing on GFX6
This fixes smooth lines on GFX6.

Fixes: 85cbdba355 ("radv: add support for smooth lines")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23168>
2023-05-23 06:52:22 +00:00
Mike Blumenkrantz
208c31b25f zink: infer types from load_const instrs to avoid more bitcasts
this walks to uses list for the ssa def to infer a type from one of the
uses to reduce the need to bitcast

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
9f6be8effb zink: store and use alu types for ntv defs
this adds indexing for ssa/reg defs with the accompanying current
type of a given def (inaccurate for objects but whatever), enabling
that type to be used directly in order to avoid bitcasts in some places

this upends the assumption that all stored srcs are uint type

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
096dcdbd01 zink: dynamically emit non-bool register values using local_vars spirv buffer
this will be useful in a future commit

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
871afadfe5 zink: write out register variables to a separate spirv buffer
this will enable registers to be written more dynamically with correct
type values to cut down on bitcasts

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
2a18d070cb zink: manually memcpy the spirv instruction buffer
no functional changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
5f4a2f6cfe zink: move get_alu_type() up in file
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
af76c23d74 zink: use void return for store_dest
not sure why this had returns, but it doesn't seem necessary

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
e4dacc382e zink: delete unnecessary bitcast in load_shared/scratch
if the mem is loaded as uint and stored as uint, then
the loaded and stored value must be uint, so a bitcast to uint
is as pointless as this commit message

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
5d8103b109 zink: also declare int size caps inline with signed int type usage
Fixes: 854fd242fa ("zink: declare int/float size caps inline with type usage")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22934>
2023-05-23 01:02:56 +00:00
Mike Blumenkrantz
80b8defaf3 zink: promote flushed clears to unordered cmdbuf when possible
this reuses the unordered_blitting codepath for fb clears

for #9016

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23056>
2023-05-23 00:18:29 +00:00
Mike Blumenkrantz
dfc01aea83 vk/graphics_state: handle null pipeline state structs in creation
when these members are null, the corresponding graphics states should be
initialized with sensible default values

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22236>
2023-05-22 23:51:22 +00:00
Mike Blumenkrantz
589fc441c3 anv: more correctly handle null pipeline states
it's not necessary to check whether dynamic states are set before
the null checks since any issues there would be VU errors

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22236>
2023-05-22 23:51:22 +00:00
Mike Blumenkrantz
fef493f745 lavapipe: more correctly handle null pipeline states
it's not necessary to check whether dynamic states are set before
the null checks since any issues there would be VU errors

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22236>
2023-05-22 23:51:21 +00:00
Mike Blumenkrantz
0f510040dc zink: flag 'has_work' on batch when promoting a cmd
has_work controls whether a flush can be deferred, i.e., when unset
a flush may be deferred

since a promoted cmd must still be flushed to take effect, ensure this
is always set when promoted cmds are pending

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
b0c02f5ce9 zink: explicitly disable promotion on images that are both unflushed and non-reorderable
until #9016 is resolved, be more cautious and consider any image with unflushed
access as un-promotable to avoid layout desync

affects:
KHR-GLES3.packed_pixels.varied_rectangle.rgb

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
3c010319bb zink: explicitly disable reordering after restricted swapchain readback blits
when needs_present_readback is set, reordering is disabled without hitting
the path that would normally disable promotion for the resource, so this
needs to be changed manually to avoid layout desync on the swapchain

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
ab3914a17b zink: disable unordered blits when swapchain images need aqcuire
this is consistent with other cmdbuf reordering for blits

Fixes: 3a9f7d7038 ("zink: implement unordered u_blitter calls")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
db12b881c7 zink: track/check submit info on resource batch usage
resources use a private refcount to avoid overhead from atomics on
descriptor binds, but this has the side effect of evading batch usage,
meaning that the usage may not be properly removed once the batch state
is reset, which will cause issues with detecting whether usage exists
for a given resource

to fix this, the mechanism for tc fence disambiguation can be reused,
namely adding the batch state's submit count to the usage info and
then using that to add a second set of comparisons such that it becomes
possible to check both whether the batch usage for a resource matches
a given batch AND whether the batch usage is the current state of the
batch

affects:
KHR-GLES3.copy_tex_image_conversions.required.cubemap_posy_cubemap_negz

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
5e1943db7f zink: move batch usage to substruct on zink_bo objects
no functional changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
143da5f2e4 zink: move zink_batch_state::submit_count to zink_batch_usage
no functional changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
84bcdc521d zink: use batch usage function for a simple case
no functional changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:45 +00:00
Mike Blumenkrantz
9c8b6754b0 zink: add special-casing for (not) reordering certain image barriers
in a scenario where an ordered read op occurs for an image,
successive read-only barriers SHOULD be able to be promoted

...but they can't, because there isn't yet a mechanism for handling layout
transitions between the unordered cmdbuf and the ordered cmdbuf,
meaning that promoting e.g., a SHADER_READ_ONLY barrier after a TRANSFER_SRC
barrier will leave the image with the wrong layout for the transfer op:

TRANSFER_SRC(unordered) -> COPY(ordered) -> SHADER_READ_ONLY(unordered)

becomes

TRANSFER_SRC(unordered) -> SHADER_READ_ONLY(unordered) -> COPY(ordered)

ideally I'll get around to figuring this out at some point

affects:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.texture2d_array_to_renderbuffer

Fixes: bf0af0f8ed ("zink: move all barrier-related functions to c++")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23035>
2023-05-22 23:26:44 +00:00
Nanley Chery
03b9a6fde1 iris: Use known formats for tex_cache_flush_hack
Instead of using ISL_FORMAT_UNSUPPORTED, use the known format to avoid
extra cache flushes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
2023-05-22 22:20:58 +00:00
Nanley Chery
803a569fdf intel/blorp: Add and use blorp_copy_get_formats
This is useful for iris to know what formats will be used for copy
operations.

The new function introduces a couple refactors. It makes use of the
ISL_GFX_VER() macro and it also makes more use of the
isl_surf_usage_is_depth() function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
2023-05-22 22:20:58 +00:00
Nanley Chery
f11a02c183 intel/blorp: Change condition for CCS_E copy formats
In blorp_copy, instead of checking if the surface's aux-usage is CCS_E,
check if its format supports CCS_E.

ISL won't report that a surface supports CCS_E if its format doesn't, so
this should strictly widen the scope of surfaces included in this path.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
2023-05-22 22:20:58 +00:00
Nanley Chery
1ac1b17087 intel/blorp: Add depth usage check for copy format
We will soon update the CCS_E aux-usage check to a CCS_E format check.
Since depth formats support CCS_E on gfx12+, add another check for the
depth usage to prevent depth surfaces from falling into the CCS_E copy
format case.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
2023-05-22 22:20:58 +00:00
Nanley Chery
85142f3fce intel/blorp: Use the depth copy format more on BDW+
Sampling with HiZ is introduced on BDW+. For BLORP copies, instead of
using the depth format when the source uses HiZ, use it for all depth
sampling on BDW+.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23087>
2023-05-22 22:20:58 +00:00
Chia-I Wu
57b85b6002 radv: do not use a pipe offset for aliased images
Fixes dEQP-VK.ycbcr.plane_view.memory_alias.* on raven2.

Fixes: 1c06565026 ("radv: expose disjoint image support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23143>
2023-05-22 20:14:22 +00:00
Chia-I Wu
4f1c43d38e ac/surface: print tile_swizzle as well
swizzle modes that are *_X or *_T depend on tile_swizzle.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23143>
2023-05-22 20:14:22 +00:00
Chia-I Wu
4f5edcd0ee amd/drm-shim: add raven2
It differs from raven in interesting ways (e.g., GB_ADDR_CONFIG).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23143>
2023-05-22 20:14:22 +00:00
Erik Faye-Lund
569d035a08 panfrost: expose PIPE_CAP_POLYGON_OFFSET_CLAMP
This gives us ARB_polygon_offset_clamp and EXT_polygon_offset_clamp, and
most of the actual state plumbing was already in place.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23169>
2023-05-22 20:00:18 +00:00
Alyssa Rosenzweig
8484fdf501 mesa/st: Set pipe_shader_image::single_layer_view
Pass it through from the API.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23142>
2023-05-22 16:41:10 +00:00
Alyssa Rosenzweig
a6a3a7a881 gallium: Add pipe_image_view::single_layer_view
OpenGL has a goofy feature that allows creating an image view of a single layer
of an array texture... in which case that image is treated as non-arrayed in
shader. If you have a 16x16x16 3D texture and bind the third layer, you get a
16x16 2D texture instead of a 16x16x1 3D texture. That distinction matters to
the hardware on AGX, since the texture dimension needs to match between the
shader and the pipe_image_view. If the shader is going to use image2D, we need
to know that the pipe_image_view should be treated as 2D (even though the
underlying resource is 3D).

"But, Alyssa, we already have first_layer and last_layer. Surely you can just
check if first_layer == last_layer?" you ask. The problem is that doesn't
distinguish a 16x16x1 3D texture (accessed as image3D in the shader) from a
16x16 slice (accessed as image2D in the shader) of a 16x16x16 3D texture. To
solve, we add a boolean flag indicating we want to create a view (with a lower
dimension than the underlying resource). This provides an unambiguous way to
communicate this case to drivers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23142>
2023-05-22 16:41:10 +00:00
Martin Roukala (né Peres)
17fd50b817 radv/ci: switch to b2c v0.9.10
This brings a fix for the steam decks which may boot too fast sometimes,
and have the network adapter not being enumerated by the time it tries
to connect to the gateway...

Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23170>
2023-05-22 16:01:52 +00:00
Caio Oliveira
623bc176fb mesa/spirv: Provide more specific error message for glSpecializeShader()
Distinguish between the "entry point not found" and "parsing error"
cases in the error text.  For consistency, identify the unhandled
specialization index case as part of the verification function.

The verification function was renamed to make clearer its scope and
what module it belongs.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22976>
2023-05-22 15:26:40 +00:00
Alyssa Rosenzweig
eebb9377c4 pan/mdg: Use nir_lower_image_atomics_to_global
We were already lowering image atomics to lea_image + global atomic. It's a lot
nicer to make that lowering explicit in the NIR. This is much bigger win than in
the Bifrost compiler since here lea_image is used only for atomics, and here it
wasn't well abstracted in the compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
2023-05-22 14:33:14 +00:00
Alyssa Rosenzweig
47f5cc6ba7 pan/bi: Use nir_lower_image_atomics_to_global
We were already lowering image atomics to lea_attr_tex + global atomic, might as
well make that lowering explicit in the NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
2023-05-22 14:33:14 +00:00
Alyssa Rosenzweig
1ff7ec0c9e pan/bi: Fix atomic exchange on Valhall
Copypaste fail when switching to unified atomics, missed becuase I don't have
any Valhall hardware and Valhall isn't in CI. (Good news, that means it probably
didn't affect anyone in the mean time :-p)

Fixes crashes with lots of dEQP-GLES31 tests observed under drm-shim.

Fixes: e258083e07 ("pan/bi: Use unified atomics")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
2023-05-22 14:33:13 +00:00
Alyssa Rosenzweig
de648020af nir: Add pass to lower image atomics
Hardware that lacks dedicated image atomics can still implement image atomics
with regular atomics on global memory, as long as there is a way to get the
address of a texel in memory. I've open-coded this lowering in my first 2
compilers, so before I add another crappy vendored version in my 3rd, let's add
a common NIR pass to do the lowering.

Thanks to unified atomics, the pass itself is fairly concise.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
2023-05-22 14:33:13 +00:00
Alyssa Rosenzweig
66656822e3 nir: Add image_texel_address intrinsics
Some hardware has an instruction to load the address of a texel in a writeable
image, given the coordinates ("LEA_IMAGE"). This operation is defined only for
uncompressed images, but it is well-defined regardless of the underlying
twiddling. As such, it is not expected to be produced by APIs but is useful for
internal lowering when it is known that images will be uncompressed (e.g.
because image_store does not support compression on the hardware).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
2023-05-22 14:33:13 +00:00
Alyssa Rosenzweig
c3ea2f8d20 nir: Document extra image source
I was scratching my head about this for a few minutes until I found the answer
in spirv_to_nir. Hopefully this saves someone else some head scratching in turn.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
2023-05-22 14:33:13 +00:00
David Heidelberg
32b150344e docs: use meson instead invoking ninja directly
This approach is available since meson 0.47.0 which we depend on.

Reviewed-by: Sergi Blanch-Torné <sergi.blanch.torne@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23127>
2023-05-22 15:41:40 +02:00
Mike Blumenkrantz
62961b172f zink: try update fb resource refs when starting new renderpass
in the case where a draw is triggered after a flush, zink_update_descriptor_refs
will be called to set batch tracking for descriptors. this function also
handles refs for fb attachments, and everything is usually fine there

the problem with this approach is that tracking is no longer set on view
objects at renderpass begin, which makes them susceptible to early deletion
if a rp isn't started from a draw call

instead, apply batch tracking to fb attachment resources on renderpass
begin if the BATCH_CHANGED flag is set (need to rename this at some point)
in order to guarantee that the resource (object) lifetime will match the
cmdbuf runtime [since imageviews are now only freed upon batch completion]

fixes #9059

Fixes: f6bbd7875a ("zink: remove batch tracking/usage from view types"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23132>
2023-05-22 11:15:22 +00:00
Lionel Landwerlin
cab7ba00e2 anv: fix push descriptor deferred surface state packing
Yuzu is running into a segfault because it writes the push descriptor
twice with 2 different layouts, but without a draw/dispatch in
between.

First vkCmdPushDescriptorSetKHR() writes descriptor 0 & 1 with a
uniform buffer. We toggle the 2 first bits of
anv_descriptor_set::generate_surface_states.

Second vkCmdPushDescriptorSetKHR() writes descriptor 0 with uniform
buffer and descriptor 1 with an image view. The first bit of
anv_descriptor_set::generate_surface_states stays, but the second bit
was already set before and it should now be off.

When we finally flush the push descriptor, we try to generate a
surface state for descriptor 1, but there is no valid buffer view for
it, we access an invalid pointer and segfault.

This fix resets the anv_descriptor_set::generate_surface_states when
the descriptor layout changes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b49b18f0b7 ("anv: reduce BT emissions & surface state writes with push descriptors")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23156>
2023-05-22 10:50:26 +00:00
David Heidelberg
cc0cf1762d r300: workaround GCC 12+ warning, declare NULL value as unreachable
Solution recommended in the https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109716#c3

Suggested-by: Eric Engestrom <eric@engestrom.ch>

Acked-by: Filip Gawin <filip@gawin.net>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23148>
2023-05-22 12:32:42 +02:00
Iago Toral Quiroga
e401add741 broadcom/compiler: skip jumps in non-uniform if/then when block cost is small
We have an optimization for non-uniform if/else where if all channels meet the
jump condition we emit a branch to jump straight to the ELSE block. Similarly,
if at the end of the THEN block we don't have any channels that would execute
the ELSE block, we emit a branch to jump straight to the AFTER block.

This optimization has a cost though: we need to emit the condition for the
branch and a branch instruction (which also comes with a 3 delay slot), so for
very small blocks (just a couple of ALU for example) emitting the branch
instruction is typically worse. Futher, if the condition for the branch is not
met, we still pay the cost for no benefit at all.

Here is an example:

nop                           ; fmul.ifa rf26, 0x3e800000, rf54
xor.pushz -, rf52, 2          ; nop
bu.alla  32, r:unif (0x00000000 / 0.000000)
nop                           ; nop
nop                           ; nop
nop                           ; nop
xor.pushz -, rf52, 3          ; nop
nop                           ; mov.ifa rf52, 0
nop                           ; mov.pushz -, rf52
nop                           ; mov.ifa rf26, 0x3f800000

The bu instruction here is setup to jump over the following 4 instructions
(the last 4 instructions in there). To do this, we pay the price of the xor
to generate the condition, the bu instruction, and the 3 delay slots right
after it, so we end up paying 6 instructions to skip over 4 which we pay
always, even if the branch is not taken and we still have to execute those
4 instructions. With this change, we produce:

nop                           ; fmul.ifa rf56, 0x3e800000, rf28
xor.pushz -, rf9, 3           ; nop
nop                           ; mov.ifa rf9, 0
nop                           ; mov.pushz -, rf9
nop                           ; mov.ifa rf56, 0x3f800000

Now we don't try to skip the small block, ever. At worse, if all channels
would have met the branch condition, we only pay the cost of the 4
instructions instead of 6, at best, if any channel wouldn't take the
branch, we save ourselves 5 cycles for the branch condition, the branch
instruction and its 3 delay slots.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23161>
2023-05-22 09:23:41 +00:00
Yiwei Zhang
4c8be22c66 radv: fix radv_emit_userdata_vertex for vertex offset -1
-1 is a legit vertex offset upon vkCmdDrawIndexed and other cmds. This
change fixes to track last_vertex_offset with an additional valid bit.

Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23157>
2023-05-22 08:31:28 +00:00