Marek Olšák
21e90d9c6e
radeonsi: clear color buffers via compute for special tiling cases
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
2a0b9839ca
radeonsi: add use_aco into CS blit shader key
...
it will be set in a future commit
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
fe7a4ed708
radeonsi: use shader_info::use_aco_amd to determine whether to use ACO
...
It's set by si_nir_scan_shader, so we need to use it after that.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
c83225cd0a
radeonsi: print the compute shader blit key for AMD_DEBUG
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
d62ad0da5f
radeonsi: use MIMG A16 (16-bit image coordinates) in compute blits
...
This reduces VGPR usage for MSAA blits and blitting multiple pixels per
lane.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
d6c96024a8
radeonsi: extend NIR compute helpers to allow returning 16-bit results
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
5b3e1a0532
radeonsi: change the compute blit to clear/blit multiple pixels per lane
...
The target is 8-16B per lane regardless of the format and number of
samples. This is needed to fully utilize the memory bandwidth instead
of only a small fraction of it. These are optimal numbers identified by
benchmarking.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
d4c066abaf
radeonsi: adds flags parameter into si_compute_blit to replace fail_if_slow
...
So that we can also specify sync flags.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
30af861bff
radeonsi: restructure (rewrite) the compute blit shader
...
This merges the separate MSAA, downsampling, upsampling, and non-MSAA blocks.
It's not meant to change behavior, but some change are necessary:
- disallow 16 samples
- loads only load the number of components that we need
- optimizations barriers are placed optimally and include the sample index
in the same vector as the coordinates, so that LLVM is forced to form VMEM
clauses for loads and stores
- the shader queries the descriptor for the dst image manually and passes
it to the image store instead of the image variable (this is needed to get
latency hiding for scalar loads in the presence of optimization barriers)
This is a prerequisite for blitting multiple pixels per lane.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
d2ce5fc07a
radeonsi: split xy_clamp_to_edge to separate X and Y flags for the compute blit
...
to generate less shader code if only one of the axes needs clamping.
Use util_is_box_out_of_bounds instead of doing it manually.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
7ee936bf65
radeonsi: convert the compute blit shader hash table to u64 keys
...
32 bits is not enough anymore. We'll add more.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
40bcb588dd
radeonsi: remove the old si_compute_copy_image
...
It's replaced by the compute blit.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:11 +00:00
Marek Olšák
b0c0cca3a7
radeonsi: switch the old compute image copy to the new one using the blit
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
f3a59fe216
radeonsi: add a new version of si_compute_copy_image using the compute blit
...
It's faster and handles more stuff.
This is mostly the same code as the old version, but it calls
si_compute_blit at the end.
A later commit will remove the old version, so that there is no code
duplication.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
b7389615c6
radeonsi: rename si_compute_copy_image -> si_compute_copy_image_old
...
It will be replaced in several stages.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
8b030ac588
radeonsi: rename si_compute_blit "testing" parameter to "fail_if_slow"
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
a4602395d2
radeonsi: switch compute image clears to the compute blit shader
...
The compute blit shader is faster and handles more stuff.
This removes the old clear_render_target shader.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
9915289bdf
radeonsi: extend the compute blit to do image clears as well
...
The compute blit is faster and handles more stuff than
the clear_render_target shader. We can just pass a clear value to it
to replace the source image.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
e41887c6a4
radeonsi: cosmetic and robustness changes for the compute blit
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
0c5d727a5e
radeonsi: document better how X/Y flipping in the compute blit works
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
bb86366fee
radeonsi/gfx11: enable MSAA image stores in the compute blit
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
5897dde3f7
radeonsi: don't fail due to DCC when using the compute blit on compute queues
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
fcd9f0069f
radeonsi: don't use si_can_use_compute_blit in the compute blit
...
It makes supporting compute queues on all chips more complicated.
Other uses of si_can_use_compute_blit will be removed, so the function
will be removed too.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
1b924bad5e
radeonsi: reject unsupported parameters as the first thing in the compute blit
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
993c30af06
radeonsi: fix sample0_only for the compute blit
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
0ca93e8090
radeonsi: optimize unaligned compute blits
...
If a blit starts on a coordinate that is not at the beginning of a tile
(e.g. 8x8), launch extra threads before 0,0,0 to make all following blocks
start at the beginning of such tiles. This makes such blits faster.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
2423c5ad2f
radeonsi: use MIMG D16 (16-bit data) for image instructions in compute blits
...
This reduces VGPR usage.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
d3638a9f58
radeonsi: remove fp16_rtz from the compute blit
...
it's not useful to have precisely the same behavior as u_blitter,
and D16 image stores are not supported by gfx6-7.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
78ab033ae8
radeonsi: ignore PIPE_SWIZZLE_1 for 40% VGPR usage reduction for compute blits
...
It had no effect on correctness and it was very inefficient because all
formats without alpha have SWIZZLE_1 in the last channel.
util_format_get_last_component is the same, but ignores PIPE_SWIZZLE_1.
It improves MSAA compute blit performance.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
144fe156ef
radeonsi: use better workgroup sizes for compute blits to improve perf
...
It depends on the copy area and the tiling of the destination image.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
269ab6cc62
radeonsi: don't declare 3D coordinates in the compute blit if they aren't needed
...
This eliminates the 3rd coordinate VGPR.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
07fa635f11
gallium/u_blitter: add option to override fragment shader for util_blitter_blit
...
radeonsi will use a custom MSAA resolving shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917 >
2024-06-08 05:48:10 +00:00
Marek Olšák
9ab9644c1f
radeonsi/gfx12: fix stencil corruption
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Tested-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564 >
2024-06-08 00:11:28 -04:00
Marek Olšák
1b9ce2625f
ac/nir/lower_ngg: don't use gfx12 xfb defs outside their basic block on gfx11
...
Move the defs after nir_pop_if and phis and inside the gfx12 branch.
Fixes: 1ea96a47cd
- ac/nir/lower_ngg: use voffset in global_atomic_add for xfb
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564 >
2024-06-08 00:11:18 -04:00
Marek Olšák
ea99c3fcb9
amd: update addrlib
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564 >
2024-06-08 00:11:17 -04:00
Marek Olšák
2ea3cb054b
ac/surface: pass the correct addrlib handle to Addr3GetPossibleSwizzleModes
...
Fixes: d22564d29c
- ac/surface: add gfx12
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564 >
2024-06-08 00:11:15 -04:00
Guilherme Gallo
41dd1c52b1
ci/lava: Fix cmdline for UART/fastboot devices
...
Fastboot devices need an indirection for creating a boot image via
`mkbootimg`, so we need to propagate the cmdline from LAVA and our extra
arguments to it properly.
This commit fixes it by retrieving the default cmdline from LAVA and
sending it, together with the `extra_nfsroot_args` to the `mkbootimg`
command.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29611 >
2024-06-07 22:03:22 +00:00
Roland Scheidegger
eead805919
lavapipe: add option to enable snorm blending
...
This is disabled by default because it fails CTS, however this may
still be useful (as it generally works), hence use LVP_SNORM_BLEND
env var to enable it.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Reviewed-by: Brian Paul <brian.paul@broadcom.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29587 >
2024-06-07 21:33:12 +00:00
Jianxun Zhang
9654aa4c31
intel/isl: Allow multi-sample on depth aux usage (xe2)
...
The restriction on depth aux mode is gone on Xe2 in spec.
Fix: piglit
arb_post_depth_coverage-multisampling -auto -fbo
isl_surface_state.c:723: isl_gfx20_surf_fill_state_s:
Assertion `info->surf->samples == 1' failed.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29274 >
2024-06-07 21:06:37 +00:00
Eric Engestrom
bd6ace73f3
radv/ci: document navi31 regression from !29235
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29596 >
2024-06-07 20:57:01 +00:00
Eric Engestrom
89666be1b9
nvk+zink/ci: add another flake seen in nightly
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29601 >
2024-06-07 20:47:01 +00:00
Eric Engestrom
46247b3827
v3d/drm-shim: emulate a rpi4 instead of a rpi3
...
7278 is the chip on the rpi3, while the rpi4 that made it to market has
the 2711 chip.
When this was introduced (82bf1979
), the rpi4 was probably still in
flux, which is why the rpi3 chip was put there (and v3d doesn't care
about that, but v3dv does).
cc: mesa-stable
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29584 >
2024-06-07 20:28:44 +00:00
Mike Blumenkrantz
2a90e16709
zink: add HKP to tiler mode switch
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29609 >
2024-06-07 20:02:19 +00:00
Mike Blumenkrantz
9a28f69ee7
vulkan: Update XML and headers to 1.3.287
...
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29610 >
2024-06-07 19:06:46 +00:00
Craig Stout
d0b3b2eb54
util: os_time: add Fuchsia support
...
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org >
Reviewed-by: Eric Engestrom <eric@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29539 >
2024-06-07 18:29:20 +00:00
C Stout
d39faf7f3d
util: u_dl: add Fuchsia support
...
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org >
Reviewed-by: Eric Engestrom <eric@igalia.com >
Acked-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29539 >
2024-06-07 18:29:20 +00:00
C Stout
2a3f53bd3b
util: os_misc: add Fuchsia support
...
v2: cleaner detect os check (robclark@)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org >
Reviewed-by: Eric Engestrom <eric@igalia.com >
Acked-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29539 >
2024-06-07 18:29:20 +00:00
C Stout
d6096ce8c8
util: u_thread: add Fuchsia support
...
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org >
Reviewed-by: Eric Engestrom <eric@igalia.com >
Acked-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29539 >
2024-06-07 18:29:20 +00:00
C Stout
ebe4a8d75f
util: detect_os: add DETECT_OS_FUCHSIA and DETECT_OS_POSIX_LITE
...
Fuchsia is a microkernel-like OS. It strategically implements
some POSIX and Unix APIs to promote software re-use.
It considers itself POSIX lite.
"In order to reduce the amount of source modification needed to
run on Fuchsia, Fuchsia offers a POSIX compatibility layer, POSIX
Lite, that this software can target. POSIX Lite is layered on
top of the underlying Fuchsia System ABI as a client library.
However, POSIX Lite is not a complete implementation of POSIX."
In the case of Fuchsia + src/util, these heavy-weight POSIX
functions shouldn't be used:
- file descriptors
- syslog.h
- signals
- process creation
To differentiate POSIX Lite, which Fuchsia and all heavy-weight
POSIX implementations support, add DETECT_OS_POSIX_LITE.
The use case is incrementally upstreaming functionality used in
downstream drivers (lavapipe, ..). Being in-tree for obvious
patches helps until the full driver can be merged.
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org >
Reviewed-by: Eric Engestrom <eric@igalia.com >
Acked-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29539 >
2024-06-07 18:29:20 +00:00
Mike Blumenkrantz
9cdbb099ee
gallium: stop dropping drawid_offset param with util_draw_indirect
...
this breaks indirect draws with offsets
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29462 >
2024-06-07 17:47:53 +00:00