Commit Graph

194428 Commits

Author SHA1 Message Date
Samuel Pitoiset
2234e6d75a radv: add a helper to store data to the DGC upload space
The offset is automatically incremented when something is stored.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30924>
2024-08-30 18:02:22 +00:00
Samuel Pitoiset
330d6e0951 radv: stop passing the upload offset to dgc_emit_bind_pipeline()
This doesn't do anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30924>
2024-08-30 18:02:22 +00:00
Samuel Pitoiset
e96be348f2 radv: move emitting the compute pipeline with DGC
Only compute is supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30924>
2024-08-30 18:02:22 +00:00
Mike Blumenkrantz
ecb788624b revert part of 94e470a32d
accidentally added during rebase

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30949>
2024-08-30 17:08:33 +00:00
Louis-Francis Ratté-Boulianne
9d981a4c5b panfrost: properly lower DrawID sysval on v9 GPUs
We only use special DrawID register on v10 GPUs so we still need to
lower to sysval on any earlier generation.

Fixes commit f390835074

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30933>
2024-08-30 16:30:40 +00:00
Sai Teja
05f6e9f11e ci: Disable angle jobs for GL changes
Mesa's GL stack changes doesn't affect angle in any
way for now. Thus, drop angle jobs for GL changes from
intel and amd CI.

Signed-off-by: Sai Teja <saiteja13427@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30943>
2024-08-30 15:09:15 +00:00
Ali Homafar
3b581ed1f8 zink: Optimize descriptor buffers struct filling
Improve buffer descriptor filling for UBOs and SSBOs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30390>
2024-08-30 14:20:19 +00:00
Mike Blumenkrantz
94e470a32d zink: update profile with missing extensions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30944>
2024-08-30 13:37:25 +00:00
Daniel Stone
2e97d7b35c doc/llvmpipe: Update URL to fix linkcheck
linkcheck-docs has been failing for a little while now.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30939>
2024-08-30 13:26:41 +00:00
Christian Gmeiner
42f5b99a34 etnaviv: Switch to etna_core_has_feature(..) for has_halti2_instructions
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
2978d10203 etnaviv: Switch to etna_core_has_feature(..) for npot_tex_any_wrap
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
61d0ec5aec etnaviv: Switch to stream_count from etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
8b0a409431 etnaviv: Switch to max_registers from etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
89e286892d etnaviv: Switch to num_constants from etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
f34bf16114 etnaviv: npu: Drop not used spec values
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
92a6f697d5 etnaviv: npu: Switch to use etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
226e7d952f etnaviv: Switch to vertex_output_buffer_size from etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
d5b4758417 etnaviv: Switch to vertex_cache_size from etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
0a6baea787 etnaviv: Switch to shader_core_count from etna_core_info
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Christian Gmeiner
f304dc57ae etnaviv: Drop has_sin_cos_sqrt and has_sign_floor_ceil
They are not read anywhere.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30806>
2024-08-30 12:45:58 +00:00
Jordan Justen
3e4f73b3a0 intel/dev: Update hwconfig => max_threads_per_psd for Xe2
Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30887>
2024-08-30 01:53:55 -07:00
Qiang Yu
588a65f29a ac: do not lower some ops in nir_lower_packing
AMD does not implement nir_op_pack_32_4x8_split, others
are implemented, so don't lower them.

Fixes: 0f937426cc ("radeonsi: lower subgroup ops after wave size is known")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11781
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30885>
2024-08-30 05:46:51 +00:00
Qiang Yu
d43c5003fc nir: add skip_lower_packing_ops shader compile option
Drivers like radeonsi and radv prefer to not lowering
some packing ops.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30885>
2024-08-30 05:46:51 +00:00
Eric Engestrom
6c1d0b82fb turnip/ci: add vkd3d job on the a750
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29845>
2024-08-30 05:10:53 +00:00
Caio Oliveira
e4f090d3a6 intel/brw: Remove special treatment for 2-src in emit() helper
For Gfx9+ no 2-src instructions need sources to fixed up.  Special
treatment remains for 3-src instructions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30911>
2024-08-30 04:33:47 +00:00
Ian Romanick
73f365e208 intel/brw: load_offset cannot be constant on this path
Literally inside an if-statement (about 26 lines before this hunk)
that checks for !nir_src_is_const(instr->src[1]).

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
fef175de09 intel/brw: Enable constant propagation for a couple more logical sends
This prevents some regressions later in the MR. Once load_const
operations are marked as is_scalar, they will cesase to get the
automatic constant propagation that occurs in try_rebuild_source.

No shader-db or fossil-db changes on any Intel platform.

v2: Slightly relax source restrictions on
SHADER_OPCODE_UNALIGNED_OWORD_BLOCK_READ_LOGICAL. Add a comment
explaining the restriction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
c6a8b382fd intel/brw: Relax is_partial_write check in cmod propagation
The is_partial_write check is too strict because it tests two separate
things. It tests whether or not the instruction always writes a value
(i.e., is it predicated), and it tests whether or not the instruction
writes a complete register. This latter check is problematic as it
perevents cmod propagation in SIMD1, and it prevents cmod propagation in
SIMD8 when the destination size is 16 bits.

This check is unnecessary. Cmod propagation already checks that the
region written and region read overlap. It also already checks that the
execution sizes of the instructions match. Further restriction based on
the specific parts of the register written only generates false
negatives.

v2: Relax all of the calls to is_partial_write. Suggested by Caio.

No shader-db changes on any Intel platform.

fossil-db:

Meteor Lake
Totals:
Instrs: 151505520 -> 151502923 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17201385104 -> 17194901423 (-0.04%); split: -0.06%, +0.02%
Spill count: 80827 -> 80837 (+0.01%)
Fill count: 152693 -> 152692 (-0.00%); split: -0.01%, +0.01%

Totals from 346 (0.05% of 630198) affected shaders:
Instrs: 1257205 -> 1254608 (-0.21%); split: -0.21%, +0.00%
Cycle count: 5532845647 -> 5526361966 (-0.12%); split: -0.18%, +0.06%
Spill count: 32903 -> 32913 (+0.03%)
Fill count: 64338 -> 64337 (-0.00%); split: -0.03%, +0.03%

DG2
Totals:
Instrs: 151531440 -> 151528055 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17200238927 -> 17197996676 (-0.01%); split: -0.03%, +0.02%
Spill count: 81003 -> 80971 (-0.04%); split: -0.04%, +0.00%
Fill count: 152975 -> 152912 (-0.04%); split: -0.05%, +0.01%

Totals from 346 (0.05% of 630198) affected shaders:
Instrs: 1260363 -> 1256978 (-0.27%); split: -0.27%, +0.00%
Cycle count: 5532019670 -> 5529777419 (-0.04%); split: -0.09%, +0.05%
Spill count: 33046 -> 33014 (-0.10%); split: -0.11%, +0.01%
Fill count: 64581 -> 64518 (-0.10%); split: -0.13%, +0.03%

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149972324 -> 149972289 (-0.00%)
Cycle count: 15566495293 -> 15565151171 (-0.01%); split: -0.01%, +0.00%

Totals from 16 (0.00% of 629912) affected shaders:
Instrs: 351194 -> 351159 (-0.01%)
Cycle count: 3922227030 -> 3920882908 (-0.03%); split: -0.04%, +0.00%

Skylake
Totals:
Instrs: 140787999 -> 140787983 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14665614947 -> 14665515855 (-0.00%); split: -0.00%, +0.00%
Spill count: 58500 -> 58501 (+0.00%)
Fill count: 102097 -> 102100 (+0.00%)

Totals from 16 (0.00% of 625685) affected shaders:
Instrs: 343560 -> 343544 (-0.00%); split: -0.01%, +0.01%
Cycle count: 3354997898 -> 3354898806 (-0.00%); split: -0.01%, +0.01%
Spill count: 16864 -> 16865 (+0.01%)
Fill count: 27479 -> 27482 (+0.01%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
13332c236b intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup
I observed some ray tracing shaders where a resource_intel inside a
loop was non-uniform, and some code was lowered to account for
that. Eventually the loop containing the resource_intel was unrolled,
and the resource_intel became uniform.

For example, nir_opt_uniform_subgroup can transform something like

    con loop {
        con block b5:        // preds: b4 b8
        con 32    %330 = @read_first_invocation (%329)
        con 1     %331 = ieq %330, %329
                         // succs: b6 b7
        if %331 {
            con block b6:        // preds: b5
            con 32    %332 = iadd %120.b, %330
            con 32    %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless|non-uniform, resource_block_intel=-1)
            div 32x4  %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler)
                             break
                             // succs: b9
        } else {
            con block b7:  // preds: b5, succs: b8
        }
        con block b8:  // preds: b7, succs: b5
    }

into

    con loop {
        con block b5:        // preds: b4 b8
        con 1     %331 = ieq %329, %329
                         // succs: b6 b7
        if %331 {
            con block b6:        // preds: b5
            con 32    %332 = iadd %120.b, %329
            con 32    %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless|non-uniform, resource_block_intel=-1)
            div 32x4  %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler)
                             break
                             // succs: b9
        } else {
            con block b7:  // preds: b5, succs: b8
        }
        con block b8:  // preds: b7, succs: b5
    }

Notice that %331 is now a tautology. Running brw_nir_optimize again
eliminates the loop.

v2: Add a comment in the code explaining the rationale. Suggested by
Ken. Update the commit message. Suggested by Caio.

shader-db:

Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
total instructions in shared programs: 19733448 -> 19733330 (<.01%)
instructions in affected programs: 14120 -> 14002 (-0.84%)
helped: 32 / HURT: 3

total cycles in shared programs: 916254496 -> 916226288 (<.01%)
cycles in affected programs: 2035116 -> 2006908 (-1.39%)
helped: 19 / HURT: 13

total spills in shared programs: 5807 -> 5807 (0.00%)
spills in affected programs: 26 -> 26 (0.00%)
helped: 1 / HURT: 1

total fills in shared programs: 6794 -> 6792 (-0.03%)
fills in affected programs: 84 -> 82 (-2.38%)
helped: 1 / HURT: 1

LOST:   1
GAINED: 1

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20393084 -> 20392971 (<.01%)
instructions in affected programs: 21750 -> 21637 (-0.52%)
helped: 31 / HURT: 4

total cycles in shared programs: 880273065 -> 880247818 (<.01%)
cycles in affected programs: 2546748 -> 2521501 (-0.99%)
helped: 18 / HURT: 9

total spills in shared programs: 4628 -> 4630 (0.04%)
spills in affected programs: 287 -> 289 (0.70%)
helped: 1 / HURT: 2

total fills in shared programs: 5381 -> 5376 (-0.09%)
fills in affected programs: 711 -> 706 (-0.70%)
helped: 2 / HURT: 2

LOST:   1
GAINED: 1

fossil-db:

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 151513669 -> 151505520 (-0.01%); split: -0.01%, +0.00%
Send messages: 7459339 -> 7459396 (+0.00%)
Loop count: 49111 -> 47588 (-3.10%)
Cycle count: 17208178205 -> 17201385104 (-0.04%); split: -0.05%, +0.01%
Spill count: 80830 -> 80827 (-0.00%); split: -0.02%, +0.01%
Fill count: 152754 -> 152693 (-0.04%); split: -0.04%, +0.00%
Scratch Memory Size: 4136960 -> 4130816 (-0.15%)
Max live registers: 32016493 -> 32015955 (-0.00%); split: -0.00%, +0.00%

Totals from 672 (0.11% of 630198) affected shaders:
Instrs: 1352428 -> 1344279 (-0.60%); split: -0.78%, +0.17%
Send messages: 54302 -> 54359 (+0.10%)
Loop count: 6124 -> 4601 (-24.87%)
Cycle count: 1260266379 -> 1253473278 (-0.54%); split: -0.69%, +0.16%
Spill count: 15967 -> 15964 (-0.02%); split: -0.09%, +0.08%
Fill count: 36245 -> 36184 (-0.17%); split: -0.18%, +0.01%
Scratch Memory Size: 740352 -> 734208 (-0.83%)
Max live registers: 50699 -> 50161 (-1.06%); split: -1.45%, +0.39%

Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149976046 -> 149971100 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 7685264 -> 7685256 (-0.00%)
Cycle count: 15566401168 -> 15566405478 (+0.00%); split: -0.00%, +0.00%
Spill count: 61238 -> 61240 (+0.00%)
Fill count: 107301 -> 107289 (-0.01%)
Max live registers: 31992969 -> 31993857 (+0.00%); split: -0.00%, +0.00%

Totals from 553 (0.09% of 629912) affected shaders:
Instrs: 557027 -> 552081 (-0.89%); split: -0.90%, +0.01%
Subgroup size: 8648 -> 8640 (-0.09%)
Cycle count: 150154496 -> 150158806 (+0.00%); split: -0.23%, +0.24%
Spill count: 181 -> 183 (+1.10%)
Fill count: 440 -> 428 (-2.73%)
Max live registers: 33698 -> 34586 (+2.64%); split: -0.02%, +2.65%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
65eb7ed5fc intel/brw: Run intel_nir_lower_conversions only after brw_nir_optimize
Without this, the next commit tiggers assertions.

v2: Unconditionally do the lowering after brw_nir_optimize. Suggested by
Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
572e00dd66 intel/brw: Copy prop from raw integer moves with mismatched types
The specific pattern from the unit test was observed in ray tracing
trampoline shaders.

v2: Refactor the is_raw_move tests out to a utility function. Suggested
by Ken.

v3: Fix a regression caused by being too picky about source
modifiers. This was introduced somewhere between when I did initial
shader-db runs an v2.

v4: Fix typo in comment. Noticed by Caio.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19734086 -> 19733997 (<.01%)
instructions in affected programs: 135388 -> 135299 (-0.07%)
helped: 76 / HURT: 2

total cycles in shared programs: 916290451 -> 916264968 (<.01%)
cycles in affected programs: 41046002 -> 41020519 (-0.06%)
helped: 32 / HURT: 29

fossil-db:

Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown)
Totals:
Instrs: 151531355 -> 151513669 (-0.01%); split: -0.01%, +0.00%
Cycle count: 17209372399 -> 17208178205 (-0.01%); split: -0.01%, +0.00%
Max live registers: 32016490 -> 32016493 (+0.00%)

Totals from 17361 (2.75% of 630198) affected shaders:
Instrs: 2642048 -> 2624362 (-0.67%); split: -0.67%, +0.00%
Cycle count: 79803066 -> 78608872 (-1.50%); split: -1.75%, +0.25%
Max live registers: 421668 -> 421671 (+0.00%)

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149995644 -> 149977326 (-0.01%); split: -0.01%, +0.00%
Cycle count: 15567293770 -> 15566524840 (-0.00%); split: -0.02%, +0.01%
Spill count: 61241 -> 61238 (-0.00%)
Fill count: 107304 -> 107301 (-0.00%)
Max live registers: 31993109 -> 31993112 (+0.00%)

Totals from 17813 (2.83% of 629912) affected shaders:
Instrs: 3738236 -> 3719918 (-0.49%); split: -0.49%, +0.00%
Cycle count: 4251157049 -> 4250388119 (-0.02%); split: -0.06%, +0.04%
Spill count: 28268 -> 28265 (-0.01%)
Fill count: 50377 -> 50374 (-0.01%)
Max live registers: 470648 -> 470651 (+0.00%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
c160ed212e nir/divergence: resource_intel is less divergent than you thought
When the non_uniform flag is not set, the result is never divergent.

v2: Remove redundant assignment to is_divergent. Suggested by Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:30 +00:00
Casey Bowman
eda55c7c2f vulkan/screenshot-layer: Add Vulkan screenshot layer
This change adds a Vulkan screenshot layer that allows users to take
screenshots from a Vulkan application, but has an emphasis on
performance, decreasing the performance impact on the application
involved. This allows for automated setups to use this layer to take
screenshots for navigating various in-application menus.

This layer works by hooking into various common Vulkan setup functions, until
it enters the vkQueuePresentKHR function, and from there it copies the current
frame's image from the swapchain as an RGB image to host-cached memory, where
we will receive the information as a framebuffer pointer. From there, we copy
the framebuffer contents to a thread that will detach from the main process
so it can write the image to a PNG file without holding back the main thread.

This layer was created from using the existing overlay layer as a template,
then adding portions of LunarG's VulkanTools screenshot layer:
https://github.com/LunarG/VulkanTools/blob/main/layersvt/screenshot.cpp

More specifically, there were usages of functions, along with modifications of
various functions from screenshot.cpp in the VulkanTools project, used in
screenshot.cpp.

There are some sections of the screenshotting functionality that remain
unmodified from the original screenshot.cpp file in VulkanTools, including the
global locking structures and the writeFile() function, which takes care of
obtaining the images from the swapchain. There were various areas in which
modifications were made, including how images are written to a file (using PNG
instead of PPM, introducing threading, added fences/semaphores, etc), along
with many smaller changes.

v2: Fix segfault upon application exit

v3: Fix filename issue with concatenation, along with some leftover
memory handling that wasn't cleaned up.

v4: Fix some error handling and nits

v5: Fix output directory handling

Reviewed-by: Ivan Briano <ivan.briano@intel.com

Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30527>
2024-08-30 02:56:02 +00:00
Sil Vilerino
7530487e60 d3d12: Video Encode HEVC - Use VPS information from frontend, specifically for vps_max_dec_pic_buffering_minus1
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
e268ed0613 d3d12: Video Encode HEVC to use direct DPB from frontend
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
0249f2e652 d3d12: Video Encode H264 - Support direct mmco operations
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
da2cbfe3bf d3d12: Video Encode H264 to use direct DPB from frontend
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
bb1bbe51df d3d12: Implement get_feedback_fence
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
067f881435 frontend/va: VaSyncSurface encoder check for surface feedback
Currently VaSyncBuffer clears the feedback of the buffer's associated encode surface
to avoid a subsequent call to VaSyncSurface for the same encoded frame to retrieve feedback twice.
Add a check to VaSyncSurface to don't perform the feedback retrieval if surf->feedback already cleared.
Otherwise it results in the second call to VaSyncSurface to pass a null feedback
after waiting for the non-null fence.

Fixes: 96fe9fde3f ("frontends/va: Implement sync buffer/surface timeout for encode feedback")

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
dad58e0cd3 d3d12: Implement pipe_video_codec.create_dpb_buffer for texture array resources
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
06a8b8c7e9 d3d12: Allow passing custom pipe_resource creation template/placed resource to d3d12_video_buffer_create_impl
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
4070d02524 d3d12: Implement pipe_video_codec.create_dpb_buffer for AOT resources
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
f8145fe691 pipe: Add PIPE_BIND_VIDEO_DECODE_DPB/PIPE_BIND_VIDEO_ENCODE_DPB
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
David Rosca
e751cb074a frontends/va: Allow drivers to allocate and use encode DPB surface buffers
Drivers can now allocate these buffers and use them directly.

(cherry picked from commit a4ef6eccb5c78c8080a944d8dd08f829902afc60)
See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30843#note_2544865
for more details about discussion/cherry-picking.

Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Christian Gmeiner
bab6f2a1ec etnaviv: isa: Add conv instruction
This instruction is used to implement float type conversion. The source type
is defined via src1 immed (0: f32, 1: f16) and the dest type is defined via
the instruction type.

Blob generates such conv's for piglit's tests/cl/program/execute/mad-mix.cl

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30797>
2024-08-30 01:53:18 +00:00
Yinjie Yao
7f1c0fbe61 radeonsi/vcn: Rename transform_skip_disabled and remove hardcoded value for VCN5
This fix the HEVC encode corruption caused by mismatch between PPS
header and IB setting, the fix only apply for VCN5.
Rename from transform_skip_dicarded to transform_skip_disabled.

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30930>
2024-08-30 01:17:22 +00:00
Valentine Burley
419885e280 tu: Simplify VK_EXT_sample_locations SampleCounts assignment
Use the existing sample_counts variable.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30730>
2024-08-30 00:34:54 +00:00
Valentine Burley
98d52cf292 tu: Fix VK_EXT_extended_dynamic_state3 feature
Don't claim to support extendedDynamicState3SampleLocationsEnable on pre-A650 GPUs,
which can't advertise VK_EXT_sample_locations.

Fixes dEQP-VK.info.device_mandatory_features on A6xx Gen 1 and Gen 2.

Fixes: 84726da2f4 ("tu: Implement extendedDynamicState3SampleLocationsEnable")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30730>
2024-08-30 00:34:54 +00:00
Connor Abbott
630d6d1f2e tu: Add a750 flush workaround and re-enable UBWC for storage images
This is closer to what the blob does.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30896>
2024-08-29 23:52:00 +00:00
Faith Ekstrand
4442d61b16 nvk: Advertise VK_KHR_maintenance7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30925>
2024-08-29 23:33:05 +00:00