Commit Graph

194401 Commits

Author SHA1 Message Date
Ian Romanick
c6a8b382fd intel/brw: Relax is_partial_write check in cmod propagation
The is_partial_write check is too strict because it tests two separate
things. It tests whether or not the instruction always writes a value
(i.e., is it predicated), and it tests whether or not the instruction
writes a complete register. This latter check is problematic as it
perevents cmod propagation in SIMD1, and it prevents cmod propagation in
SIMD8 when the destination size is 16 bits.

This check is unnecessary. Cmod propagation already checks that the
region written and region read overlap. It also already checks that the
execution sizes of the instructions match. Further restriction based on
the specific parts of the register written only generates false
negatives.

v2: Relax all of the calls to is_partial_write. Suggested by Caio.

No shader-db changes on any Intel platform.

fossil-db:

Meteor Lake
Totals:
Instrs: 151505520 -> 151502923 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17201385104 -> 17194901423 (-0.04%); split: -0.06%, +0.02%
Spill count: 80827 -> 80837 (+0.01%)
Fill count: 152693 -> 152692 (-0.00%); split: -0.01%, +0.01%

Totals from 346 (0.05% of 630198) affected shaders:
Instrs: 1257205 -> 1254608 (-0.21%); split: -0.21%, +0.00%
Cycle count: 5532845647 -> 5526361966 (-0.12%); split: -0.18%, +0.06%
Spill count: 32903 -> 32913 (+0.03%)
Fill count: 64338 -> 64337 (-0.00%); split: -0.03%, +0.03%

DG2
Totals:
Instrs: 151531440 -> 151528055 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17200238927 -> 17197996676 (-0.01%); split: -0.03%, +0.02%
Spill count: 81003 -> 80971 (-0.04%); split: -0.04%, +0.00%
Fill count: 152975 -> 152912 (-0.04%); split: -0.05%, +0.01%

Totals from 346 (0.05% of 630198) affected shaders:
Instrs: 1260363 -> 1256978 (-0.27%); split: -0.27%, +0.00%
Cycle count: 5532019670 -> 5529777419 (-0.04%); split: -0.09%, +0.05%
Spill count: 33046 -> 33014 (-0.10%); split: -0.11%, +0.01%
Fill count: 64581 -> 64518 (-0.10%); split: -0.13%, +0.03%

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149972324 -> 149972289 (-0.00%)
Cycle count: 15566495293 -> 15565151171 (-0.01%); split: -0.01%, +0.00%

Totals from 16 (0.00% of 629912) affected shaders:
Instrs: 351194 -> 351159 (-0.01%)
Cycle count: 3922227030 -> 3920882908 (-0.03%); split: -0.04%, +0.00%

Skylake
Totals:
Instrs: 140787999 -> 140787983 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14665614947 -> 14665515855 (-0.00%); split: -0.00%, +0.00%
Spill count: 58500 -> 58501 (+0.00%)
Fill count: 102097 -> 102100 (+0.00%)

Totals from 16 (0.00% of 625685) affected shaders:
Instrs: 343560 -> 343544 (-0.00%); split: -0.01%, +0.01%
Cycle count: 3354997898 -> 3354898806 (-0.00%); split: -0.01%, +0.01%
Spill count: 16864 -> 16865 (+0.01%)
Fill count: 27479 -> 27482 (+0.01%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
13332c236b intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup
I observed some ray tracing shaders where a resource_intel inside a
loop was non-uniform, and some code was lowered to account for
that. Eventually the loop containing the resource_intel was unrolled,
and the resource_intel became uniform.

For example, nir_opt_uniform_subgroup can transform something like

    con loop {
        con block b5:        // preds: b4 b8
        con 32    %330 = @read_first_invocation (%329)
        con 1     %331 = ieq %330, %329
                         // succs: b6 b7
        if %331 {
            con block b6:        // preds: b5
            con 32    %332 = iadd %120.b, %330
            con 32    %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless|non-uniform, resource_block_intel=-1)
            div 32x4  %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler)
                             break
                             // succs: b9
        } else {
            con block b7:  // preds: b5, succs: b8
        }
        con block b8:  // preds: b7, succs: b5
    }

into

    con loop {
        con block b5:        // preds: b4 b8
        con 1     %331 = ieq %329, %329
                         // succs: b6 b7
        if %331 {
            con block b6:        // preds: b5
            con 32    %332 = iadd %120.b, %329
            con 32    %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless|non-uniform, resource_block_intel=-1)
            div 32x4  %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler)
                             break
                             // succs: b9
        } else {
            con block b7:  // preds: b5, succs: b8
        }
        con block b8:  // preds: b7, succs: b5
    }

Notice that %331 is now a tautology. Running brw_nir_optimize again
eliminates the loop.

v2: Add a comment in the code explaining the rationale. Suggested by
Ken. Update the commit message. Suggested by Caio.

shader-db:

Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
total instructions in shared programs: 19733448 -> 19733330 (<.01%)
instructions in affected programs: 14120 -> 14002 (-0.84%)
helped: 32 / HURT: 3

total cycles in shared programs: 916254496 -> 916226288 (<.01%)
cycles in affected programs: 2035116 -> 2006908 (-1.39%)
helped: 19 / HURT: 13

total spills in shared programs: 5807 -> 5807 (0.00%)
spills in affected programs: 26 -> 26 (0.00%)
helped: 1 / HURT: 1

total fills in shared programs: 6794 -> 6792 (-0.03%)
fills in affected programs: 84 -> 82 (-2.38%)
helped: 1 / HURT: 1

LOST:   1
GAINED: 1

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20393084 -> 20392971 (<.01%)
instructions in affected programs: 21750 -> 21637 (-0.52%)
helped: 31 / HURT: 4

total cycles in shared programs: 880273065 -> 880247818 (<.01%)
cycles in affected programs: 2546748 -> 2521501 (-0.99%)
helped: 18 / HURT: 9

total spills in shared programs: 4628 -> 4630 (0.04%)
spills in affected programs: 287 -> 289 (0.70%)
helped: 1 / HURT: 2

total fills in shared programs: 5381 -> 5376 (-0.09%)
fills in affected programs: 711 -> 706 (-0.70%)
helped: 2 / HURT: 2

LOST:   1
GAINED: 1

fossil-db:

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 151513669 -> 151505520 (-0.01%); split: -0.01%, +0.00%
Send messages: 7459339 -> 7459396 (+0.00%)
Loop count: 49111 -> 47588 (-3.10%)
Cycle count: 17208178205 -> 17201385104 (-0.04%); split: -0.05%, +0.01%
Spill count: 80830 -> 80827 (-0.00%); split: -0.02%, +0.01%
Fill count: 152754 -> 152693 (-0.04%); split: -0.04%, +0.00%
Scratch Memory Size: 4136960 -> 4130816 (-0.15%)
Max live registers: 32016493 -> 32015955 (-0.00%); split: -0.00%, +0.00%

Totals from 672 (0.11% of 630198) affected shaders:
Instrs: 1352428 -> 1344279 (-0.60%); split: -0.78%, +0.17%
Send messages: 54302 -> 54359 (+0.10%)
Loop count: 6124 -> 4601 (-24.87%)
Cycle count: 1260266379 -> 1253473278 (-0.54%); split: -0.69%, +0.16%
Spill count: 15967 -> 15964 (-0.02%); split: -0.09%, +0.08%
Fill count: 36245 -> 36184 (-0.17%); split: -0.18%, +0.01%
Scratch Memory Size: 740352 -> 734208 (-0.83%)
Max live registers: 50699 -> 50161 (-1.06%); split: -1.45%, +0.39%

Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149976046 -> 149971100 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 7685264 -> 7685256 (-0.00%)
Cycle count: 15566401168 -> 15566405478 (+0.00%); split: -0.00%, +0.00%
Spill count: 61238 -> 61240 (+0.00%)
Fill count: 107301 -> 107289 (-0.01%)
Max live registers: 31992969 -> 31993857 (+0.00%); split: -0.00%, +0.00%

Totals from 553 (0.09% of 629912) affected shaders:
Instrs: 557027 -> 552081 (-0.89%); split: -0.90%, +0.01%
Subgroup size: 8648 -> 8640 (-0.09%)
Cycle count: 150154496 -> 150158806 (+0.00%); split: -0.23%, +0.24%
Spill count: 181 -> 183 (+1.10%)
Fill count: 440 -> 428 (-2.73%)
Max live registers: 33698 -> 34586 (+2.64%); split: -0.02%, +2.65%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
65eb7ed5fc intel/brw: Run intel_nir_lower_conversions only after brw_nir_optimize
Without this, the next commit tiggers assertions.

v2: Unconditionally do the lowering after brw_nir_optimize. Suggested by
Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
572e00dd66 intel/brw: Copy prop from raw integer moves with mismatched types
The specific pattern from the unit test was observed in ray tracing
trampoline shaders.

v2: Refactor the is_raw_move tests out to a utility function. Suggested
by Ken.

v3: Fix a regression caused by being too picky about source
modifiers. This was introduced somewhere between when I did initial
shader-db runs an v2.

v4: Fix typo in comment. Noticed by Caio.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19734086 -> 19733997 (<.01%)
instructions in affected programs: 135388 -> 135299 (-0.07%)
helped: 76 / HURT: 2

total cycles in shared programs: 916290451 -> 916264968 (<.01%)
cycles in affected programs: 41046002 -> 41020519 (-0.06%)
helped: 32 / HURT: 29

fossil-db:

Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown)
Totals:
Instrs: 151531355 -> 151513669 (-0.01%); split: -0.01%, +0.00%
Cycle count: 17209372399 -> 17208178205 (-0.01%); split: -0.01%, +0.00%
Max live registers: 32016490 -> 32016493 (+0.00%)

Totals from 17361 (2.75% of 630198) affected shaders:
Instrs: 2642048 -> 2624362 (-0.67%); split: -0.67%, +0.00%
Cycle count: 79803066 -> 78608872 (-1.50%); split: -1.75%, +0.25%
Max live registers: 421668 -> 421671 (+0.00%)

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149995644 -> 149977326 (-0.01%); split: -0.01%, +0.00%
Cycle count: 15567293770 -> 15566524840 (-0.00%); split: -0.02%, +0.01%
Spill count: 61241 -> 61238 (-0.00%)
Fill count: 107304 -> 107301 (-0.00%)
Max live registers: 31993109 -> 31993112 (+0.00%)

Totals from 17813 (2.83% of 629912) affected shaders:
Instrs: 3738236 -> 3719918 (-0.49%); split: -0.49%, +0.00%
Cycle count: 4251157049 -> 4250388119 (-0.02%); split: -0.06%, +0.04%
Spill count: 28268 -> 28265 (-0.01%)
Fill count: 50377 -> 50374 (-0.01%)
Max live registers: 470648 -> 470651 (+0.00%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
c160ed212e nir/divergence: resource_intel is less divergent than you thought
When the non_uniform flag is not set, the result is never divergent.

v2: Remove redundant assignment to is_divergent. Suggested by Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:30 +00:00
Casey Bowman
eda55c7c2f vulkan/screenshot-layer: Add Vulkan screenshot layer
This change adds a Vulkan screenshot layer that allows users to take
screenshots from a Vulkan application, but has an emphasis on
performance, decreasing the performance impact on the application
involved. This allows for automated setups to use this layer to take
screenshots for navigating various in-application menus.

This layer works by hooking into various common Vulkan setup functions, until
it enters the vkQueuePresentKHR function, and from there it copies the current
frame's image from the swapchain as an RGB image to host-cached memory, where
we will receive the information as a framebuffer pointer. From there, we copy
the framebuffer contents to a thread that will detach from the main process
so it can write the image to a PNG file without holding back the main thread.

This layer was created from using the existing overlay layer as a template,
then adding portions of LunarG's VulkanTools screenshot layer:
https://github.com/LunarG/VulkanTools/blob/main/layersvt/screenshot.cpp

More specifically, there were usages of functions, along with modifications of
various functions from screenshot.cpp in the VulkanTools project, used in
screenshot.cpp.

There are some sections of the screenshotting functionality that remain
unmodified from the original screenshot.cpp file in VulkanTools, including the
global locking structures and the writeFile() function, which takes care of
obtaining the images from the swapchain. There were various areas in which
modifications were made, including how images are written to a file (using PNG
instead of PPM, introducing threading, added fences/semaphores, etc), along
with many smaller changes.

v2: Fix segfault upon application exit

v3: Fix filename issue with concatenation, along with some leftover
memory handling that wasn't cleaned up.

v4: Fix some error handling and nits

v5: Fix output directory handling

Reviewed-by: Ivan Briano <ivan.briano@intel.com

Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30527>
2024-08-30 02:56:02 +00:00
Sil Vilerino
7530487e60 d3d12: Video Encode HEVC - Use VPS information from frontend, specifically for vps_max_dec_pic_buffering_minus1
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
e268ed0613 d3d12: Video Encode HEVC to use direct DPB from frontend
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
0249f2e652 d3d12: Video Encode H264 - Support direct mmco operations
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
da2cbfe3bf d3d12: Video Encode H264 to use direct DPB from frontend
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
bb1bbe51df d3d12: Implement get_feedback_fence
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
067f881435 frontend/va: VaSyncSurface encoder check for surface feedback
Currently VaSyncBuffer clears the feedback of the buffer's associated encode surface
to avoid a subsequent call to VaSyncSurface for the same encoded frame to retrieve feedback twice.
Add a check to VaSyncSurface to don't perform the feedback retrieval if surf->feedback already cleared.
Otherwise it results in the second call to VaSyncSurface to pass a null feedback
after waiting for the non-null fence.

Fixes: 96fe9fde3f ("frontends/va: Implement sync buffer/surface timeout for encode feedback")

Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
dad58e0cd3 d3d12: Implement pipe_video_codec.create_dpb_buffer for texture array resources
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
06a8b8c7e9 d3d12: Allow passing custom pipe_resource creation template/placed resource to d3d12_video_buffer_create_impl
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
4070d02524 d3d12: Implement pipe_video_codec.create_dpb_buffer for AOT resources
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Sil Vilerino
f8145fe691 pipe: Add PIPE_BIND_VIDEO_DECODE_DPB/PIPE_BIND_VIDEO_ENCODE_DPB
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
David Rosca
e751cb074a frontends/va: Allow drivers to allocate and use encode DPB surface buffers
Drivers can now allocate these buffers and use them directly.

(cherry picked from commit a4ef6eccb5c78c8080a944d8dd08f829902afc60)
See https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30843#note_2544865
for more details about discussion/cherry-picking.

Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30908>
2024-08-30 02:12:28 +00:00
Christian Gmeiner
bab6f2a1ec etnaviv: isa: Add conv instruction
This instruction is used to implement float type conversion. The source type
is defined via src1 immed (0: f32, 1: f16) and the dest type is defined via
the instruction type.

Blob generates such conv's for piglit's tests/cl/program/execute/mad-mix.cl

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30797>
2024-08-30 01:53:18 +00:00
Yinjie Yao
7f1c0fbe61 radeonsi/vcn: Rename transform_skip_disabled and remove hardcoded value for VCN5
This fix the HEVC encode corruption caused by mismatch between PPS
header and IB setting, the fix only apply for VCN5.
Rename from transform_skip_dicarded to transform_skip_disabled.

Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30930>
2024-08-30 01:17:22 +00:00
Valentine Burley
419885e280 tu: Simplify VK_EXT_sample_locations SampleCounts assignment
Use the existing sample_counts variable.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30730>
2024-08-30 00:34:54 +00:00
Valentine Burley
98d52cf292 tu: Fix VK_EXT_extended_dynamic_state3 feature
Don't claim to support extendedDynamicState3SampleLocationsEnable on pre-A650 GPUs,
which can't advertise VK_EXT_sample_locations.

Fixes dEQP-VK.info.device_mandatory_features on A6xx Gen 1 and Gen 2.

Fixes: 84726da2f4 ("tu: Implement extendedDynamicState3SampleLocationsEnable")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30730>
2024-08-30 00:34:54 +00:00
Connor Abbott
630d6d1f2e tu: Add a750 flush workaround and re-enable UBWC for storage images
This is closer to what the blob does.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30896>
2024-08-29 23:52:00 +00:00
Faith Ekstrand
4442d61b16 nvk: Advertise VK_KHR_maintenance7
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30925>
2024-08-29 23:33:05 +00:00
Eric Engestrom
ace54cc998 zink+turnip/ci: fix .zink-turnip-valve-manual-rules
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30929>
2024-08-29 22:59:49 +00:00
Ian Romanick
f11a414645 nir/algebraic: Remove incorrect bfi of iand pattern
The comment says, "This expands to (b & 3) & ~0xc which is (b & 3) &
3." This is not correct. ~0xc is actually 0xfffffff3.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: #11695
Fixes: 1c7e35d4e0 ("nir/algebraic: Optimize some bit operation nonsense observed in some shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30913>
2024-08-29 22:21:55 +00:00
Vignesh Raman
0b010b357d ci: use v6.11-rc5 kernel for Mali V10 testing
Set FORCE_KERNEL_TAG to v6.11-rc5 for Mali V10 testing.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30883>
2024-08-29 20:45:56 +00:00
Vignesh Raman
0d90f48b4f ci: enable Mali V10 testing
Panthor support was added in Linux 6.10 and Panfrost V10
in Mesa, enabling new GPUs on Rockchip's RK3588. Add CI
jobs to test Mali V10.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30883>
2024-08-29 20:45:56 +00:00
Lionel Landwerlin
91b3ae71d7 iris: fix utrace compute end timestamp reads on Gfx20
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30923>
2024-08-29 20:10:12 +00:00
Lionel Landwerlin
14d772d678 anv: fix utrace compute timestamp reads on Gfx20
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30923>
2024-08-29 20:10:11 +00:00
Eric Engestrom
08ecfe8fa4 zink+nvk/ci: mark a ton of tests as fixed
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
f668b17da6 zink+nvk/ci: bump zink-nvk-ga106-valve timeout as more tests are being run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
0cbd5bbb47 venus/ci: add flake and skip timing out test
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
71dbe29537 venus/ci: drop redundant flakes definitions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
66bae75d47 radeonsi/ci: mark a bunch of subgroups tests as failing
Fixes: 48a49c4e04 ("radeonsi: enable KHR_shader_subgroup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
f05887a359 freedreno/ci: fix test timeout for a306_piglit
Two issues:
1. this is a baremetal/fastboot job, not a lava job, so JOB_TIMEOUT does
   nothing and TEST_PHASE_TIMEOUT_MINUTES was erroneously removed
   instead.
2. the test timeout needs to be smaller than the job timeout, otherwise
   it can't do anything. 5min is the margin almost every job uses, so
   let's use that.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
8bfd2c083e Revert "freedreno/ci: drop TEST_PHASE_TIMEOUT_MINUTES that match the default value"
This reverts commit 71787885e3.

The last version of the MR, the one that got merged, dropped the
bm/fastboot changes as they were causing issues; I should have dropped
this commit too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
39d6251141 radeonsi/ci: bump timeout for nightly job glcts-vangogh-valve
It now takes ~45min (likely due to new failures being retried, itself
caused by KHR_shader_subgroup being added), but let's bump the timeout
a bit higher to avoid having to do this again soon.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Eric Engestrom
8ffb44a633 nvk/ci: mark -dEQP-VK.drm_format_modifiers.export_import* as fixed
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30928>
2024-08-29 19:21:52 +00:00
Job Noorman
b967677d4e ir3/postsched: take WAR ss-delay into account
Waiting for WAR hazards needs (ss) just like waiting for ss-producers.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Job Noorman
bb13f30db2 ir3: add is_war_hazard_producer helper
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Job Noorman
ce5c0c21c4 ir3/legalize: don't add (ss) for WAR hazards synced with (sy)
(ss) can be used to resolve all tex/sfu/mem WAR hazards. However, when
the reader is a sy-producer, they can also be resolved using (sy). Track
those cases separately and make sure we don't add (ss) when the reader
has already been synced using (sy).

For example, take a sequence like this:
sam rd, rs, ...
(sy)...
(ss)write rs

Before this commit, we would add the (ss) to resolve the WAR hazard
between the consumer (sam) and the writer of rs. However, the consumer
of rs has already been synced using (sy) so has definitely consumed rs.
This commit ensures the unnecessary (ss) for the write is not added
anymore.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Job Noorman
6a19274e3d ir3/legalize: add needs_ss_war helper
The condition was getting unwieldy and we will need to add more to it in
the next commit.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Job Noorman
6e16dc60a1 ir3: add assert to detect getting reg file of const/imm
ir3_reg_file_offset should only be called for actual registers, not for
const or immediate values. However, this did happens accidentally for
tracking WAR hazards in ir3_legalize. While that case has been fixed,
better to prevent such cases in the future.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Job Noorman
523a0e2e39 ir3/legalize: don't add WAR dependencies for const/imm regs
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Job Noorman
7cc24aa506 ir3: fix recognizing const/imm registers as a0
Fixes: 72bb4d79dc ("ir3/legalize: handle scalar ALU WAR hazards for a0.x")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30611>
2024-08-29 18:44:14 +00:00
Daniel Stone
43d65e0ff0 ci: Make per-build dependencies optional
Sometimes not all of the jobs execute. For instance, Windows build jobs
will not trigger on AMD-only MRs. Use the 'optional' keyword to ignore
jobs which don't exist in our pipeline.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 310e3bb026 ("ci: do not start build-only jobs until the critical build-for-tests jobs are done")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30926>
2024-08-29 17:50:16 +00:00
Roland Scheidegger
9b717596b2 llvmpipe: Fix type mismatch when storing residency info
The storage allocated was always the same for both the ordinary texture
result data as well as the residency info. However, the former can be
float vector, whereas the latter is always int vector.
At least some llvm versions/builds will assert on this mismatch when
storing the data.
While here, also cut unnecessary zero initialization (lp_build_alloca()
already explicitly does this).

Fixes: 6168317b84 (lavapipe: Implement shaderResourceResidency)

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30878>
2024-08-29 16:53:45 +00:00
Valentine Burley
7fb7fa794c util: Remove Vulkan-only formats from get_plane_width/height
This reverts commit 3316bc3e88.

These formats were only used by RADV and are no longer needed as we can get the plane dimensions
from the YCbCr table.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30899>
2024-08-29 15:57:52 +00:00
Valentine Burley
1ae09c4e79 tu: Use vk_format_get_plane_count for tu6_plane_count
This change simplifies the code by avoiding special casing, making it easier to add support
for formats like P010 with minimal changes.

Inline it on one place where where the difference for VK_FORMAT_D32_SFLOAT_S8_UINT doesn't matter.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30899>
2024-08-29 15:57:51 +00:00
Valentine Burley
29d1cd6e8b tu: Use vk_format_get_plane_width/height to get the plane dimensions
This change simplifies the code by avoiding special casing, making it easier to add support
for formats like P010 with minimal changes.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30899>
2024-08-29 15:57:51 +00:00