Commit Graph

184338 Commits

Author SHA1 Message Date
Connor Abbott
6ad0cbafe8 ir3: Set branchstack earlier
We were relying on it in RA to tell us whether we could give more
registers to the shader mostly "for free" (because occupancy is bounded
by the branchstack), but it turns out it was actually 0 so we weren't
taking advantage of it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22072>
2024-02-02 17:39:35 +00:00
Connor Abbott
fa22b0901a ir3/ra: Add specialized shared register RA/spilling
There are two problems with shared register allocation at the moment:

1. We weren't modelling physical edges correctly, and once we do, the
   current hack in RA for handling them won't work correctly. This means
   live-range splitting doesn't work. I've tried various strategies but
   none of them seems to fix this.
2. Spilling of shared registers to non-shared registers isn't
   implemented.

Spilling of shared regs is significantly simpler than spilling
non-shared regs, because (1) spilling and unspilling is significantly
cheaper, just a single mov, and (2) we can swap "stack slots" (actually
non-shared regs) so all the complexity of parallel copy handling isn't
necessary. This means that it's much easier to integrate RA and
spilling, while still using the tree-scan framework, so that we can
spill instead of splitting live ranges. The other issue, of phi nodes
with physical edges, we can handle by spilling those phis earlier. For
this to work, we need to accurately insert physical edges based on
divergence analysis or else every phi node would involve physical edges,
which later commits will accomplish.

This commit adds a shared register allocation pass which is a
severely-cut-down version of RA and spilling. Everything to do with live
range splitting is cut from RA, and everything to do with parallel copy
handling and for spilling we simply always spill as long as soon as we
encounter a case where it's necessary. This could be improved,
especially the spilling strategy, but for now it keeps the pass simple
and cuts down on code duplication. Unfortunately there's still some
shared boilerplate with regular RA which seems unavoidable however.

The new RA requires us to redo liveness information, which is
significantly expensive, so we keep the ability of the old RA to handle
shared registers and only use the new RA when it may be required: either
something potentially requiring live-range splitting, or a too-high
shared register limit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22072>
2024-02-02 17:39:34 +00:00
Samuel Pitoiset
f977501a7c radv: do not allow to enable VK_EXT_shader_object with LLVM
This isn't expected to work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27415>
2024-02-02 17:14:56 +00:00
Konstantin Seurer
c925b6019d radv/rt: Lower ray payloads like hit attribs
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27051>
2024-02-02 16:36:15 +00:00
Erik Faye-Lund
4f0c33196c mesa: fix error-handling for ETC2/RGTC textures
It seems we missed an error-case that got introduced in OpenGL 4.4.

While this error doesn't *technically* exist as-is in OpenGL ES before
version 3, neither does 3D textures. And while OES_texture_3D introduces
it to OpenGL ES 2.0 without adding the same error for ETC2 textures,
that is likely an omission in the spec; 3D ETC2 texture was never a
thing.

This fixes a regression in the confidential Khronos CTS, specifically
GL46.gtf42.GL3Tests.texture_storage.texture_storage_compressed_texture_data

Fixes: 652a898d316 ("mesa/main: add support for EXT_texture_storage")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10545
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27428>
2024-02-02 16:06:19 +00:00
Eric Engestrom
5d293f01cc ci_run_n_monitor: avoid spamming a ton of "new status: created" for all the jobs at the beginning
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27419>
2024-02-02 15:25:22 +00:00
Eric Engestrom
6250885640 panfrost: fix UB caused by shifting signed int too far
Fixes: 13d7ca1300 ("pan/va: Optimize add with imm to ADD_IMM")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27437>
2024-02-02 14:56:20 +00:00
Mike Blumenkrantz
2085d60438 zink: run sparse lowering after all optimization passes
some passes (e.g., opt_shrink_vector) operate on the assumption that
sparse tex ops have a certain number of components and then remove components
and unset the sparse flag if they can optimize out the sparse usage

zink's sparse ops do not have the standard number of components, which
causes such passes to make incorrect assumptions and tag them as
not being sparse, which breaks everything

fix #10540

Fixes: 0d652c0c8d ("zink: shrink vectors during optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27414>
2024-02-02 14:36:25 +00:00
Mike Blumenkrantz
6a8cd7a64f zink: move sparse lowering up in file
no functional changes

Fixes: 0d652c0c8d ("zink: shrink vectors during optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27414>
2024-02-02 14:36:25 +00:00
Mike Blumenkrantz
aacc4e1c68 zink: zero allocate resident_defs array in ntv
this makes assert(def!=0) more reliable

Fixes: 73ef54e342 ("zink: handle residency return value from sparse texture instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27414>
2024-02-02 14:36:25 +00:00
Mike Blumenkrantz
3b025d6b42 zink: fix sparse bo placement
the util function here takes a bitmask of memory type indices, not properties.
rename the function and correct the usage

fixes sparse on nvidia blob

Fixes: c71287e70c ("zink: correct sparse bo mem_type_idx placement")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27414>
2024-02-02 14:36:25 +00:00
Konstantin Seurer
bb14ee53a5 radv/sqtt: Handle ray tracing pipelines with no traversal shader
Fixes: 0f87d40 ("radv/rt: Skip compiling a traversal shader")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27383>
2024-02-02 14:14:16 +01:00
Samuel Pitoiset
0aa9afa8e1 radv: add support for emitting VS+TCS compiled separately on GFX9+
With a VS prolog, we end up with 3 long jumps (VS prolog->VS->TCS->TCS
epilog), super annoying.

The shaders config must also be combined between VS and TCS.

This is for VK_EXT_shader_object.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
397a08b407 radv: always emit PGM_RSRC1_HS when emitting the TCS epilog state
This will simplify upcoming changes and it doesn't matter much because
this is for ESO only.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
542b9aaf18 radv: force TCS stage for VS as LS compiled separately on GFX9+
When VS as LS is compiled separately on GFX9+, the stage/previous_stage
must be VERTEX/TESS_CTRL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
3d5d163693 radv: always mark drawid/base_instance used with ESO
The user SGPR is always declared for merged shaders compiled separately
because the args must match.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
3c23ae8547 radv: rework shader arguments for separate compilation of VS+TCS on GFX9+
When VS or TCS are compiled separately on GFX9+, the shader input args
must match. This is implemented using a complete separate path, it's
duplicated but it seems cleaner than adding a ton of checks here and
there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
1e55d91c82 radv: only merge shader info stages if both stages exist on GFX9+
With shader objects, both stages might not exist and if the src stage
doesn't, this will copy garbage data because it's unitialized.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
0018faf384 radv: check active NIR stages before trying to merge shaders on GFX9+
For shader object.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
1fe8770bbe radv: constify radv_device in radv_emit_shader_pointer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27336>
2024-02-02 10:16:59 +01:00
Samuel Pitoiset
3b2452da3c radv: set the default workgroup size for VS as LS
This will be optimized during shader info linking if TCS is present.
The main motivation for this change is ESO because the next stage
might not exist.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27396>
2024-02-02 08:53:20 +00:00
Samuel Pitoiset
2a58bbbed8 radv: determine the workgroup size for TCS earlier
This can be done before linking shader info pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27396>
2024-02-02 08:53:20 +00:00
Samuel Pitoiset
c6ca7fcc25 radv: remove radv_graphics_state_key::dynamic_patch_control_points
When the state isn't dynamic, the patch control points value must
greater than 0. Having a separate field isn't necessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27396>
2024-02-02 08:53:20 +00:00
Blisto
3bc6f95e3d driconf: set vk_x11_strict_image_count for Atlas Fallen Vulkan
Prevents crash with vsync turned off on xwayland.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27122>
2024-02-02 07:48:22 +00:00
Mike Blumenkrantz
7b7a581a52 zink: prune dmabuf export tracking when adding resource binds
this avoids invalid access for the stack resource in add_resource_bind()
when adding a new bind to an exportable resource

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27395>
2024-02-02 03:51:52 +00:00
Dave Airlie
60d2ea83e8 vulkan/video: add AV1 decode support to common code
This adds the av1 decode parameters handling.

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27423>
2024-02-02 03:18:52 +00:00
Ian Romanick
68da9e4dff intel/compiler/xe2: Set SIMD mode for sampler messages
Since SIMD8 no longer exists, the SIMD modes enums have different names
and different values.

v2 (Francisco Jerez): Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file").

v3: Update brw_disasm.c with SIMD descriptions.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:10 +00:00
Ian Romanick
84de7a88d3 intel/compiler/xe2: Emit texture instructions w/ combined LOD and array index
The extra assertions are just there to help validate
pack_lod_and_array_index (in nir_lower_tex.c).

v2: Split got_lod_or_bias into two variables. This simplifies some
changes that Sagar is working on. Suggested by Sagar.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:10 +00:00
Ian Romanick
c8ba2bc2f0 nir: Pack texture LOD and array index to a single 32-bit value
v2: Fix clamped_ai calculation in nir_lower_tex.c. Add
nir_tex_src_combined_lod_and_array_index_intel to
print_tex_instr. Suggested by Sagar.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:10 +00:00
Ian Romanick
78e7f7b377 intel/compiler/xe2: Use new sample_*_mlod messages
Note: a future commit will expand the sampler message type to the 6 bits
used on Xe2.

v2 (Francisco Jerez): Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file").

v3: Drop XE2_SAMPLER_MESSAGE_SAMPLE_BIAS_MLOD as it does not actually
exist. This resulted in some bigger changes in brw_disasm.c. Noticed
by Sagar.

v4: Now that XE2_SAMPLER_MESSAGE_SAMPLE_MLODc conflicts with
GFX7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO_C, the determination of
min_lod_is_first must include devinfo->ver or previous platforms will
break.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Sagar Ghuge
8690a6b546 intel/compiler/xe2: Handle 6-bit message type for Gfx20+
Message types are expanded to 6-bit encoding now. 5 bits are still the
same field from the Sampler Message Descriptor. The most significant bit
is now bit 31 of the Sampler Message Descriptor. The messages that have
'1 in bit 6 are only to support programmable offsets and those would
require message header. If a sampler type shows only 5 bits encoding, it
is implied bit 6 equal to 0 and there is no requirement for header.

v2 (idr): Trivial formatting changes.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Ian Romanick
a9ed9cf88b intel/fs: Move opcode modification before the switch that emits srcs
This small refactor simplifies a later commit that will optionally emit
some opcodes before the switch (as is already done with the shadow
comparitor).

v2 (Francisco Jerez): Rebase on 07b9bfacc7 ("intel/compiler: Move
logical-send lowering to a separate file").

v3 (Jordan): SHADER_OPCODE_TXL => SHADER_OPCODE_TXL_LZ (was
SHADER_OPCODE_TXF_LZ).

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Ian Romanick
7441af803f intel/compiler/xe2: Update get_sampler_lowered_simd_width
The Bspec also says, "The table below describes the SIMD modes which
are supported. SIMD32 and SIMD64 are used for media-type operations
only."  Perhaps this commit should just add

    if (devinfo->ver >= 20)
        return 16;

instead.

v2: Use reg_unit in get_sampler_lowered_simd_width. Suggested by Sagar.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Mike Blumenkrantz
24a7f6cd16 zink: add a tu flake
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27422>
2024-02-02 02:23:02 +00:00
Dave Airlie
59fb425e1c vulkan: update registry/includes to 1.3.277
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27421>
2024-02-02 01:46:24 +00:00
Jesse Natalie
559f31e202 dzn: Use blits for all non-averaging resolves
Trying to do min/max resolves on depth/stencil is failing for me on
hardware, just simplify things and always use a manual resolve for
modes that aren't average.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:52 +00:00
Jesse Natalie
70fa127c97 dzn: Use correct format for depth/stencil resolves
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:52 +00:00
Jesse Natalie
973c5bd047 dzn: Don't resolve for RESOLVE_MODE_NONE
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:52 +00:00
Jesse Natalie
dd7cfd5255 dzn: Add a debug flag for forcing off native view instancing
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:52 +00:00
Jesse Natalie
a85e8058cb dzn: Support non-static samplers for meta
Some hardware that doesn't support true static samplers, emulates it
by copying all static samplers into a reserved portion of every descriptor
heap. To support Vulkan's required 4000 live sampler limit in bindless
mode, D3D is now able to create descriptor heaps which do not have a reserved
portion. Any descriptor heaps above the MaxSamplerDescriptorHeapSizeWithStaticSamplers
limit will not have that reserved portion and cannot be used with static samplers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:51 +00:00
Jesse Natalie
c286c01136 dzn: Add barrier to copy source for DispatchIndirect copies
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:51 +00:00
Jesse Natalie
581a23c0cc dzn: Add missing handling of VK_PIPELINE_STAGE_2_DRAW_INDIRECT_BIT
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:51 +00:00
Jesse Natalie
60aad6ef07 spirv2dxil: Lower the Vulkan memory model and coherent loads/stores
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:51 +00:00
Jesse Natalie
003d2da2dc microsoft/compiler: Add a pass for promoting ACCESS_COHERENT on loads/stores
DXIL doesn't have instruction-level coherency. We have 3 options:
1. Promote the instruction to an atomic instruction. We can only do this
   for 32-bit or 64-bit ops.
2. If using bindless, declare the local resource declaration as globally-coherent.
3. If not using bindless, add globally-coherent to the global resource declaration.

This pass does all 3 of these, stopping at the intrinsic level for supported types
of atomics, otherwise assigning to the global resource declaration, which will be
unused if we're doing bindless, where instead we'll get it from the instruction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:51 +00:00
Jesse Natalie
b74cd405d3 microsoft/compiler: Respect ACCESS_COHERENT in UAV variable data
DXIL has a globally-coherent field for UAVs. When emitting UAV metadata
based on a resource variable, respect the relevant bit in the var data.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5628
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27348>
2024-02-02 01:19:51 +00:00
Ian Romanick
118e0bdc1f intel/rt: Don't directly generate umul_32x16
The optimization pass will (eventually) turn the imul into a
umul_32x16. In many cases, the multiply will be converted to something
else.

I also tried cloning a bunch of existing imul algebraic patterns for
[iu]mul_32x16. This produced the same result, but it was a lot more
churn.

All of the shaders affected were ray tracing shaders in Q2RTX. This is
the only ray tracing workload in my fossil-db.

DG2
Totals:
Instrs: 191995626 -> 191995079 (-0.00%); split: -0.00%, +0.00%
Cycles: 14003803561 -> 14003798040 (-0.00%); split: -0.00%, +0.00%
Spill count: 108320 -> 108288 (-0.03%)
Fill count: 200695 -> 200663 (-0.02%)
Scratch Memory Size: 8755200 -> 8754176 (-0.01%)

Totals from 7 (0.00% of 652118) affected shaders:
Instrs: 14998 -> 14451 (-3.65%); split: -3.94%, +0.29%
Cycles: 137222 -> 131701 (-4.02%); split: -4.10%, +0.07%
Spill count: 32 -> 0 (-inf%)
Fill count: 32 -> 0 (-inf%)
Scratch Memory Size: 19456 -> 18432 (-5.26%)

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27161>
2024-02-02 00:02:05 +00:00
Timothy Arceri
bc0178af57 glsl: don't tree graft globals
As per this optimisations description:

"Takes assignments to variables that are dereferenced only
once and pastes the RHS expression into where the variables
dereferenced."

However the optimisation is run at compile time before multiple
shaders from the same stage could have been pasted together.
So this optimisation can incorrectly assume a global is only
referenced once since it cannot see the other pieces of the
shader stage until link time.

Here we skip the optimisation if the variable is a global. We
could change it to only run at link time however this
optimisation is only run at link time if we are being forced
to use GLSL IR to inline a function that glsl to nir cannot
handle and this will also be removed in a future patchset.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10482
Fixes: d75a36a9ee ("glsl: remove do_copy_propagation_elements() optimisation pass")

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27351>
2024-02-01 23:15:24 +00:00
Eric Engestrom
98197e15cc ci: explain purpose of the word after the date in image tags
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27379>
2024-02-01 22:10:09 +00:00
Eric Engestrom
b6d70eb099 ci: reduce maximum image tags length from 30 to 20
To keep a margin in case we need to add something more in the future.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27379>
2024-02-01 22:10:09 +00:00
Eric Engestrom
b6fceeaa9f ci: enforce maximum image tag length
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27379>
2024-02-01 22:10:09 +00:00