We cannot find anything in the Vulkan spec requiring this. D3D12 [1]
says it's undefined as long as it doesn't crash the OS :
"Out of bounds indexing of any descriptor table from the shader
results in a largely undefined memory access, including the
possibility of reading arbitrary in-process memory as if it is a
hardware state descriptor and living with the consequence of what
the hardware does with that. This could produce a device reset, but
will not crash Windows."
[1] : https://learn.microsoft.com/en-us/windows/win32/direct3d12/advanced-use-of-descriptor-tables#out-of-bounds-indexing
Found 2 titles affected by this change
Some pretty good results on Cyberpunk 2077 :
Totals from 10285 (100.00% of 10285) affected shaders:
Instrs: 7638709 -> 7517360 (-1.59%); split: -1.64%, +0.05%
Cycles: 148047414 -> 148470916 (+0.29%); split: -0.83%, +1.12%
Subgroup size: 112544 -> 112576 (+0.03%); split: +0.04%, -0.01%
Spill count: 98 -> 90 (-8.16%)
Fill count: 90 -> 82 (-8.89%)
Max live registers: 495274 -> 479502 (-3.18%); split: -3.21%, +0.03%
Max dispatch width: 87824 -> 91168 (+3.81%); split: +4.10%, -0.29%
Gaining 297 shaders in SIMD16/32, loosing 16 SIMD32 shaders
Some not so good results on Strange Brigade :
Totals from 4027 (100.00% of 4027) affected shaders:
Instrs: 2080355 -> 2013880 (-3.20%); split: -3.20%, +0.01%
Cycles: 25405149 -> 25170579 (-0.92%); split: -1.37%, +0.45%
Max live registers: 167303 -> 168958 (+0.99%)
Max dispatch width: 33264 -> 32496 (-2.31%)
Loosing 96 SIMD16 shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>
Currently we put the upload and cmd bos into GART, however in the
future this might change, but for conditional rendering we must have
a GART space to read the value from. This creates a separate buffer
allocations that are gart forced. This will be used to provide
cond render with a gart location.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24520>
Structs are not allowed to contain an image in regular glsl. The only time
they are intended to be allowed to be declared in a struct is when
they are bindless.
Unfortunately the bindless spec does not meantion this behaviour
explicitly so there is no spec quote to reference but you can see in
the original commit to allow them in mesa that spec clarification was
provided 48b7882200
The spec also states that certain uses are implicitly bindless as per
the following spec quote:
"When used as shader inputs, outputs, uniform block members,
or temporaries, the value of the sampler is a 64-bit unsigned
integer handle and never refers to a texture image unit."
Given images are not allowed in regular glsl for the above types
similair to being forbidden in structs, we can also assume
declarations in structs are implicitly bindless.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24269>
- bresenham and smooth lines
These two need to override multisample rasterization to get correct
results on CTS tests.
- stippled lines
The stipple factor needs to be remapped from [1, 256] to [0, 255].
-rectangular and strict lines
Rectangular lines need multisample rasterization rules to get correctly
rasterized even for one sample. That way we get strict lines too for
VK_LINE_RASTERIZATION_MODE_DEFAULT_EXT.
As per the DX rasterization rules:
Rasterization rules for primitives are, in general, unchanged by multisample antialiasing, except:
- For a triangle, a coverage test is performed for each sample location (not for a pixel center).
If more than one sample location is covered, a pixel shader runs once with attributes interpolated at the pixel center.
The result is stored (replicated) for each covered sample location in the pixel that passes the depth/stencil test.
- A line is treated as a rectangle made up of two triangles, with a line width of 1.4.
- For a point, a coverage test is performed for each sample location (not for a pixel center).
For single sample rasterization we get the same results for the
triangles and points, but for lines we get the rectangular form instead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24517>
dEQP-GLES31.functional.image_load_store.buffer.image_size.writeonly_7
produces:
t7 opcode: CP_COMPUTE_CHECKPOINT (6e) (8 dwords)
{ ADDR_0_LO = 0x15000 }
{ ADDR_0_HI = 0x5 }
0x18
{ ADDR_1_LEN = 3 }
0xf
{ ADDR_1_LO = 0x2e010 }
{ ADDR_1_HI = 0x5 }
and it was asserting due to sizedwords==7. Without the assert, we were
dereffing a len past the end of the packet. This len value we were
loading is also suspiciously not the location of the ADDR_1_LEN field in
the packet's XML. But then, the command stream at ADDR_1 was clearly 0xf
long, and that puts ADDR_1_LEN at the spot we would expect compared to
SET_RENDER_MODE's ADDR_1.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>
A bunch of our piglit fails were due to failing to compile shaders due to
a lack of spilling support. I used a simple shader with a large local
array with tunable size to determine the MEMSIZEPERITEM increment and the
location of HWSTACKOFFSET (matching a3xx locations). Unfortunately
fibers_per_sp I had to guess by taking a big spilling shader and cranking
it up until it rendered correctly. The value I found made HWSTACKOFFSET's
shift value match a6xx's, as a bit of confirmation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>
This now matches a6xx. This major border color flakiness in deqp -- when
a prior test in the caselist bound a VS and it didn't get unbound at the
gallium level, our FS border colors would be up at offset 8 instead of 0,
and the wrong padding would make FS sampler 0 get a junk border color.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>
When replaying a RT pipeline, RADEON_FLAG_REPLAYABLE should be set.
The idea is that for capture, RADEON_FLAG_REPLAYABLE should be passed
when allocating a BO (ie. replay_va would be 0), and then for replay
the VA would be non-zero but the flag is also required.
Fixes
dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.multithreaded_compilation.*.
Fixes: 744357477e ("radv: Add utilities to serialize and deserialize shader allocation info")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24543>