When the i-th target format is set, all previous target formats
must be non-zero to avoid hangs. In other words, without this
if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs
because the target format of mrt1 is zero.
This fixes DXVK GPU hangs with "Seven: The Days Long Gone",
"GTA V" and probably more games.
Cc: "18.0" 18.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and
GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the
GEOMETRY shader for the TESS_EVAL shader.
This would cause the flush_constants code to emit the GEOMETRY constants
to the TESS_EVAL registers and then conclude that it did not need to set
the GEOMETRY shader registers.
Fixes: dfff9fb6f8 "radv: Handle GFX9 merged shaders in radv_flush_constants()"
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
With GFX9 merged shaders, active_stages would be set to the original
stages specified if shaders were not cached, but to the stages still
present after merging if they were.
Be consistent and use the original stages.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "18.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Errors are not that common of a case so we can eat a slight perf
hit in having to call a function and do a runtime check.
In turn this makes debugging random errors happening for end users
easier, because they don't have to have a debug build on hand.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR
linking optimisations and only run over the NIR optimisation loop
once similar to the GLSLOptimizeConservatively constant used by
some GL drivers.
We need to run over the opts at least once to avoid errors in LLVM
(e.g. dead vars it can't handle) and also to reduce the time spent
compiling the IR in LLVM.
With this change the Blacksmith Unity demos compilation times
go from 329760 ms -> 299881 ms when using Wine and DXVK.
V2: add bit to radv_pipeline_key
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246
Missed this on initial radeonsi port, we shouldn't use this value
on gfx9, but also in gfx8 only for when we have a geom shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h
v2: fix radv build
Reviewed-by: Brian Paul <brianp@vmware.com>
Pretty straight forward, just pass the divisors through the shader
key and then do a LLVM divide.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
According to Marek, not enabling it on Stoney has a significant
negative performance impact. (And I guess this might impact
performance on Raven as well)
The register settings are pretty much copied from radeonsi. I did
not put this in the pipeline as that would make the pipeline more
dependent on the format which mean we would have to have more
pipelines for the meta shaders.
v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet
does already.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Disabled by default for now, it can be enabled with
RADV_PERFTEST=outoforder.
No CTS regressions on Polaris, and all Vulkan games I tested
look good as well.
Expect small performance improvements for applications where
out-of-order rasterization can be enabled by the driver.
Loosely based on RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This extension removes the restrictions on minDepth/maxDepth,
minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth.
The following CTS tests now pass:
dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth
dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted
dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The fragment shader was trying to read this, but nothing
was exporting it from the vertex shader. This handles
it like the prim id export.
Fixes:
dEQP-VK.multiview.secondary_cmd_buffer.*
dEQP-VK.multiview.index.fragment_shader.*
v1.1: updated to use 0x1 (Samuel)
Fixes: e3265c10c8 (radv: Implement multiview draws.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This drops one of the geometry specific user sgprs,
we can work this out at compile time.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This moves the lds_size calcs into the shader so we have all
the size stuff in one file.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This drops the now unneeded scanning and results in favour
of the ones in the info.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of recalculating the value, use the shader calculated value.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Until llvm handles indirects better we will need to use these
workarounds in the radeonsi backend also.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
NIR->LLVM should only be a translation pass, and all scan stuff
should be done before.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>