add a new attribute, 'disable_ccs_repack' to gen_device info, which
indicates whether repacking of components in certain pixel formats
before compression needs to be disabled to keep the compatibility
with decompression capability of display controller (gen11+)
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
correct bit fields information of CACHE_MODE_0 reg in current gen11.xml
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The "demote" intrinsic works like "discard" but don't change the
control flow, allowing derivative operations to work. This is the
semantics of D3D discard.
The "is_helper_invocation" intrinsic will return true for helper
invocations -- both the ones that started as helpers and the ones that
where demoted. This is needed to avoid changing the behavior of
gl_HelperInvocation which is an input (so not expected to change
during shader execution).
v2: Emit the discard jump and comment why it is safe. (Jason)
Rework the is_helper_invocation() that was stomping f0.1. (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
From SPV_EXT_demote_to_helper_invocation. Demote will be implemented
as a variant of discard, so mark uses_discard if it is used.
v2: Add CAN_ELIMINATE flag to the new intrinsic. (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is simpler than radv, since the driver_location is already assigned
for us.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We always get gl_FragCoord as a system value, not a varying, so this is
never hit. We already set PIXEL_CENTER_INTEGER elsewhere.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is nice to have with radeonsi, where color varyings are handled
specially to avoid recompiles.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
We have to add a few lowering to deal with things that used to be dealt
with inline when creating inputs. We also move the code that fills out
the radv_shader_variant_info struct for linking purposes to
radv_shader.c, as it's no longer tied to the NIR->LLVM lowering.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Pretty much every driver using nir_lower_io_to_temporaries followed by
nir_lower_io is going to want this. In particular, radv and radeonsi in
the next commits.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
These weren't properly supported. This does pretty much the same thing
that the radv code did.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Right now nir_copy_prop_vars is effectively undoing
nir_lower_io_to_temporaries for inputs by propagating the original
variable through the copy created in lower_io_to_temporaries. A
theoretical variable coalescing pass would have the same issue with
output variables, although that doesn't exist yet. To fix this, add a
new bit to nir_variable, and disable copy propagation when it's set.
This doesn't seem to affect any drivers now, probably since since no one
uses lower_io_to_temporaries for inputs as well as copy_prop_vars, but
it will fix radv once we flip on lower_io_to_temporaries for fs inputs.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It was double-counting cases where multiple variables were assigned to
the same slot, and not handling the case where the last variable is a
compact variable.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
These are used in Vulkan for clip/cull distances, instead of the GLSL
lowering when the clip/cull arrays are shared.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It isn't really doing anything Gallium-specific, and it's needed for
handling component packing, overlapping, etc.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
load_fragcoord is already handled in common code for radeonsi, so we
don't need to do anything to handle it. However, there were some passes
creating NIR with the varying, so we switch them over to the sysval. In
the case of nir_lower_input_attachments which is used by both radv and
anv, we add handling for both until intel switches to using a sysval.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
On AMD, FragCoord should be a sysval because it is handled separately
from all the other inputs. We were already doing this in radeonsi, but
we weren't doing it with radv. It'll be much more annoying to handle
VARYING_SLOT_POS in fragment shaders when we let NIR lower FS inputs for
us, so here we add an option so that radv can get it as a system value.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This was done for all previous GPUs.
This fixes Talos Principle launch hangs.
Fixes: 7e43022e8c (radv/gfx10: add gfx10_cs_emit_cache_flush)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
GFX10 has two rings, so UMR want to know which one to halt.
Select the first one by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In merged shaders we put a big if around each shader, so both stages
can have a different number of threads. However, the NGG output code
still needs to run if the first shader is not executed.
This can happen when there are more gs threads than vs/es threads, or
when there are 0 es/vs threads (why? no clue).
Fixes: ee21bd7440 "radv/gfx10: implement NGG support (VS only)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
It was reported as unsupported previously. It should be importable
and is compatible with itself.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: 69cc6272fb ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This needs to be cleaned up a bit, and it probably contains
missing stuff and/or bugs.
This doesn't fix the "half of the triangles" issue.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale
the number of offchip buffers accordingly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The cache flush logic on GFX10 is quite different and it's
implemented with a new function.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>