Similar to RadeonSI.
This fixes:
dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat
dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
When allocate_user_sgprs() was called, ctx->stage was actually
unset and 0 is for the vertex shader. This doesn't change
anything for now because of the spill support thing.
Though, the number of user SGPRs has to be fixed for merged
shaders on GFX9. It was broken before anyway.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This might decrease VGPR spilling, because we no longer
have to use v4i32 for 2D fetches when level == 0. We now
use v2i32 for those cases.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Using 4, as it is the default value on mesa. See mesa/main/config.h
and the following commit that introduced the value:
15ac66e331
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
ARB_transform_feedback3 sets a minimum of 1, ARB_gpu_shader5 a minimum
of 4. It shouldn't matter too much, so choosing the later.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Used to handle how many ubo you can define on the context. Minimimum
defined as 36 on ARB_uniform_buffer_object spec, up to 84 on OpenGL
4.6 (12 per stage at each moment).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Every now and then I execute the standalone compiler, get the
non-version error, and need to remember what I'm doing wrong
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This allows us to pass the llvm param directly rather than looking
it up.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Copied from radeonsi.
Putting in the correct metadata flush commands for eventually not
flushing L2 on CB/DB switch.
Does not remove the need for V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT
at the moment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
These are just shaders reads, so we need to invalidate L1.
Fixes: 6dbb0eaccc "radv: handle subpass cache flushes"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
It makes more sense to move all scan stuff in the same place.
Also, we don't really need to duplicate the uses_primid field
for each stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Commit 2f421651ac ("egl: let each platform decided how to handle
LIBGL_ALWAYS_SOFTWARE") broke the build due to copy-n-paste of misnamed
function parameter.:
src/egl/drivers/dri2/platform_android.c:1183:8: error: use of undeclared identifier 'disp'
Rather than just fixing 'disp', rename the function parameter 'dpy' to
'disp' to align with the other EGL platforms' implementations.
Fixes: 2f421651ac ("egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Some later code relies on _Layer to set first/last_layer. Make sure it's
always initialized.
Detected by valgrind's conditional jump/move with uninit value logic.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
If alpha-to-coverage is enabled, we have to compute alpha
even if color writes are disabled.
Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
sync_files are in linux since 4.7, while the amdgpu fence_to_handle
ioctl is only in 4.15.
In particular we don't need it for sync_file in radv, because
everything happens via syncobjs, which got support earlier than
fence_to_handle.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
When rasterization is disabled we can have that few.
Fixes: 76603aa90b "radv: Drop the default viewport when 0 viewports are given."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Seems like users are actually hitting 0xFFFFFFFF actually making
things broken for them, and the mad max regression is fixed, so
lets put this in once more.
v2: Use 0xf for depth-only htile. (Dave)
Fixes: af2844116f "radv: Revert HTILE reset word to 0xFFFFFFFF."
Reviewed-by: Dave Airlie <airlied@redhat.com>
We were trying to load/store the logical width/height number of compressed
blocks. As long as the textures were large, single-level, and the
load/store at (0,0), it kind of worked.
I want to do the SETMSF.IFA to discard only if execute == 0 and cond, so
our dest of the PUSHZ needs to be nonzero if execute or !cond are nonzero.
Fixes dEQP-GLES3.functional.shaders.discard.dynamic_loop_dynamic.
Apparently the other funcs will have observable differences when early Z
is enabled.
Fixes (new) simulator assertion failures in
dEQP-GLES3.functional.rasterizer_discard.basic.clear_depth.