Looks like it fixes some potentially important VK test bugs. But also, it
fixes the GLES31 SSBO layout tests to not be so excessively large, so we
can run them in a reasonable time now. Note that a630 fail list is reset,
since the test list has changed and so we end up with a different subset
of tests being run. Interestingly, in the process the semaphore tests are
now reporting "NotSupported (Exporting and importing semaphore type not
supported at vktSynchronizationSignalOrderTests.cpp:513)" where they
weren't before.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5554>
For texturing and draw calls, HW expects the clear color to be in two
different color spaces after sRGB fast-clears - sRGB in the former and
linear in the latter. Up until now, iris has stored the clear color in
the sRGB color space. Limit the allowable clear colors for sRGB
fast-clears to 0/1 so that both color space requirements are satisfied.
Makes iris pass the sRGB -> sRGB subtest of the fcc-write-after-clear
piglit test on gen9+.
v2:
* Drop iris_context::blend_enables. (Ken)
* Drop some more resolve-related blend-state-tracking code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>
For rendering operations, avoid adding or using fast-cleared blocks if
the render format is incompatible with the clear color interpretation.
Note that the clear color is currently interpreted through the
resource's surface format.
Makes iris pass subtests of the fcc-write-after-clear piglit test:
* UNORM -> SNORM, partial block on gen8+.
* linear -> sRGB, partial block on gen9+.
* UNORM -> SNORM, full block on gen12.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>
Remove the CCS_D fallback logic so that iris doesn't attempt to use a
non-existent surface state for some renders. Also, add an assertion to
catch the issue.
The fallback in iris_resource_render_aux_usage can lead to this problem
because it doesn't account for the fact that surface states created from
resources with the Y_TILED_CCS modifier may only have CCS_E or NONE as
aux usages (due to iris_resource_create_with_modifiers).
Without this change, the next commit would have triggered the fallback
and regressed the following tests on gen9:
* dEQP-EGL.functional.wide_color.window_888_colorspace_srgb
* dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb
* dEQP-EGL.functional.wide_color.pbuffer_888_colorspace_srgb
* dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4972>
There's no good reason to have the "which table do I use?" code
duplicated twice. The only advantage to the way we were doing it before
was that we could move the switch statement outside the loop. If this
is ever an actual device initialization perf problem that someone cares
about, we can optimize that when the time comes. For now, the
duplicated cases are simply a platform-enabling pit-fall.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5530>
This commit fills out VkPhysicalDeviceSubgroupProperties if present
in a VkPhysicalDeviceProperties2. The values here are simply pulled
from the blob.
Fixes some flakes in dEQP-VK.subgroups.* since dEQP was reading
uninitialized values of VkPhysicalDeviceSubgroupProperties.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5564>
For shader-cache, we want to not have anything important in `ir3_shader`.
And to have shader variants with lower const size limits (to properly
handle cross-stage limits), we also want variants to be able to have
their own const_state.
But we still need binning pass shaders to align with their draw pass
counterpart so that the same const emit can be used for both passes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
Make it an rzalloc'd ptr instead of embedded struct, so it can serve as
the mem ctx for immediates. This gets rid of needing to explicitly free
the immediates, so one less thing to deal with when moving const_state.
(Also, after we move const_state to the shader variant, we won't need
one for binning pass variants)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
It seems a lot of the lowerings being run the second time were
unnecessary. In addition, when const_state is moved to the variant,
then it will become impossible to know ahead of time whether a variant
needs additional optimizing, which means that ir3_key_lowers_nir() needs
to go away. The new approach should have the same effect, since it skips
running lowerings that are unnecessary and then skips the opt loop if no
optimizations made progress, but it will work better when we move
ir3_nir_analyze_ubo_ranges() to be after variant creation.
The one maybe controversial thing I did is to make
nir_opt_algebraic_late() always happen during variant lowering. I wanted
to avoid code duplication, and it seems to me that we should push the
_late variants as far back as possible so that later opt_algebraic runs
don't miss out on optimization opportunities.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
The only difference between this and `const_state->num_ubos` was that
the latter is counting # of ubos loaded via `ldg` (based on UBO addrs
in push-consts). But turns out there isn't really any reason to care.
Instead just add an early return in the one code-path that cares about
the number of `ldg` UBOs.
This gets rid of one more thing we need to move from `ir3_shader` to
`ir3_shader_variant`.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
We are going to want to move this back to the variant, and come up with
a different strategy for binning/nonbinning to share the same constant
layout, in order to implement shader-cache support. (Since then we
can have a mix of dynamically compiled variants and cache hits, so there
is no good place to serialize the const-state.)
To reduce the churn as we re-arrange things, move direct access to the
const-state to a helper fxn. This patch is the boring churny part.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
If buffer or stride is unaligned we use the same trick as on gfx6:
load 1 byte at a time and recompose the output if needed.
This change fixes lots of deqp/glcts tests:
- dEQP-GLES2.functional.draw.random.1, 10, ...
- dEQP-GLES2.functional.vertex_arrays.multiple_attributes.stride.3_float2_0_float2_0_float2_17, ...
- dEQP-GLES2.functional.vertex_arrays.single_attribute.first.byte_first24_offset1_stride2_quads256, ...
- dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.buffer_0_17_byte2_vec4_dynamic_draw_quads_1, ...
- dEQP-GLES31.functional.draw_indirect.random.14, ...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5502>
The docs say that these registers should only be read with a certain
type, and I'm inclined to believe that the hardware behaves that way,
but it makes the assembler a little more confusing and also confuses the
user of the assembler that some operands don't take types or regions.
Just always requiring regions and types seems like the sensible thing.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5514>