From the GLES 3.2 spec:
"If the source formats are integer types or stencil values, a single
sample’s value is selected for each pixel. If the source formats are
floating-point or normalized types, the sample values for each pixel are
resolved in an implementation-dependent manner. If the source formats are
depth values, sample values are resolved in an implementation-dependent
manner where the result will be between the minimum and maximum depth
values in the pixel."
For Z24S8 we were doing an average, which would be invalid for the stencil
data, and apparently the CTS didn't catch that. For Z32F, our averaging
was technically legal, but given the Vulkan spec's requirement of sample 0
support but not sample average support for depth and stencil resolves, it
suggests that our averaging behavior was unusual. Let's not spend power
on producing surprising results.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9661>
max_waves is just for shader-db stats for now, but threadsize will
replace the various mechanisms used to determine threadsize across the
different gen's. Calculating these correctly entails adding a bunch of
details about the sizes of various things to ir3. In the future we will
use the guts of the max_waves calculation to inform RA decisions as
well, which is why the max_waves calculation is broken up into register
dependent/independent pieces.
Something should be said about the units of reg_size_vec4. These units
were chosen for two reasons:
1. As said in the comment, it makes some calculations easier.
2. For a4xx/a5xx, where we don't know as much because we haven't done
the same sorts of experiments to probe for the HW configuration, it
corresponds more directly to things that are known. The existing code
switches to the smaller threadsize when r24.x or higher is used,
which translates directly to a reg_size_vec4 of 48. If we chose
different units (e.g. multiplying by wave_granularity and/or
threadsize_base), then to match the same behavior we'd have to set
reg_size_vec4 based on some other parameters that aren't 100% known.
If someone comes along and updates them, they might inadvertantly
break it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
We want to use the local_size when available to calculate the threadsize
in ir3, and we need it to work with e.g. computerator where we don't
have a nir shader. Add a local_size field and use that in computerator
instead of of a separate structure that's inaccessable to core ir3.
Also set a dummy local_size in the tests to avoid a divide-by-zero.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
Quoting Jason's commit message (afa8f5892), that also applies here:
"The Vulkan API provides a mechanism for applications to cache their
own shaders and manage on-disk pipeline caching themselves.
Generally, this is what I would recommend to application developers
and I've resisted implementing driver-side transparent caching in the
Vulkan driver for a long time. However, not all applications do this
and, for some use-cases, it's just not practical."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
Also return the proper Vulkan result for this case, that is somewhat
tricky. Technically Create[Graphics/Compute]Pipeline only allow OOM
errors. So for this case, there is only the alternative of the generic
VK_ERROR_UNKNOWN, even if we known the cause of the error. From spec:
"VK_ERROR_UNKNOWN will be returned by an implementation when an
unexpected error occurs that cannot be attributed to valid behavior
of the application and implementation. Under these conditions, it
may be returned from any command returning a VkResult"
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
Until now we were always doing a two-step cache lookup, as we were
using the NIR shaders to fill up the key to lookup for the compiled
shaders. But since we were already generating the sha1 key with the
original SPIR-V shader (or its internal NIR representation) any info
we were collecting from from NIR is already implicit in the original
shader, so we can avoid using the NIR in most cases.
Because the v3d_key that is used to compile a shader is populated with
data coming directly from the NIR shader or produced during NIR
lowerings, we can't use it directly as part of the pipeline cache
entry. We could split them, but that would be confusing, so we add a
new struct, v3dv_pipeline_key used specifically to search for the
compiled shaders on the pipeline cache. v3d_key would be still used to
compile the shaders.
As we are using the same sha1 key for all compiled shaders in a
pipeline, we can also group all of them in the same cache entry, so we
don't need a lookup for each stage. This also allows to cache pipeline
data shared by all the stages (like the descriptor maps).
While we are here, we also create a single BO to store the assembly
for all the pipeline stages.
Finally, we remove the link to the variant on the pipeline stage
struct, to avoid the confusion of having two links to the same
data. This mostly means that we stop to use the pipeline stage
structures after the pipeline is created, so we can freed them.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
Mostly the same that main mesa gl_shader_stage, but including the
coordinate shader. This would allow to loop over all the available
stages (for example if we need to free them, compute the max spill
size, etc).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
If we were able to get a shader variant from the pipeline cache, we
will not have the nir shader available.
Note that this is what we were doing on the driver before the nir io
helpers were available.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
This maps the nir shader data.location to its final
data.driver_location. In general we are using the driver location as
index (like vattr_sizes on the same struct), so having this map is
useful if what we have is the data.location, and we don't have
available the original nir shader.
v2: use memset instead of for loop, and nir_foreach_shader_in_variable
instead of nir_foreach_variable_with_modes (Iago)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
Right now we were not pre-generating several variants, but we decided
to let this method, just in case we need that idea back. This ended
being a bad idea. Several months have passed without that need, so
having that method just adds confusion. Also, if we need to add a
multiple-variant in the future, perhaps we would need to do it
different, so let's not template in advance.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>
We tweak a little some of the individual messages, and add a new
option to dump the stats when the pipeline destroy.
As we are here we also we also tweak the names of the global options
to make it more clear.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>