This just adds the structure with a name and its update function.
strlen and others will be added in the following commits. The idea is to
parse and analyze the name in advance to make glGetUniformLocation faster.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13507>
This expands on commit c54c42321e. See the code comment for full
justifications. At the time of the previous commit Ian wanted to
limit the relaxing of the rule to GLSL 3.30 as that was the highest
version of shaders seen in the wild that were having trouble with
the stricter rules.
However since then I've found that the long standing issue with tess
shaders failing to compile in the game 'Layers Of Fear' is due to
this same issue. The game uses 4.10 shaders and also makes use of
explicit varying locations, so here we relax the rule to 4.20 and
make sure to apply the restriction to shaders using varyings with
explicit locations also.
Fixes: c54c42321e ("glsl: relax rule on varying matching for shaders older than 4.00")
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11873>
The CAP for packed transform feedback concerns packing of unrelated
variables into the same varying slot. (On Mali, transform feedback is
implemented on a per-slot basis, so different variables need different
slots to be written to different buffers.) However, this requirement is
tangential to the packing of arrays, matrices, and structures inherent
to GLSL. These array-like values need to be packed /within/ their slot,
even though drivers using the CAP (just Panfrost) cannot pack
independent values in the slot. Transform feedback of individual
elements is not independent, after all.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>
We need to duplicate the subscripted members even if they happen to be
aligned, since the other elements may be passed into the consumer
shader. Fixes on Panfrost:
dEQP-GLES3.functional.transform_feedback.array_element.interleaved.lines.highp_float
Note: the test did pass on main previously due to an elaborate set of
driver hacks. I don't believe the old behaviour was correct regardless.
Only Panfrost is affected by this change and the next, as every other
driver sets PIPE_CAP_PACKED_STREAM_OUTPUT.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>
Per OpenGL 4.6 spec:
"If no xfb_stride qualifier is specified for a
binding point, the stride is derived by identifying the variable associated with the
binding point having the largest offset, and then adding the offset and the size of
the variable, in basic machine units. If any variable associated with the binding
point contains double-precision floating-point components, the derived stride is
aligned to the next multiple of eight basic machine units. If a binding point has no
xfb_stride qualifier and no associated output variables, its stride is zero."
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
1) Per GL_ARB_enhanced_layouts if explicit location is set for varying,
each struct member, array element and matrix row will take
separate location. With GL_ARB_gpu_shader_fp64/GL_ARB_gpu_shader_int64
they may take two locations.
Examples:
| layout(location=0) dvec3[2] a; | layout(location=4) vec2[4] b; |
| | |
| 32b 32b 32b 32b | 32b 32b 32b 32b |
| 0 X X Y Y | 4 X Y 0 0 |
| 1 Z Z 0 0 | 5 X Y 0 0 |
| 2 X X Y Y | 6 X Y 0 0 |
| 3 Z Z 0 0 | 7 X Y 0 0 |
Previously it wasn't taken into account.
2) Captured double-precision variables should be aligned to
8 bytes per GL_ARB_gpu_shader_fp64:
"If any variable captured in transform feedback has double-precision
components, the practical requirements for defined behavior are:
...
(c) each double-precision variable captured must be aligned to a
multiple of eight bytes relative to the beginning of a vertex."
v2: fix `output_size` calculations
( Andrii Simiklit <andrii.simiklit@globallogic.com> )
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1667
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
When packing varyings when there is only 32bit of space
left in slot 64bit type is attempted to be divided between
current and next slot. However there is neither code for
splitting the 64bit type nor for assembling it back.
Instead we add 32bit padding.
The above happens only in structs because their
members aren't sorted by packing order.
Example of the issue:
struct S {
vec3 a;
double d;
};
out flat S s;
Before, the packing went as:
slot 32b 32b 32b 32b
0 a.x a.y a.z d
1 d 0 0 0
After:
slot 32b 32b 32b 32b
0 a.x a.y a.z 0
1 d d 0 0
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
- xxhash is faster than sha1.
- remove superfluous calls to strlen
Using SPECviewperf13 snx-03 first subtest and "perf -e cycles -g", perf report says:
Before | After | Function
---------|--------|---------------
47.39% | 47.36% | _mesa_CallList
5.00% | 3.03% | _mesa_program_resource_location
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7078>
Some lowering passes modify the value of built-in variables in
order for drivers to work properly. However, modifying such values
will also break transform feedback as the captured value won't
match what's expected.
For example, on some hardware, the vertex shaders are expected to
output gl_Position in screen space. However, the transform
feedback captured value is still supposed to be the world-space
coordinates (see nir_lower_viewport_transform).
To fix that, we create a new variable that contains the
pre-transformation value and use it for transform feedback instead
of the built-in one.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2433>
When varying packing is disabled for transform feedback and a xfb
declaration points to an array element or structure member, the
element/member should be aligned to the start of a slot as well.
If that's not the case, a new varying is created and the
element/member value is copied.
There might a way to further optimize the number of slots allocated
or the number of copies necessary if the performance cost is
problematic. For example, in cases where simply padding the top
level variable might correctly align all the captured values.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2433>
Some drivers (e.g. Panfrost) don't support packing of varyings when
used for transform feedback. This new constant ensures that any
varying used for xfb is aligned at the start of a slot and won't be
packed with other varyings.
Scenarios where transform feedback declarations are related to an
array element or a struct member will be handled in a subsequent
patch.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (Fix order of arguments to varying_matches())
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2433>
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
On GLES, the interface between vertex and fragment shaders doesn’t
need to have matching precision.
Section 4.3.10 of the GLSL ES 3.00 spec:
“The type of vertex outputs and fragment inputs with the same name
must match, otherwise the link command will fail. The precision does
not need to match.”
Reviewed-by: Eric Anholt <eric@anholt.net>
From page 76 (page 80 of the PDF) of the GLSL 4.60 v.5 spec:
" No aliasing in output buffers is allowed: It is a compile-time or
link-time error to specify variables with overlapping transform
feedback offsets."
Currently, this is expected to fail, but it succeeds:
"
...
layout (xfb_offset = 0) out vec2 a;
layout (xfb_offset = 0) out vec4 b;
...
"
Fixes the following piglit test:
tests/spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-overlap.vert
Fixes the following test:
KHR-GL44.enhanced_layouts.xfb_output_overlapping
v2:
- Use a data structure to track the used components instead of a
nested loop (Ilia).
v3:
- Take the BITSET_WORD array out from the
gl_transform_feedback_buffer struct and make it local to the
validation process (Timothy).
- Do not use a nested scope for the validation (Timothy).
v4:
- Add reference to the fixed piglit test in the commit log.
- Add reference to the fixed VK-GL-CTS test in the commit
log (Tapani).
- Empty initialize the BITSET_WORD pointers array (Tapani).
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
fixes following warning with clang:
warning: suggest braces around initialization of subobject
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Until now, we were only doing this when linking a SSO
program. However, nothing avoids linking a non SSO program which
doesn't have both a VS and FS. In those cases, we also need to report
the usual linking errors, if happening.
v2: Use a better name for the renamed function (Timothy).
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
From the OpenGL 4.60.5 spec, section 4.4.1 Input Layout Qualifiers,
Page 67, (Location aliasing):
" Further, when location aliasing, the aliases sharing the location
must have the same underlying numerical type and bit
width (floating-point or integer, 32-bit versus 64-bit, etc.) and
the same auxiliary storage and interpolation qualification."
Additionally, we have improved the linker error descriptions.
Specifically, when taking structs into account we were producing a
linker error because we assumed that all components in each location
were used and that would cause component aliasing. This is not
accurate of the actual problem. Now, the failure specifies that the
underlying numerical type incompatibility is the cause for the
failure.
Fixes the following piglit test:
tests/spec/arb_enhanced_layouts/linker/component-layout/vs-to-fs-width-mismatch-double-float.shader_test
v2:
- Do not assert if we see invalid numerical types. These come
straight from shader code, so we should produce linker errors if
shaders attempt to do location aliasing on variables that are not
numerical such as records.
- While we are at it, improve error reporting for the case of
numerical type mismatch to include the shader stage.
v3:
- Allow location aliasing of images and samplers. If we get these
it means bindless support is active and they should be handled
as 64-bit integers (Ilia)
- Make sure we produce link errors for any non-numerical type
for which we attempt location aliasing, not just structs.
v4:
- Rebased with minor fixes (Andres).
- Added fixing tag to the commit log (Andres).
v5:
- Remove the helper function and check individually for the
underlying numerical type and bit width (Timothy).
- Implicitly, assume that any non-treated type which is checked for
its underlying numerical type is either integer or
float and has a defined bit width (Timothy).
- Implicitly, assume that structs are the only non-treated
non-numerical type (Timothy).
- Improve the linker error descriptions and commit log (Andres).
Fixes: 13652e7516 ("glsl/linker: Fix type checks for location aliasing")
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Section 7.4.1 (Shader Interface Matching) of the OpenGL 4.30 spec says:
"Variables or block members declared as structures are considered
to match in type if and only if structure members match in name,
type, qualification, and declaration order."
Fixes:
* layout-location-struct.shader_test
v2: rebased against master and small fixes
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108250
'invariant' qualifier is propagated on variables which are used
to calculate other invariant variables, however when we are matching
variable's declarations we should take into account only explicitly
declared invariance because invariance propagation is an implementation
specific detail.
Thus new flag is added to ir_variable_data which indicates 'invariant'
qualifier being explicitly set in the shader.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100316
Fixes: 89b60492 ('glsl: Add a pass to propagate the "invariant" and
"precise" qualifiers')
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This reverts commit 1aa5738e66.
This patch incorrectly asumed that for SSOs no inner interface
matching check was needed.
From the ARB_separate_shader_objects spec v.25:
" With separable program objects, interfaces between shader stages
may involve the outputs from one program object and the inputs
from a second program object. For such interfaces, it is not
possible to detect mismatches at link time, because the programs
are linked separately. When each such program is linked, all
inputs or outputs interfacing with another program stage are
treated as active. The linker will generate an executable that
assumes the presence of a compatible program on the other side of
the interface. If a mismatch between programs occurs, no GL error
will be generated, but some or all of the inputs on the interface
will be undefined."
This completes the fix from commit:
3be05dd267 ("glsl/linker: don't fail non static used inputs without matching outputs")
Fixes: 1aa5738e66 ("glsl: relax input->output validation for SSO programs")
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Current implementation uses a complicated calculation which relies in
an implicit conversion to check the integral part of 2 division
results.
However, the calculation actually checks that the xfb_offset is
smaller or a multiplier of the xfb_stride. For example, while this is
expected to fail, it actually succeeds:
"
...
layout(xfb_buffer = 2, xfb_stride = 12) out block3 {
layout(xfb_offset = 0) vec3 c;
layout(xfb_offset = 12) vec3 d; // ERROR, requires stride of 24
};
...
"
Fixes: 2fab85aaea ("glsl: add xfb_stride link time validation")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
If there is no Static Use of an input variable, the linker shouldn't
fail whenever there is no defined matching output variable in the
previous stage.
From page 47 (page 51 of the PDF) of the GLSL 4.60 v.5 spec:
" Only the input variables that are statically read need to be
written by the previous stage; it is allowed to have superfluous
declarations of input variables."
Now, we complete this exception whenever the input variable has an
explicit location. Previously, 18004c338f ("glsl: fail when a
shader's input var has not an equivalent out var in previous") took
care of the cases in which the input variable didn't have an explicit
location.
v2: do the location based interface matching check regardless on
whether it is a separable program or not (Ilia).
Fixes: 1aa5738e66 ("glsl: relax input->output validation for SSO programs")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Outputs are always validated when having explicit locations and we
were trusting its outcome to catch similar problems with the inputs
since, in case of having undefined outputs for existing inputs, we
would be already reporting a linker error.
However, consider this case:
" Shader stage n:
---------------
...
layout(location = 0) out float a;
...
Shader stage n+1:
-----------------
...
layout(location = 0) in float b;
layout(location = 0) in float c;
...
"
Currently, this won't report a linker error even though location
aliasing is happening for the inputs.
Therefore, we also need to validate the inputs independently from the
outcome of the outputs validation.
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This is purely for conformance, since it's not actually possible to do
XFB on TCS output varyings. However we do have to make sure we record
the names correctly, and this removes an extra level of array-ness from
the names in question.
Fixes KHR-GL45.tessellation_shader.single.xfb_captures_data_from_correct_stage
v2: Add comment to the new program_resource_visitor::process function.
(Ilia Mirkin)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108457
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Avoids regression on:
KHR-GLES*.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
that is uncovered by the following patch.
"glsl: fix recording of variables for XFB in TCS shaders"
v2: Rebased over glsl: fix recording of variables for XFB in TCS shaders
v3: Move this patch before "glsl: fix recording of variables for XFB in TCS
shaders" to avoid temporal regressions. (Illia Mirkin)
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The check for location aliasing was always asuming output variables
but this validation is also called for input variables.
Fixes: e2abb75b0e ("glsl/linker: validate explicit locations for SSO programs")
Cc: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem
in nir for platforms that don't have popcount or popcountll, such as
32bit msvc.
v2: - Fix additional uses of _mesa_bitcount added after this was
originally written
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h
v2: fix radv build
Reviewed-by: Brian Paul <brianp@vmware.com>
This fixes a varying packing issue when using transform feedback in
GL_SEPARATE_ATTRIBS mode. By time we get to linking, we already
know that the number of feedback attributes is under the
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't
as critical. In fact, packing/splitting vec3 attributes can cause
trouble because splitting effectively creates another TFB output
which can exceed device limits. So, disable vec3 packing when it's
not needed to avoid that issue.
Fixes the Piglit ext_transform_feedback-separate test on VMware
driver.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The mix of bitwise operators with * and + to compute the packing_class
values was a little weird. Just use bitwise ops instead.
v2: add assertion to make sure interpolation bits fit without collision,
per Timothy. Basically, rewrite function to be simpler.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This fixes a crash in:
KHR-GL45.enhanced_layouts.xfb_block_stride
Fixes: 0822517936 "glsl: add helper to process xfb qualifiers during linking"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>