We need to duplicate the subscripted members even if they happen to be
aligned, since the other elements may be passed into the consumer
shader. Fixes on Panfrost:
dEQP-GLES3.functional.transform_feedback.array_element.interleaved.lines.highp_float
Note: the test did pass on main previously due to an elaborate set of
driver hacks. I don't believe the old behaviour was correct regardless.
Only Panfrost is affected by this change and the next, as every other
driver sets PIPE_CAP_PACKED_STREAM_OUTPUT.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>
Per OpenGL 4.6 spec:
"If no xfb_stride qualifier is specified for a
binding point, the stride is derived by identifying the variable associated with the
binding point having the largest offset, and then adding the offset and the size of
the variable, in basic machine units. If any variable associated with the binding
point contains double-precision floating-point components, the derived stride is
aligned to the next multiple of eight basic machine units. If a binding point has no
xfb_stride qualifier and no associated output variables, its stride is zero."
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
1) Per GL_ARB_enhanced_layouts if explicit location is set for varying,
each struct member, array element and matrix row will take
separate location. With GL_ARB_gpu_shader_fp64/GL_ARB_gpu_shader_int64
they may take two locations.
Examples:
| layout(location=0) dvec3[2] a; | layout(location=4) vec2[4] b; |
| | |
| 32b 32b 32b 32b | 32b 32b 32b 32b |
| 0 X X Y Y | 4 X Y 0 0 |
| 1 Z Z 0 0 | 5 X Y 0 0 |
| 2 X X Y Y | 6 X Y 0 0 |
| 3 Z Z 0 0 | 7 X Y 0 0 |
Previously it wasn't taken into account.
2) Captured double-precision variables should be aligned to
8 bytes per GL_ARB_gpu_shader_fp64:
"If any variable captured in transform feedback has double-precision
components, the practical requirements for defined behavior are:
...
(c) each double-precision variable captured must be aligned to a
multiple of eight bytes relative to the beginning of a vertex."
v2: fix `output_size` calculations
( Andrii Simiklit <andrii.simiklit@globallogic.com> )
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1667
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
When varying packing is disabled for transform feedback and a xfb
declaration points to an array element or structure member, the
element/member should be aligned to the start of a slot as well.
If that's not the case, a new varying is created and the
element/member value is copied.
There might a way to further optimize the number of slots allocated
or the number of copies necessary if the performance cost is
problematic. For example, in cases where simply padding the top
level variable might correctly align all the captured values.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2433>
From page 76 (page 80 of the PDF) of the GLSL 4.60 v.5 spec:
" No aliasing in output buffers is allowed: It is a compile-time or
link-time error to specify variables with overlapping transform
feedback offsets."
Currently, this is expected to fail, but it succeeds:
"
...
layout (xfb_offset = 0) out vec2 a;
layout (xfb_offset = 0) out vec4 b;
...
"
Fixes the following piglit test:
tests/spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-overlap.vert
Fixes the following test:
KHR-GL44.enhanced_layouts.xfb_output_overlapping
v2:
- Use a data structure to track the used components instead of a
nested loop (Ilia).
v3:
- Take the BITSET_WORD array out from the
gl_transform_feedback_buffer struct and make it local to the
validation process (Timothy).
- Do not use a nested scope for the validation (Timothy).
v4:
- Add reference to the fixed piglit test in the commit log.
- Add reference to the fixed VK-GL-CTS test in the commit
log (Tapani).
- Empty initialize the BITSET_WORD pointers array (Tapani).
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Until now, we were only doing this when linking a SSO
program. However, nothing avoids linking a non SSO program which
doesn't have both a VS and FS. In those cases, we also need to report
the usual linking errors, if happening.
v2: Use a better name for the renamed function (Timothy).
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
I'd like to use this in the prog_parameter.c code, so I need to move it
into C, make it non-static, and so on. This probably isn't the ideal
place for it, but I couldn't think of a better one.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
v2:
- we only need to validate inputs to the first stage and outputs
from the last stage, everything else has already been validated
during cross_validate_outputs_to_inputs (Timothy).
- Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.
v2: compute location slots properly depending on shader stage and
variable type / direction
Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Through the glsl headers we had an odd mix of guards be that
"ifndef", "pragma once" neither or both.
Simplify things by using the more common ones (ifndef) and annotating
all the sources, barring the generated builting header -
builtin_int64.h.
The final header - udivmod64.h - is [seemingly] unused and on its way
out (patch purge it is on the mailing list).
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.
The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.
We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Since this extension allows more than one varying to share a single
location we can't just count the number of slots a varying takes and
add it to the total.
Instead we now reuse the reserved varyings bitfield to determine how
many slots are reserved for explicit locations instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
From the ARB_enhanced_layous spec:
"It is a compile-time or link-time error to have any *xfb_offset*
that overflows *xfb_stride*, whether stated on declarations before
or after the *xfb_stride*, or in different compilation units.
...
When no *xfb_stride* is specified for a buffer, the stride of a
buffer will be the smallest needed to hold the variable placed at
the highest offset, including any required padding."
Reviewed-by: Dave Airlie <airlied@redhat.com>
This adds the initial infrastructure for enabling transform feedback
mode via in shader qualifiers and adds initial buffer support.
Reviewed-by: Dave Airlie <airlied@redhat.com>
This function checks for any xfb_* qualifiers which will enable
transform feedback mode and cause any API defined xfb varyings
to be ignored.
It also counts the number of varyings that have a xfb_offset
qualifier and finally it calls the create_xfb_varying_names()
helper to generate the names of varyings to be caputured.
Reviewed-by: Dave Airlie <airlied@redhat.com>