This allows mediump inputs and outputs to be trivially lowered into packed
16-bit varyings where 1 slot is occupied by 2 16-bit vec4s, without any
packing instructions in NIR and without any conflicts with 32-bit varyings.
The only thing that is changed is IO semantics in intrinsics to get packed
16-bit varyings.
This simplifies supporting 16-bit types for drivers that have 32-bit slots
everywhere except the fragment shader where they can do 16-bit interpolation
on either the low or high half of each slot.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
This breaks CLOn12's handling of CL CTS test_basic vector_creation for char3 (at least).
Removing this cast causes us to try to load from a deref with no alignment info.
Fixes: 99bb2a4d ("nir/opt_deref: Don't remove casts with alignment information")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
The current handling for SPIR-V memory semantics is very specific to
the wording in the SPIR-V spec, which breaks its handling of OpenCL
(compared to what we had working downstream before merging upstream).
Update/relax the logic here to support CL's barrier(CLK_GLOBAL_MEM_FENCE);
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
Sources outside of src/util path should include "util/string_buffer.h"
Fixes the following building error in Android:
external/mesa/src/compiler/glsl/ast_type.cpp:25:10: fatal error: 'string_buffer.h' file not found
^~~~~~~~~~~~~~~~~
1 error generated.
Fixes: eeec9d56ad ("compiler/glsl: clean up output")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10156>
When lowering precision on integers from GLSL ES, we can end up with
16 bit integer loop counters. So let's tolerate this as well.
This was probably not caught earlier because most NIR drivers disable
GLSL-level loop-unrolling, and no non-NIR driver sets LowerPrecisionInt16
to true. This was discovered while trying to wire up int16 support for
Zink, which doesn't currently disable GLSL loop-unrolling.
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10125>
This format-string seems to have been incorrect since it's inception.
But there's also been commits that have both forgotten to add and remove
flags as appropriate as well.
Let's correct the format-list. This was done by counting by hand. A
better solution for the long-term is coming in a future commit.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9896>
HLSL doesn't support bitcasting a 64bit integer to a double. DXIL
doesn't have generic pack/unpack instructions, so we lower those to
integer bitwise ops. As a result, NIR generic double pack/unpack would
require our backend to emit a bitcast to get a double, but we want
to match HLSL semantics and emit MakeDouble/SplitDouble.
Adding a dedicated opcode for double pack/unpack allows us to add a
pass to emit that instead, which lets our backend emit the right
instruction to pack and unpack doubles.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10063>
Roundtrip to a larger float and divide there. The extra details for
mod/rem are handled directly in integer space to simplify verification
of rounding details. The one issue is that the mantissa might be
rounded down which will cause issues; adding 1 unconditionally (proposed
by Jonathan Marek) fixes this. The lowerings here were tested
exhaustively on all pairs of 16-bit integers.
v2: Update idiv lowering per Rhys Perry's comment.
v3: Rewrite lowerings.
v4: Remove useless ftrunc, fix 8-bit issue, simplify code.
v5: Remove useless ffloor
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8339>
Helps deduplicate some code between the two lowering paths. In
particular, it ports the missing 32-bit? check to the precise pass. This
does not change anything immediately: drivers depending on this to lower
16-bit did not work before due to type mismatches and will not work now
since it'll refuse to lower. But that means sub-32-bit idiv can be
lowered more efficiently in an algebraic pass.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8339>
texture_index is meaningless when a tex_instr has deref src.
Use var->data.binding instead.
This fixes the incorrect lowering on radeonsi where the same
lowering steps was applied to all tex_instr based on the needs
of the first one (since texture_index is always 0).
CC: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9931>
Per OpenGL 4.6 spec:
"If no xfb_stride qualifier is specified for a
binding point, the stride is derived by identifying the variable associated with the
binding point having the largest offset, and then adding the offset and the size of
the variable, in basic machine units. If any variable associated with the binding
point contains double-precision floating-point components, the derived stride is
aligned to the next multiple of eight basic machine units. If a binding point has no
xfb_stride qualifier and no associated output variables, its stride is zero."
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
1) Per GL_ARB_enhanced_layouts if explicit location is set for varying,
each struct member, array element and matrix row will take
separate location. With GL_ARB_gpu_shader_fp64/GL_ARB_gpu_shader_int64
they may take two locations.
Examples:
| layout(location=0) dvec3[2] a; | layout(location=4) vec2[4] b; |
| | |
| 32b 32b 32b 32b | 32b 32b 32b 32b |
| 0 X X Y Y | 4 X Y 0 0 |
| 1 Z Z 0 0 | 5 X Y 0 0 |
| 2 X X Y Y | 6 X Y 0 0 |
| 3 Z Z 0 0 | 7 X Y 0 0 |
Previously it wasn't taken into account.
2) Captured double-precision variables should be aligned to
8 bytes per GL_ARB_gpu_shader_fp64:
"If any variable captured in transform feedback has double-precision
components, the practical requirements for defined behavior are:
...
(c) each double-precision variable captured must be aligned to a
multiple of eight bytes relative to the beginning of a vertex."
v2: fix `output_size` calculations
( Andrii Simiklit <andrii.simiklit@globallogic.com> )
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1667
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
When packing varyings when there is only 32bit of space
left in slot 64bit type is attempted to be divided between
current and next slot. However there is neither code for
splitting the 64bit type nor for assembling it back.
Instead we add 32bit padding.
The above happens only in structs because their
members aren't sorted by packing order.
Example of the issue:
struct S {
vec3 a;
double d;
};
out flat S s;
Before, the packing went as:
slot 32b 32b 32b 32b
0 a.x a.y a.z d
1 d 0 0 0
After:
slot 32b 32b 32b 32b
0 a.x a.y a.z 0
1 d d 0 0
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
Section 8.9.4 (Compatibility Profile Texture Functions) of the
GLSL 4.20 spec outlines a number of builtin texture functions that
have been moved to compatibility shaders.
This change enforces those restrictions. Note we don't worry about
enforcing restrictions on the EXT_gpu_shader4 extensions of these
functions because EXT_gpu_shader4 should only be enabled for compat
already.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9887>
Fix defects reported by Coverity Scan.
uninit_member: Non-static class member buffer_block_index is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member ubo_byte_offset is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member shader_type is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member next_sampler is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member next_bindless_sampler is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member next_image is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member next_bindless_image is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member next_subroutine is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member field_counter is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member current_var is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member explicit_location is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member record_array_count is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member record_next_sampler is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member record_next_image is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member record_next_bindless_sampler is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member record_next_bindless_image is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member targets is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member shader_samplers_used is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member shader_shadow_samplers is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member num_bindless_samplers is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member num_bindless_images is not initialized in this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7910>
When we encounter a bindless image here, lower_deref returns a
NULL-pointer, and calling record_images_used will try to dereference
that NULL-pointer.
So let's dig out the var from the source instruction instead of the
result of the lowering.
Fixes: 5910c938a2 ("nir/glsl: gather bitmask of images used by program")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9895>
Specifically, fix this error (which is covered in existing tests):
../src/compiler/glsl/glcpp/pp.c:198:28: runtime error: applying non-zero offset 1 to null pointer
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/compiler/glsl/glcpp/pp.c:198:28 in
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9669>