Allows nir drivers to either use a single or dual locations for
vs double inputs.
i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.
The following patch will also make use of this option when
calling nir_shader_gather_info().
Reviewed-by: Karol Herbst <kherbst@redhat.com>
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.
We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
nir_type_conversion enables new operations to handle rounding modes to
convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne
and nir_op_f2f16_rtz.
The undefined behaviour doesn't has any effect and uses the original
nir_op_f2f16 operation.
v2: Indentation fixed (Jason Ekstrand)
v3: Use explicit case for undefined rounding and assert if
rounding mode is used for non 16-bit float conversions
(Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Adds new INT16, UINT16 and FLOAT16 base types.
The corresponding GL types for half floats were reused from the
AMD_gpu_shader_half_float extension. The int16 and uint16 types come from
NV_gpu_shader_5 extension.
This adds the builtins and the lexer support.
To avoid a bunch of warnings due to cases not handled in switch, the
new types have been added to a few places using same behavior as
their 32-bit counterparts, except for a few trivial cases where they are
already handled properly. Subsequent patches in this set will provide
correct 16-bit implementations when needed.
v2: * Use FLOAT16 instead of HALF_FLOAT as name of the base type.
* Removed float16_t from builtin types.
* Don't copy 16-bit types as if they were 32-bit values in
copy_constant_to_storage().
* Use get_scalar_type() instead of adding a new custom switch
statement.
(Jason Ekstrand)
v3: Use GL_FLOAT16_NV instead of GL_HALF_FLOAT for consistency
(Ilia Mirkin)
v4: Add missing 16-bit base types support in glsl_to_nir (Eduardo Lima).
v5: Fix coding style (Topi Poholainen).
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This way they can return either a uvec4 or a uint64_t. At the moment,
this is a no-op since we still always return a uint64_t.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
NIR does not have these instructions. TGSI and Mesa IR both implement
them using < and >=, repsectively. Removing them deletes a bunch of
code and means I don't have to add code to the SPIR-V generator for
them.
v2: Rebase on 2+ years of change... and fix a major bug added in the
rebase.
text data bss dec hex filename
8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before
8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after
7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before
7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Will be used in nir link pass to decided if we can remove a varying
or not.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
There was no reason to treat array types and record types differently.
Unifying them saves a bunch of code and saves a few bytes in every
ir_constant.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
The next patch will unify ::array_elements and ::components, so the
name ::array_elements wouldn't be appropriate. A lot of things use
the names array_elements and components, so grepping for either is
pretty useless.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
We are currently copying the name for each member dereference
but we can just share a single instance of the string provided
by the type.
This change also stops us recalculating the field index
repeatedly.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Extra validation is added to ir_validate to make sure this is
always updated to the correct numer of operands, as passes like
lower_instructions modify the instructions directly rather then
generating a new one.
The reduction in time is so small that it is not really
measurable. However callgrind was reporting this function as
being called just under 34 million times while compiling the
Deus Ex shaders (just pre-linking was profiled) with 0.20%
spent in this function.
v2:
- make num_operands a unit8_t
- fix unsigned/signed mismatches
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Check if shaders have transform feedback varyings also after the
post-link step.
This fixes:
KHR-GL45.enhanced_layouts.xfb_vertex_streams
piglit/spec/arb_enhanced_layouts/gs-stream-location-aliasing
v2: add claryfing comments (Timothy)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
These are intrinsics rather than opcodes, because they operate across
channels.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is convenient for backends that support both Vulkan and OpenGL while
lowering samplers to derefs with nir_lower_samplers_as_deref.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Commit e1af20f18a changed the shader_info
from being embedded into being just a pointer. The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct. This, however, has
caused a few problems:
1) There are many things which generate NIR without GLSL. This means
we have to support both NIR shaders which come from GLSL and ones
that don't and need to have an info elsewhere.
2) The solution to (1) raises all sorts of ownership issues which have
to be resolved with ralloc_parent checks.
3) Ever since 00620782c9, we've been
using nir_gather_info to fill out the final shader_info. Thanks to
cloning and the above ownership issues, the nir_shader::info may not
point back to the gl_shader anymore and so we have to do a copy of
the shader_info from NIR back to GLSL anyway.
All of these issues go away if we just embed the shader_info in the
nir_shader. There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It doesn't make sense to prefix them with 'image' because
they are called "Memory Qualifiers" and they can be applied
to members of storage buffer blocks.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
These should be lowered away in GLSL IR but if we don't get dead
code to clean them up it causes issues in glsl_to_nir.
We wan't to drop as many GLSL IR opts in future as we can so this
makes glsl_to_nir just ignore the vars if it sees them.
In future we will want to just use the nir lowering pass that
Vulkan currently uses.
Acked-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The NIR story on conversion opcodes is a mess. We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random. This commit re-organizes things and makes them all
consistent:
- All non-bool conversion opcodes now have the explicit size in the
destination and are named <src_type>2<dst_type><size>.
- Integer <-> integer conversion opcodes now only come in i2i and u2u
forms (i2u and u2i have been removed) since the only difference
between the different integer conversions is whether or not they
sign-extend when up-converting.
- Boolean conversion opcodes all have the explicit size on the bool and
are named <src_type>2<dst_type>.
Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako. This will make adding
int8, int16, and float16 versions much easier when the time comes.
Reviewed-by: Eric Anholt <eric@anholt.net>
NIR is a typeless IR and the two opcodes, when considered bitwise, do
exactly the same thing. There's no reason to have two versions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There are some line wrapping violations here but those lines will get
deleted in the following patch.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
v2: Don't up-convert the shift count parameter if shift instructions.
Suggested by Connor. Add type_is_singed() function. This will make
adding 8- and 16-bit types easier.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b. Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.
v3 (idr): Make the "from" type in a cast unsized. This reduces the
number of required cast operations at the expensive slightly more
complex code. However, this will be a dramatic improvement when other
sized integer types are added. Suggested by Connor.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
So far, input_reads was a bitmap tracking which vertex input locations
were being used.
In OpenGL, an attribute bigger than a vec4 (like a dvec3 or dvec4)
consumes just one location, any other small attribute. So we mark the
proper bit in inputs_read, and also the same bit in double_inputs_read
if the attribute is a dvec3/dvec4.
But in Vulkan, this is slightly different: a dvec3/dvec4 attribute
consumes two locations, not just one. And hence two bits would be marked
in inputs_read for the same vertex input attribute.
To avoid handling two different situations in NIR, we just choose the
latest one: in OpenGL, when creating NIR from GLSL/IR, any dvec3/dvec4
vertex input attribute is marked with two bits in the inputs_read bitmap
(and also in the double_inputs_read), and following attributes are
adjusted accordingly.
As example, if in our GLSL/IR shader we have three attributes:
layout(location = 0) vec3 attr0;
layout(location = 1) dvec4 attr1;
layout(location = 2) dvec3 attr2;
then in our NIR shader we put attr0 in location 0, attr1 in locations 1
and 2, and attr2 in location 3 and 4.
Checking carefully, basically we are using slots rather than locations
in NIR.
When emitting the vertices, we do a inverse map to know the
corresponding location for each slot.
v2 (Jason):
- use two slots from inputs_read for dvec3/dvec4 NIR from GLSL/IR.
v3 (Jason):
- Fix commit log error.
- Use ladder ifs and fix braces.
- elements_double is divisible by 2, don't need DIV_ROUND_UP().
- Use if ladder instead of a switch.
- Add comment about hardware restriction in 64bit vertex attributes.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Here we also remove the duplicate field in gl_linked_shader and always
get the value from shader_info instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This also removes the duplicate field in gl_linked_shader, and
gets num_ubos from shader_info instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
We rename it to nir_deref_clone, re-order the sources to match the other
clone functions, and expose nir_deref_var_clone. This past part, in
particular, lets us get rid of quite a few lines since we no longer have
to call nir_copy_deref and wrap it in deref_as_var.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g'
Just happened to notice this in a patch that was sent and included one
of the tokens in question.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
In 19a541f (nir: Get rid of nir_constant_data) a number of places that
operated on nir_constant::values were mechanically converted to operate
on the whole array without regard for the base type. Only
GLSL_TYPE_FLOAT and GLSL_TYPE_DOUBLE can be matrices, so only those
types can have data in the non-0 array element.
See also b870394.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
All of these are happily set from glsl_to_nir or spirv_to_nir but their
values are never used for anything.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This has bothered me for about as long as NIR has been around. Why do we
have two different unions for constants? No good reason other than one of
them is a direct port from GLSL IR.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Certain built-in arrays, such as gl_ClipDistance[], gl_CullDistance[],
gl_TessLevelInner[], and gl_TessLevelOuter[] are specified as scalar
arrays. Normal scalar arrays are sparse - each array element usually
occupies a whole vec4 slot. However, most hardware assumes these
built-in arrays are tightly packed.
The new var->data.compact flag indicates that a scalar array should
be tightly packed, so a float[4] array would take up a single vec4
slot, and a float[8] array would take up two slots.
They are still arrays, not vec4s, however. nir_lower_io will generate
intrinsics using ARB_enhanced_layouts style component qualifiers.
v2: Add nir_validate code to enforce type restrictions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>