OpenCL knows vector of size 8 and 16.
v2: rebased on master (nir_swizzle rework)
rework more declarations with nir_component_mask_t
adjust print_var_decl
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
There are no fixed sized array arguments in C, those are simply pointers
to unsized arrays and as the size is passed in anyway, just rely on that.
where possible calls are replaced by nir_channel and nir_channels.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
This commit completely reworks function calls in NIR. Instead of having
a set of variables for the parameters and return value, nir_call_instr
now has simply has a number of sources which get mapped to load_param
intrinsics inside the functions. It's up to the client API to build an
ABI on top of that. In SPIR-V, out parameters are handled by passing
the result of a deref through as an SSA value and storing to it.
This virtue of this approach can be seen by how much it allows us to
delete from core NIR. In particular, nir_inline_functions gets halved
and goes from a fairly difficult pass to understand in detail to almost
trivial. It also simplifies spirv_to_nir somewhat because NIR functions
never were a good fit for SPIR-V.
Unfortunately, there is no good way to do this without a mega-commit.
Core NIR and SPIR-V have to be changed at the same time. This also
requires changes to anv and radv because nir_inline_functions couldn't
handle deref instructions before this change and can't work without them
after this change.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
I threatened to do this a long time ago.. I probably *should* have done
it a long time ago when there where many fewer intrinsics. But the
system of macro/#include magic for dealing with intrinsics is a bit
annoying, and python has the nice property of optional fxn params,
making it possible to define new intrinsics while ignoring parameters
that are not applicable (and naming optional params). And not having to
specify various array lengths explicitly is nice too.
I think the end result makes it easier to add new intrinsics.
v2: couple small fixes found with a test program to compare the old and
new tables
v3: misc comments, don't rely on capture=true for meson.build, get rid
of system_values table to avoid return value of intrinsic() and
*mostly* remove side-effects, add autotools build support
v4: scons build
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
I've been doing this inside of vc4, but vc5 wants it as well and it may be
useful for other drivers (Intel has a related path for pre-gen6 with MRT,
and freedreno had a TGSI path for it at one point).
This required defining a common enum for the standard comparison
functions, but other lowering passes are likely to also want that enum.
v2: Add to meson.build as well.
Acked-by: Rob Clark <robdclark@gmail.com>
Simple search for a backslash followed by two newlines.
If one of the newlines were to be removed, this would cause issues, so
let's just remove these trailing backslashes.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The NIR story on conversion opcodes is a mess. We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random. This commit re-organizes things and makes them all
consistent:
- All non-bool conversion opcodes now have the explicit size in the
destination and are named <src_type>2<dst_type><size>.
- Integer <-> integer conversion opcodes now only come in i2i and u2u
forms (i2u and u2i have been removed) since the only difference
between the different integer conversions is whether or not they
sign-extend when up-converting.
- Boolean conversion opcodes all have the explicit size on the bool and
are named <src_type>2<dst_type>.
Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako. This will make adding
int8, int16, and float16 versions much easier when the time comes.
Reviewed-by: Eric Anholt <eric@anholt.net>
Each of the pop functions (and push_else) take a control flow parameter as
their second argument. If NULL, it assumes that the builder is in a block
that's a direct child of the control-flow node you want to pop off the
virtual stack. This is what 90% of consumers will want. The SPIR-V pass,
however, is a bit more "creative" about how it walks the CFG and it needs
to be able to pop multiple levels at a time, hence the argument.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
We rename it to nir_deref_clone, re-order the sources to match the other
clone functions, and expose nir_deref_var_clone. This past part, in
particular, lets us get rid of quite a few lines since we no longer have
to call nir_copy_deref and wrap it in deref_as_var.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.
There are other advantages such as being able to reuse the
shader info populated by GLSL IR.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The previous nir_load_system_value(b, nir_intrinsic_load_whatever), 0) was
rather verbose, when system values should be easy to generate.
The index is left out because only one system value had an index included
in it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The first simply picks the bany_inequal[234] opcodes based on the SSA
def's number of components. The latter implicitly compares with zero
to achieve the same semantics of GLSL's any().
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Backends can normally handle shader inputs solely by looking at
load_input intrinsics, and ignore the nir_variables in nir->inputs.
One exception is fragment shader inputs. load_input doesn't capture
the necessary interpolation information - flat, smooth, noperspective
mode, and centroid, sample, or pixel for the location. This means
that backends have to interpolate based on the nir_variables, then
associate those with the load_input intrinsics (say, by storing a
map of which variables are at which locations).
With GL_ARB_enhanced_layouts, we're going to have multiple varyings
packed into a single vec4 location. The intrinsics make this easy:
simply load N components from location <loc, component>. However,
working with variables and correlating the two is very awkward; we'd
much rather have intrinsics capture all the necessary information.
Fragment shader input interpolation typically works by producing a
set of barycentric coordinates, then using those to do a linear
interpolation between the values at the triangle's corners.
We represent this by introducing five new load_barycentric_* intrinsics:
- load_barycentric_pixel (ordinary variable)
- load_barycentric_centroid (centroid qualified variable)
- load_barycentric_sample (sample qualified variable)
- load_barycentric_at_sample (ARB_gpu_shader5's interpolateAtSample())
- load_barycentric_at_offset (ARB_gpu_shader5's interpolateAtOffset())
Each of these take the interpolation mode (smooth or noperspective only)
as a const_index, and produce a vec2. The last two also take a sample
or offset source.
We then introduce a new load_interpolated_input intrinsic, which
is like a normal load_input intrinsic, but with an additional
barycentric coordinate source.
The intention is that flat inputs will still use regular load_input
intrinsics. This makes them distinguishable from normal inputs that
need fancy interpolation, while also providing all the necessary data.
This nicely unifies regular inputs and interpolateAt functions.
Qualifiers and variables become irrelevant; there are just
load_barycentric intrinsics that determine the interpolation.
v2: Document the interp_mode const_index value, define a new
BARYCENTRIC() helper rather than using SYSTEM_VALUE() for
some of them (requested by Jason Ekstrand).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is similar to nir_channel except that it lets you grab more than one
channel by providing a mask.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There's no reason for having a macro *and* a python generator. We can
easily just do the whole thing in python. This has the advantage that we
are no longer definining ALU# macros which conflict with the ones in
brw_fs_builder.h.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It's what all the call-sites once, so gets rid of a bunch of inlined
glsl_get_base_type() at the call-sites.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
As they are not standard C++ and are not supported by MSVC C++ compiler.
Just have nir_imm_double match nir_imm_float above.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
v2:
- Group num_components and bit_size together (Jason)
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
No need for it not to be const, and lets caller declare it const if
desired.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
v2:
- Make the users to give the right bit_sizes as arguments (Jason).
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Squash multiple commits addressing the new parameter in different
files so we don't break the build (Iago)
v3: Fix tgsi (Samuel)
v4: Fix nir_clone.c (Samuel)
v5: Fix vc4 and freedreno (Iago)
v6 (Sam)
- Fix build errors in nir_lower_indirect_derefs
- Use helper to get type size from nir_alu_type.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>