This pass searches for reasonably large local variables which can be
statically proven to be constant and moves them into shader constant
data. This is especially useful when large tables are baked into the
shader source code because they can be moved into a UBO by the driver to
reduce register pressure and make indirect access cheaper.
v2 (Jason Ekstrand):
- Use a size/align function to ensure we get the right alignments
- Use the newly added deref offset helpers
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This commit adds a concept to NIR of having a blob of constant data
associated with a shader. Instead of being a UBO or uniform that can be
manipulated by the client, this constant data considered part of the
shader and remains constant across all invocations of the given shader
until the end of time. To access this constant data from the shader, we
add a new load_constant intrinsic. The intention is that drivers will
eventually lower load_constant intrinsics to load_ubo, load_uniform, or
something similar. Constant data will be used by the optimization pass
in the next commit but this concept may also be useful for OpenCL.
v2 (Jason Ekstrand):
- Rename num_constants to constant_data_size (anholt)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis. In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This fixes validation issues with some Vulkan CTS tests with the new
deref instructions.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
This commit completely reworks function calls in NIR. Instead of having
a set of variables for the parameters and return value, nir_call_instr
now has simply has a number of sources which get mapped to load_param
intrinsics inside the functions. It's up to the client API to build an
ABI on top of that. In SPIR-V, out parameters are handled by passing
the result of a deref through as an SSA value and storing to it.
This virtue of this approach can be seen by how much it allows us to
delete from core NIR. In particular, nir_inline_functions gets halved
and goes from a fairly difficult pass to understand in detail to almost
trivial. It also simplifies spirv_to_nir somewhat because NIR functions
never were a good fit for SPIR-V.
Unfortunately, there is no good way to do this without a mega-commit.
Core NIR and SPIR-V have to be changed at the same time. This also
requires changes to anv and radv because nir_inline_functions couldn't
handle deref instructions before this change and can't work without them
after this change.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This adds a concept of "members" to a variable with an interface type.
It allows you to specify the full variable data for each member of the
interface instead of once for the variable. We also add a lowering pass
to lower those variables to a sequence of variables and rewrite all the
derefs accordingly.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This will be removed at the end of the transition, but add some tracking
plus asserts to help ensure that lowering passes are called at the
correct point (pre or post deref instruction lowering) as passes are
converted and the point where lower_deref_instrs() is called is moved.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This commit adds a new instruction type to NIR for handling derefs.
Nothing uses it yet but this adds the data structure as well as all of
the code to validate, print, clone, and [de]serialize them.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is copied from the corresponding value in ir_variable. The
intention is to eventually use it in a pure-NIR linker.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Run this pass late (after opt loop) to move load_const instructions back
into the basic blocks which use the result, in cases where a load_const
is only consumed in a single block.
This helps reduce register usage in cases where the backend driver
cannot lower the load_const to a uniform.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
nir_ssa_def::parent_instr and nir_src::parent_instr have the same name,
but they mean really different things. I choose to save the next person
the hour+ that I just spent figuring that out. Even now that I know, I
doubt I'd notice in code review that someone typed foo->parent_instr
when they actually meant foo->ssa->parent_instr.
v2: Minor wording tweak in nir_ssa_def::parent_instr. Suggested by
Jason.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This is basically the same as the GLSL lowering path.
v2: Fix typo in the link
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is based on the glsl/lower_instructions.cpp implementation, but
should be much more readable.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb
to that.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to
glsl/lower_instructions.cpp.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes
things harder than keeping the "bits" argument around.
This still uses bfm, but I've added the obvious lowering of bfm if you
need it.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Rename and change the prototype for consistency regarding
nir_tex_instr_is_query(). This function will be used in the
following patch.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This pass is required by the Midgard compiler; our instruction set uses
NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
Normally, this lowering pass would be implemented in a backend-specific
algebraic pass, but this conflicts with the existing iand->b2f pass in
nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
existing pass optional (default on -- all other backends should remain
unaffected), adding an optional pass for lowering the opposite
direction.
v2: Defer lowering until late algebraic optimisations to allow
optimising the b2f instruction itself.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Not all bit-sizes may be supported natively in hardware for all operations.
This pass allows drivers to lower such operations to a bit-size that is
actually supported and then converts the result back to the original
bit-size.
Compiler backends control which operations and wich bit-sizes require
the lowering through a callback function.
v2: generalize this pass and make it available in NIR core (Rob, Jason)
v3: remove some temporaries and reduce nesting in instruction loop using
a continue statement (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
With this we should have no passes in src/compiler/nir with any
dependencies on headers from core GL Mesa.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Add helpers to get the number of src/dest components for an intrinsic,
and update spots that were open-coding this logic to use the helpers
instead.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Because nir_instr_remove is an inline wrapper around nir_instr_remove_v,
the compiler should be able to tell that the return value is unused and
not emit the extra code in most cases.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I threatened to do this a long time ago.. I probably *should* have done
it a long time ago when there where many fewer intrinsics. But the
system of macro/#include magic for dealing with intrinsics is a bit
annoying, and python has the nice property of optional fxn params,
making it possible to define new intrinsics while ignoring parameters
that are not applicable (and naming optional params). And not having to
specify various array lengths explicitly is nice too.
I think the end result makes it easier to add new intrinsics.
v2: couple small fixes found with a test program to compare the old and
new tables
v3: misc comments, don't rely on capture=true for meson.build, get rid
of system_values table to avoid return value of intrinsic() and
*mostly* remove side-effects, add autotools build support
v4: scons build
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
i965 and gallium handle the atomic buffer index differently. It was
just by luck that the single piglit test for this was passing.
For gallium we use the atomic binding so that we match the handling
in st_bind_atomics().
On radeonsi this fixes the CTS test:
KHR-GL43.shader_storage_buffer_object.advanced-write-fragment
It also fixes tressfx hair rendering in Tomb Raider.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This will only ever be used by gallium drivers so it probably doesn't
belong in the nir toolkit. Also we want to pass it some non NIR
things in the following patch.
To avoid regressions we wrap the lowering calls that have been moved
to st_glsl_to_nir with a quick hack so that they are only called for
radeonsi, we will replace the hack with a check for uniform packing
in a following patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This pass moves load UBO operations just before their first use,
loosely based on nir_opt_move_comparisons.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
OpenCL kernels also have int8/uint8.
v2: remove changes in nir_search as Jason posted a patch for that
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
This is based heavily on 97f10934ed, "ac/nir: Add vote_ieq/vote_feq
lowering pass." from Bas Nieuwenhuizen. This version is a bit more
general since it's in common code. It also properly handles NaN due to
not flipping the comparison for floats.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>