Commit Graph

138 Commits

Author SHA1 Message Date
Thomas Helland
ec8423a4b1 nir: Add a LCSAA-pass
V2: Do a "depth first search" to convert to LCSSA

V3: Small comment fixup

V4: Rebase, adapt to removal of function overloads

V5: Rebase, adapt to relocation of nir to compiler/nir
    Still need to adapt to potential if-uses
    Work around nir_validate issue

V6 (Timothy):
 - tidy lcssa and stop leaking memory
 - dont rewrite the src for the lcssa phi node
 - validate lcssa phi srcs to avoid postvalidate assert
 - don't add new phi if one already exists
 - more lcssa phi validation fixes
 - Rather than marking ssa defs inside a loop just mark blocks inside
   a loop. This is simpler and fixes lcssa for intrinsics which do
   not have a destination.
 - don't create LCSSA phis for loops we won't unroll
 - require loop metadata for lcssa pass
 - handle case were the ssa defs use outside the loop is already a phi

V7: (Timothy)
- pass indirect mask to metadata call

v8: (Timothy)
- make convert to lcssa a helper function rather than a nir pass
- replace inside loop bitset with on the fly block index logic.
- remove lcssa phi validation special cases
- inline code from useless helpers, suggested by Jason.
- always do lcssa on loops, suggested by Jason.
- stop making lcssa phis special. Add as many source as the block
  has predecessors, suggested by Jason.

V9: (Timothy)
- fix regression with the is_lcssa_phi field not being initialised
  to false now that ralloc() doesn't zero out memory.

V10: (Timothy)
- remove extra braces in SSA example, pointed out by Topi

V11: (Timothy)
- add missing support for LCSSA phis in if conditions.

V12: (Timothy)
- small tidy up suggested by Jason.
- always create lcssa phi even if it just points to an lcssa
  phi from an inner loop

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-23 10:15:36 +11:00
Thomas Helland
6772a17acc nir: Add a loop analysis pass
This pass detects induction variables and calculates the
trip count of loops to be used for loop unrolling.

V2: Rebase, adapt to removal of function overloads

V3: (Timothy Arceri)
 - don't try to find trip count if loop terminator conditional is a phi
 - fix trip count for do-while loops
 - replace conditional type != alu assert with return
 - disable unrolling of loops with continues
 - multiple fixes to memory allocation, stop leaking and don't destroy
   structs we want to use for unrolling.
 - fix iteration count bugs when induction var not on RHS of condition
 - add FIXME for && conditions
 - calculate trip count for unsigned induction/limit vars

V4: (Timothy Arceri)
- count instructions in a loop
- set the limiting_terminator even if we can't find the trip count for
 all terminators. This is needed for complex unrolling where we handle
 2 terminators and the trip count is unknown for one of them.
- restruct structs so we don't keep information not required after
 analysis and remove dead fields.
- force unrolling in some cases as per the rules in the GLSL IR pass

V5: (Timothy Arceri)
- fix metadata mask value 0x10 vs 0x16

V6: (Timothy Arceri)
- merge loop_variable and nir_loop_variable structs and lists suggested by Jason
- remove induction var hash table and store pointer to induction information in
  the loop_variable suggested by Jason.
- use lowercase list_addtail() suggested by Jason.
- tidy up init_loop_block() as per Jasons suggestions.
- replace switch with nir_op_infos[alu->op].num_inputs == 2 in
  is_var_basic_induction_var() as suggested by Jason.
- use nir_block_last_instr() in and rename foreach_cf_node_ex_loop() as suggested
  by Jason.
- fix else check for is_trivial_loop_terminator() as per Connors suggetions.
- simplify offset for induction valiables incremented before the exit conditions is
  checked.
- replace nir_op_isub check with assert() as it should have been lowered away.

V7: (Timothy Arceri)
- use rzalloc() on nir_loop struct creation. Worked previously because ralloc()
  was broken and always zeroed the struct.
- fix cf_node_find_loop_jumps() to find jumps when loops contain
  nested if statements. Code is tidier as a result.

V8: (Timothy Arceri)
- move is_trivial_loop_terminator() to nir.h so we can use it to assert is
  the loop unroll pass
- fix analysis to not bail when looking for terminator when the break is in the else
  rather then the if
- added new loop terminator fields: break_block, continue_from_block and
  continue_from_then so we don't have to gather these when doing unrolling.
- get correct array length when forcing unrolling of variables
  indexed arrays that are the same size as the iteration count
- add support for induction variables of type float
- update trival loop terminator check to allow an if containing
  instructions as long as both branches contain only a single
  block.

V9: (Timothy)
 - bunch of tidy ups and simplifications suggested by Jason.
 - rewrote trivial terminator detection, now the only restriction is there
   must be no nested jumps, anything else goes.
 - rewrote the iteration test to use nir_eval_const_opcode().
 - count instruction properly even when forcing an unroll.
 - bunch of other tidy ups and simplifications.

V10: (Timothy)
 - some trivial tidy ups suggested by Jason.
 - conditional fix for break inside continue branch by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-23 10:15:36 +11:00
Jason Ekstrand
a620f66872 nir: Add a couple quick-and-dirty out-of-SSA helpers
These are designed for use within an optimization pass when SSA becomes
more pain than it's worth.  They're very naive and don't generate
anything close to optimal register-based NIR.  Also, they may result in
shaders which do not validate because of, for instance, registers in phi
sources.  However, the register-based into-SSA pass should be pretty
efficient at cleaning up the mess.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-12-23 10:15:35 +11:00
Iago Toral Quiroga
5be2e785b1 nir/lower_tex: add lowering for texture gradient on shadow samplers
This is ported from the Intel lowering pass that we use with GLSL IR.
This takes care of lowering texture gradients on shadow samplers other
than cube maps. Intel hardware requires this for gen < 8.

v2 (Ken):
 - Use the helper function to retrieve ddx/ddy
 - Swizzle away size components we are not interested in

v3:
- Get rid of the ddx/ddy helper and use nir_tex_instr_src_index
  instead (Ken, Eric)

v4:
- Add a 'continue' statement if the lowering makes progress because it
  replaces the original texture instruction

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v3)
2016-12-13 10:32:52 +01:00
Iago Toral Quiroga
a8e740c354 nir/lower_tex: add lowering for texture gradient on cube maps
This is ported from the Intel lowering pass that we use with GLSL IR.
The NIR pass only handles cube maps, not shadow samplers, which are
also lowered for gen < 8 on Intel hardware. We will add support for
that in a later patch, at which point we should be able to remove
the GLSL IR lowering pass.

v2:
- added a helper to retrieve ddx/ddy parameters (Ken)
- No need to make size.z=1.0, we are only using component x anyway (Iago)

v3:
- Get rid of the ddx/ddy helper and use nir_tex_instr_src_index
  instead (Ken, Eric)

v4:
- When emitting the textureLod operation, copy all texture parameters
  from the original textureGrad() (except for ddx/ddy) using a loop
- Add a 'continue' statement if the lowering makes progress because it
  replaces the original texture instruction

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v3)
2016-12-13 10:32:00 +01:00
Ilia Mirkin
fd249c803e treewide: s/comparitor/comparator/
git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g'

Just happened to notice this in a patch that was sent and included one
of the tokens in question.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-12-12 22:13:07 -05:00
Jason Ekstrand
7db009b59e nir: Remove some unused fields from nir_variable
All of these are happily set from glsl_to_nir or spirv_to_nir but their
values are never used for anything.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:10 -08:00
Jason Ekstrand
50e0b0bee3 nir: Delete most of the constant_initializer support
Constant initializers have been a constant (ha!) pain for quite some time.
While they're useful from a language perspective, people writing passes or
backends really don't want deal with them most of the time.  This commit
removes most of the constant initializer support from NIR.  It is expected
that you call nir_lower_constant_initializers VERY EARLY to ensure that
they're gone before you do anything interesting.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
f5232db9e5 nir: Add a pass for lowering away constant initializers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
19a541f496 nir: Get rid of nir_constant_data
This has bothered me for about as long as NIR has been around.  Why do we
have two different unions for constants?  No good reason other than one of
them is a direct port from GLSL IR.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-02 10:53:32 -08:00
Kenneth Graunke
9a179f2db0 nir: add a pass to compact clip/cull distances.
v2: Use nir_is_per_vertex_io() rather than is_arrays_of_arrays().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-22 00:29:23 -08:00
Kenneth Graunke
663b2e9a92 nir: Add a "compact array" flag and IO lowering code.
Certain built-in arrays, such as gl_ClipDistance[], gl_CullDistance[],
gl_TessLevelInner[], and gl_TessLevelOuter[] are specified as scalar
arrays.  Normal scalar arrays are sparse - each array element usually
occupies a whole vec4 slot.  However, most hardware assumes these
built-in arrays are tightly packed.

The new var->data.compact flag indicates that a scalar array should
be tightly packed, so a float[4] array would take up a single vec4
slot, and a float[8] array would take up two slots.

They are still arrays, not vec4s, however.  nir_lower_io will generate
intrinsics using ARB_enhanced_layouts style component qualifiers.

v2: Add nir_validate code to enforce type restrictions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-22 00:29:23 -08:00
Kenneth Graunke
ad9d4a4f8d nir: Generalize the "is per-vertex variable?" helpers and export them.
I want this function for nir_gather_info(), and realized it's basically
the same as the ones in nir_lower_io().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-11-11 09:17:07 +11:00
Dave Airlie
b16dff2d88 nir: add conditional discard optimisation (v4)
This is ported from GLSL and converts

if (cond)
	discard;

into
discard_if(cond);

This removes a block, but also is needed by radv
to workaround a bug in the LLVM backend.

v2: handle if (a) discard_if(b) (nha)
cleanup and drop pointless loop (Matt)
make sure there are no dependent phis (Eric)
v3: make sure only one instruction in the then block.
v4: remove sneaky tabs, add cursor init (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-11-10 05:46:33 +10:00
Timothy Arceri
2e423ca147 nir: stop adjusting driver location for varying packing
As of 59864e8e02 we just use the location assigned by the front-end and
no longer need this for i965.

Since there were some issues in the logic with assigning arrays the same
driver location if they didn't start at the same location just remove it
and let other drivers implement a solution if needed when they add
ARB_enhanced_layouts support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-26 14:29:36 +11:00
Timothy Arceri
e1af20f18a nir/i965/anv/radv/gallium: make shader info a pointer
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.

There are other advantages such as being able to reuse the
shader info populated by GLSL IR.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-26 14:29:36 +11:00
Timothy Arceri
094fe3a959 nir: move nir_shader_info to a common compiler header
This will allow use to stop copying values between structs and
will also simplify handling handling these values in the shader cache.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-26 14:29:36 +11:00
Jason Ekstrand
ae032e5ea6 nir: Remove some no longer needed asserts
Now that the NIR casting functions have type assertions, we have a bunch of
assertions that aren't needed anymore.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-10-06 09:16:39 -07:00
Jason Ekstrand
2ed17d46de nir: Make nir_foo_first/last_cf_node return a block instead
One of NIR's invariants is that control flow lists always start and end
with blocks.  There's no good reason why we should return a cf_node from
these functions since we know that it's always a block.  Making it a block
lets us remove a bunch of code.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-10-06 09:16:37 -07:00
Jason Ekstrand
7a3bcadf4e nir: Add asserts to the casting functions
This makes calling nir_foo_as_bar a bit safer because we're no longer 100%
trusting in the caller to ensure that it's safe.  The caller still needs to
do the right thing but this ensures that we catch invalid casts with an
assert rather than by reading garbage data.  The one downside is that we do
use the casts a bit in nir_validate and it's not a validate_assert.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-10-06 09:16:24 -07:00
Eric Anholt
36f0f03182 nir: Allow opt_peephole_sel to be more aggressive in flattening IFs.
VC4 was running into a major performance regression from enabling control
flow in the glmark2 conditionals test, because of short if statements
containing an ffract.

This pass seems like it was was trying to ensure that we only flattened
IFs that should be entirely a win by guaranteeing that there would be
fewer bcsels than there were MOVs otherwise.  However, if the number of
ALU ops is small, we can avoid the overhead of branching (which itself
costs cycles) and still get a win, even if it means moving real
instructions out of the THEN/ELSE blocks.

For now, just turn on aggressive flattening on vc4.  i965 will need some
tuning to avoid regressions.  It does looks like this may be useful to
replace freedreno code.

Improves glmark2 -b conditionals:fragment-steps=5:vertex-steps=0 from 47
fps to 95 fps on vc4.

vc4 shader-db:
total instructions in shared programs: 101282 -> 99543 (-1.72%)
instructions in affected programs:     17365 -> 15626 (-10.01%)
total uniforms in shared programs: 31295 -> 31172 (-0.39%)
uniforms in affected programs:     3580 -> 3457 (-3.44%)
total estimated cycles in shared programs: 225182 -> 223746 (-0.64%)
estimated cycles in affected programs:     26085 -> 24649 (-5.51%)

v2: Update shader-db output.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2016-09-22 11:10:21 +03:00
Dave Airlie
7bf76563e2 glsl: add subpass image type (v2)
SPIR-V/Vulkan have a special image type for input attachments
called the subpass type. It has different characteristics than
other images types.

The main one being it can only be an input image to fragment
shaders and loads from it are relative to the frag coord.

This adds support for it to the GLSL types. Unfortunately
we've run out of space in the sampler dim in types, so we
need to use another bit.

v2: Fixup subpass input name (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-09-16 15:16:31 +10:00
Jason Ekstrand
ed65e6ef49 nir: Add a flag to lower_io to force "sample" interpolation
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-15 13:31:43 -07:00
Kenneth Graunke
2d8a3fa7ea nir: Report progress from nir_lower_phis_to_scalar.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-09-14 12:01:51 -07:00
Kenneth Graunke
32630e211e nir: Report progress from nir_lower_alu_to_scalar.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-09-14 12:01:49 -07:00
Rob Clark
1a8424ceba nir: move tex_instr_remove_src
I want to re-use this in a different pass, so move to nir.h

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-14 13:45:32 -04:00
Jason Ekstrand
88a2a2e053 nir/gcm: Add global value numbering support
Unlike the current CSE pass, global value numbering is capable of detecting
common values even if one does not dominate the other.  For instance, in
you have

if (...) {
   ssa_1 = ssa_0 + 7;
   /* use ssa_1 */
} else {
   ssa_2 = ssa_0 + 7;
   /* use ssa_2 */
}

Global value numbering doesn't care about dominance relationships so it
figures out that ssa_1 and ssa_2 are the same and converts this to

if (...) {
   ssa_1 = ssa_0 + 7;
   /* use ssa_1 */
} else {
   /* use ssa_1 */
}

Obviously, we just broke SSA form which is bad.  Global code motion,
however, will repair this for us by turning this into

ssa_1 = ssa_0 + 7;
if (...) {
   /* use ssa_1 */
} else {
   /* use ssa_1 */
}

This intended to eventually mostly replace CSE.  However, conventional CSE
may still be useful because it's less of a scorched-earth approach and
doesn't require GCM.  This makes it a bit more appropriate for use as a
clean-up in a late optimization run.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-08 20:53:01 -07:00
Connor Abbott
356d101af3 nir: remove some fields from nir_shader_compiler_options
I accidentally added these with 0dc4cab. Oops!
2016-09-03 00:49:58 -04:00
Connor Abbott
0dc4cabee2 nir: add nir_after_phis() cursor helper
And re-implement nir_after_cf_node_and_phis() using it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-09-03 00:37:48 -04:00
Kenneth Graunke
93bfa1d7a2 nir: Change nir_shader_get_entrypoint to return an impl.
Jason suggested adding an assert(function->impl) here.  All callers
of this function actually want ->impl, so I decided just to change
the API.

We also change the nir_lower_io_to_temporaries API here.  All but one
caller passed nir_shader_get_entrypoint(), and with the previous commit,
it now uses a nir_function_impl internally.  Folding this change in
avoids the need to change it and change it back.

v2: Fix one call I missed in ir3_compiler (caught by Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-08-25 19:18:24 -07:00
Francisco Jerez
97ac3eba58 nir: Pass through fb_fetch_output and OutputsRead from GLSL IR.
The NIR representation of framebuffer fetch is the same as the GLSL
IR's until interface variables are lowered away, at which point it
will be translated to load output intrinsics.  The GLSL-to-NIR pass
just needs to copy the bits over to the NIR program.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-25 18:33:29 -07:00
Eric Anholt
9f1411d1ec nir: Add an IO scalarizing pass using the intrinsic's first_component.
vc4 wants to have per-scalar IO load/stores so that dead code elimination
can happen on a more granular basis, which it has been doing in the
backend using a multiplication by 4 of the intrinsic's driver_location.
We can represent it properly in the NIR using the first_component field,
though.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Eric Anholt
24728637e2 nir: Move the undef of nir_intrinsics.h macros to the .h.
I wanted to include this from nir_builder as well, so it also needed the
undefs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-08-19 13:11:36 -07:00
Kenneth Graunke
7603b4d3a1 nir: Make nir_alu_srcs_equal non-static.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-08-04 00:41:07 -07:00
Matt Turner
d1f6f65697 glsl: Separate overlapping sentinel nodes in exec_list.
I do appreciate the cleverness, but unfortunately it prevents a lot more
cleverness in the form of additional compiler optimizations brought on
by -fstrict-aliasing.

No difference in OglBatch7 (n=20).

Co-authored-by: Davin McCall <davmac@davmac.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-07-26 12:12:27 -07:00
Jason Ekstrand
d9156efc52 nir/lower_tex: Add support for lowering coordinate offsets
On i965, we can't support coordinate offsets for texelFetch or rectangle
textures.  Previously, we were doing this with a GLSL pass but we need to
do it in NIR if we want those workarounds for SPIR-V.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:48:53 -07:00
Jason Ekstrand
09135cd55a nir: Add a helper for determining the type of a texture source
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand
dc9f2436c3 nir: Add a nir_deref_foreach_leaf helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-20 15:29:55 -07:00
Kenneth Graunke
707ca00fce nir: Add nir_load_interpolated_input lowering code.
Now nir_lower_io can optionally produce load_interpolated_input
and load_barycentric_* intrinsics for fragment shader inputs.

flat inputs continue using regular load_input.

v2: Use a nir_shader_compiler_options flag rather than ad-hoc boolean
    passing (in response to review feedback from Chris Forbes).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-20 11:01:00 -07:00
Kenneth Graunke
2496462479 nir: Add new intrinsics for fragment shader input interpolation.
Backends can normally handle shader inputs solely by looking at
load_input intrinsics, and ignore the nir_variables in nir->inputs.

One exception is fragment shader inputs.  load_input doesn't capture
the necessary interpolation information - flat, smooth, noperspective
mode, and centroid, sample, or pixel for the location.  This means
that backends have to interpolate based on the nir_variables, then
associate those with the load_input intrinsics (say, by storing a
map of which variables are at which locations).

With GL_ARB_enhanced_layouts, we're going to have multiple varyings
packed into a single vec4 location.  The intrinsics make this easy:
simply load N components from location <loc, component>.  However,
working with variables and correlating the two is very awkward; we'd
much rather have intrinsics capture all the necessary information.

Fragment shader input interpolation typically works by producing a
set of barycentric coordinates, then using those to do a linear
interpolation between the values at the triangle's corners.

We represent this by introducing five new load_barycentric_* intrinsics:

- load_barycentric_pixel     (ordinary variable)
- load_barycentric_centroid  (centroid qualified variable)
- load_barycentric_sample    (sample qualified variable)
- load_barycentric_at_sample (ARB_gpu_shader5's interpolateAtSample())
- load_barycentric_at_offset (ARB_gpu_shader5's interpolateAtOffset())

Each of these take the interpolation mode (smooth or noperspective only)
as a const_index, and produce a vec2.  The last two also take a sample
or offset source.

We then introduce a new load_interpolated_input intrinsic, which
is like a normal load_input intrinsic, but with an additional
barycentric coordinate source.

The intention is that flat inputs will still use regular load_input
intrinsics.  This makes them distinguishable from normal inputs that
need fancy interpolation, while also providing all the necessary data.

This nicely unifies regular inputs and interpolateAt functions.
Qualifiers and variables become irrelevant; there are just
load_barycentric intrinsics that determine the interpolation.

v2: Document the interp_mode const_index value, define a new
    BARYCENTRIC() helper rather than using SYSTEM_VALUE() for
    some of them (requested by Jason Ekstrand).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-20 11:00:45 -07:00
Kenneth Graunke
ac1181ffbe compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_*.
Likewise, rename the enum type to glsl_interp_mode.

Beyond the GLSL front-end, talking about "interpolation modes" seems
more natural than "interpolation qualifiers" - in the IR, we're removed
from how exactly the source language specifies how to interpolate an
input.  Also, SPIR-V calls these "decorations" rather than "qualifiers".

Generated by:
$ find . -regextype egrep -regex '.*\.(c|cpp|h)' -type f -exec sed -i \
  -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \
  -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \;

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Dave Airlie <airlied@redhat.com>
2016-07-17 19:26:48 -07:00
Timothy Arceri
448adfbc67 nir: use the same driver location for packed varyings
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-07 10:26:43 +10:00
Timothy Arceri
0eea6b3297 nir: add new intrinsic field for storing component offset
This offset is used for packing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-07 10:26:43 +10:00
Giuseppe Bilotta
60a27ad122 Remove wrongly repeated words in comments
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by

v2:
    * proper commit message and non-joke title;
    * replace two 'as is' followed by 'is' to 'as-is'.
v3:
    * 'a integer' => 'an integer' and similar (originally spotted by
      Jason Ekstrand, I fixed a few other similar ones while at it)

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2016-06-23 13:55:03 -07:00
Jason Ekstrand
202751fbb7 nir: Add a pass for propagating invariant decorations
This pass is similar to propagate_invariance in the GLSL compiler.  The
real "output" of this pass is that any algebraic operations which are
eventually consumed by an invariant variable get marked as "exact".

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-20 12:02:45 -07:00
Jason Ekstrand
4d3b8318a7 nir/info: Get rid of uses_interp_var_at_offset
We were using this briefly in the i965 driver to trigger recompiles but we
haven't been using it since we switched to the NIR y-transform lowering
pass.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-06-03 19:29:28 -07:00
Rob Clark
dfbae7d64f nir/algebraic: support for power-of-two optimizations
Some optimizations, like converting integer multiply/divide into left/
right shifts, have additional constraints on the search expression.
Like requiring that a variable is a constant power of two.  Support
these cases by allowing a fxn name to be appended to the search var
expression (ie. "a#32(is_power_of_two)").

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-03 16:05:03 -04:00
Jordan Justen
6f316c9d86 nir: Make lowering gl_LocalInvocationIndex optional
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jason Ekstrand
15e553daf0 nir: Make nir_const_value a union
There's no good reason for it to be a struct of an anonymous union.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96221
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-26 16:03:44 -07:00
Kristian Høgsberg Kristensen
a41b57679f nir: Add a lowering pass for YUV textures
This lowers sampling from YUV textures to 1) one or more texture
instructions to sample each plane and 2) color space conversion to RGB.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-24 10:14:56 -07:00