Commit Graph

678 Commits

Author SHA1 Message Date
Connor Abbott
8a7fe634d2 nir/lower_outputs_to_temporaries: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-05 16:19:41 -07:00
Kenneth Graunke
bc0062c54a nir: Optimize out stores of undefs.
There are a couple of cycle count changes in shader-db, but it's
basically a wash.

However, with the Broadwell scalar TCS backend enabled, many
Shadow of Mordor shaders benefit from this patch.  Because we don't
batch up output writes for TCS, vec4 outputs might not have all
components defined.  Many output writes have a value of undef,
which is useless.

With scalar TCS, stats for tessellation shaders on Broadwell:

total instructions in shared programs: 1283000 -> 1280444 (-0.20%)
instructions in affected programs: 34302 -> 31746 (-7.45%)
helped: 71
HURT: 0

total cycles in shared programs: 10798768 -> 10780682 (-0.17%)
cycles in affected programs: 158004 -> 139918 (-11.45%)
helped: 71
HURT: 0

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Kenneth Graunke
c7a8b32700 nir: Replace vecN(undef, undef, ...) with a single undef.
shader-db statistics on Broadwell:

total instructions in shared programs: 8963409 -> 8962455 (-0.01%)
instructions in affected programs: 60858 -> 59904 (-1.57%)
helped: 318
HURT: 0

total cycles in shared programs: 71408022 -> 71406276 (-0.00%)
cycles in affected programs: 398416 -> 396670 (-0.44%)
helped: 199
HURT: 51

GAINED: 1

The only shaders affected were in Dota 2 Reborn.

It also sets up for the next optimization.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Kenneth Graunke
49ea7454a1 nir: Rename opt_undef_alu to opt_undef_csel; update comments.
This better reflects what it does.  I plan to add other ALU
optimizations as well, so the old name would be confusing.

In preparation for that, also move the file comments about csels
above the opt_undef_csel function, and delete the ones about there
not being other optimizations.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-05 14:24:00 -07:00
Thomas Hindoe Paaboel Andersen
8698194313 nir: fix assert for wildcard pairs
The assert was null checking dest_arr_parent twice. The intention
seems to be to check both dest_ and src_.

Added in d3636da9

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-05-05 09:33:02 +02:00
Samuel Iglesias Gonsálvez
2ab2d2e588 nir: Separate 32 and 64-bit fmod lowering
Split 32-bit and 64-bit fmod lowering as the drivers might need to
lower them separately inside NIR depending on the HW support.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-04 08:07:49 +02:00
Samuel Iglesias Gonsálvez
b902377a56 nir/lower_double_ops: lower mod()
There are rounding errors with the division in i965 that affect
the mod(x,y) result when x = N * y. Instead of returning '0' it
was returning 'y'.

This lowering pass fixes those cases.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-04 08:07:49 +02:00
Dave Airlie
265fe9dce8 glsl: subroutine types cannot be used in constructors.
This fixes two of the cases in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-04 06:44:45 +10:00
Dave Airlie
3110a0aa23 glsl: resource is a reserved keyword in GLSL 4.20 as well
resource just appears in GLSL 4.20 without any fanfare.

Fixes GL43-CTX.CommonBugs.CommonBug_ReservedNames

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-04 06:44:45 +10:00
Rob Clark
dcf8c4425a nir: make lower_clamp_color pass work after lower i/o
Kinda important to work with tgsi_to_nir, which generates nir which
already has i/o lowered.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-02 14:25:38 -04:00
Timothy Arceri
f982e2434b mesa: add LOCATION_COMPONENT support to GetProgramResourceiv
From Section 7.3.1.1 (Naming Active Resources) of the OpenGL 4.5 spec:

   "For the property LOCATION_COMPONENT, a single integer indicating the first
   component of the location assigned to an active input or output variable is
   written to params. For input and output variables with a component specified
   by a layout qualifier, the specified component is written. For all other
   input and output variables, the value zero is written."

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-05-01 23:13:36 +10:00
Timothy Arceri
b1c872a81e glsl: add component to has_layout() helper
I don't think this will do much as it's a compiler error
to use component without location which is already in the
table but its good to be consistent.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-05-01 23:13:28 +10:00
Timothy Arceri
589053dac7 glsl: validate linking of intrastage component qualifiers
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-05-01 23:13:22 +10:00
Timothy Arceri
0317dfcd9b glsl: update explicit location matching to support component qualifier
This is needed so we don't optimise away the varying when more than
one shares the same location.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:13:15 +10:00
Timothy Arceri
0d88b15f07 glsl: cross validate varyings with a component qualifier
This change checks for component overlap, including handling overlap of
locations and components by doubles. Previously there was no validation
for assigning explicit locations to a location used by the second half
of a double.

V3: simplify handling of doubles and fix double component aliasing
detection

V2: fix component matching for matricies

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:13:10 +10:00
Timothy Arceri
94438578d2 glsl: validate and store component layout qualifier in GLSL IR
We make use of the existing IR field location_frac used for tracking
component locations.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:13:05 +10:00
Timothy Arceri
2d9936a686 glsl: allow component qualifier on varying inputs
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
2016-05-01 23:13:00 +10:00
Timothy Arceri
daa8df590b glsl: parse component layout qualifier
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-01 23:12:52 +10:00
Emil Velikov
cee69ccb92 spirv: automake: add missing headers to the tarball.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-05-01 08:38:06 +01:00
Thomas Hindoe Paaboel Andersen
cbcd7b60f5 nir/lower_double_ops: fix indentation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-30 12:16:32 -07:00
Thomas Hindoe Paaboel Andersen
21424e019d nir/opt_dead_cf: fix indentation
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-30 12:16:29 -07:00
Thomas Hindoe Paaboel Andersen
6935726197 nir/opt_dead_cf: correction of side effect check
Parenthesis are needed here as ! takes precedence over the &. The
check had the opposite effect than intended.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-30 12:16:22 -07:00
Rob Clark
64abf6d404 nir: clamp-color-output support
Handled by tgsi_emulate for glsl->tgsi case.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-04-30 14:56:19 -04:00
Kenneth Graunke
750c38fad1 glsl: Lower vector_extracts to swizzles after lower_vector_derefs.
lower_vector_derefs can produce new vector_extract operations.
Neither i965 nor st_glsl_to_tgsi can handle them, so we'd best
convert them to swizzles.

Together with the previous patch, this fixes assertion failures in
GLideN64, as well as a new Piglit test which reproduces the issue:
spec/glsl-1.10/compiler/vector-dereference-in-dereference.frag

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-29 16:03:36 -07:00
Kenneth Graunke
1cd600dbb9 glsl: Convert lower_vec_index_to_swizzle to a rvalue visitor.
The old visitor missed some cases.  For example, it wouldn't handle
an ir_dereference_array with a vector_extract as the index.

Rather than trying to add the missing cases, just rewrite it as an
ir_rvalue_visitor.  This makes it easy to replace any expression,
and is much less code.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95164
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-29 16:03:29 -07:00
Andres Gomez
c750029b37 glsl: Checks for interpolation into its own function.
This generalizes the validation also to be done for variables inside
interface blocks, which, for some cases, was missing.

For a discussion about the additional validation cases included see
https://lists.freedesktop.org/archives/mesa-dev/2016-March/109117.html
and Khronos bug #15671.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-04-29 08:03:00 +02:00
Jason Ekstrand
6d4a426745 nir/algebraic: Support lowering for both 64 and 32-bit ldexp
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand
f0af5b87ec nir/opcodes: Make ldexp take an explicitly 32-bit int
There is no sense in having the double version of ldexp take a 64-bit
integer.  Instead, let's just take a 32-bit int all the time.  This also
matches what GLSL does where both variants of ldexp take a regular integer
for the exponent argument.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand
bee40dd730 nir/opcodes: Simplify the expressions for [un]pack_double
The new expressions are more explicit in terms of where the bits go so it's
a little easier to tell what's going on.  This is the way GLSL specifies
things so it's a bit easier to verify too.  It also has the benifit that
the new expressions easily vectorize so we can constant-fold vector forms
of the _split versions correctly.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand
70f89dd75e nir: Switch the arguments to nir_foreach_def
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_def(\([^,]*\),\s*\([^,]*\))/nir_foreach_def(\2, \1)/

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
5015260a05 nir: Switch the arguments to nir_foreach_use and friends
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_use(\([^,]*\),\s*\([^,]*\))/nir_foreach_use(\2, \1)/

and similar expressions for nir_foreach_use_safe, etc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
9464d8c498 nir: Switch the arguments to nir_foreach_function
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
e63766fb4b nir: Switch the arguments to nir_foreach_parallel_copy_entry
This matches the "foreach x in container" pattern found in many other
programming languages.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
8564916d01 nir: Switch the arguments to nir_foreach_phi_src
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_phi_src(\([^,]*\),\s*\([^,]*\))/nir_foreach_phi_src(\2, \1)/

and a similar expression for nir_foreach_phi_src_safe.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-28 15:54:48 -07:00
Jason Ekstrand
707e72f13b nir: Switch the arguments to nir_foreach_instr
This matches the "foreach x in container" pattern found in many other
programming languages.  Generated by the following regular expression:

s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/

and similar expressions for nir_foreach_instr_safe etc.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-28 15:54:48 -07:00
Connor Abbott
3a8688fb41 nir/algebraic: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1f8c100614 nir/validate: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
a471c161b1 nir/nir_worklist: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
db35177772 nir/remove_dead_variables: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
b3aaae398e nir/split_var_copies: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
9d41a1ffeb nir/repair_ssa: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
480a182ccd nir/opt_peephole_select: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
e5f37701ab nir/phi_builder: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1ba40d834b nir/opt_cp: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
8dd7d78925 nir/opt_remove_phis: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
1a8c17a59e nir/opt_undef: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
52affdd2e6 nir/opt_dead_cf: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
ddc6639f85 nir/opt_dce: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
3afb3be674 nir/opt_gcm: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00
Connor Abbott
eecf96f530 nir/opt_constant_folding: fixup for new foreach_block()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 15:52:17 -07:00