Commit Graph

46 Commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
430d2206da compiler: Rename local_size to workgroup_size
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Rhys Perry
49add985ff nir/unsigned_upper_bound: don't require dominance metadata
Instead, determine if it's a merge or loop exit phi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9808>
2021-06-04 14:14:00 +00:00
Timur Kristóf
e905e0938a nir: Support upper bound of unsigned bit size conversions.
These allow us to generate slightly better code in some cases,
eg. multiplications in ACO.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
9a2ffe1abb nir: Support upper bound of subgroup_id/num_subgroups for non-compute.
These intrinsics will be used when lowering NGG shaders, including
currently supported stages like VS, TES, GS and also by mesh shaders
in the future.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
38df949f98 nir: Add tessellation related AMD-specific intrinsics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
2021-03-17 12:42:23 +00:00
Timur Kristóf
c2a81ebe19 nir: Add default unsigned upper bound configuration.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
2021-03-17 12:42:23 +00:00
Timur Kristóf
8ebb8d31af nir: Add unsigned upper bound for TCS load_invocation_id.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
2021-03-17 12:42:23 +00:00
Timur Kristóf
084863bb5d nir: Fix unsigned upper bound of local_invocation_index for non-CS stages.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>
2021-03-17 12:42:23 +00:00
Ian Romanick
da7389eced nir/range_analysis: Simplify analysis of bcsel
union_ranges was previously guarded by 'ifndef NDEBUG'.  After removing
that, I noticed that the two tables were identical.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
2021-03-11 22:00:30 +00:00
Ian Romanick
f4a7dbc58f nir/range_analysis: Fix analysis of fmin, fmax, or fsat with NaN source
Recall that when either value is NaN, fmax will pick the other value.
This means the result range of the fmax will either be the "ideal"
result range (calculated above) or the range of the non-NaN value.

Previously, something like fmax({gt_zero}, {lt_zero, is_a_number}) would
return a range of gt_zero.  However, if the "gt_zero" parameter is NaN,
the actual result will be the "lt_zero" parameter.

This analysis depends on the is_a_number analysis also added in this MR.
Assuming this doesn't cause any unforeseen problems, I believe we should
wait a bit, then nominate a subset of the series for the stable
branches.

This fixes the piglit tests

    tests/spec/glsl-1.30/execution/range_analysis_fmax_of_nan.shader_test
    tests/spec/glsl-1.30/execution/range_analysis_fmin_of_nan.shader_test

from https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/463.

Even with the added fsat fixes, range_analysis_fsat_of_nan.shader_test
still fails.  There are some other issues there that will be addressed
in later commits (in another MR).

v2: Add fsat fixes.  Suggested by Rhys.

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Shader-db results:

All Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 21049290 -> 21049314 (<.01%)
instructions in affected programs: 3175 -> 3199 (0.76%)
helped: 0
HURT: 17
HURT stats (abs)   min: 1 max: 3 x̄: 1.41 x̃: 1
HURT stats (rel)   min: 0.20% max: 1.89% x̄: 0.97% x̃: 0.92%
95% mean confidence interval for instructions value: 1.09 1.73
95% mean confidence interval for instructions %-change: 0.75% 1.19%
Instructions are HURT.

total cycles in shared programs: 855136176 -> 855136406 (<.01%)
cycles in affected programs: 37579 -> 37809 (0.61%)
helped: 0
HURT: 17
HURT stats (abs)   min: 12 max: 20 x̄: 13.53 x̃: 14
HURT stats (rel)   min: 0.17% max: 1.13% x̄: 0.79% x̃: 0.91%
95% mean confidence interval for cycles value: 12.53 14.53
95% mean confidence interval for cycles %-change: 0.63% 0.94%
Cycles are HURT.

Fossil-db results:

Tiger Lake
Instructions in all programs: 160901033 -> 160902591 (+0.0%)
SENDs in all programs: 6812270 -> 6812270 (+0.0%)
Loops in all programs: 38225 -> 38225 (+0.0%)
Cycles in all programs: 7430016795 -> 7429003266 (-0.0%)
Spills in all programs: 192582 -> 192582 (+0.0%)
Fills in all programs: 304539 -> 304539 (+0.0%)

Ice Lake
Instructions in all programs: 145299102 -> 145301634 (+0.0%)
SENDs in all programs: 6863890 -> 6863890 (+0.0%)
Loops in all programs: 38219 -> 38219 (+0.0%)
Cycles in all programs: 8798390846 -> 8798589772 (+0.0%)
Spills in all programs: 216880 -> 216880 (+0.0%)
Fills in all programs: 334250 -> 334250 (+0.0%)

Skylake
Instructions in all programs: 135889478 -> 135892010 (+0.0%)
SENDs in all programs: 6802916 -> 6802916 (+0.0%)
Loops in all programs: 38216 -> 38216 (+0.0%)
Cycles in all programs: 8442624166 -> 8442597324 (-0.0%)
Spills in all programs: 194839 -> 194839 (+0.0%)
Fills in all programs: 301116 -> 301116 (+0.0%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
2021-03-11 22:00:30 +00:00
Ian Romanick
aa5d38decd nir/range_analysis: Add "is a number" range analysis tracking
This commit is necessary to support "nir/range_analysis: Fix analysis of
fmin and fmax with NaN".

No shader-db or fossil-db changes on any Intel platform.

v2: Pack and unpack is_a_number.

v3: Don't set is_a_number of integer constants.  The bit pattern might
be NaN.

v4: Update handling of b2i32.  intBitsToFloat(int(true)) is
1.401298464324817e-45.  Return a value consistent with that.

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
2021-03-11 22:00:30 +00:00
Ian Romanick
d4f21b53f2 nir/range_analysis: Add "is finite" range analysis tracking
The obvious changes to nir_search_helpers.h are in a separate commit to
limit the scope of this change.  These additions are really only needed
to support the next commit "nir/range_analysis: Add "is a number" range
analysis tracking".  This reduction in scope is intended to increase the
suitability for stable branches.

No shader-db or fossil-db changes on any Intel platform.

v2: Pack and unpack is_finite.

v3: Split nir_search_helpers.h changes into a separate commit.

v4: Remove assertion intended for the next commit.  Update is_finite
comment for fsign.  Both noticed by Rhys.  Fix is_finite handling for
load_const vectors.  If any element is not finite, set the flag to
false.  This is the same way is_integral is already handled.

v5: Update handling of b2i32.  intBitsToFloat(int(true)) is
1.401298464324817e-45.  Return a value consistent with that.

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
2021-03-11 22:00:30 +00:00
Ian Romanick
86fb53b1be nir/range_analysis: Refactor fsat handling
This will greatly simplify a later commit.  The assert(r.is_integral) in
the eq_zero case is dropped because I don't think it's useful anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>
2021-03-11 22:00:30 +00:00
Ian Romanick
f2656569c6 nir/range_analysis: Handle vectors better in ssa_def_bits_used
If a query is made of a vector ssa_def (possibly from an intermediate
result), return all_bits.  If a constant source is a vector, swizzle
the correct component.

Unit tests were added for the constant vector cases.  I don't see a
great way to make unit tests for the other cases.

v2: Add a FINIHSME comment about u16vec2 hardware.

Fixes: 96303a59ea ("nir: Add some range analysis for used bits")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9123>
2021-02-22 22:37:17 +00:00
Jason Ekstrand
96303a59ea nir: Add some range analysis for used bits
This isn't 100% accurate, of course, but it should be good enough for
what we're about to do with it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>
2021-02-16 16:36:31 +00:00
Samuel Pitoiset
0b503d8de9 nir: fix determining if an addition might overflow for phi sources
nir_addition_might_overflow() expects the parent instruction to be
an alu instr but it might be a phi instr. Fix it by assuming that
the addition might overflow.

This fixes compiler crashes with Horizon Zero Dawn.

No fossils-db changes.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8268>
2020-12-31 16:17:08 +00:00
Pierre-Eric Pelloux-Prayer
805b6b426e nir: update fallthrough comments
clang doesn't support /* fallthrough */ so switch to fallthrough
attribute.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7747>
2020-12-01 10:04:41 +01:00
Rhys Perry
2a1238f3a3 nir/unsigned_upper_bound: decrement num_sources_left before recursing
Otherwise, search_phi_bcsel() will be called with a buf_size that is
slightly lower than it has to be.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7748>
2020-11-25 11:26:42 +00:00
Rhys Perry
65fbae16e3 nir/unsigned_upper_bound: fix buffer overflow in search_phi_bcsel
It should only recurse if there's enough space to add the phi sources.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 72ac3f6026 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7748>
2020-11-25 11:26:42 +00:00
Daniel Schürmann
a79dad950b nir,amd: remove trinary_minmax opcodes
These consist of the variations nir_op_{i|u|f}{min|max|med}3 which are either
lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>
2020-08-24 20:56:11 +00:00
Karol Herbst
e5899c1e88 nir: rename nir_op_fne to nir_op_fneu
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
2020-08-21 17:26:21 +00:00
Jesse Natalie
678cb6d248 nir: nir_range_analysis needs to be updated for vec16
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6275>
2020-08-11 14:38:34 +00:00
Jason Ekstrand
5c5555a862 nir: Add a find_variable_with_[driver_]location helper
We've hand-rolled this loop 10 places and those are just the ones I
found easily.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-29 17:38:58 +00:00
Jason Ekstrand
2956d53400 nir: Add nir_foreach_shader_in/out_variable helpers
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-29 17:38:57 +00:00
Rhys Perry
72ac3f6026 nir: add nir_unsigned_upper_bound and nir_addition_might_overflow
This adds a nir_unsigned_upper_bound() helper which does something similar
to nir_analyze_range() except it tries to obtain the largest possible
value instead of it's relation to zero.

It also adds nir_addition_might_overflow(), which uses this helper to try
to prove that an unsigned addition does not wrap around.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
2020-07-21 18:25:35 +00:00
Marek Olšák
d18d07c9d7 nir: replace GCC unroll with an option that works on GCC < 8.0
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3970>
2020-02-27 22:53:12 -05:00
Kristian H. Kristensen
2be81a3bfa nir: Make unroll pragma work on clang
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3686>
2020-02-04 06:03:52 +00:00
Brian Paul
a2689ebcd6 nir: no-op C99 _Pragma() with MSVC
This fixes a build failure on MSVC.

BTW, it looks like clang supports _Pragma() but I don't know if it
understands the "gcc unroll N" directive.

Signed-off-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-11-23 10:34:24 -07:00
Ian Romanick
ca353285cb nir/range_analysis: Make sure the table validation only occurs once
All of the tables are static const, so they only need to be validated
once.  As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass.  Even with
the help of the previous commit, this does not always occur.

-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Ian Romanick
ccefce46cb nir/range-analysis: Add pragmas to help loop unrolling
I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions.  It seems none of this happens even at -O3.

Adding the pragmas helps encourage loop unrolling at some optimization
levels.  I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.

-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5

v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Eric Anholt
c23db0df18 nir: Keep the range analysis HT around intra-pass until we make a change.
This lets us memoize range analysis work across instructions.  Reduces
runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3).

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Ian Romanick
7e53bebcb5 nir/range-analysis: Use types to provide better ranges from bcsel and mov
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16328255 -> 16315391 (-0.08%)
instructions in affected programs: 218318 -> 205454 (-5.89%)
helped: 988
HURT: 0
helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10
helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88%
95% mean confidence interval for instructions value: -13.69 -12.35
95% mean confidence interval for instructions %-change: -6.55% -5.99%
Instructions are helped.

total cycles in shared programs: 363683977 -> 363615417 (-0.02%)
cycles in affected programs: 1475193 -> 1406633 (-4.65%)
helped: 923
HURT: 36
helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48
helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08%
HURT stats (abs)   min: 1 max: 179 x̄: 38.58 x̃: 4
HURT stats (rel)   min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29%
95% mean confidence interval for cycles value: -75.88 -67.10
95% mean confidence interval for cycles %-change: -5.10% -4.66%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10785779 -> 10785654 (<.01%)
instructions in affected programs: 13855 -> 13730 (-0.90%)
helped: 67
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1
helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78%
95% mean confidence interval for instructions value: -2.47 -1.26
95% mean confidence interval for instructions %-change: -1.13% -0.81%
Instructions are helped.

total cycles in shared programs: 153704799 -> 153704481 (<.01%)
cycles in affected programs: 101509 -> 101191 (-0.31%)
helped: 38
HURT: 13
helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16
helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53%
HURT stats (abs)   min: 1 max: 36 x̄: 12.15 x̃: 7
HURT stats (rel)   min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44%
95% mean confidence interval for cycles value: -10.24 -2.24
95% mean confidence interval for cycles %-change: -0.75% -0.17%
Cycles are helped.

LOST:   2
GAINED: 0

No shader-db change on Iron Lake or GM45.
2019-09-25 15:37:05 -07:00
Ian Romanick
99ddb41e2d nir/range-analysis: Use types in the hash key
This allows the reslut of mov and bcsel to be separately interpreted as
float or int depending on the use.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-25 15:37:01 -07:00
Ian Romanick
018d2b524a nir/range-analysis: Bail if the types don't match
Some shaders are hurt by this change because now a
load_const(0x00000000) is not recognized as eq_zero when loaded as a
float.  This behavior is restored in a later patch (nir/range-analysis:
Use types to provide better ranges from bcsel and mov).

v2: Add a comment about reinterpretation of int/uint/bool.  Suggested by
Caio.  Rewrite condition the check for types being float versus checking
for types not being all the things that aren't float.

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16327543 -> 16328255 (<.01%)
instructions in affected programs: 55928 -> 56640 (1.27%)
helped: 0
HURT: 208
HURT stats (abs)   min: 1 max: 16 x̄: 3.42 x̃: 3
HURT stats (rel)   min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12%
95% mean confidence interval for instructions value: 3.06 3.79
95% mean confidence interval for instructions %-change: 1.17% 1.46%
Instructions are HURT.

total cycles in shared programs: 363682759 -> 363683977 (<.01%)
cycles in affected programs: 325758 -> 326976 (0.37%)
helped: 44
HURT: 133
helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5
helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29%
HURT stats (abs)   min: 1 max: 157 x̄: 20.28 x̃: 14
HURT stats (rel)   min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73%
95% mean confidence interval for cycles value: 0.38 13.39
95% mean confidence interval for cycles %-change: -0.06% 0.96%
Inconclusive result (%-change mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10787433 -> 10787443 (<.01%)
instructions in affected programs: 1842 -> 1852 (0.54%)
helped: 0
HURT: 10
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.36% 1.10%
Instructions are HURT.

total cycles in shared programs: 153724543 -> 153724563 (<.01%)
cycles in affected programs: 8407 -> 8427 (0.24%)
helped: 1
HURT: 3
helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18
helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98%
HURT stats (abs)   min: 4 max: 18 x̄: 12.67 x̃: 16
HURT stats (rel)   min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72%
95% mean confidence interval for cycles value: -21.31 31.31
95% mean confidence interval for cycles %-change: -1.11% 1.46%
Inconclusive result (value mean confidence interval includes 0).

No shader-db changes on Iron Lake or GM45.
2019-09-25 15:37:01 -07:00
Samuel Pitoiset
966a455bb9 nir: do not assume that the result of fexp2(a) is always an integral
It's only correct when 'a' is an integral greater or equal to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493
Fixes: 5544b2cbbd ("nir/algebraic: Use value range analysis to eliminate useless unary ops")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-09-02 09:00:37 +02:00
Ian Romanick
9ad4a2eac5 nir/range-analysis: Add a lot more assertions about the contents of tables
v2: Update several of the comments.  Drop some redundant uses of
ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source.  Suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-29 13:15:53 -07:00
Ian Romanick
636da12433 nir/range-analysis: Range tracking for fpow
One shader from Metro Last Light and the rest from Rochard.  In the
Rochard cases, something like:

    min(1.0, max(pow(saturate(x), y), z))

was transformed to

    saturate(max(pow(saturate(x), y), z))

because the result of the pow must be >= 0.

The Metro Last Light case was similar.  An instance of

    min(pow(abs(x), y), 1.0)

became

    saturate(pow(abs(x), y))

v2: Fix some comments.  Suggested by Caio.

v3: Fix setting is_intgral when the exponent might be negative.  See
also Mesa MR !1778.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280670 -> 16280659 (<.01%)
instructions in affected programs: 1130 -> 1119 (-0.97%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.72% max: 1.43% x̄: 1.03% x̃: 0.97%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.19% -0.86%
Instructions are helped.

total cycles in shared programs: 367168430 -> 367168270 (<.01%)
cycles in affected programs: 10281 -> 10121 (-1.56%)
helped: 10
HURT: 1
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.31% max: 2.43% x̄: 1.79% x̃: 1.70%
HURT stats (abs)   min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel)   min: 3.10% max: 3.10% x̄: 3.10% x̃: 3.10%
95% mean confidence interval for cycles value: -20.06 -9.04
95% mean confidence interval for cycles %-change: -2.36% -0.32%
Cycles are helped.
2019-08-29 13:15:53 -07:00
Ian Romanick
7dba7df5e5 nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel
I discovered this while looking at a shader that was hurt by some other
work I'm doing.  When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated.  I had to use NIR_PRINT=true to see exactly where things
when wrong.

A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280659 -> 16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs)   min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel)   min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.

total cycles in shared programs: 367168270 -> 367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs)   min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel)   min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.
2019-08-29 13:15:53 -07:00
Ian Romanick
0b4782fccd nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)
Found by inspection.  I tried really, really hard to make a test case
that would trigger this problem, but I was unsuccesful.  It's very hard
to get an instruction to produce a ne_zero result without ne_zero
sources.  The most plausible way is using bcsel.  That proves
problematic because bcsel interprets its sources as integers, so it
cannot currently be used to "clean" values for floating point
instructions.

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
2019-08-29 13:15:53 -07:00
Ian Romanick
ef2e235252 nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-fma-compare-zero.shader_test
    - fs-underflow-mul-compare-zero.shader_test

v2: Add back part of comment accidentally deleted.  Noticed by
Caio. Remove is_not_zero function as it is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: fa116ce357 ("nir/range-analysis: Range tracking for ffma and flrp")
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms** had similar results. (Ice Lake shown)
total instructions in shared programs: 16278465 -> 16279492 (<.01%)
instructions in affected programs: 16765 -> 17792 (6.13%)
helped: 0
HURT: 23
HURT stats (abs)   min: 7 max: 275 x̄: 44.65 x̃: 8
HURT stats (rel)   min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62%
95% mean confidence interval for instructions value: 9.57 79.74
95% mean confidence interval for instructions %-change: 1.85% 6.61%
Instructions are HURT.

total cycles in shared programs: 367135159 -> 367154270 (<.01%)
cycles in affected programs: 279306 -> 298417 (6.84%)
helped: 0
HURT: 23
HURT stats (abs)   min: 13 max: 6029 x̄: 830.91 x̃: 54
HURT stats (rel)   min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49%
95% mean confidence interval for cycles value: 100.89 1560.94
95% mean confidence interval for cycles %-change: 0.94% 13.71%
Cycles are HURT.

total spills in shared programs: 8870 -> 8869 (-0.01%)
spills in affected programs: 19 -> 18 (-5.26%)
helped: 1
HURT: 0

total fills in shared programs: 21904 -> 21901 (-0.01%)
fills in affected programs: 81 -> 78 (-3.70%)
helped: 1
HURT: 0

LOST:   0
GAINED: 1

** On Broadwell, a shader was hurt for spills / fills instead of
   helped.

No changes on any earlier platforms.
2019-08-29 13:15:53 -07:00
Ian Romanick
33ad2bab4b nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-exp2-compare-zero.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

Most of the shaders affected are, unsurprisingly, in Unigine Heaven.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16278207 -> 16278465 (<.01%)
instructions in affected programs: 11374 -> 11632 (2.27%)
helped: 0
HURT: 58
HURT stats (abs)   min: 2 max: 13 x̄: 4.45 x̃: 4
HURT stats (rel)   min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82%
95% mean confidence interval for instructions value: 3.77 5.13
95% mean confidence interval for instructions %-change: 2.19% 2.64%
Instructions are HURT.

total cycles in shared programs: 367134284 -> 367135159 (<.01%)
cycles in affected programs: 81207 -> 82082 (1.08%)
helped: 17
HURT: 36
helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6
helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78%
HURT stats (abs)   min: 4 max: 235 x̄: 66.97 x̃: 16
HURT stats (rel)   min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09%
95% mean confidence interval for cycles value: -20.36 53.38
95% mean confidence interval for cycles %-change: -1.08% 4.67%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier platforms.
2019-08-29 13:15:53 -07:00
Ian Romanick
f2965fde9b nir/range-analysis: Fail gracefully on non-SSA sources
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 09:02:38 -07:00
Ian Romanick
fa116ce357 nir/range-analysis: Range tracking for ffma and flrp
A similar technique could be used for fmin3, fmax3, and fmid3.

This could be squashed with the previous commit.  I kept it separate to
ease review.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
586602c5d9 nir/range-analysis: Range tracking for bcsel
This could be squashed with the previous commit.  I kept it separate to
ease review.

v2: Add some missing cases.  Use nir_src_is_const helper.  Both
suggested by Caio.  Use a table for mapping source ranges to a result
range.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
3009cbed50 nir/range-analysis: Tighten the range of fsat based on the range of its source
This could be squashed with the previous commit.  I kept it separate to
ease review.

v2: Use a switch statement and add more comments.  Both suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00
Ian Romanick
405de7ccb6 nir/range-analysis: Rudimentary value range analysis pass
Most integer operations are omitted because dealing with integer
overflow is hard.  There are a few things that could be smarter if there
was a small amount more tracking of ranges of integer types (i.e.,
operands are Boolean, operand values fit in 16 bits, etc.).

The changes to nir_search_helpers.h are included in this patch to
simplify reordering the changes to nir_opt_algebraic.py.

v2: Memoize range analysis results.  Without this, some shaders appear
to get stuck in infinite loops.

v3: Rebase on many months of Mesa changes, including 1-bit Boolean
changes.

v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction".

v5: Use nir_alu_srcs_equal for detecting (a*a).  Previously just the SSA
value was compared, and this incorrectly matched (a.x*a.y).

v6: Many code improvements including (but not limited to) better names,
more comments, and better use of helper functions.  All suggested by
Caio.  Rework the handling of several opcodes to use a table for mapping
source ranges to a result range.  This change fixed a bug that caused
fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero.
Slightly tighten the range of fmul by recognizing that x*x is gt_zero if
x is gt_zero.  Add similar handling for -x*x.

v7: Use _______ in the tables as an alias for unknown.  Suggested by
Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-05 20:14:13 -07:00