Commit Graph

89 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
c39896b17b nir: Use getters for nir_src::parent_*
First, we need to give the parent_instr field a unique name to be able to
replace with a helper.  We have parent_instr fields for both nir_src and
nir_def, so let's rename nir_src::parent_instr in preparation for rework.

This was done with a combination of sed and manual fix-ups.

Then we use semantic patches plus manual fixups:

    @@
    expression s;
    @@

    -s->renamed_parent_instr
    +nir_src_parent_instr(s)

    @@
    expression s;
    @@

    -s.renamed_parent_instr
    +nir_src_parent_instr(&s)

    @@
    expression s;
    @@

    -s->parent_if
    +nir_src_parent_if(s)

    @@
    expression s;
    @@

    -s.renamed_parent_if
    +nir_src_parent_if(&s)

    @@
    expression s;
    @@

    -s->is_if
    +nir_src_is_if(s)

    @@
    expression s;
    @@

    -s.is_if
    +nir_src_is_if(&s)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:05 -04:00
Timothy Arceri
84e0f5ce75 nir: remove unused param from nir_alu_src_copy()
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24986>
2023-09-08 03:01:39 +00:00
Faith Ekstrand
b64da56b1a nir: s/nir_instr_ssa_def/nir_instr_def/
Generated by sed:

    sed -i -e 's/nir_instr_ssa_def/nir_instr_def/g' src/**/*.h src/**/*.c src/**/*.cpp

Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703>
2023-08-15 17:44:27 +00:00
Faith Ekstrand
6c1d32581a nir: Drop nir_alu_dest
Instead, we replace it directly with nir_def.  We could replace it with
nir_dest but the next commit gets rid of that so this avoids unnecessary
churn.  Most of this commit was generated by sed:

   sed -i -e 's/dest.dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp

There were a few manual fixups required in the nir_legacy.c and
nir_from_ssa.c as nir_legacy_reg and nir_parallel_copy_entry both have a
similar pattern.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
2023-08-14 21:22:53 +00:00
Faith Ekstrand
ed9affa02f nir: Drop most instances of nir_ssa_dest_init()
Generated using the following two semantic patches:

    @@
    expression I, J, NC, BS;
    @@

    -nir_ssa_dest_init(I, &J->dest, NC, BS);
    +nir_def_init(I, &J->dest.ssa, NC, BS);

    @@
    expression I, J, NC, BS;
    @@

    -nir_ssa_dest_init(I, &J->dest.dest, NC, BS);
    +nir_def_init(I, &J->dest.dest.ssa, NC, BS);

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24658>
2023-08-13 17:12:52 +00:00
Alyssa Rosenzweig
09d31922de nir: Drop "SSA" from NIR language
Everything is SSA now.

   sed -e 's/nir_ssa_def/nir_def/g' \
       -e 's/nir_ssa_undef/nir_undef/g' \
       -e 's/nir_ssa_scalar/nir_scalar/g' \
       -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \
       -e 's/nir_gather_ssa_types/nir_gather_types/g' \
       -i $(git grep -l nir | grep -v relnotes)

   git mv src/compiler/nir/nir_gather_ssa_types.c \
          src/compiler/nir/nir_gather_types.c

   ninja -C build/ clang-format
   cd src/compiler/nir && find *.c *.h -type f -exec clang-format -i \{} \;

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>
2023-08-12 16:44:41 -04:00
Faith Ekstrand
777d336b1f nir: clang-format src/compiler/nir/*.[ch]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24382>
2023-08-12 19:27:28 +00:00
Alyssa Rosenzweig
42ee8a55dd nir: Remove nir_alu_dest::write_mask
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
2023-08-03 22:40:30 +00:00
Alyssa Rosenzweig
579bc1e72e treewide: Drop some is_ssa if's
Via Coccinelle patch:

    @@
    expression x;
    @@

    -if (!x.is_ssa) {
    -...
    -}

and likewise with x->is_ssa, with invalid hunks manually filtered out.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
2023-08-03 22:40:29 +00:00
Alyssa Rosenzweig
a8013644a1 nir: Drop nir_alu_src::{negate,abs}
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
5fead24365 treewide: Drop is_ssa asserts
We only see SSA now.

Via Coccinelle patch:

    @@
    expression x;
    @@

    -assert(x.is_ssa);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
d559764e7c nir: Remove nir_alu_dest::saturate
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig
815efcdf7e nir: Use nir_builder_create
perl -p0e 's/nir_builder ([^;]*);\s*nir_builder_init\(&\1, /nir_builder \1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>
2023-06-27 18:13:02 +00:00
Alyssa Rosenzweig
01e9ee79f7 nir: Drop unused name from nir_ssa_dest_init
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:

    @@
    expression A, B, C, D, E;
    @@

    -nir_ssa_dest_init(A, B, C, D, E);
    +nir_ssa_dest_init(A, B, C, D);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
2023-05-17 23:46:16 +00:00
Faith Ekstrand
01275a1a95 nir: Drop a bunch of Authors tags
This is what git blame is for.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22120>
2023-03-26 00:16:25 +00:00
Ian Romanick
ea413e826b nir: Eliminate nir_op_f2b
Builds on the work of !15121.  This gets to delete even more code
because many drivers shared a lot of code for i2b and f2b.

No shader-db or fossil-db changes on any Intel platform.

v2: Rebase on 1a35acd8d9.

v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin.

v4: Another rebase. Remove f2b stuff from Midgard.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
2023-02-03 22:39:57 +00:00
Ian Romanick
eb76cee9f8 nir: Eliminate nir_op_i2b
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b.  Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.

At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.

The failing test on d3d12 is a pre-existing bug that is triggered by
this change.  I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.

v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.

v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.

v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress.  The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).

v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.

v6: Just eliminate the i2b instruction.

v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷

No shader-db changes on any Intel platform.

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2

Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2

The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.

Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
2022-12-14 06:23:21 +00:00
Rhys Perry
69ba1c4d59 nir: adjust nir_src_copy signature to take a nir_instr *
This is almost always a nir_instr and updating the src of a nir_if will
have to work slightly differently in the future.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
2022-08-30 18:21:44 +00:00
Rhys Perry
aa2d6e020b Revert "nir: Drop the unused instr arg for src/dest copy functions."
This reverts commit c3a01841184ee8303c0c5ebe58491301622c5ad6.

Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
2022-08-30 18:21:44 +00:00
Connor Abbott
9d9b891f94 nir: Free instructions more often
Soon we'll be allocating instructions out of a per-shader pool, which
means that if we don't free too many instructions during the main
optimization loop, the final nir_sweep() call will create holes which
can't be filled. By freeing instructions more aggressively, we can
allocate more instructions from the freelist which will reduce the final
memory usage.

Modified from Connor Abbott's original patch to rebase on top of
refactored DCE and so that the use-after-free in nir_algebraic_impl() is
fixed.

Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
2022-08-30 18:21:44 +00:00
Rhys Perry
312a284980 nir/algebraic: add ignore_exact() wrapper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
2022-01-20 22:54:42 +00:00
Emma Anholt
de33205f88 nir/algebraic: Move all the individual transforms to a common table.
Cuts 28% of the remaining relocations in libvulkan_intel.so, shrinks
binary size by 290kb.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>
2021-12-07 07:09:00 +00:00
Emma Anholt
a29b54f014 nir/algebraic: Mark the automaton's filter tables as const.
Moves it to .rodata instead of .data.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>
2021-12-07 07:09:00 +00:00
Emma Anholt
53f49b7066 nir/algebraic: Move relocations for variable conds to a table.
This helps concentrate the dirty pages from the relocations, reduces how
many relocations there are, and reduces the size of each variable assuming
variables mostly don't have conditions or the conditions are mostly
reused).  Reduces libvulkan_intel.so size by 49kb.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>
2021-12-07 07:09:00 +00:00
Emma Anholt
8485a78977 nir/algebraic: Move relocations for expression conds to a table.
This helps concentrate the dirty pages from the relocations, reduces how
many relocations there are, and reduces the size of each expression
(assuming expressions mostly don't have conditions or the conditions are
mostly reused).  Reduces libvulkan_intel.so size by 8.7kb.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>
2021-12-07 07:09:00 +00:00
Emma Anholt
5d82c61a30 nir/algebraic: Replace relocations for nir_search values with a table.
Even with packing all 3 types into a 40-byte union (nir_search_constant
being 24 bytes and nir_search_expression having formerly been 32), and
having a single array of them, this cuts 1.7MB from each of
libvulkan_intel.so and libgallium_dri.so.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>
2021-12-07 07:09:00 +00:00
Emma Anholt
a263474d3b nir/algebraic: Move some generated-code algebraic opt args into a struct.
I'm going to be adding some more tables to reduce relocations in the
generated code, so move the current tables to a struct for arg-passing
sanity.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13987>
2021-12-07 07:08:59 +00:00
Emma Anholt
aed4c0b5a9 nir: Drop the unused instr arg for src/dest copy functions.
Now that we don't use ralloc, we don't need this arg to get at the right
ralloc ctx.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11776>
2021-09-14 17:53:06 +00:00
Ian Romanick
72259a870f util: Add and use functions to calculate min and max int for a size
Many places need to know the maximum or minimum possible value for a
given size integer... so everyone just open-codes their favorite
version.  There is some potential to hit either undefined or
implementation-defined behavior, so having one version that Just Works
seems beneficial.

v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in
u_intmin.  Noticed by CI.  Lol.  Rename functions
`s/u_(uint|int)(min|max)/u_\1N_\2/g`.  Suggested by Jason.  Add some
unit tests that would have caught the copy-and-paste bug before wasting
CI time.  Change the implementation of u_intN_min to use the same
pattern as stdint.h.  This avoids the integer division.  Noticed by
Jason.

v3: Add changes to convert_clear_color
(src/gallium/drivers/iris/iris_clear.c).  Suggested by Nanley.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>
2021-08-03 12:55:02 -07:00
Jason Ekstrand
117668b811 nir: Make nir_ssa_def_rewrite_uses take an SSA value
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
2021-03-08 16:59:55 +00:00
Jason Ekstrand
f064b7a42c nir: Add some ssa-only fast-paths for nir_src rewrite
Basically every pass in NIR uses nir_ssa_def_rewrite_uses which calls
nir_instr_rewrite_src which is fairly complex because it handles all
sorts of non-SSA cases.  Since we already know a priori that every
source written by nir_ssa_def_rewrite_uses is SSA, we can check new_src
once at the top of the function and cut out all that complexity.

While we're at it, we expose a new SSA-only nir_ssa_def_rewrite_uses_ssa
helper which takes an SSA def which avoids the one SSA check.  It's also
more convenient 90% of the time.

Compile time as tested by Rhys Perry <pendingchaos02@gmail.com>

    Difference at 95.0% confidence
        -797.166 +/- 418.649
        -0.566174% +/- 0.296441%
        (Student's t, pooled s = 325.459)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8790>
2021-02-02 15:35:55 +00:00
Rhys Perry
7d8c06d484 nir/search: check for changes before adding uses to worklist
So it doesn't uselessly add instructions to the worklist.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7511>
2020-11-20 13:57:34 +00:00
Rhys Perry
556a20afe6 nir/search: check instr type before adding to worklist
nir_algebraic_instr() ignores non-ALU instructions, so there's no point.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7511>
2020-11-20 13:57:34 +00:00
Jason Ekstrand
2b676b2ce8 nir: Properly preserve metadata in more cases
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00
Jason Ekstrand
5e1c42d85f nir: Call nir_metadata_preserve on !progress
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00
Ian Romanick
c0bdf37c91 nir/algebraic: Change the default cursor location when replacing a unary op
If the expression tree that is being replaced has a unary operation at
its root, set the cursor (location where new instructions are inserted)
at the source instruction instead.

This doesn't do much now because there are very few patterns that have a
unary operation as the root.  Almost all of the patterns that do have a
unary operation as the root have inot.  All of the shaders that are
affected by this commit have expression trees with an inot at the root.

This change prevents some significant, spurious caused by the next
commit.  There is further explanation in the large comment added in
the code.

I also considered a couple other options that may still be worth exploring.

1. Add some mark-up to the search pattern to denote where new
   instructions should be added.  I considered using "@" to denote the
   cursor location.  For example,

    (('fneg', ('fadd@', a, b)), ...)

2. To prevent other kinds of unintended code motion, add the ability to
   name expressions in the search pattern so that they can be reused in
   the replacement.  For example,

   (('bcsel', ('ige', ('find_lsb=b', a), 0), ('find_lsb', a), -1), b),

   An alternative would be to add some kind of CSE at the time of
   inserting the replacements.  Create a new instruction, then check to
   see if it already exists.  That option might be better overall.

Over the years I know Matt has heard me complain, "I added a pattern
that just deleted an instruction, but it added a bunch of spills!"  This
was always in large, complex shaders that are very hard to analyze.  I
always blamed these cases on the scheduler being dumb.  I am now very
suspicious that unintended code motion was the real problem.

All Gen4+ Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611405 -> 17611333 (<.01%)
instructions in affected programs: 18613 -> 18541 (-0.39%)
helped: 41
HURT: 13
helped stats (abs) min: 1 max: 18 x̄: 4.46 x̃: 4
helped stats (rel) min: 0.27% max: 5.68% x̄: 1.29% x̃: 1.34%
HURT stats (abs)   min: 1 max: 20 x̄: 8.54 x̃: 7
HURT stats (rel)   min: 0.30% max: 4.20% x̄: 2.15% x̃: 2.38%
95% mean confidence interval for instructions value: -3.29 0.63
95% mean confidence interval for instructions %-change: -0.95% 0.02%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 338366118 -> 338365223 (<.01%)
cycles in affected programs: 257889 -> 256994 (-0.35%)
helped: 42
HURT: 15
helped stats (abs) min: 2 max: 120 x̄: 39.38 x̃: 34
helped stats (rel) min: 0.04% max: 2.55% x̄: 0.86% x̃: 0.76%
HURT stats (abs)   min: 6 max: 204 x̄: 50.60 x̃: 34
HURT stats (rel)   min: 0.11% max: 4.75% x̄: 1.12% x̃: 0.56%
95% mean confidence interval for cycles value: -30.39 -1.02
95% mean confidence interval for cycles %-change: -0.66% -0.02%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1359>
2020-04-01 00:28:38 +00:00
Ian Romanick
58bdc1c748 nir/search: Use larger type to hold linearized index
"index" is an offset into a linearized 3-dimensional array.  Starting
with fbd5359a0a, the 3-dimensional array can have 43 elements in each
dimension.  43**3 = 79507, and that will overflow the uint16_t.

See also the discussion in MR !3765.

Fixes: fbd5359a0a ("nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select")
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3871>
2020-02-19 19:07:34 +00:00
Rob Clark
a8ec4082a4 nir+vtn: vec8+vec16 support
This introduces new vec8 and vec16 instructions (which are the only
instructions taking more than 4 sources), in order to construct 8 and 16
component vectors.

In order to avoid fixing up the non-autogenerated nir_build_alu() sites
and making them pass 16 src args for the benefit of the two instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).

v2 (Karol Herbst):
  use nir_build_alu2 for vec8 and vec16
  use python's array multiplication syntax
  add nir_op_vec helper
  simplify nir_vec
  nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert
  use nir_build_alu for opcodes with <= 4 sources
v3 (Karol Herbst):
  fix nir_serialize
v4 (Dave Airlie):
  fix serialization of glsl_type
  handle vec8/16 in lowering of bools
v5 (Karol Herbst):
  fix load store vectorizer

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-21 11:00:17 +00:00
Eric Anholt
d845dca0f5 nir: Make algebraic backtrack and reprocess after a replacement.
The algebraic pass was exhibiting O(n^2) behavior in
dEQP-GLES2.functional.uniform_api.random.3 and
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 (along with
other code-generated tests, and likely real-world loop-unroll cases).
In the process of using fmul(b2f(x), b2f(x)) -> b2f(iand(x, y)) to
transform:

result = b2f(a == b);
result *= b2f(c == d);
...
result *= b2f(z == w);

->

temp = (a == b)
temp = temp && (c == d)
...
temp = temp && (z == w)
result = b2f(temp);

nir_opt_algebraic, proceeding bottom-to-top, would match and convert
the top-most fmul(b2f(), b2f()) case each time, leaving the new b2f to
be matched by the next fmul down on the next time algebraic got run by
the optimization loop.

Back in 2016 in 7be8d07732 ("nir: Do opt_algebraic in reverse
order."), Matt changed algebraic to go bottom-to-top so that we would
match the biggest patterns first.  This helped his cases, but I
believe introduced this failure mode.  Instead of reverting that, now
that we've got the automaton, we can update the automaton's state
recursively and just re-process any instructions whose state has
changed (indicating that they might match new things).  There's a
small chance that the state will hash to the same value and miss out
on this round of algebraic, but this seems to be good enough to fix
dEQP.

Effects with NIR_VALIDATE=0 (improvement is better with validation enabled):

Intel shader-db runtime -0.954712% +/- 0.333844% (n=44/46, obvious throttling
  outliers removed)
dEQP-GLES2.functional.uniform_api.random.3 runtime
  -65.3512% +/- 4.22369% (n=21, was 1.4s)
dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime
  -68.8066% +/- 6.49523% (was 4.8s)

v2: Use two worklists, suggested by @cwabbott, to cut out a bunch of
    tricky code.  Runtime of uniform_api.random.3 down -0.790299% +/-
    0.244213% compred to v1.
v3: Re-add the nir_instr_remove() that I accidentally dropped in v2,
    fixing infinite loops.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-26 10:13:46 -08:00
Eric Anholt
90ad6304bf nir: Refactor algebraic's block walk
My motivation was to clarify the changes in the following commit, but
incidentally, it reduces runtime of
dEQP-GLES2.functional.uniform_api.random.3 (an algebraic-heavy
testcase) by -5.39524% +/- 2.21179% (n=15)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-26 10:13:40 -08:00
Connor Abbott
305d1300f9 nir: Maintain the algebraic automaton's state as we work.
In order to have nir_opt_algebraic be able to do further algebraic
work on the output of a replacement, we need to maintain the
automaton's state.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-26 10:13:19 -08:00
Ian Romanick
ea19f2fb68 nir/algebraic: Add the ability to mark a replacement as exact
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-04 14:05:49 -08:00
Eric Anholt
3cc914921e nir: Factor out most of the algebraic passes C code to .c/.h.
Working on the algebraic implementation, I was being driven nuts by my
editor not highlighting and handling indentation for the C code.  It turns
out that it's basically not pass-specific code, and we can move it over to
the relevant .c file.  Replaces 30KB of code with 34KB of data on my i965
build.  No perf diff on shader-db (n=3)

Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
c23db0df18 nir: Keep the range analysis HT around intra-pass until we make a change.
This lets us memoize range analysis work across instructions.  Reduces
runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3).

Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
7025dbe794 nir: Skip emitting no-op movs from the builder.
Having passes generate these is just making more work for copy
propagation (and thus probably calling more optimization passes)
later.  Noticed while trying to debug nir_opt_algebraic()
top-to-bottom having O(n^2) behavior due to not finding new matches in
replacement code.

Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Anholt
e7b754a05c nir: Make nir_search's dumping go to stderr.
Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-10-04 19:15:01 +00:00
Eric Engestrom
5febd4d575 compiler: replace MAYBE_UNUSED with UNUSED
MAYBE_UNUSED is going away, so let's replace legitimate uses of it with
UNUSED, which the former aliased to so far anyway.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-31 09:41:05 +01:00
Jonathan Marek
5a4e71c082 nir/algebraic: allow swizzle in nir_algebraic replace expression
This is to allow optimizations in nir_opt_algebraic not otherwise possible

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Ian Romanick
02c6cd8481 nir/serach: Increase maximum commutative expressions from 4 to 8
No shader-db change on any Intel platform.  No shader-db run-time
difference on a certain 36-core / 72-thread system at 95% confidence
(n=20).

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-28 18:56:19 -07:00
Ian Romanick
cae1af4339 nir/search: Log Boolean constants instead of asserting
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-28 18:56:19 -07:00