Isabella Basso
59fea8af3a
nir/algebraic: remove duplicate bool conversion lowerings
...
While [1] added some boolean conversion lowering patterns, those were
already dealt with on [2].
[1] - b86305bb
("nir/algebraic: collapse conversion opcodes (many patterns)")
[2] - d7e0d47b
("nir/algebraic: nir: Add a bunch of b2[if] optimizations")
Fixes: b86305bb
("nir/algebraic: collapse conversion opcodes (many patterns)")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965 >
2023-03-11 17:21:38 +00:00
Isabella Basso
a553d3cd29
nir/algebraic: make patterns for float conversion lowerings imprecise
...
As noted on [1], lowering patterns of the form
floatS -> floatB -> floatS ==> floatS
cannot require precision since this may cause flush denorming.
[1] 3f779013
("nir: Add an algebraic optimization for float->double->float")
Fixes: b86305bb
("nir/algebraic: collapse conversion opcodes (many patterns)")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965 >
2023-03-11 17:21:37 +00:00
Isabella Basso
79c94ef52e
nir/algebraic: extend lowering patterns for conversions on smaller bit sizes
...
Conversions on smaller bit sizes should also be collapsed when composed.
This also adds more patterns on the
intS -> intB -> floatB ==> intS -> floatB
lowering so as to deal with any int size C > B instead of a fixed intB.
Closes : #7776
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965 >
2023-03-11 17:21:37 +00:00
Isabella Basso
a27bcd63d0
nir/algebraic: extend mediump patterns
...
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Suggested-by: Italo Nicola <italonicola@collabora.com >
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965 >
2023-03-11 17:21:37 +00:00
Isabella Basso
b3685f3ba7
nir/algebraic: insert patterns inside optimizations list
...
Some patterns were outside the list of optimizations.
Fixes: b86305bb
("nir/algebraic: collapse conversion opcodes (many patterns)")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Signed-off-by: Isabella Basso <isabellabdoamaral@usp.br >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20965 >
2023-03-11 17:21:37 +00:00
Alyssa Rosenzweig
2ba48eea88
nir/lower_point_size: Use shader_instructions_pass
...
Sleepy code deletion mood.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21750 >
2023-03-11 16:42:36 +00:00
Ian Romanick
0cadc3830f
nir/lower_int64: Optionally lower ufind_msb using uadd_sat
...
v2: Fix inverted condition for applying the optimization. Noticed by
Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
831f9d3f61
nir/algebraic: Optimize some ifind_msb to ufind_msb
...
On Intel platforms, the uclz lowering if ufind_msb is either one
instruction better (Gfx7 and newer) or two instructions better (all
older platforms) than the ifind_msb implementations.
On platforms that use lower_find_msb_to_reverse, there should be no
difference.
All Haswell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19938662 -> 19938634 (<.01%)
instructions in affected programs: 850 -> 822 (-3.29%)
helped: 2 / HURT: 0
total cycles in shared programs: 858467067 -> 858465538 (<.01%)
cycles in affected programs: 10080 -> 8551 (-15.17%)
helped: 2 / HURT: 0
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
db6d1edc1b
nir: Restrict ufind_msb and ufind_msb_rev to 32- or 64-bit sources
...
4d802df3aa
loosened the type restrictions
on these opcodes to enable support for 64-bit ballot operations. In
doing so, it enabled 8-bit and 16-bit sizes as well.
It's impossible to get these sizes through GLSL or SPIR-V. None of the
lowering in nir_opt_algebraic can handle non-32-bit sizes. Almost no
drivers can handle non-32-bit sizes.
It doesn't seem possible to enforce anything other than "one bit size"
or "all bit sizes" in nir_opcodes.py. The only way it seems possible to
enforce this is in nir_validate. This is not ideal, but it be what it
be.
v2: Remove restriction on find_lsb. It is acutally possible to get this
via GLSL by doing findLSB() on a lowp value. findMSB declares its
parameter as highp, so that path is still impossible.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
2d6f48f6ef
nir/algebraic: Do not generate 8- or 16-bit find_msb
...
The next commit will add validation to restrict this instruction (and
others) to only 32-bit or 64-bit sources.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
2119ab7319
nir/builder: Do not generate 8- or 16-bit find_msb
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
28311f9d02
nir: intel/compiler: Move ufind_msb lowering to NIR
...
Fossil-db results:
All Intel platforms had similar results. (Ice Lake shown)
Cycles in all programs: 9098346105 -> 9098333765 (-0.0%)
Cycles helped: 6
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
a4052e70ea
nir/algebraic: Only lower ufind_msb with 32-bit sources
...
The 31-ufind_msb_rev(x) lowering only produces the correct result for
32-bit sources. ufind_msb_rev can also have 64-bit sources, and most
platforms are expected to lower this to 32-bit instructions with extra
logic operations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
0cc7bf63b7
nir: intel/compiler: Move ifind_msb lowering to NIR
...
Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so
no @32 annotation is required.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Ian Romanick
66840b98e4
nir: ifind_msb_rev can only have int32 sources
...
Just like ifind_msb.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042 >
2023-03-10 15:27:17 +00:00
Eric Engestrom
f5d3d1e7ed
meson: inline gtest_test_protocol now that it's always 'gtest'
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21485 >
2023-03-10 07:20:29 +00:00
antonino
3a59b2a670
nir: handle output beeing written to deref in nir_lower_point_smooth
...
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21731 >
2023-03-09 04:38:24 +00:00
Daniel Schürmann
3073810397
nir/gather_info: allow terminate() in non-PS
...
RADV will use terminate() to end ray-tracing shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736 >
2023-03-08 16:59:41 +00:00
Rhys Perry
98cb7e0108
nir: add nir_lower_alu_width_test.fdot_order
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812 >
2023-03-08 14:38:26 +00:00
Rhys Perry
50f7e21481
nir: make fdph lowering match fdot
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812 >
2023-03-08 14:38:26 +00:00
Rhys Perry
3668da7c83
nir: use xyzw order for precise fdot
...
Fixes flickering grass in Immortals Fenyx Rising.
fossil-db (gfx1100):
Totals from 13969 (10.38% of 134574) affected shaders:
MaxWaves: 442794 -> 442878 (+0.02%)
Instrs: 4861105 -> 4901408 (+0.83%); split: -0.02%, +0.85%
CodeSize: 24316100 -> 24396272 (+0.33%); split: -0.03%, +0.35%
VGPRs: 446256 -> 445572 (-0.15%); split: -0.20%, +0.05%
Latency: 28122456 -> 28162233 (+0.14%); split: -0.10%, +0.24%
InvThroughput: 2899673 -> 2904323 (+0.16%); split: -0.07%, +0.23%
VClause: 119599 -> 119631 (+0.03%); split: -0.07%, +0.09%
SClause: 186636 -> 186265 (-0.20%); split: -0.23%, +0.03%
Copies: 301370 -> 300386 (-0.33%); split: -0.75%, +0.42%
Branches: 85066 -> 85047 (-0.02%); split: -0.02%, +0.00%
PreSGPRs: 436167 -> 436137 (-0.01%)
PreVGPRs: 329715 -> 329809 (+0.03%); split: -0.01%, +0.04%
fossil-db (gfx1100, RADV_DEBUG=invariantgeom):
Totals from 43116 (32.04% of 134574) affected shaders:
MaxWaves: 1332938 -> 1333012 (+0.01%); split: +0.01%, -0.00%
Instrs: 16424513 -> 16658021 (+1.42%); split: -0.06%, +1.48%
CodeSize: 81258868 -> 81827860 (+0.70%); split: -0.07%, +0.77%
VGPRs: 1720368 -> 1719648 (-0.04%); split: -0.19%, +0.15%
SpillSGPRs: 1670 -> 1600 (-4.19%); split: -5.27%, +1.08%
Latency: 82063766 -> 82425418 (+0.44%); split: -0.23%, +0.67%
InvThroughput: 9665803 -> 9727810 (+0.64%); split: -0.09%, +0.73%
VClause: 449662 -> 451099 (+0.32%); split: -0.32%, +0.64%
SClause: 498841 -> 498639 (-0.04%); split: -0.24%, +0.20%
Copies: 1001020 -> 1000770 (-0.02%); split: -1.20%, +1.17%
Branches: 237580 -> 239637 (+0.87%); split: -0.01%, +0.88%
PreSGPRs: 1198167 -> 1198024 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 1225202 -> 1225035 (-0.01%); split: -0.06%, +0.05%
fossil-db (navi10):
Totals from 13969 (10.38% of 134563) affected shaders:
MaxWaves: 474386 -> 474508 (+0.03%); split: +0.05%, -0.03%
Instrs: 3740895 -> 3771566 (+0.82%); split: -0.00%, +0.82%
CodeSize: 19426592 -> 19459916 (+0.17%); split: -0.00%, +0.18%
VGPRs: 389916 -> 389852 (-0.02%); split: -0.09%, +0.07%
Latency: 25452927 -> 25502482 (+0.19%); split: -0.14%, +0.34%
InvThroughput: 3880807 -> 3923144 (+1.09%); split: -0.07%, +1.16%
VClause: 66835 -> 66712 (-0.18%); split: -0.38%, +0.20%
SClause: 178805 -> 178802 (-0.00%); split: -0.01%, +0.01%
Copies: 167601 -> 167625 (+0.01%); split: -0.54%, +0.56%
Branches: 83788 -> 83784 (-0.00%)
PreSGPRs: 388229 -> 388216 (-0.00%)
PreVGPRs: 342984 -> 343062 (+0.02%); split: -0.01%, +0.03%
fossil-db (navi10, RADV_DEBUG=invariantgeom):
Totals from 43116 (32.04% of 134563) affected shaders:
MaxWaves: 1260184 -> 1256414 (-0.30%); split: +0.10%, -0.40%
Instrs: 12804951 -> 12983628 (+1.40%); split: -0.01%, +1.41%
CodeSize: 65813224 -> 66137852 (+0.49%); split: -0.03%, +0.52%
VGPRs: 1556396 -> 1561340 (+0.32%); split: -0.09%, +0.41%
SpillSGPRs: 1377 -> 1395 (+1.31%)
Latency: 76095867 -> 76355111 (+0.34%); split: -0.32%, +0.66%
InvThroughput: 13546863 -> 13788789 (+1.79%); split: -0.05%, +1.84%
VClause: 310910 -> 311283 (+0.12%); split: -0.63%, +0.75%
SClause: 474878 -> 474941 (+0.01%); split: -0.09%, +0.10%
Copies: 639367 -> 637610 (-0.27%); split: -1.03%, +0.76%
Branches: 240178 -> 240185 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1056594 -> 1056590 (-0.00%); split: -0.00%, +0.00%
PreVGPRs: 1247950 -> 1247798 (-0.01%); split: -0.05%, +0.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7920
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812 >
2023-03-08 14:38:26 +00:00
Marek Olšák
f7076d129d
amd: add nir_intrinsic_xfb_counter_sub_amd and fix overflowed streamout offsets
...
Fixes: 5ec79f9899
- ac/nir/ngg: nogs support streamout
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21584 >
2023-03-07 22:08:47 +00:00
Lionel Landwerlin
a278eeb719
nir: fix nir_ishl_imm
...
Both GLSL & SPIRV have undefined values for shift > bitsize. But SM5
says :
"This instruction performs a component-wise shift of each 32-bit
value in src0 left by an unsigned integer bit count provided by
the LSB 5 bits (0-31 range) in src1, inserting 0."
Better to not hard code the wrong behavior in NIR.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: e227bb9fd5
("nir/builder: add ishl_imm helper")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com >
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21720 >
2023-03-07 08:14:34 +00:00
Alyssa Rosenzweig
f47ea3f992
glsl/nir: Use scoped_barrier for control barrier
...
Rather than control_barrier. This avoids the need to handle control_barrier at
all for backends that set use_scoped_barrier. This effectively matches what
spirv_to_nir emits, so Vulkan-capable compilers should be ok.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Reviewed-by: Emma Anholt <emma@anholt.net >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634 >
2023-03-07 00:41:13 +00:00
Alyssa Rosenzweig
952bd63d6d
nir/opt_barrier: Generalize to control barriers
...
For GLSL, we want to optimize code like
memoryBarrierBuffer();
controlBarrier();
into a single scoped_barrier intrinsic for the backend to consume. Now that
backends can get scoped_barriers everywhere, what's left is enabling backends to
combine these barriers together. We already have an Intel-specific pass for
combining memory barriers; it just needs a teensy bit of generalization to allow
combining all sorts of barriers together.
This avoids code quality regression on Asahi when switching to purely scoped
barriers. It's probably useful for other backends too.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21661 >
2023-03-06 22:09:27 +00:00
Alyssa Rosenzweig
282aeb9b9c
nir/lower_tex: Add lower_index_to_offset
...
Some backends can handle a constant texture index or a dynamic texture index but
not a constant texture index plus a dynamic texture offset. Add a nir_lower_tex
option to lower to one of these options.
v2: Use more straightforward code proposed by Faith.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Emma Anholt <emma@anholt.net > [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21546 >
2023-03-06 21:38:32 +00:00
Erik Faye-Lund
c305f97257
nir: add a print_internal debug-flag
...
It can sometimes be useful to also print the shaders that are marked as
internal, so let's add a flag that lets us do that.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21681 >
2023-03-06 09:13:52 +00:00
Alyssa Rosenzweig
586da7b329
nir: Add nir_lower_helper_writes pass
...
This NIR pass lowers stores in fragment shaders to:
if (!gl_HelperInvocaton) {
store();
}
This implements the API requirement that helper invocations do not have visible
side effects, and the lowering is required on any hardware that cannot directly
mask helper invocation's side effects. The pass was originally written for
Midgard (which has this issue) but is also needed for Asahi. Let's share the
code, and fix it while we're at it.
Changes from the Midgard pass:
1. Add an option to only lower atomics.
AGX hardware can mask helper invocations for "plain" stores but not for
atomics. Accordingly, the AGX compiler wants this lowering for atomics but
not store_global. By contrast, Midgard cannot mask any stores and needs the
lowering for all store intrinsics. Add an option to the common pass to
accommodate both cases.
This is an optimization for AGX. It is not required for correctness, this
lowering is always legal.
2. Fix dominance issues.
It's invalid to have NIR like
if ... {
ssa_1 = ...
}
foo ssa_1
Instead we need to rewrite as
if ... {
ssa_1 = ...
} else {
ssa_2 = undef
}
ssa_3 = phi ssa_1, ssa_2
foo ssa_3
By default, neither nir_validate nor the backends check this, so this doesn't
currently fix a (known) real bug. But it's still invalid and fails validation
with NIR_DEBUG=validate_ssa_dominance.
Fix this in lower_helper_writes for intrinsics that return data (atomics).
3. Assert that the pass is run only for fragment shaders. This encourages
backends to be judicious about which passes they call instead of just
throwing everything in a giant lower everything spaghetti.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Italo Nicola <italonicola@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21413 >
2023-03-04 13:31:05 -05:00
Rhys Perry
aa32dc704f
nir/range_analysis: fix vectorized phis and intrinsics
...
Found by inspection.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21288 >
2023-03-04 12:58:38 +00:00
Marek Olšák
b80bd58265
nir: skip nir_op_unpack_32_4x8 in nir_lower_alu_width
...
The pass can't handle it just like the other unpack opcodes and generates
invalid NIR.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399 >
2023-03-03 03:27:40 +00:00
Marek Olšák
ec38758e86
nir: return progress from nir_lower_io_to_scalar
...
oversight?
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399 >
2023-03-03 03:27:40 +00:00
Faith Ekstrand
c11ac5e446
nir: Handle wider unaligned loads in lower_mem_access_bit_size
...
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
7e8a10be67
nir: Make chunk_align_offset const in lower_mem_load()
...
This should make things more clear than changing the value from earlier
in the loop. Also, rename chunk_offset to load_offset so they match.
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
eb9a56b6ca
nir: Rename nir_mem_access_size_align::align_mul to align
...
It's a simple alignment so calling it align_mul is a bit misleading.
Suggested-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
802bf1d9a6
nir: Rename align to whole_align in lower_mem_load
...
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
ca4d73ba36
nir: Add a combined alignment helper
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com >
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
e433a7c4fa
nir: Add UBO support to nir_lower_mem_access_bit_sizes
...
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
116a851264
nir: Add mode filtering to lower_mem_access_bit_sizes
...
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Faith Ekstrand
4b06b1a7c5
nir: Check against combined alignment in nir_lower_mem_access_bit_sizes
...
Checking against align_mul is insufficient if align_offset > 0. We need
to check against the combined alignment instead.
Fixes: 2e2d7803c7
("nir: Add a load/store bit size lowering pass")
Reviewed-by: M Henning <drawoc@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524 >
2023-03-03 02:00:39 +00:00
Georg Lehmann
0a3387a190
nir/lower_mediump: don't use fp16 for constants if the result is denormal
...
Image stores are not required to preserve denorms.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21622 >
2023-03-02 11:42:10 +00:00
Timothy Arceri
d75a36a9ee
glsl: remove do_copy_propagation_elements() optimisation pass
...
Since 13b859de
do_copy_propagation_elements() has a flaw where
the time it takes to complete grows exponentially slowers as the number
of nested loops increases. It can also hurt rather than help verses
just letting NIR optimise the code. So if the NIR linker is enabled we
let it handle it instead.
shader-db results Iris (BDW):
total instructions in shared programs: 11177181 -> 11199739 (0.20%)
instructions in affected programs: 119424 -> 141982 (18.89%)
helped: 109
HURT: 65
total cycles in shared programs: 368946819 -> 372277173 (0.90%)
cycles in affected programs: 116539428 -> 119869782 (2.86%)
total spills in shared programs: 3983 -> 8785 (120.56%)
spills in affected programs: 2072 -> 6874 (231.76%)
helped: 0
HURT: 6
total fills in shared programs: 2016 -> 6068 (200.99%)
fills in affected programs: 230 -> 4282 (1761.74%)
helped: 0
HURT: 6
LOST: 85
GAINED: 77
freedreno results:
total instructions in shared programs: 11011122 -> 11011620 (<.01%)
instructions in affected programs: 939829 -> 940327 (0.05%)
total full in shared programs: 762725 -> 762674 (<.01%)
full in affected programs: 1096 -> 1045 (-4.65%)
total constlen in shared programs: 1772092 -> 1771596 (-0.03%)
constlen in affected programs: 2780 -> 2284 (-17.84%)
total stp in shared programs: 4040 -> 4058 (0.45%)
stp in affected programs: 3656 -> 3674 (0.49%)
total ldp in shared programs: 2160 -> 2178 (0.83%)
ldp in affected programs: 1748 -> 1766 (1.03%)
stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/13.shader_test CL: 1231 -> 1234 (0.24%)
stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/13.shader_test CL: 1231 -> 1234 (0.24%)
stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/15.shader_test CL: 453 -> 456 (0.66%)
stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/15.shader_test CL: 453 -> 456 (0.66%)
stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/17.shader_test CL: 144 -> 147 (2.08%)
stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/17.shader_test CL: 144 -> 147 (2.08%)
however, those stp counts are misleading -- gfxbench gl-5-normal actually
gets its scratch (ldp/stp) stored as 16 bits instead of 32 thanks to
better NIR copy prop, and the result is 2.64398% +/- 0.0991923% perf
improvement!
i915 results:
total instructions in shared programs: 510528 -> 510489 (<.01%)
instructions in affected programs: 3303 -> 3264 (-1.18%)
total tex_indirect in shared programs: 16708 -> 16717 (0.05%)
tex_indirect in affected programs: 134 -> 143 (6.72%)
total temps in shared programs: 30181 -> 30169 (-0.04%)
temps in affected programs: 1268 -> 1256 (-0.95%)
LOST: 0
GAINED: 1
i915 highlights:
instructions HURT: shaders/closed/steam/legend-of-grimrock/47.shader_test FS: 141 -> 144 (2.13%)
instructions HURT: shaders/closed/steam/steamworld-dig/22.shader_test FS: 84 -> 108 (28.57%)
temps HURT: shaders/closed/steam/left-4-dead-2/medium/3682.shader_test FS: 7 -> 13 (85.71%)
r300 results:
total instructions in shared programs: 1340439 -> 1340845 (0.03%)
instructions in affected programs: 32354 -> 32760 (1.25%)
total temps in shared programs: 179394 -> 179329 (-0.04%)
temps in affected programs: 1505 -> 1440 (-4.32%)
total consts in shared programs: 1177742 -> 1177885 (0.01%)
consts in affected programs: 1107 -> 1250 (12.92%)
total lits in shared programs: 24992 -> 25019 (0.11%)
lits in affected programs: 138 -> 165 (19.57%)
instructions HURT: shaders/closed/steam/legend-of-grimrock/26.shader_test FS: 47 -> 52 (10.64%)
instructions HURT: shaders/closed/steam/sanctum-2/6072.shader_test FS: 43 -> 48 (11.63%)
instructions HURT: shaders/closed/steam/champions-of-regnum/2378.shader_test VS: 35 -> 40 (14.29%)
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13288 >
2023-03-01 16:09:25 +00:00
Emma Anholt
106019a5d8
nir/split_64bit_vec3_and_vec4: Handle 64-bit matrix types.
...
The offset handling should already work for flattening to our split vars,
just need to make sure we have enough (or any!) array elements.
Fixes : #7154
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13288 >
2023-03-01 16:09:25 +00:00
Caio Oliveira
5f79e78911
spirv: Add skip_os_break_in_debug_build option to use in unit tests
...
When running in the CI environment, instead of crashing the test
binary, it is preferable to just fail gracefully (in this case return
a NULL shader) like is done in release mode, so other tests continue
to be executed.
For convenience add a variable break_on_failure to the test so the
breaking behavior can be re-enable in individual tests when debugging.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512 >
2023-03-01 13:47:57 +00:00
Caio Oliveira
8a91a33b7c
spirv/tests: Add some basic control flow tests
...
The DISABLED test currently fails parsing.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512 >
2023-03-01 13:47:57 +00:00
Caio Oliveira
4e5b520286
spirv/tests: Parametrize stage in get_nir() helper
...
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512 >
2023-03-01 13:47:57 +00:00
Caio Oliveira
131f328a18
spirv/tests: Add script to generate C array from SPIR-V source
...
This is useful for generating the C code to embed the SPIR-V
when adding a new test.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512 >
2023-03-01 13:47:57 +00:00
Caio Oliveira
17e0c75441
spirv/tests: Subclass spirv_test helper to namespace the tests
...
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512 >
2023-03-01 13:47:57 +00:00
Georg Lehmann
aeb68c29b4
nir/opt_algebraic: add patterns for iand/ior of feq/fneu with 0
...
Foz-DB Navi21:
Totals from 1245 (0.92% of 134913) affected shaders:
VGPRs: 66232 -> 66248 (+0.02%); split: -0.01%, +0.04%
CodeSize: 5874976 -> 5868168 (-0.12%); split: -0.17%, +0.05%
MaxWaves: 25278 -> 25274 (-0.02%); split: +0.01%, -0.02%
Instrs: 1087502 -> 1085267 (-0.21%); split: -0.21%, +0.00%
Latency: 6531489 -> 6531672 (+0.00%); split: -0.04%, +0.05%
InvThroughput: 1531774 -> 1532327 (+0.04%); split: -0.02%, +0.05%
VClause: 22218 -> 22202 (-0.07%); split: -0.08%, +0.00%
SClause: 45906 -> 45873 (-0.07%); split: -0.08%, +0.01%
Copies: 64004 -> 64102 (+0.15%); split: -0.24%, +0.39%
Branches: 21529 -> 21534 (+0.02%); split: -0.00%, +0.03%
PreSGPRs: 51936 -> 51850 (-0.17%)
PreVGPRs: 55393 -> 55398 (+0.01%); split: -0.02%, +0.03%
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21576 >
2023-03-01 11:24:43 +00:00
Caio Oliveira
863cbb3e02
spirv: Don't specify nir_var_uniform or nir_var_mem_ubo in barriers
...
These are constant read-only data and don't need to be synchronized.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21517 >
2023-03-01 09:53:29 +00:00
Eric Engestrom
78c95b2865
glsl: align definition of _mesa_problem with the one in main/error.h
...
The ctx pointer not used by that function anyway, so const'ing it makes
no difference.
Signed-off-by: Eric Engestrom <eric@igalia.com >
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com >
Reviewed-by: Marek Olšák <maraeo@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21557 >
2023-02-28 09:04:47 +00:00