nir: intel/compiler: Add and use nir_op_pack_32_4x8_split

A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
This commit is contained in:
Ian Romanick
2021-01-25 16:31:17 -08:00
committed by Marge Bot
parent 89f639c0ca
commit f0a8a9816a
6 changed files with 31 additions and 2 deletions

View File

@@ -68,6 +68,7 @@
.lower_usub_sat64 = true, \
.lower_hadd64 = true, \
.avoid_ternary_with_two_constants = true, \
.has_pack_32_4x8 = true, \
.max_unroll_iterations = 32, \
.force_indirect_unrolling = nir_var_function_temp