nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results
in a lot of shifts and MOVs. When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.
v2: Rebase on b4369de27f
("nir/lower_packing: use
shader_instructions_pass")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
This commit is contained in:
@@ -68,6 +68,7 @@
|
||||
.lower_usub_sat64 = true, \
|
||||
.lower_hadd64 = true, \
|
||||
.avoid_ternary_with_two_constants = true, \
|
||||
.has_pack_32_4x8 = true, \
|
||||
.max_unroll_iterations = 32, \
|
||||
.force_indirect_unrolling = nir_var_function_temp
|
||||
|
||||
|
Reference in New Issue
Block a user