aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags.

Instead of v_bfe + v_lshl_or for each vertex, get all 3 edge flags
at once of every vertex. This takes fewer VALU instructions than
previously.

Fossil DB results on Sienna Cichlid (with NGGC on):

Totals from 56917 (44.24% of 128647) affected shaders:
CodeSize: 161028288 -> 158751628 (-1.41%)
Instrs: 30917985 -> 30519571 (-1.29%)
Latency: 130617204 -> 129975532 (-0.49%); split: -0.50%, +0.01%
InvThroughput: 21280238 -> 20927401 (-1.66%)
Copies: 3011120 -> 3011125 (+0.00%); split: -0.00%, +0.00%

No Fossil DB changed with NGGC off.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>
This commit is contained in:
Timur Kristóf
2021-07-15 13:56:18 +02:00
committed by Marge Bot
parent b157a5d0d6
commit 1bbea90f50
5 changed files with 15 additions and 18 deletions

View File

@@ -513,7 +513,7 @@ visit_intrinsic(nir_shader *shader, nir_intrinsic_instr *instr)
case nir_intrinsic_has_input_vertex_amd:
case nir_intrinsic_has_input_primitive_amd:
case nir_intrinsic_load_packed_passthrough_primitive_amd:
case nir_intrinsic_load_initial_edgeflag_amd:
case nir_intrinsic_load_initial_edgeflags_amd:
case nir_intrinsic_gds_atomic_add_amd:
is_divergent = true;
break;