agx: Insert jmp_exec_none instructions

With the exception of the backwards branch for loops, all the control flow we
insert during instruction selection just predicates instructions rather than
actually jumping around. That means, for example, we execute both sides of the
if even for a uniform condition! That's inefficient. The solution is insert
jmp_exec_none instructions after control flow in order to skip unexecuted
regions, which is much faster than predicating them out. However, jmp_exec_none
is costly in itself, so we need to use a heuristic to determine when it's
actually beneficial.

This uses a very simple heuristic for this purpose. However, it is a massive
performance speed-up for Dolphin uber shaders: 39fps -> 67fps at 2x resolution.
Nearly a doubling of performance!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This commit is contained in:
Alyssa Rosenzweig
2023-08-30 16:03:46 -04:00
parent 79c4d4213c
commit d83d24e96a
4 changed files with 186 additions and 0 deletions

View File

@@ -2621,6 +2621,7 @@ agx_compile_function_nir(nir_shader *nir, nir_function_impl *impl,
agx_insert_waits(ctx);
agx_opt_empty_else(ctx);
agx_opt_break_if(ctx);
agx_opt_jmp_none(ctx);
agx_lower_pseudo(ctx);
if (agx_should_dump(nir, AGX_DBG_SHADERS))