agx: Insert jmp_exec_none instructions
With the exception of the backwards branch for loops, all the control flow we insert during instruction selection just predicates instructions rather than actually jumping around. That means, for example, we execute both sides of the if even for a uniform condition! That's inefficient. The solution is insert jmp_exec_none instructions after control flow in order to skip unexecuted regions, which is much faster than predicating them out. However, jmp_exec_none is costly in itself, so we need to use a heuristic to determine when it's actually beneficial. This uses a very simple heuristic for this purpose. However, it is a massive performance speed-up for Dolphin uber shaders: 39fps -> 67fps at 2x resolution. Nearly a doubling of performance! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This commit is contained in:
@@ -2621,6 +2621,7 @@ agx_compile_function_nir(nir_shader *nir, nir_function_impl *impl,
|
||||
agx_insert_waits(ctx);
|
||||
agx_opt_empty_else(ctx);
|
||||
agx_opt_break_if(ctx);
|
||||
agx_opt_jmp_none(ctx);
|
||||
agx_lower_pseudo(ctx);
|
||||
|
||||
if (agx_should_dump(nir, AGX_DBG_SHADERS))
|
||||
|
Reference in New Issue
Block a user