radeonsi: pass VS->TCS IO via VGPRs if VS and TCS have the same thread count

It can only be done if a TCS input is accessed without indirect indexing and
with gl_InvocationID as the vertex index, and the number of VS and TCS threads
is the same.

This eliminates LDS stores and loads for VS->TCS IO, reducing shader lifetime
and LDS traffic.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
This commit is contained in:
Marek Olšák
2020-11-14 17:24:11 -05:00
committed by Marge Bot
parent 6f13034265
commit 61fe66a2e4
5 changed files with 47 additions and 2 deletions

View File

@@ -46,6 +46,7 @@ struct si_shader_output_values {
struct si_shader_context {
struct ac_llvm_context ac;
struct si_shader *shader;
struct si_shader_selector *next_shader_sel;
struct si_screen *screen;
gl_shader_stage stage;