lima/gpir: Rework the scheduler

Now, we do scheduling at the same time as value register allocation. The
ready list now acts similarly to the array of registers in
value_regalloc, keeping us from running out of slots. Before this, the
value register allocator wasn't aware of the scheduling constraints of
the actual machine, which meant that it sometimes chose the wrong false
dependencies to insert. Now, we assign value registers at the same time
as we actually schedule instructions, making its choices reflect reality
much better. It was also conservative in some cases where the new scheme
doesn't have to be. For example, in something like:

1 = ld_att
2 = ld_uni
3 = add 1, 2

It's possible that one of 1 and 2 can't be scheduled in the same
instruction as 3, meaning that a move needs to be inserted, so the value
register allocator needs to assume that this sequence requires two
registers. But when actually scheduling, we could discover that 1, 2,
and 3 can all be scheduled together, so that they only require one
register. The new scheduler speculatively inserts the instruction under
consideration, as well as all of its child load instructions, and then
counts the number of live value registers after all is said and done.
This lets us be more aggressive with scheduling when we're close to the
limit.

With the new scheduler, the kmscube vertex shader is now scheduled in 40
instructions, versus 66 before.

Acked-by: Qiang Yu <yuq825@gmail.com>
This commit is contained in:
Connor Abbott
2018-01-11 18:35:58 -05:00
parent 12645e8714
commit 54434fe670
9 changed files with 1215 additions and 588 deletions

View File

@@ -422,10 +422,7 @@ bool gpir_compile_nir(struct lima_vs_shader_state *prog, struct nir_shader *nir)
if (!gpir_post_rsched_lower_prog(comp))
goto err_out0;
if (!gpir_value_regalloc_prog(comp))
goto err_out0;
if (!gpir_physical_regalloc_prog(comp))
if (!gpir_regalloc_prog(comp))
goto err_out0;
if (!gpir_schedule_prog(comp))