v3d/compiler: Lower geometry output store base into offset src

When generating the VPM write instruction for geometry shader outputs,
emit_store_output_gs ends up adding the base and offset arguments
together with an ADD instruction. The addition was done at the VIR level
after scheduling so it always ends up right next to the corresponding
stvpm instruction. Most of the time the offset is constant but nothing
does any constant folding at the VIR level.

This patch makes it instead fold the addition into the offset at the NIR
level in v3d_nir_lower_io so that the NIR-level constant folding can get
rid of the addition most of the time.

v2: Use nir_iadd_imm to simplify the code. (Eric Anholt)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5825>
This commit is contained in:
Neil Roberts
2020-07-07 15:49:27 +02:00
parent 081691b5ae
commit 97f8ec321b
2 changed files with 15 additions and 6 deletions

View File

@@ -2014,10 +2014,12 @@ emit_store_output_gs(struct v3d_compile *c, nir_intrinsic_instr *instr)
{
assert(instr->num_components == 1);
struct qreg offset = ntq_get_src(c, instr->src[1], 0);
uint32_t base_offset = nir_intrinsic_base(instr);
struct qreg src_offset = ntq_get_src(c, instr->src[1], 0);
struct qreg offset =
vir_ADD(c, vir_uniform_ui(c, base_offset), src_offset);
if (base_offset)
offset = vir_ADD(c, vir_uniform_ui(c, base_offset), offset);
/* Usually, for VS or FS, we only emit outputs once at program end so
* our VPM writes are never in non-uniform control flow, but this

View File

@@ -81,10 +81,17 @@ v3d_nir_store_output(nir_builder *b, int base, nir_ssa_def *offset,
intr->num_components = 1;
intr->src[0] = nir_src_for_ssa(chan);
if (offset)
intr->src[1] = nir_src_for_ssa(offset);
else
if (offset) {
/* When generating the VIR instruction, the base and the offset
* are just going to get added together with an ADD instruction
* so we might as well do the add here at the NIR level instead
* and let the constant folding do its magic.
*/
intr->src[1] = nir_src_for_ssa(nir_iadd_imm(b, offset, base));
base = 0;
} else {
intr->src[1] = nir_src_for_ssa(nir_imm_int(b, 0));
}
nir_intrinsic_set_base(intr, base);
nir_intrinsic_set_write_mask(intr, 0x1);