intel/vec4: Don't spill fp64 registers more than once

The way we handle spilling for fp64 in vec4 is to emit a series of MOVs
which swizzles the data around and then a pair of 32-bit spills.  This
works great except that the next time we go to pick a spill reg, the
compiler isn't smart enough to figure out that the register has already
been spilled.  Normally we do this by looking at the sources of spill
instructions (or destinations of fills) but, because it's separated from
the actual value by a MOV, we can't see it.  This commit adds a new
opcode VEC4_OPCODE_MOV_FOR_SCRATCH which is identical to MOV in
semantics except that it lets RA know not to spill again.

Fixes: 82c69426a5 "i965/vec4: support basic spilling of 64-bit registers"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>
This commit is contained in:
Jason Ekstrand
2021-05-04 15:30:27 -05:00
committed by Marge Bot
parent 7138249675
commit 2db8867943
8 changed files with 19 additions and 9 deletions

View File

@@ -2138,6 +2138,7 @@ vec4_visitor::nir_emit_undef(nir_ssa_undef_instr *instr)
*/
vec4_instruction *
vec4_visitor::shuffle_64bit_data(dst_reg dst, src_reg src, bool for_write,
bool for_scratch,
bblock_t *block, vec4_instruction *ref)
{
assert(type_sz(src.type) == 8);
@@ -2145,6 +2146,8 @@ vec4_visitor::shuffle_64bit_data(dst_reg dst, src_reg src, bool for_write,
assert(!regions_overlap(dst, 2 * REG_SIZE, src, 2 * REG_SIZE));
assert(!ref == !block);
opcode mov_op = for_scratch ? VEC4_OPCODE_MOV_FOR_SCRATCH : BRW_OPCODE_MOV;
const vec4_builder bld = !ref ? vec4_builder(this).at_end() :
vec4_builder(this).at(block, ref->next);
@@ -2156,22 +2159,22 @@ vec4_visitor::shuffle_64bit_data(dst_reg dst, src_reg src, bool for_write,
}
/* dst+0.XY = src+0.XY */
bld.group(4, 0).MOV(writemask(dst, WRITEMASK_XY), src);
bld.group(4, 0).emit(mov_op, writemask(dst, WRITEMASK_XY), src);
/* dst+0.ZW = src+1.XY */
bld.group(4, for_write ? 1 : 0)
.MOV(writemask(dst, WRITEMASK_ZW),
.emit(mov_op, writemask(dst, WRITEMASK_ZW),
swizzle(byte_offset(src, REG_SIZE), BRW_SWIZZLE_XYXY));
/* dst+1.XY = src+0.ZW */
bld.group(4, for_write ? 0 : 1)
.MOV(writemask(byte_offset(dst, REG_SIZE), WRITEMASK_XY),
swizzle(src, BRW_SWIZZLE_ZWZW));
.emit(mov_op, writemask(byte_offset(dst, REG_SIZE), WRITEMASK_XY),
swizzle(src, BRW_SWIZZLE_ZWZW));
/* dst+1.ZW = src+1.ZW */
return bld.group(4, 1)
.MOV(writemask(byte_offset(dst, REG_SIZE), WRITEMASK_ZW),
byte_offset(src, REG_SIZE));
.emit(mov_op, writemask(byte_offset(dst, REG_SIZE), WRITEMASK_ZW),
byte_offset(src, REG_SIZE));
}
}