intel/fs: Fix register coalesce in presence of force_writemask_all copy source writes.

This fixes the behavior of register coalesce in cases where the source
of a copy is written elsewhere in the program by a force_writemask_all
instruction, which could cause the overwrite to be executed for an
inactive channel under non-uniform control flow, causing
can_coalesce_vars() to give incorrect results.  This has been reported
in cases like:

> while (true) {
>    x = imageSize(img);
>    if (non_uniform_condition()) {
>       y = x;
>       break;
>    }
> }
> use(y);

Currently the register coalesce pass would coalesce x and y in the
example above, which is invalid since in the example above imageSize()
is implemented as a force_writemask_all SEND message, whose result is
broadcast to all channels, so when a given channel executes 'y = x'
and breaks out of the loop, another divergent channel can execute a
subsequent iteration of the loop overwriting 'x' with a different
value, hence coalescing y and x into the same register changes the
behavior of the program.

Note that this is a regression introduced by commit a4b36cd3dd.  In
order to avoid the problem without reverting that patch, we prevent
register coalesce if there is an overwrite of the source with
force_writemask_all behavior inconsistent with the copy and this
occurs anywhere in the intersection of the live ranges of source and
destination, even if it occurs lexically before the copy, since it
might be physically executed after the copy under divergent loop
control flow.

Fixes: a4b36cd3dd ("intel/fs: Coalesce when the src live range is contained in the dst")
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21351>
This commit is contained in:
Francisco Jerez
2023-02-10 19:35:35 -08:00
committed by Kenneth Graunke
parent d4015bcb38
commit 76b4255cd8

View File

@@ -177,7 +177,8 @@ can_coalesce_vars(const fs_live_variables &live, const cfg_t *cfg,
/* See the big comment above */
if (regions_overlap(scan_inst->dst, scan_inst->size_written,
inst->src[0], inst->size_read(0))) {
if (seen_copy || scan_block != block)
if (seen_copy || scan_block != block ||
(scan_inst->force_writemask_all && !inst->force_writemask_all))
return false;
seen_src_write = true;
}