intel/fs: Fix register coalesce in presence of force_writemask_all copy source writes.
This fixes the behavior of register coalesce in cases where the source of a copy is written elsewhere in the program by a force_writemask_all instruction, which could cause the overwrite to be executed for an inactive channel under non-uniform control flow, causing can_coalesce_vars() to give incorrect results. This has been reported in cases like: > while (true) { > x = imageSize(img); > if (non_uniform_condition()) { > y = x; > break; > } > } > use(y); Currently the register coalesce pass would coalesce x and y in the example above, which is invalid since in the example above imageSize() is implemented as a force_writemask_all SEND message, whose result is broadcast to all channels, so when a given channel executes 'y = x' and breaks out of the loop, another divergent channel can execute a subsequent iteration of the loop overwriting 'x' with a different value, hence coalescing y and x into the same register changes the behavior of the program. Note that this is a regression introduced by commita4b36cd3dd
. In order to avoid the problem without reverting that patch, we prevent register coalesce if there is an overwrite of the source with force_writemask_all behavior inconsistent with the copy and this occurs anywhere in the intersection of the live ranges of source and destination, even if it occurs lexically before the copy, since it might be physically executed after the copy under divergent loop control flow. Fixes:a4b36cd3dd
("intel/fs: Coalesce when the src live range is contained in the dst") Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21351>
This commit is contained in:

committed by
Kenneth Graunke

parent
d4015bcb38
commit
76b4255cd8
@@ -177,7 +177,8 @@ can_coalesce_vars(const fs_live_variables &live, const cfg_t *cfg,
|
||||
/* See the big comment above */
|
||||
if (regions_overlap(scan_inst->dst, scan_inst->size_written,
|
||||
inst->src[0], inst->size_read(0))) {
|
||||
if (seen_copy || scan_block != block)
|
||||
if (seen_copy || scan_block != block ||
|
||||
(scan_inst->force_writemask_all && !inst->force_writemask_all))
|
||||
return false;
|
||||
seen_src_write = true;
|
||||
}
|
||||
|
Reference in New Issue
Block a user