intel/compiler/gfx12.5+: Lower 64-bit cluster_broadcast with 32-bit ops
For MTL (verx10 == 125), float64 is supported, but int64 is not. Therefore we need to lower cluster broadcast using 32-bit int ops. For gfx12.5+ platforms that support int64, the register regions used by cluster broadcast aren't supported by the 64-bit pipeline. On MTL, dEQP-VK.subgroups.clustered.*_double* and dEQP-VK.subgroups.clustered.*_dvec* were failing to validate the compiled shader in debug mode, and reportedly gpu-hanging in release mode. With this change dEQP-VK.subgroups.clustered.*_double* passed all 48 tests and dEQP-VK.subgroups.clustered.*_dvec* passed all 140 tests on MTL. Rework: * Move from generator to brw_fs_lower_regioning.cpp. (Suggested by Francisco) * Apply to verx10 >= 125.. (Suggested by Francisco) Cc: 23.1 <mesa-stable> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> (v1) Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22569>
This commit is contained in:
@@ -174,10 +174,17 @@ namespace {
|
||||
* integer DWord multiply, indirect addressing must not be
|
||||
* used."
|
||||
*
|
||||
* For MTL (verx10 == 125), float64 is supported, but int64 is not.
|
||||
* Therefore we need to lower cluster broadcast using 32-bit int ops.
|
||||
*
|
||||
* For gfx12.5+ platforms that support int64, the register regions
|
||||
* used by cluster broadcast aren't supported by the 64-bit pipeline.
|
||||
*
|
||||
* Work around the above and handle platforms that don't
|
||||
* support 64-bit types at all.
|
||||
*/
|
||||
if ((!has_64bit || devinfo->platform == INTEL_PLATFORM_CHV ||
|
||||
if ((!has_64bit || devinfo->verx10 >= 125 ||
|
||||
devinfo->platform == INTEL_PLATFORM_CHV ||
|
||||
intel_device_info_is_9lp(devinfo)) && type_sz(t) > 4)
|
||||
return BRW_REGISTER_TYPE_UD;
|
||||
else
|
||||
|
Reference in New Issue
Block a user