intel/compiler: Fix 64-bit ufind_msb, find_lsb, and bit_count

We only support 32-bit versions of ufind_msb, find_lsb, and bit_count,
so we need to lower them via nir_lower_int64.

Previously, we were failing to do so on platforms older than Icelake
and let those operations fall through to nir_lower_bit_size, which
used a callback to determine it should lower them for bit_size != 32.
However, that pass only emulates small bit-size operations by promoting
them to supported, larger bit-sizes (i.e. 16-bit using 32-bit).  It
doesn't support emulating larger operations (i.e. 64-bit using 32-bit).

So nir_lower_bit_size would just u2u32 the 64-bit source, causing us to
flat ignore half of the bits.

Commit 78a195f252 (intel/compiler: Postpone most int64 lowering to
brw_postprocess_nir) provoked this bug on Icelake and later as well,
by moving the nir_lower_int64 handling for ufind_msb until late in
compilation, allowing it to reach nir_lower_bit_size which broke it.

To fix this, we always set int64 lowering for these opcodes, and also
correct the nir_lower_bit_size callback to ignore 64-bit operations.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>
This commit is contained in:
Kenneth Graunke
2023-05-18 17:19:42 -07:00
committed by Marge Bot
parent 9293d8e64b
commit a2d384a5c0
2 changed files with 5 additions and 2 deletions

View File

@@ -135,7 +135,10 @@ brw_compiler_create(void *mem_ctx, const struct intel_device_info *devinfo)
nir_lower_imul64 |
nir_lower_isign64 |
nir_lower_divmod64 |
nir_lower_imul_high64;
nir_lower_imul_high64 |
nir_lower_find_lsb64 |
nir_lower_ufind_msb64 |
nir_lower_bit_count64;
nir_lower_doubles_options fp64_options =
nir_lower_drcp |
nir_lower_dsqrt |

View File

@@ -772,7 +772,7 @@ lower_bit_size_callback(const nir_instr *instr, UNUSED void *data)
* source.
*/
assert(alu->src[0].src.is_ssa);
return alu->src[0].src.ssa->bit_size == 32 ? 0 : 32;
return alu->src[0].src.ssa->bit_size >= 32 ? 0 : 32;
default:
break;
}