From af4c9789bb3677597b938f54ade7796256d9c71c Mon Sep 17 00:00:00 2001 From: Qiang Yu Date: Mon, 14 Nov 2022 15:01:51 +0800 Subject: [PATCH] ac/nir/ngg: fix nogs culling with nuw add MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We should not use "nuw" here as negative add positive may wrap around (negative is 0xffffff??). This problem can be observed with LLVM15 (I can't see when LLVM14): %.neg = mul nsw i32 %31, -4 %163 = add nuw nsw i32 %.neg, 16 %164 = lshr i32 257, %.neg %165 = lshr i32 %164, %163 LLVM just assume %.neg is possitive, so pre-shift 0x01010101 by 16. This get wrong value because we can't get back the shifted bits with a negative shift right. Fixes: 75dbb404393 ("ac/nir: Remove byte permute from prefix sum of the repack sequence.") Reviewed-by: Timur Kristóf Signed-off-by: Qiang Yu Part-of: (cherry picked from commit 982b523769a75c99039deac7f832a1e10260e916) --- .pick_status.json | 2 +- src/amd/common/ac_nir_lower_ngg.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.pick_status.json b/.pick_status.json index e85c18a7fbc..28ab55f363c 100644 --- a/.pick_status.json +++ b/.pick_status.json @@ -661,7 +661,7 @@ "description": "ac/nir/ngg: fix nogs culling with nuw add", "nominated": true, "nomination_type": 1, - "resolution": 0, + "resolution": 1, "main_sha": null, "because_sha": "75dbb404393a5ae99adb90a156fa5a084aa79c4d" }, diff --git a/src/amd/common/ac_nir_lower_ngg.c b/src/amd/common/ac_nir_lower_ngg.c index addf2e010d7..f03f65b45e8 100644 --- a/src/amd/common/ac_nir_lower_ngg.c +++ b/src/amd/common/ac_nir_lower_ngg.c @@ -234,7 +234,7 @@ summarize_repack(nir_builder *b, nir_ssa_def *packed_counts, unsigned num_lds_dw */ nir_ssa_def *lane_id = nir_load_subgroup_invocation(b); - nir_ssa_def *shift = nir_iadd_imm_nuw(b, nir_imul_imm(b, lane_id, -4u), num_lds_dwords * 16); + nir_ssa_def *shift = nir_iadd_imm(b, nir_imul_imm(b, lane_id, -4u), num_lds_dwords * 16); bool use_dot = b->shader->options->has_udot_4x8; if (num_lds_dwords == 1) {