intel/compiler: Fix uncompaction of signed word immediates on Tigerlake

This expression accidentally performs a 32-bit sign-extension when processing the second half of the expression (the low 16 bits). Consider -7W, which is represented as 0xfff9fff9 in our encoding (the 16-bit word is replicated to both halves of the 32-bit dword). Tigerlake's compaction stores the low 11-bits of an immediate as-is, and replicates the 12th bit. So here, compacted_imm will be 0xff9. ( (int)(0xff9 << 20) >> 4) | ((short)(0xff9 << 4) >> 4)) 0xfff90000 | (0xff90 >> 4) 0xfff90000 | 0xfffffff9 ...oops... 0xfffffff9 By casting the second line of the expression to unsigned short, we prevent the sign-extension when it combines both parts, so we get: 0xfff90000 | 0x0000fff9 0xfff9fff9 Fixes: 12d3b11908 ("intel/compiler: Add instruction compaction support on Gen12") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16833>
2022-06-02 02:11:15 -07:00
parent 9eee4c79db
commit 26bb81f3f6
1 changed files with 2 additions and 2 deletions
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -1658,8 +1658,8 @@ uncompact_immediate(const struct intel_device_info *devinfo,
         return (int)(compact_imm << 20) >> 20;
      case BRW_REGISTER_TYPE_W:
         /* Extend the 12th bit into the high 4 bits and replicate */
-         return (  (int)(compact_imm << 20) >> 4) |
-                ((short)(compact_imm <<  4) >> 4);
+         return ((int)(compact_imm << 20) >> 4) |
+                ((unsigned short)((short)(compact_imm << 4) >> 4));
      case BRW_REGISTER_TYPE_NF:
      case BRW_REGISTER_TYPE_DF:
      case BRW_REGISTER_TYPE_Q: