agx: do not flush denorms for fp16 fmin/fmax

total instructions in shared programs: 2164639 -> 2158940 (-0.26%) instructions in affected programs: 319475 -> 313776 (-1.78%) helped: 1200 HURT: 6 Instructions are helped. total alu in shared programs: 1690198 -> 1684653 (-0.33%) alu in affected programs: 272173 -> 266628 (-2.04%) helped: 1181 HURT: 6 Alu are helped. total fscib in shared programs: 1686497 -> 1680797 (-0.34%) fscib in affected programs: 272922 -> 267222 (-2.09%) helped: 1200 HURT: 6 Fscib are helped. total bytes in shared programs: 14334550 -> 14300314 (-0.24%) bytes in affected programs: 2075546 -> 2041310 (-1.65%) helped: 1200 HURT: 6 Bytes are helped. total regs in shared programs: 662332 -> 662302 (<.01%) regs in affected programs: 1103 -> 1073 (-2.72%) helped: 14 HURT: 15 Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
2024-07-09 17:07:41 -04:00
parent 6ac289dade
commit e8b673a109
1 changed files with 8 additions and 2 deletions
--- a/src/asahi/compiler/agx_compile.c
+++ b/src/asahi/compiler/agx_compile.c
@@ -1680,8 +1680,14 @@ agx_fminmax_to(agx_builder *b, agx_index dst, agx_index s0, agx_index s1,
   /* Calculate min/max with the appropriate hardware instruction */
   agx_index tmp = agx_fcmpsel(b, s0, s1, s0, s1, fcond);

-   /* Flush denorms, as cmpsel will not. */
-   return agx_fadd_to(b, dst, tmp, agx_negzero());
+   /* G13 flushes fp32 denorms and preserves fp16 denorms. Since cmpsel
+    * preserves denorms, we need to canonicalize for fp32. Canonicalizing fp16
+    * would be harmless but wastes an instruction.
+    */
+   if (alu->def.bit_size == 32)
+      return agx_fadd_to(b, dst, tmp, agx_negzero());
+   else
+      return agx_mov_to(b, dst, tmp);
 }

 static agx_instr *