pan/mdg: Use special NIR ops for trig scaling

Otherwise the lowering is fundamentally unsound due to incorrect constant folding, even though it worked by chance with the old pass ordering. We're about to change slightly the way we handle fsin/fcos, which was enough to trigger this unsoundness. shader-db results are mostly a toss-up. total instructions in shared programs: 1520675 -> 1520220 (-0.03%) instructions in affected programs: 96841 -> 96386 (-0.47%) helped: 397 HURT: 3 helped stats (abs) min: 1.0 max: 4.0 x̄: 1.15 x̃: 1 helped stats (rel) min: 0.22% max: 6.25% x̄: 1.15% x̃: 0.40% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.58% max: 2.08% x̄: 1.08% x̃: 0.58% 95% mean confidence interval for instructions value: -1.19 -1.08 95% mean confidence interval for instructions %-change: -1.26% -1.01% Instructions are helped. total bundles in shared programs: 650088 -> 649844 (-0.04%) bundles in affected programs: 31132 -> 30888 (-0.78%) helped: 229 HURT: 23 helped stats (abs) min: 1.0 max: 4.0 x̄: 1.21 x̃: 1 helped stats (rel) min: 0.49% max: 7.14% x̄: 1.28% x̃: 0.71% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.48 x̃: 1 HURT stats (rel) min: 0.83% max: 8.33% x̄: 2.38% x̃: 1.85% 95% mean confidence interval for bundles value: -1.08 -0.86 95% mean confidence interval for bundles %-change: -1.15% -0.74% Bundles are helped. total quadwords in shared programs: 1137388 -> 1136767 (-0.05%) quadwords in affected programs: 71826 -> 71205 (-0.86%) helped: 367 HURT: 17 helped stats (abs) min: 1.0 max: 8.0 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.31% max: 17.24% x̄: 2.27% x̃: 0.96% HURT stats (abs) min: 1.0 max: 6.0 x̄: 2.29 x̃: 2 HURT stats (rel) min: 0.44% max: 11.11% x̄: 2.18% x̃: 1.47% 95% mean confidence interval for quadwords value: -1.76 -1.47 95% mean confidence interval for quadwords %-change: -2.36% -1.78% Quadwords are helped. total registers in shared programs: 90483 -> 90461 (-0.02%) registers in affected programs: 890 -> 868 (-2.47%) helped: 67 HURT: 44 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 8.33% max: 25.00% x̄: 10.52% x̃: 9.09% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.02 x̃: 1 HURT stats (rel) min: 9.09% max: 50.00% x̄: 31.15% x̃: 33.33% 95% mean confidence interval for registers value: -0.39 -0.01 95% mean confidence interval for registers %-change: 1.75% 10.25% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total threads in shared programs: 55694 -> 55685 (-0.02%) threads in affected programs: 21 -> 12 (-42.86%) helped: 1 HURT: 5 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -2.79 -0.21 95% mean confidence interval for threads %-change: -89.26% 39.26% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>
2023-01-13 12:30:00 -05:00
parent c3839bd540
commit 10759d1708
3 changed files with 6 additions and 17 deletions
--- a/src/panfrost/midgard/midgard_compile.c
+++ b/src/panfrost/midgard/midgard_compile.c
@@ -419,9 +419,6 @@ optimise_nir(nir_shader *nir, unsigned quirks, bool is_blend, bool is_blit)
   if (!is_blend)
      NIR_PASS(progress, nir, nir_fuse_io_16);

-   /* Must be run at the end to prevent creation of fsin/fcos ops */
-   NIR_PASS(progress, nir, midgard_nir_scale_trig);
-
   do {
      progress = false;

@@ -865,8 +862,8 @@ emit_alu(compiler_context *ctx, nir_alu_instr *instr)
      ALU_CASE_RTZ(i2f16, i2f_rte);
      ALU_CASE_RTZ(u2f16, u2f_rte);

-      ALU_CASE(fsin, fsinpi);
-      ALU_CASE(fcos, fcospi);
+      ALU_CASE(fsin_mdg, fsinpi);
+      ALU_CASE(fcos_mdg, fcospi);

      /* We'll get 0 in the second arg, so:
       * ~a = ~(a | 0) = nor(a, 0) */
--- a/src/panfrost/midgard/midgard_nir.h
+++ b/src/panfrost/midgard/midgard_nir.h
@@ -3,7 +3,6 @@

 bool midgard_nir_lower_algebraic_early(nir_shader *shader);
 bool midgard_nir_lower_algebraic_late(nir_shader *shader);
-bool midgard_nir_scale_trig(nir_shader *shader);
 bool midgard_nir_cancel_inot(nir_shader *shader);
 bool midgard_nir_lower_image_bitsize(nir_shader *shader);
 bool midgard_nir_lower_helper_writes(nir_shader *shader);
--- a/src/panfrost/midgard/midgard_nir_algebraic.py
+++ b/src/panfrost/midgard/midgard_nir_algebraic.py
@@ -34,6 +34,10 @@ c = 'c'
 algebraic = [
   # Allows us to schedule as a multiply by 2
   (('~fadd', ('fadd', a, b), a), ('fadd', ('fadd', a, a), b)),
+
+   # Midgard scales fsin/fcos arguments by pi.
+   (('fsin', a), ('fsin_mdg', ('fdiv', a, math.pi))),
+   (('fcos', a), ('fcos_mdg', ('fdiv', a, math.pi))),
 ]

 algebraic_late = [
@@ -135,14 +139,6 @@ cancel_inot = [
        (('inot', ('inot', a)), a)
 ]

-# Midgard scales fsin/fcos arguments by pi.
-# Pass must be run only once, after the main loop
-
-scale_trig = [
-        (('fsin', a), ('fsin', ('fdiv', a, math.pi))),
-        (('fcos', a), ('fcos', ('fdiv', a, math.pi))),
-]
-
 def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('-p', '--import-path', required=True)
@@ -162,9 +158,6 @@ def run():
    print(nir_algebraic.AlgebraicPass("midgard_nir_lower_algebraic_late",
                                      algebraic_late + converts + constant_switch).render())

-    print(nir_algebraic.AlgebraicPass("midgard_nir_scale_trig",
-                                      scale_trig).render())
-
    print(nir_algebraic.AlgebraicPass("midgard_nir_cancel_inot",
                                      cancel_inot).render())