third_party_mesa3d

Author	SHA1	Message	Date
Bas Nieuwenhuizen	efa4e9568b	radv: Use correct watermark for early loop exit. The previous check assumed the stack starts at offset=0, which isn't necessarily true for ray queries. Note that this didn't cause correctness issues, just made an optimization not apply. Found when I accidentally made this load-bearing in a refactor. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20265>	2022-12-11 18:51:29 +00:00
Bas Nieuwenhuizen	f0d6a1a685	radv: Rename stack_base to stack_low_watermark. Better covers the purpose. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20265>	2022-12-11 18:51:29 +00:00
Bas Nieuwenhuizen	3884210902	radv: Skip and for node_to_addr with bvh_base. Cause the bvh base is always 64 byte aligned. Totals from 7 (0.01% of 134913) affected shaders: CodeSize: 209216 -> 209076 (-0.07%) Instrs: 38402 -> 38374 (-0.07%) Latency: 804537 -> 803899 (-0.08%) InvThroughput: 165663 -> 165530 (-0.08%) Copies: 4919 -> 4912 (-0.14%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>	2022-11-19 14:24:36 +00:00
Bas Nieuwenhuizen	0a26975840	radv: Move ray flag compares out of the loop. To save on and+cmp combos with VALU instructions. Totals from 7 (0.01% of 134913) affected shaders: CodeSize: 208476 -> 209216 (+0.35%) Instrs: 38384 -> 38402 (+0.05%) Latency: 805725 -> 804537 (-0.15%) InvThroughput: 165906 -> 165663 (-0.15%) Copies: 4936 -> 4919 (-0.34%) PreSGPRs: 393 -> 430 (+9.41%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>	2022-11-19 14:24:36 +00:00
Konstantin Seurer	ac01f09d57	radv/rt: Load instance id and custom index on demand Stats for Quae II RTX: 57fps -> 57fps Totals from 7 (14.00% of 50) affected shaders: VGPRs: 800 -> 784 (-2.00%) CodeSize: 217868 -> 218308 (+0.20%) MaxWaves: 62 -> 63 (+1.61%) Instrs: 40384 -> 40420 (+0.09%); split: -0.01%, +0.10% Latency: 866315 -> 870692 (+0.51%) InvThroughput: 199189 -> 196595 (-1.30%); split: -1.75%, +0.45% VClause: 1058 -> 1077 (+1.80%) SClause: 1126 -> 1130 (+0.36%) Copies: 5787 -> 5772 (-0.26%); split: -0.40%, +0.14% PreVGPRs: 764 -> 750 (-1.83%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19159>	2022-10-24 14:39:25 +00:00
Bas Nieuwenhuizen	a8abdc0d89	radv: Add traversal backtracking with a short stack. So we can now work with arbitrarily deep BVHs. Reviewed-By: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18799>	2022-10-10 23:35:25 +00:00
Bas Nieuwenhuizen	f1e1509c92	radv: Add a field for the offset of the bvh in the blas. So that we can put some metadata in front. Reviewed-By: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18799>	2022-10-10 23:35:25 +00:00
Konstantin Seurer	ac45935345	radv: Add a common traversal build helper Adds a helper for building the ray traversal loop to radv_rt_common. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18650>	2022-09-24 13:23:40 +00:00
Bas Nieuwenhuizen	3c09681edd	radv: Use proper matrices for instance nodes. Converts both wto and otw matrices to be full row-major 4x3 matrices. Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18692>	2022-09-23 22:52:23 +00:00
Bas Nieuwenhuizen	b2972cf410	radv: Add scratch stack to reduce LDS stack in RT traversal. The current stack size is a significant limiter for occupancy, and hence we need smaller stacks in LDS. Rhys earlier had a patch that just put the N entries closest to the root in LDS and the rest in scratch. However, this is not ideal for performance as most of the activity is happening away from the root, near the leaves. Of course we can't just switch it around, as the leaf activity likely isn't happening all the way at the end of the stack. So what we do is make the LDS stack kinda a ringbuffer by always accessing it using the stack index modulo the buffer size (always a power of two so we can efficiently mask). If we then do not have free space in this buffer we evict the entries closest to the root to scratch and if we hit the "bottom" of the LDS space we load from scratch. Some rough perf numbers for indication with Q2RTX: \| evicting \| LDS entries \| perf \| \|----------\|-------------\|------\| \| no \| 76 \| 55% \| \| no \| 32 \| 100% \| \| no \| 24 \| 105% \| \| yes \| 32 \| 95% \| \| yes \| 16 \| 100% \| \| yes \| 8 \| 90% \| \| yes \| 4 \| 75% \| (For the case with 4 entries we need to do some extra accounting as a full batch may not be available to evict) So an obvious choice is to use a stack of 16 entries. One might wonder if Q2RTX perf is mainly good due to BVHs with very little geometry and hence low depth, so I also did some profiling with control. This is done with RGP instruction timing, so this is instructions executed not weighted for enabled masks, i.e. divergence effects included. \| game \| LDS entries \| scratch action \| fraction of iterations \| \|---------\|-------------\|----------------\|------------------------\| \| Control \| 8 \| store \| 10.3% \| \| Control \| 8 \| load \| 34.8% \| \| Control \| 16 \| store \| 0.58% \| \| Control \| 16 \| load \| 2.62% \| \| Q2RTX \| 16 \| store \| 1.00% \| \| Q2RTX \| 16 \| load \| 3.07% \| So Q2RTX doesn't seem like an unreasonably good case for this algorithm. On the implementation side, we can always place the scratch stack at address 0 by just reserving the scratch space, and in the case of fixed callstack size moving that up. In the dynamic case the dynamic stack base already takes any reserved scratch space into account. Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>	2022-09-20 01:39:20 +00:00
Konstantin Seurer	c4650cbdb0	radv: Replace magic constants with enum values Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15722>	2022-04-03 12:43:00 +00:00
Konstantin Seurer	6d2e95db7b	radv: Move common code to seperate file Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14565>	2022-03-13 12:02:05 +01:00

12 Commits