Rhys Perry
941739619e
Revert "radv,aco: allow unaligned LDS access on GFX9+"
...
This reverts commit 1a0b0e8460
.
The bounds checking behaviour of ds_read_b64, ds_read_b96 and ds_read_b128
make this feature very difficult to use safely.
This fixes a blocking artifact in Hitman 2. Previously, it contained:
ds_read_b64(local_invocation_index() * 4 - 4)
For local_invocation_index()=0, the second dword would be considered
out-of-bounds, even though it's at offset 0.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332 >
2021-03-02 13:13:59 +00:00
Rhys Perry
1a0b0e8460
radv,aco: allow unaligned LDS access on GFX9+
...
fossil-db (GFX10.3):
Totals from 223 (0.16% of 139391) affected shaders:
SGPRs: 10032 -> 10096 (+0.64%)
VGPRs: 7480 -> 7592 (+1.50%)
CodeSize: 853960 -> 821920 (-3.75%); split: -3.76%, +0.01%
MaxWaves: 5916 -> 5908 (-0.14%)
Instrs: 154935 -> 150281 (-3.00%); split: -3.01%, +0.01%
Cycles: 3202496 -> 3080680 (-3.80%); split: -3.81%, +0.00%
VMEM: 48187 -> 46671 (-3.15%); split: +0.29%, -3.44%
SMEM: 13869 -> 13850 (-0.14%); split: +1.52%, -1.66%
VClause: 3110 -> 3085 (-0.80%); split: -1.03%, +0.23%
SClause: 4376 -> 4381 (+0.11%)
Copies: 12132 -> 12065 (-0.55%); split: -2.61%, +2.06%
Branches: 5204 -> 5203 (-0.02%)
PreVGPRs: 6304 -> 6359 (+0.87%); split: -0.10%, +0.97%
See https://reviews.llvm.org/D82788
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8762 >
2021-02-17 12:57:12 +00:00
Rhys Perry
3d4c13f3b8
aco: add DeviceInfo
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:44:22 +00:00
Rhys Perry
b759557cac
aco: consider that GFX10.3 allocates LDS in 1024 byte blocks
...
fossil-db (GFX10.3):
Totals from 3 (0.00% of 139391) affected shaders:
VMEM: 513 -> 511 (-0.39%)
SMEM: 94 -> 92 (-2.13%)
VClause: 31 -> 30 (-3.23%)
fossil-db (GFX10.3, wave32):
Totals from 4 (0.00% of 139391) affected shaders:
VClause: 82 -> 81 (-1.22%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:35:38 +00:00
Rhys Perry
7ff805a19d
radv,aco: add radv_nir_compiler_options::wgp_mode
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:35:36 +00:00
Rhys Perry
f520f4c299
aco: add Program::wgp_mode
...
Instead of assuming WGP mode on GFX10+ in different places, add a member
to Program that can be used instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:35:14 +00:00
Rhys Perry
592d64611c
aco: fix waves calculation for wave32
...
fossil-db (GFX10.3, wave32):
Totals from 176 (0.13% of 139391) affected shaders:
SGPRs: 16648 -> 16640 (-0.05%)
VGPRs: 18920 -> 19076 (+0.82%); split: -0.30%, +1.12%
CodeSize: 2354172 -> 2354288
(+0.00%); split: -0.01%, +0.01%
MaxWaves: 1618 -> 1627 (+0.56%); split: +0.68%, -0.12%
Instrs: 435756 -> 435761 (+0.00%); split: -0.02%, +0.02%
Cycles: 8858360 -> 8869960 (+0.13%); split: -0.01%, +0.14%
VMEM: 55899 -> 57220 (+2.36%); split: +2.53%, -0.17%
SMEM: 10323 -> 10374 (+0.49%); split: +0.73%, -0.23%
VClause: 8307 -> 8290 (-0.20%); split: -0.24%, +0.04%
SClause: 16573 -> 16577 (+0.02%); split: -0.01%, +0.03%
Copies: 24641 -> 24652 (+0.04%); split: -0.24%, +0.28%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:35:13 +00:00
Daniel Schürmann
44a76ba16d
aco: use VCC as regular SGPR pair on GFX10
...
There is no need to reserve it for special purposes, only.
Totals from 139391 (100.00% of 139391) affected shaders (Navi10):
VGPRs: 4738296 -> 4738156 (-0.00%); split: -0.01%, +0.00%
SpillSGPRs: 16188 -> 14968 (-7.54%); split: -7.60%, +0.06%
CodeSize: 294204472 -> 294118048 (-0.03%); split: -0.04%, +0.01%
MaxWaves: 2119584 -> 2119619 (+0.00%); split: +0.00%, -0.00%
Instrs: 56075079 -> 56056235 (-0.03%); split: -0.05%, +0.01%
Cycles: 1757781564 -> 1755354032 (-0.14%); split: -0.16%, +0.02%
VMEM: 52995887 -> 52996319 (+0.00%); split: +0.07%, -0.07%
SMEM: 9005338 -> 9004858 (-0.01%); split: +0.16%, -0.17%
VClause: 1178436 -> 1178331 (-0.01%); split: -0.02%, +0.01%
SClause: 2403649 -> 2404542 (+0.04%); split: -0.14%, +0.18%
Copies: 3447073 -> 3432417 (-0.43%); split: -0.66%, +0.23%
Branches: 1166542 -> 1166422 (-0.01%); split: -0.11%, +0.10%
PreSGPRs: 4229322 -> 4235538 (+0.15%)
PreVGPRs: 3817111 -> 3817040 (-0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921 >
2021-02-12 19:00:18 +00:00
Daniel Schürmann
b98a4d4dd7
aco: refactor GPR limit calculation
...
This patch delays the calculation of GPR limits in order to
precisely incorporate extra registers (VCC etc.) and shared VGPRs.
Additionally, the allocation granularity is used to set the config.
This has some effect on the reported SGPR stats.
Totals (Navi10):
SGPRs: 6971787 -> 17753642 (+154.65%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921 >
2021-02-12 19:00:18 +00:00
Daniel Schürmann
eaf681724e
aco: change gpr_alloc_granule to full alignment
...
This also switches the alloc_granule of Tonga and Iceland
to 96, so that the calculation is consistent.
Also changes the granularity for RDNA to 16 to keep
better stats with the upcoming patch.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921 >
2021-02-12 19:00:18 +00:00
Rhys Perry
e115b01948
aco: return references in instruction cast methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:33 +00:00
Rhys Perry
1d245cd18b
aco: use format-check methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Rhys Perry
70dbcfa1c9
aco: use instruction cast methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Rhys Perry
441ead5fb3
aco: remove Format::{VOP3A,VOP3B}
...
These are really the same as Format::VOP3.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Rhys Perry
489aa8c7cb
aco: fix num_waves on GFX10+
...
There are half the SIMDs per CU and physical_vgprs should be 512 instead
of 256.
fossil-db (GFX10.3):
Totals from 3622 (2.60% of 139391) affected shaders:
VGPRs: 298192 -> 289732 (-2.84%); split: -3.43%, +0.59%
CodeSize: 29443432 -> 29458388 (+0.05%); split: -0.00%, +0.06%
MaxWaves: 21703 -> 23395 (+7.80%); split: +7.84%, -0.05%
Instrs: 5677920 -> 5681438 (+0.06%); split: -0.01%, +0.07%
Cycles: 280715524 -> 280895676 (+0.06%); split: -0.00%, +0.07%
VMEM: 981142 -> 981894 (+0.08%); split: +0.18%, -0.10%
SMEM: 243315 -> 243454 (+0.06%); split: +0.07%, -0.02%
VClause: 88991 -> 89767 (+0.87%); split: -0.02%, +0.89%
SClause: 200660 -> 200659 (-0.00%); split: -0.00%, +0.00%
Copies: 430729 -> 434160 (+0.80%); split: -0.07%, +0.86%
Branches: 158004 -> 158021 (+0.01%); split: -0.01%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Daniel Schürmann
4a70c4d383
aco: make pred_by_exec_mask() accessible in other files
...
and rename to needs_exec_mask().
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7903 >
2020-12-22 15:08:40 +01:00
Rhys Perry
ecc5b59a70
aco: don't allow destination opsel for v_cvt_pknorm
...
It doesn't make sense to do this.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349 >
2020-10-29 18:08:31 +00:00
Rhys Perry
bf77f539ee
aco: optimize more uniform reductions/scans
...
Uniform atomic optimization will create these.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558 >
2020-10-13 12:47:20 +00:00
Samuel Pitoiset
502b9daa7a
aco: add ACO_DEBUG=novn,noopt,nosched for debugging purposes
...
To disable value numbering, optimizations and scheduling.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6470 >
2020-08-27 10:23:51 +00:00
Timur Kristóf
f820dde201
aco: Fix convert_to_SDWA when instruction has 3 operands.
...
Previously, when the instruction had 3 operands, this would cause
possible corruption because of writing to sdwa->sel[2].
This was noticed thanks to GCC 10's -Wstringop-overflow warning.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6436 >
2020-08-24 15:55:14 +02:00
Samuel Pitoiset
f153151730
aco: add ACO_DEBUG=force-waitcnt to emit wait-states
...
Sounds useful for debugging missing wait-states and for improving
detection of the faulty instruction in case of memory violations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6386 >
2020-08-21 13:22:58 +02:00
Samuel Pitoiset
bc723dfda7
aco: rename DEBUG_VALIDATE to DEBUG_VALIDATE_IR
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6318 >
2020-08-20 08:15:04 +02:00
Rhys Perry
a5303a3cbe
aco: update vgpr_alloc_granule for GFX10.3
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546 >
2020-08-04 20:39:33 +01:00
Rhys Perry
d1f992f3c2
aco: rework barriers and replace can_reorder
...
fossil-db (Navi):
Totals from 273 (0.21% of 132058) affected shaders:
CodeSize: 937472 -> 936556 (-0.10%)
Instrs: 158874 -> 158648 (-0.14%)
Cycles: 13563516 -> 13562612 (-0.01%)
VMEM: 85246 -> 85244 (-0.00%)
SMEM: 21407 -> 21310 (-0.45%); split: +0.05%, -0.50%
VClause: 9321 -> 9317 (-0.04%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905 >
2020-07-28 16:56:34 +00:00
Rhys Perry
12b99d2581
aco: fix includes in aco_ir.cpp
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3300
Fixes: e75946cfef
('aco: move some setup code into helpers')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6015 >
2020-07-21 22:15:55 +00:00
Rhys Perry
e75946cfef
aco: move some setup code into helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6013 >
2020-07-21 19:38:43 +00:00
Rhys Perry
56345b8c61
aco: allow reading/writing upper halves/bytes when possible
...
Use SDWA, opsel or a different opcode to achieve this.
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
d9cfb8ad48
aco: validate instructions reading/writing upper halves/bytes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00