Commit Graph

79 Commits

Author SHA1 Message Date
Rhys Perry
d23f8b0dcf aco: rename opcode->instruction
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27913>
2024-03-22 10:29:43 +00:00
Georg Lehmann
0c57340c23 aco/builder: use 24bit mul if low bits of imm are zero
Foz-DB Navi31:
Totals from 39 (0.05% of 79395) affected shaders:
Instrs: 62712 -> 62696 (-0.03%)
CodeSize: 330096 -> 329896 (-0.06%)
Latency: 192747 -> 192561 (-0.10%)
InvThroughput: 34078 -> 33889 (-0.55%)
VALU: 38979 -> 38963 (-0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28223>
2024-03-18 13:37:28 +00:00
Georg Lehmann
b48a101d8f aco/builder: improve v_mul_imm for negative imm
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28223>
2024-03-18 13:37:27 +00:00
Rhys Perry
6547e17e60 aco: add VOPD format
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23367>
2024-02-05 10:51:14 +00:00
Georg Lehmann
4c74077b62 aco: implement rotate
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27118>
2024-01-24 16:38:40 +00:00
Georg Lehmann
18f6c2328f aco: use lm for carry out in vsub32
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26163>
2023-11-15 12:35:32 +00:00
Rhys Perry
15a3515d0b aco/tests: test that hazards are resolved at the end of shader parts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25374>
2023-10-11 15:14:04 +00:00
Rhys Perry
94958e637d aco: improve printing of s_delay_alu
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23213>
2023-05-30 12:42:00 +00:00
Georg Lehmann
984bdc0fb1 aco/builder: support VOP3(P) with dpp
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22698>
2023-05-12 13:31:15 +00:00
Timur Kristóf
9b6945bb65 amd: Cleanup old GS intrinsics code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
2023-05-04 19:08:59 +00:00
Daniel Schürmann
7d35bf24f6 aco: create hw_init_scratch() function for p_init_scratch lowering
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Georg Lehmann
097a97cc42 aco: remove VOP[123C]P? structs
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023>
2023-03-07 11:53:23 +00:00
Georg Lehmann
8fabde3be4 aco/gfx11: use dpp_row_xmask and dpp_row_share
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412>
2023-02-27 11:09:42 +00:00
Rhys Perry
b4383821e7 aco: don't modify exec in p_interp_gfx11
The RDNA3 ISA docs say that lds_param_load write the entire quad
regardless of exec, so this isn't needed.

fossil-db (gfx1100):
Totals from 5291 (3.93% of 134574) affected shaders:
Instrs: 4891396 -> 4789628 (-2.08%)
CodeSize: 25519032 -> 25111960 (-1.60%)
Latency: 36122982 -> 36074300 (-0.13%); split: -0.14%, +0.00%
InvThroughput: 4162436 -> 4161424 (-0.02%); split: -0.02%, +0.00%
Copies: 263862 -> 263838 (-0.01%)
PreSGPRs: 225012 -> 224179 (-0.37%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21171>
2023-02-08 19:35:54 +00:00
Rhys Perry
850d945baf aco/tests: add setup_reduce_temp.divergent_if_phi
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621>
2023-02-01 15:45:22 +00:00
Rhys Perry
c3dd1931d9 aco: allow Builder::Result to be dereferenced
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251>
2023-01-10 16:01:38 +00:00
Samuel Pitoiset
369c9b6425 aco: fix p_interp_gfx11 to not overwrite SCC
s_wqm_b64 clobbers SCC.
Found this while working on dual source blending.

Fixes: 6113ee650a ("aco/gfx11: fix FS input loads in quad-divergent control flow")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19747>
2022-11-15 15:57:31 +00:00
Rhys Perry
6113ee650a aco/gfx11: fix FS input loads in quad-divergent control flow
This is not ideal and it would be great to somehow make it better some
day.

fossil-db (gfx1100):
Totals from 5208 (3.86% of 135032) affected shaders:
MaxWaves: 127058 -> 126962 (-0.08%); split: +0.01%, -0.09%
Instrs: 3983440 -> 4072736 (+2.24%); split: -0.00%, +2.24%
CodeSize: 21872468 -> 22230852 (+1.64%); split: -0.00%, +1.64%
VGPRs: 206688 -> 206984 (+0.14%); split: -0.05%, +0.20%
Latency: 37447383 -> 37491197 (+0.12%); split: -0.05%, +0.17%
InvThroughput: 6421955 -> 6422348 (+0.01%); split: -0.03%, +0.03%
VClause: 71579 -> 71545 (-0.05%); split: -0.09%, +0.04%
SClause: 148289 -> 147146 (-0.77%); split: -0.84%, +0.07%
Copies: 259011 -> 258084 (-0.36%); split: -0.61%, +0.25%
Branches: 101366 -> 101314 (-0.05%); split: -0.10%, +0.05%
PreSGPRs: 223482 -> 223460 (-0.01%); split: -0.21%, +0.20%
PreVGPRs: 184448 -> 184744 (+0.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19370>
2022-11-01 12:42:43 +00:00
Samuel Pitoiset
fdc212bd7b aco: create a new builder variant for ds_add_rtn
This instruction can use 1 definition and 3 operands.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19317>
2022-10-31 13:48:39 +00:00
Samuel Pitoiset
c481978ac2 aco: split the sendmsg enumeration into sendmsg_rtn
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19267>
2022-10-25 20:23:07 +02:00
Rhys Perry
6407d783ea aco: update sendmsg enum from LLVM
Add GFX11 enums and some new ones that apparently existed before.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17710>
2022-09-30 20:57:02 +00:00
Rhys Perry
826ed52174 aco/tests: add GFX11 assembly tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333>
2022-09-26 14:49:57 +00:00
Rhys Perry
aadb7aef01 aco: add VINTERP instruction format
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333>
2022-09-26 14:49:56 +00:00
Rhys Perry
55cd74d468 aco: add LDSDIR instruction format
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333>
2022-09-26 14:49:56 +00:00
Rhys Perry
7a1b522148 aco: rename Interp_instruction to VINTRP_instruction
These is clearer since GFX11 adds another interpolation format.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333>
2022-09-26 14:49:56 +00:00
Rhys Perry
931a456db1 aco: improve support for scratch_* instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
dae1629778 aco: disable sdwa on gfx11
Instead of SDWA v_mov_b32/v_xor_b32, we can use a combination of
v_add_u16/v_sub_u16 (add/sub swap, similar to xor swap) and v_perm_b32
with a literal.

I don't know yet if GFX11 adds any new instructions which makes this
easier, but this approach should have full functionality.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16595>
2022-05-31 18:07:34 +00:00
Marek Olšák
39800f0fa3 amd: change chip_class naming to "enum amd_gfx_level gfx_level"
This aligns the naming with PAL.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pellou-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16469>
2022-05-13 14:56:22 -04:00
Daniel Schürmann
d703a0e808 aco: remove register hints entirely
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>
2022-04-13 21:52:43 +00:00
Daniel Schürmann
2fe005a3fe aco: remove occurences of VCC hint
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408>
2022-04-13 21:52:43 +00:00
Tatsuyuki Ishi
da0412e55b aco: support DPP8
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971>
2021-12-31 20:56:39 +00:00
Rhys Perry
1988a78430 aco: fix vadd32() when b is neither a constant nor temporary
This will be useful for compiling vertex shader prologs, where we
basically use ACO as an assembler.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11717>
2021-10-13 05:13:10 +00:00
Tony Wasserka
76554419b3 aco: Remove use of deprecated Operand constructors in aco_builder.h
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>
2021-07-13 17:43:26 +00:00
Daniel Schürmann
59fdaa1985 aco: reorder and cleanup #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Rhys Perry
c768d7d8f2 aco/tests: add SDWA tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
298d400e5c aco/tests: add test for NSAToVMEMBug
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9187>
2021-03-17 12:31:05 +00:00
Rhys Perry
441ead5fb3 aco: remove Format::{VOP3A,VOP3B}
These are really the same as Format::VOP3.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Daniel Schürmann
5ad52ac906 aco: create helpers to emit vop3p instructions
Also make get_alu_src() capable to return
unswizzled multi-component SGPR sources.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680>
2021-01-13 17:46:56 +00:00
Rhys Perry
382f50ad2c aco: implement sparse texture fetches
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775>
2021-01-08 14:27:07 +00:00
Samuel Pitoiset
be600b009a aco: add a new Operand flag to indicate that is 24-bit
To indicate that the upper 8-bits are always 0 to optimize more MADs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Rhys Perry
02c5519e6c aco: try harder to not create v_mul_lo_u32
fossil-db (Vega):
Totals from 4 (0.00% of 137413) affected shaders:
CodeSize: 13708 -> 13716 (+0.06%)
Instrs: 2742 -> 2744 (+0.07%)
Cycles: 24348 -> 24236 (-0.46%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Rhys Perry
8ca23bcf39 aco: copy constant to sgpr in Builder::v_mul_imm()
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Rhys Perry
70d665d981 aco: don't create v_mov_b32 in v_mul_imm()
We switched to p_parallelcopy for everything else.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Tony Wasserka
2bb8874320 aco: Fix -Wshadow warnings
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Samuel Pitoiset
0ea763a727 aco: add a new Operand flag to indicate that is 16-bit
To indicate that the upper 16-bits are always 0 and that optimizing
v_mad_u32_u16 to v_mul_u32_u24 is valid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7425>
2020-11-12 12:32:26 +00:00
Rhys Perry
ef95ba8cdd aco: implement some 16-bit arithmetic instead of lowering
fossil-db (parallel-rdp, Navi):
Totals from 210 (30.75% of 683) affected shaders:
SGPRs: 9704 -> 10248 (+5.61%)
VGPRs: 5884 -> 5368 (-8.77%)
CodeSize: 1155564 -> 1098752 (-4.92%)
Instrs: 199927 -> 189940 (-5.00%)
Cycles: 20438392 -> 19860124 (-2.83%)

v2: use divergence analysis to determine which instructions to lower.

Co-Authored-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>
2020-11-04 11:50:37 +00:00
Rhys Perry
e54c111c45 aco: always use p_parallelcopy for pre-RA copies
Most fossil-db changes are because literals are applied earlier
(in label_instruction), so use counts are more accurate and more literals
are applied.

fossil-db (Navi):
Totals from 79551 (57.89% of 137413) affected shaders:
SGPRs: 4549610 -> 4542802 (-0.15%); split: -0.19%, +0.04%
VGPRs: 3326764 -> 3324172 (-0.08%); split: -0.10%, +0.03%
SpillSGPRs: 38886 -> 34562 (-11.12%); split: -11.14%, +0.02%
CodeSize: 240143456 -> 240001008 (-0.06%); split: -0.11%, +0.05%
MaxWaves: 1078919 -> 1079281 (+0.03%); split: +0.04%, -0.01%
Instrs: 46627073 -> 46528490 (-0.21%); split: -0.22%, +0.01%

fossil-db (Polaris):
Totals from 98463 (70.90% of 138881) affected shaders:
SGPRs: 5164689 -> 5164353 (-0.01%); split: -0.02%, +0.01%
VGPRs: 3920936 -> 3921856 (+0.02%); split: -0.00%, +0.03%
SpillSGPRs: 56298 -> 52259 (-7.17%); split: -7.22%, +0.04%
CodeSize: 258680092 -> 258692712 (+0.00%); split: -0.02%, +0.03%
MaxWaves: 620863 -> 620823 (-0.01%); split: +0.00%, -0.01%
Instrs: 50776289 -> 50757577 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
74e2e9b682 aco: don't use bld.copy() in handle_operands()
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry
1a652244e4 aco: implement 16-bit literals
We can copy any value into a 16-bit subregister with a 3 dword
v_pack_b32_f16 on GFX10 or a v_and_b32+v_or_b32 on GFX9.

Because the generated code can depend on the register assignment and to
improve constant propagation, Builder::copy creates a p_create_vector in
the case of sub-dword literals.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7111>
2020-10-15 11:33:42 +00:00
Rhys Perry
bf77f539ee aco: optimize more uniform reductions/scans
Uniform atomic optimization will create these.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>
2020-10-13 12:47:20 +00:00