Commit Graph

11 Commits

Author SHA1 Message Date
Rhys Perry
525b107347 aco: rework vertex fetching a bit
This will make it easier to skip unused channels at the start and to split
unaligned loads on GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
2020-01-28 11:39:57 +00:00
Rhys Perry
70f63c1988 aco: improve support for s_sendmsg
In particular, the messages needed for GS.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>
2020-01-24 13:35:07 +00:00
Daniel Schürmann
da7ff58835 aco: make 1/2*PI a literal constant on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
6a586a6006 aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Timur Kristóf
e0bcefc3a0 aco/wave32: Use lane mask regclass for exec/vcc.
Currently all usages of exec and vcc are hardcoded to use s2 regclass.
This commit makes it possible to use s1 in wave32 mode and
s2 in wave64 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Timur Kristóf
b4efe179ed aco/wave32: Add wave size specific opcodes to aco_builder.
Several places in ACO we use SOP1 or SOP2 instructions to operate over the
exec mask or VCC, and these need to be adapted to the new size in wave32
mode.

This commit adds a way to deal with this problem in aco_builder: the caller
can specify a wave size specific opcode and the builder will translate that
to the correct opcode based on the current wave size.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-04 10:36:01 +00:00
Rhys Perry
56c06c79fc aco: implement 64-bit integer reductions
The multiplication reduction is larger than it could be, but it should be
easier to implement this way.

No failures with dEQP-VK.subgroups.*int64* except those caused by LLVM
being used for other stages.

v2: don't call setFixed() for v_add carry-out, since setHint sets physReg
v3: add and use emit_vadd32() helper
v4: use num_opcodes instead of last_opcode

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)
2019-11-19 18:58:04 +00:00
Daniel Schürmann
0b8216b2cd aco: Lower to CSSA
Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes.
Previously, it was possible that phi operands have intersecting live-ranges, and thus,
couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to
spill phis, even if it was beneficial.
This patch implements a conversion pass which is currently only called if spilling is necessary.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Timur Kristóf
d59f702e26 aco: Implement subgroup shuffle in GFX10 wave64 mode.
Previously subgroup shuffle was implemented using the bpermute
instruction, which only works accross half-waves, so by itself it's
not suitable for implementing subgroup shuffle when the shader is
running in wave64 mode.

This commit adds a trick using shared VGPRs that allows to implement
subgroup shuffle still relatively effectively in this mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
619f0a71cc aco: Use the VOP3-only add/sub GFX10 instructions if needed.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Daniel Schürmann
93c8ebfa78 aco: Initial commit of independent AMD compiler
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.

ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.

Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Co-authored-by: Connor Abbott <cwabbott0@gmail.com>
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com>
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00