Marek Olšák
3475c79328
ac/llvm: add support for 16-bit source operands for samplers
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395 >
2021-03-03 20:06:09 +00:00
Rhys Perry
6d5e26752c
ac/nir: implement sparse image/texture loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Marek Olšák
fc212dcaa5
amd/llvm: fix C++ compile failures
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7807 >
2020-12-09 16:01:21 -05:00
James Park
ee72cd0757
amd: Remove bitfield sizes from enum values
...
Fixes negative indexing on MSVC.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7791 >
2020-11-27 20:49:00 -08:00
Marek Olšák
d425d765bf
ac: add build_alloca with an initializer
...
combining alloca_undef + BuildStore.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7542 >
2020-11-18 06:19:59 +00:00
Marek Olšák
aa757f4f8c
ac/llvm: fix demote inside conditional branches
...
The big comment explains it.
v2: don't kill if subgroup ops are used
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7586 >
2020-11-12 21:02:05 +00:00
Marek Olšák
e690a1b78b
ac/llvm: don't lower bool to int32, switch to native i1 bool
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7077 >
2020-10-20 10:21:39 +00:00
Samuel Pitoiset
5000c344cc
ac/llvm: move AC_FETCH_FORMAT to non-LLVM code
...
While we are it, give it a name.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065 >
2020-10-12 13:13:40 +00:00
Samuel Pitoiset
31a0574b96
ac/nir: implement nir_op_fsat
...
With fmed3 if available, otherwise fallback to fmin/fmax.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6932 >
2020-10-08 12:38:04 +00:00
Marek Olšák
98a52fecda
radeonsi: implement 16-bit FS color outputs
...
This removes type conversions from 16 bits to 32 bits in the main function
and then back to 16 bits in the epilog.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6622 >
2020-09-22 02:44:53 +00:00
Pierre-Eric Pelloux-Prayer
82d2d73e03
amd/llvm: switch to 3-spaces style
...
Follow-up of !4319 using the same clang-format config.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5310 >
2020-09-07 10:00:20 +02:00
Marek Olšák
e8d55e6db3
ac/llvm: fix b2f for v2f16
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284 >
2020-09-06 14:36:21 +00:00
Marek Olšák
d9a77f9ca3
ac/llvm: add better code for fsign
...
There are 2 improvements:
- better code for 16, 32, and 64 bits
- vector support for 16 and 32 bits
Totals:
SGPRS: 2639738 -> 2625882 (-0.52 %)
VGPRS: 1534120 -> 1533916 (-0.01 %)
Spilled SGPRs: 3541 -> 3557 (0.45 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 292 -> 292 (0.00 %) dwords per thread
Code Size: 55640332 -> 55384892 (-0.46 %) bytes
Max Waves: 964785 -> 964857 (0.01 %)
Totals from affected shaders:
SGPRS: 377352 -> 363496 (-3.67 %)
VGPRS: 209800 -> 209596 (-0.10 %)
Spilled SGPRs: 1979 -> 1995 (0.81 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size: 12549300 -> 12293860 (-2.04 %) bytes
Max Waves: 105762 -> 105834 (0.07 %)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284 >
2020-09-06 14:36:21 +00:00
Marek Olšák
ca74603b4f
ac/llvm: add better code for isign
...
There are 2 improvements:
- select v_med3_i32
- support vectors
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284 >
2020-09-06 14:36:21 +00:00
Marek Olšák
84500eebd7
ac/llvm: remove stub prototype for fmed3
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284 >
2020-09-06 14:36:20 +00:00
Marek Olšák
70b6d54011
ac/nir: support 16-bit data in image opcodes
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
c3e0ba52a0
ac/nir: support 16-bit data in buffer_load_format opcodes
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
b819ba949b
ac/nir: remove type and num_channels args from ac_build_buffer_store_common
...
They were only used for type overloading where we can just use
the type of data.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
e5ea87cde8
ac/nir: use more types from ac_llvm_context
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Samuel Pitoiset
14292310d9
ac/nir: implement nir_intrinsic_shader_clock with device scope
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117 >
2020-05-24 20:37:58 +02:00
Samuel Pitoiset
0d63a1a84d
ac/llvm: add support for texturing with clamped LOD
...
This is a requirement for the shaderResourceMinLod feature which
allows to clamp LOD. This uses all image_sample_*_cl variants.
All dEQP-VK.glsl.texture_functions.texture*clamp.* pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989 >
2020-05-14 10:05:44 +00:00
Pierre-Eric Pelloux-Prayer
17acff01a0
radeonsi: skip vs output optimizations for some outputs
...
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555
("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559 >
2020-04-20 08:45:16 +02:00
Samuel Pitoiset
ba2ec1f369
ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()
...
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
No pipeline-db changes with VEGA10/LLVM 9.
pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 6672 -> 6672 (0.00 %)
VGPRS: 6652 -> 6652 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 561780 -> 561692 (-0.02 %) bytes
Max Waves: 1043 -> 1043 (0.00 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 84608 -> 83768 (-0.99 %)
VGPRS: 106768 -> 106636 (-0.12 %)
Spilled SGPRs: 1625 -> 1713 (5.42 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 10850936 -> 10726712 (-1.14 %) bytes
Max Waves: 3152 -> 3180 (0.89 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326 >
2020-03-27 08:05:43 +01:00
Daniel Schürmann
de57ea2a3d
amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation
...
The current implementation uses a temporary helper variable
to ensure correct behavior until LLVM provides an intrinsic.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047 >
2020-03-09 12:29:32 +00:00
Marek Olšák
4e4b2d13f0
ac: add helper ac_build_triangle_strip_indices_to_triangle
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
0f45d4dc2b
ac: add ac_build_readlane without optimization barrier
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
77393cf39b
ac: add prefix bitcount functions
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
9b71041627
ac: add ac_build_s_endpgm
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-08 16:03:48 -05:00
Marek Olšák
1c44480538
ac: add 128-bit bitcount
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-08 16:00:41 -05:00
Marek Olšák
d1c8aeb24f
ac: unify primitive export code
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-08 16:00:38 -05:00
Marek Olšák
1c77a18cc2
ac: unify build_sendmsg_gs_alloc_req
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-08 16:00:36 -05:00
Marek Olšák
f671cc4d95
ac: set swizzled bit in cache policy as a hint not to merge loads/stores
...
LLVM now merges loads and stores for all opcodes, so this must be set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-11-25 16:48:27 -05:00
Connor Abbott
9885af3bdf
ac: Add a shared interface between radv, radeonsi, LLVM and ACO
...
ac_shader_args will be similar to ac_shader_abi, except for being free
from LLVM-specific concepts and therefore capable of being shared
between LLVM and ACO. This will help us accomplish a few different
things:
- Decouple setting up SGPR and VGPR arguments from translating to LLVM,
so that we can reference these arguments in NIR lowering passes, which
will let us lower e.g. descriptor sets in NIR.
- Stop using radv-specific structures for things like determining the
chip generation in ACO.
In the end, we should replace ac_shader_abi with this structure +
driver-specific lowering passes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-11-25 14:12:46 +01:00
Samuel Pitoiset
7dfb15fff1
ac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO
...
Because some instructions will be optimized by the backend compiler,
the driver has to manually flush to zero to keep the result exact.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-18 16:55:51 +02:00
Samuel Pitoiset
d94bd4e512
ac/llvm: add ac_build_canonicalize() helper
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-18 16:55:48 +02:00
Timur Kristóf
3a08110d43
amd: Move all amd/common code that depends on LLVM to amd/llvm.
...
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-10-08 00:44:08 +00:00