Commit Graph

128 Commits

Author SHA1 Message Date
Marek Olšák
c4daf2b485 radeonsi: merge si_compile_llvm and si_llvm_compile functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
68586bdd21 radeonsi: remove useless #includes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
30b14ba67e radeonsi: move code for shader resources into si_shader_llvm_resources.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
da2c12af4b radeonsi: move geometry shader code into si_shader_llvm_gs.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
57bd73e229 radeonsi: remove llvm_type_is_64bit
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
194449a405 radeonsi: move tessellation shader code into si_shader_llvm_tess.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
cf65c6f0d2 radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:31 -05:00
Marek Olšák
34ef0c5083 radeonsi: make si_insert_input_* functions non-static
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:29 -05:00
Marek Olšák
8832a88434 radeonsi: move PS LLVM code into si_shader_llvm_ps.c
This is an attempt to clean up si_shader.c.

v2: don't move code that is not specific to LLVM

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
2020-01-14 18:46:07 -05:00
Marek Olšák
9b60b3ce93 radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
37916a66b1 radeonsi: fold si_create_function into si_llvm_create_func
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
363b4027fc radeonsi: put up to 5 VBO descriptors into user SGPRs
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs

We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.

Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
420fe1e7f9 radeonsi: remove TGSI
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-06 15:57:20 -05:00
Marek Olšák
43f05e0421 radeonsi/gfx10: fix ngg_get_ordered_id
This could have caused issues with NGG streamout.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
1a07df840e radeonsi: deduplicate ES and GS thread enablement code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Connor Abbott
3b143369a5 ac/nir, radv, radeonsi: Switch to using ac_shader_args
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-11-25 14:17:10 +01:00
Marek Olšák
442ef8c3e3 radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector
This decreases memory usage, because serialized NIR is more compact.

The main shader part is compiled from nir_shader.
Monolithic shader variants are compiled from nir_binary.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:45 -05:00
Marek Olšák
2f42d4cacc radeonsi/gfx10: use fma for TGSI_OPCODE_FMA
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
223b3174bd radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.

This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
1f8a661748 radeonsi: clean up si_llvm_context_set_tgsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
0ef4c1c04d radeonsi: don't use lp_build_if for the wrapping if block in merged shaders 2019-07-30 22:06:23 -04:00
Marek Olšák
9234275320 radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced 2019-07-30 22:06:23 -04:00
Marek Olšák
88efb63caf radeonsi/gfx10: implement Wave32
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
1d3bffaf9c radeonsi/gfx10: enable image stores with DCC
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Nicolai Hähnle
792a638b03 radeonsi/gfx10: implement streamout-related queries
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.

This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.

The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
4ecc39e1aa radeonsi/gfx10: NGG geometry shader PM4 and upload
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
a04aa4be2b radeonsi/gfx10: generate geometry shaders for NGG
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
612489bd5d radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders
This does not support geometry shading yet. Also missing are streamout
and NGG-specific optimizations.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
112bf7f900 radeonsi: make emit_streamout_output externally accessible
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
04e27ec136 radeonsi: make get_primitive_id externally visible
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
5059a4df8a radeonsi: make si_llvm_export_vs externally available
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
bf8a1ca902 radeonsi: use the new run-time linker for shaders
v2:
- fix a memory leak

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Marek Olšák
43aa2f4f7c radeonsi: make functions for creating LLVM functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
be0bd95abf radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
c5442c1165 radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI 2018-08-29 15:31:42 -04:00
Marek Olšák
e80e8d7adc ac: fix WAITCNT flags for GFX9
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-22 14:34:43 -04:00
Marek Olšák
a2c18bfbe3 radeonsi: don't use emit_data->args in atomic_emit
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:03 -04:00
Marek Olšák
cb6b241c30 ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)
To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
finish sooner on the older CPUs. (otherwise it gets killed and we fail
the test)

Acked-by: Dave Airlie <airlied@gmail.com>
2018-08-01 15:25:18 -04:00
Dave Airlie
0eb65b4944 radeonsi: rename si_compiler -> ac_llvm_compiler
As precursor to moving init to common code, just rename the struct
and move it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-07-04 05:31:32 +10:00
Marek Olšák
7bd40dc2f2 radeonsi: clean up some #includes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
f154555733 radeonsi: clean up passing the is_monolithic flag for compilation
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák
87eb597758 radeonsi: add struct si_compiler containing LLVMTargetMachineRef
It will contain more variables.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák
4c5efc40f4 radeonsi: update copyrights
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-04-05 15:34:58 -04:00
Marek Olšák
2be6143032 radeonsi: implement GL_KHR_blend_equation_advanced
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:25 -04:00
Marek Olšák
e04631b0f2 radeonsi: rename unpack_param -> si_unpack_param
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-02 13:55:23 -04:00
Timothy Arceri
50cc97d98a radeonsi: add si_llvm_emit_kill() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 11:28:37 +11:00
Timothy Arceri
6e1a142863 radeonsi: make use of if/loop build helpers in ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-08 10:12:34 +11:00
Marek Olšák
9779f34326 radeonsi: remove si_llvm_add_attribute 2018-03-07 13:55:49 -05:00
Timothy Arceri
2a68c6c6c8 radeonsi: move si_nir_load_input_gs() to si_shader.c
All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-03-06 11:44:06 +11:00
Marek Olšák
63ea0a00a3 radeonsi: preload the tess offchip ring in TES
so that it's not done multiple times in branches

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-02-24 23:08:29 +01:00