Marek Olšák
c4daf2b485
radeonsi: merge si_compile_llvm and si_llvm_compile functions
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
68586bdd21
radeonsi: remove useless #includes
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
30b14ba67e
radeonsi: move code for shader resources into si_shader_llvm_resources.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
da2c12af4b
radeonsi: move geometry shader code into si_shader_llvm_gs.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
57bd73e229
radeonsi: remove llvm_type_is_64bit
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
194449a405
radeonsi: move tessellation shader code into si_shader_llvm_tess.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
cf65c6f0d2
radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-15 15:06:31 -05:00
Marek Olšák
34ef0c5083
radeonsi: make si_insert_input_* functions non-static
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-15 15:06:29 -05:00
Marek Olšák
8832a88434
radeonsi: move PS LLVM code into si_shader_llvm_ps.c
...
This is an attempt to clean up si_shader.c.
v2: don't move code that is not specific to LLVM
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com > (v1)
2020-01-14 18:46:07 -05:00
Marek Olšák
9b60b3ce93
radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2020-01-14 18:46:07 -05:00
Marek Olšák
37916a66b1
radeonsi: fold si_create_function into si_llvm_create_func
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2020-01-14 18:46:07 -05:00
Marek Olšák
363b4027fc
radeonsi: put up to 5 VBO descriptors into user SGPRs
...
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs
We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.
Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-13 15:57:07 -05:00
Marek Olšák
420fe1e7f9
radeonsi: remove TGSI
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-06 15:57:20 -05:00
Marek Olšák
43f05e0421
radeonsi/gfx10: fix ngg_get_ordered_id
...
This could have caused issues with NGG streamout.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095 >
2019-12-16 20:06:07 +00:00
Marek Olšák
1a07df840e
radeonsi: deduplicate ES and GS thread enablement code
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095 >
2019-12-16 20:06:07 +00:00
Connor Abbott
3b143369a5
ac/nir, radv, radeonsi: Switch to using ac_shader_args
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
2019-11-25 14:17:10 +01:00
Marek Olšák
442ef8c3e3
radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector
...
This decreases memory usage, because serialized NIR is more compact.
The main shader part is compiled from nir_shader.
Monolithic shader variants are compiled from nir_binary.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2019-11-05 23:28:45 -05:00
Marek Olšák
2f42d4cacc
radeonsi/gfx10: use fma for TGSI_OPCODE_FMA
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
223b3174bd
radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
...
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.
This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
1f8a661748
radeonsi: clean up si_llvm_context_set_tgsi
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-08-19 17:23:38 -04:00
Marek Olšák
0ef4c1c04d
radeonsi: don't use lp_build_if for the wrapping if block in merged shaders
2019-07-30 22:06:23 -04:00
Marek Olšák
9234275320
radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced
2019-07-30 22:06:23 -04:00
Marek Olšák
88efb63caf
radeonsi/gfx10: implement Wave32
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-07-19 20:16:19 -04:00
Marek Olšák
1d3bffaf9c
radeonsi/gfx10: enable image stores with DCC
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Dave Airlie <airlied@redhat.com >
2019-07-09 17:24:16 -04:00
Nicolai Hähnle
792a638b03
radeonsi/gfx10: implement streamout-related queries
...
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.
This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.
The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
4ecc39e1aa
radeonsi/gfx10: NGG geometry shader PM4 and upload
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
a04aa4be2b
radeonsi/gfx10: generate geometry shaders for NGG
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
612489bd5d
radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders
...
This does not support geometry shading yet. Also missing are streamout
and NGG-specific optimizations.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
112bf7f900
radeonsi: make emit_streamout_output externally accessible
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
04e27ec136
radeonsi: make get_primitive_id externally visible
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
5059a4df8a
radeonsi: make si_llvm_export_vs externally available
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
bf8a1ca902
radeonsi: use the new run-time linker for shaders
...
v2:
- fix a memory leak
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-06-12 20:28:23 -04:00
Marek Olšák
43aa2f4f7c
radeonsi: make functions for creating LLVM functions non-static
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2019-05-16 13:10:07 -04:00
Marek Olšák
be0bd95abf
radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
2018-09-10 15:19:56 -04:00
Marek Olšák
c5442c1165
radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI
2018-08-29 15:31:42 -04:00
Marek Olšák
e80e8d7adc
ac: fix WAITCNT flags for GFX9
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-08-22 14:34:43 -04:00
Marek Olšák
a2c18bfbe3
radeonsi: don't use emit_data->args in atomic_emit
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-08-14 21:21:03 -04:00
Marek Olšák
cb6b241c30
ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)
...
To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
finish sooner on the older CPUs. (otherwise it gets killed and we fail
the test)
Acked-by: Dave Airlie <airlied@gmail.com >
2018-08-01 15:25:18 -04:00
Dave Airlie
0eb65b4944
radeonsi: rename si_compiler -> ac_llvm_compiler
...
As precursor to moving init to common code, just rename the struct
and move it.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-07-04 05:31:32 +10:00
Marek Olšák
7bd40dc2f2
radeonsi: clean up some #includes
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-06-25 18:33:58 -04:00
Marek Olšák
f154555733
radeonsi: clean up passing the is_monolithic flag for compilation
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-06-25 18:33:58 -04:00
Marek Olšák
87eb597758
radeonsi: add struct si_compiler containing LLVMTargetMachineRef
...
It will contain more variables.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2018-04-27 17:56:04 -04:00
Marek Olšák
4c5efc40f4
radeonsi: update copyrights
...
Acked-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-04-05 15:34:58 -04:00
Marek Olšák
2be6143032
radeonsi: implement GL_KHR_blend_equation_advanced
...
MSAA is supported using sample shading. Layered rendering and all texture
targets are also supported.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
2018-04-02 13:55:25 -04:00
Marek Olšák
e04631b0f2
radeonsi: rename unpack_param -> si_unpack_param
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
2018-04-02 13:55:23 -04:00
Timothy Arceri
50cc97d98a
radeonsi: add si_llvm_emit_kill() helper
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-03-08 11:28:37 +11:00
Timothy Arceri
6e1a142863
radeonsi: make use of if/loop build helpers in ac
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-03-08 10:12:34 +11:00
Marek Olšák
9779f34326
radeonsi: remove si_llvm_add_attribute
2018-03-07 13:55:49 -05:00
Timothy Arceri
2a68c6c6c8
radeonsi: move si_nir_load_input_gs() to si_shader.c
...
All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2018-03-06 11:44:06 +11:00
Marek Olšák
63ea0a00a3
radeonsi: preload the tess offchip ring in TES
...
so that it's not done multiple times in branches
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2018-02-24 23:08:29 +01:00