Pierre-Eric Pelloux-Prayer
d7008fe46a
radeonsi: switch to 3-spaces style
...
Generated automatically using clang-format and the following config:
AlignAfterOpenBracket: true
AlignConsecutiveMacros: true
AllowAllArgumentsOnNextLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: false
AlwaysBreakAfterReturnType: None
BasedOnStyle: LLVM
BraceWrapping:
AfterControlStatement: false
AfterEnum: true
AfterFunction: true
AfterStruct: false
BeforeElse: false
SplitEmptyFunction: true
BinPackArguments: true
BinPackParameters: true
BreakBeforeBraces: Custom
ColumnLimit: 100
ContinuationIndentWidth: 3
Cpp11BracedListStyle: false
Cpp11BracedListStyle: true
ForEachMacros:
- LIST_FOR_EACH_ENTRY
- LIST_FOR_EACH_ENTRY_SAFE
- util_dynarray_foreach
- nir_foreach_variable
- nir_foreach_variable_safe
- nir_foreach_register
- nir_foreach_register_safe
- nir_foreach_use
- nir_foreach_use_safe
- nir_foreach_if_use
- nir_foreach_if_use_safe
- nir_foreach_def
- nir_foreach_def_safe
- nir_foreach_phi_src
- nir_foreach_phi_src_safe
- nir_foreach_parallel_copy_entry
- nir_foreach_instr
- nir_foreach_instr_reverse
- nir_foreach_instr_safe
- nir_foreach_instr_reverse_safe
- nir_foreach_function
- nir_foreach_block
- nir_foreach_block_safe
- nir_foreach_block_reverse
- nir_foreach_block_reverse_safe
- nir_foreach_block_in_cf_node
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '<[[:alnum:].]+>'
Priority: 2
- Regex: '.*'
Priority: 1
IndentWidth: 3
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyExcessCharacter: 100
SpaceAfterCStyleCast: false
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: false
SpacesInContainerLiterals: false
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319 >
2020-03-30 11:05:52 +00:00
Marek Olšák
1a0890dcf3
radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
be772182e0
radeonsi: make si_compile_llvm return bool
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
bd19d144a1
radeonsi: move more LLVM functions into si_shader_llvm.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
1c73d598eb
radeonsi: minor cleanup in si_shader_internal.h
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
ab33ba987a
radeonsi: move si_shader_llvm_build.c content into si_shader_llvm.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
cd5b99c541
radeonsi: move VS shader code into si_shader_llvm_vs.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
d1c42e2c6a
radeonsi: move non-LLVM code out of si_shader_llvm.c
...
There was also some redundant code in si_shader_nir.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
594f085cfa
radeonsi: use ctx->ac. for types and integer constants
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421 >
2020-01-23 19:10:21 +00:00
Marek Olšák
8db00a51f8
radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
a966729c84
radeonsi/gfx10: export primitives at the beginning of VS/TES
...
This decreases VGPR usage and will allow us to merge some IF blocks
in shaders.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
5a0fcf11f0
radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders
...
This will allow us to merge some IF blocks in shaders.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
c4daf2b485
radeonsi: merge si_compile_llvm and si_llvm_compile functions
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
68586bdd21
radeonsi: remove useless #includes
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
30b14ba67e
radeonsi: move code for shader resources into si_shader_llvm_resources.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
da2c12af4b
radeonsi: move geometry shader code into si_shader_llvm_gs.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
57bd73e229
radeonsi: remove llvm_type_is_64bit
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
194449a405
radeonsi: move tessellation shader code into si_shader_llvm_tess.c
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399 >
2020-01-15 21:54:55 +00:00
Marek Olšák
cf65c6f0d2
radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-15 15:06:31 -05:00
Marek Olšák
34ef0c5083
radeonsi: make si_insert_input_* functions non-static
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-15 15:06:29 -05:00
Marek Olšák
8832a88434
radeonsi: move PS LLVM code into si_shader_llvm_ps.c
...
This is an attempt to clean up si_shader.c.
v2: don't move code that is not specific to LLVM
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com > (v1)
2020-01-14 18:46:07 -05:00
Marek Olšák
9b60b3ce93
radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2020-01-14 18:46:07 -05:00
Marek Olšák
37916a66b1
radeonsi: fold si_create_function into si_llvm_create_func
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2020-01-14 18:46:07 -05:00
Marek Olšák
363b4027fc
radeonsi: put up to 5 VBO descriptors into user SGPRs
...
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs
We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.
Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-13 15:57:07 -05:00
Marek Olšák
420fe1e7f9
radeonsi: remove TGSI
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-06 15:57:20 -05:00
Marek Olšák
43f05e0421
radeonsi/gfx10: fix ngg_get_ordered_id
...
This could have caused issues with NGG streamout.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095 >
2019-12-16 20:06:07 +00:00
Marek Olšák
1a07df840e
radeonsi: deduplicate ES and GS thread enablement code
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095 >
2019-12-16 20:06:07 +00:00
Connor Abbott
3b143369a5
ac/nir, radv, radeonsi: Switch to using ac_shader_args
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
2019-11-25 14:17:10 +01:00
Marek Olšák
442ef8c3e3
radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector
...
This decreases memory usage, because serialized NIR is more compact.
The main shader part is compiled from nir_shader.
Monolithic shader variants are compiled from nir_binary.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2019-11-05 23:28:45 -05:00
Marek Olšák
2f42d4cacc
radeonsi/gfx10: use fma for TGSI_OPCODE_FMA
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
223b3174bd
radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
...
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.
This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
1f8a661748
radeonsi: clean up si_llvm_context_set_tgsi
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-08-19 17:23:38 -04:00
Marek Olšák
0ef4c1c04d
radeonsi: don't use lp_build_if for the wrapping if block in merged shaders
2019-07-30 22:06:23 -04:00
Marek Olšák
9234275320
radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced
2019-07-30 22:06:23 -04:00
Marek Olšák
88efb63caf
radeonsi/gfx10: implement Wave32
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-07-19 20:16:19 -04:00
Marek Olšák
1d3bffaf9c
radeonsi/gfx10: enable image stores with DCC
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Dave Airlie <airlied@redhat.com >
2019-07-09 17:24:16 -04:00
Nicolai Hähnle
792a638b03
radeonsi/gfx10: implement streamout-related queries
...
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.
This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.
The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
4ecc39e1aa
radeonsi/gfx10: NGG geometry shader PM4 and upload
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
a04aa4be2b
radeonsi/gfx10: generate geometry shaders for NGG
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
612489bd5d
radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders
...
This does not support geometry shading yet. Also missing are streamout
and NGG-specific optimizations.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
112bf7f900
radeonsi: make emit_streamout_output externally accessible
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
04e27ec136
radeonsi: make get_primitive_id externally visible
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
5059a4df8a
radeonsi: make si_llvm_export_vs externally available
...
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
bf8a1ca902
radeonsi: use the new run-time linker for shaders
...
v2:
- fix a memory leak
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-06-12 20:28:23 -04:00
Marek Olšák
43aa2f4f7c
radeonsi: make functions for creating LLVM functions non-static
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2019-05-16 13:10:07 -04:00
Marek Olšák
be0bd95abf
radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
2018-09-10 15:19:56 -04:00
Marek Olšák
c5442c1165
radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI
2018-08-29 15:31:42 -04:00
Marek Olšák
e80e8d7adc
ac: fix WAITCNT flags for GFX9
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-08-22 14:34:43 -04:00
Marek Olšák
a2c18bfbe3
radeonsi: don't use emit_data->args in atomic_emit
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-08-14 21:21:03 -04:00
Marek Olšák
cb6b241c30
ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)
...
To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
finish sooner on the older CPUs. (otherwise it gets killed and we fail
the test)
Acked-by: Dave Airlie <airlied@gmail.com >
2018-08-01 15:25:18 -04:00