Commit Graph

140 Commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
d7008fe46a radeonsi: switch to 3-spaces style
Generated automatically using clang-format and the following config:

AlignAfterOpenBracket: true
AlignConsecutiveMacros: true
AllowAllArgumentsOnNextLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: false
AlwaysBreakAfterReturnType: None
BasedOnStyle: LLVM
BraceWrapping:
  AfterControlStatement: false
  AfterEnum: true
  AfterFunction: true
  AfterStruct: false
  BeforeElse: false
  SplitEmptyFunction: true
BinPackArguments: true
BinPackParameters: true
BreakBeforeBraces: Custom
ColumnLimit: 100
ContinuationIndentWidth: 3
Cpp11BracedListStyle: false
Cpp11BracedListStyle: true
ForEachMacros:
  - LIST_FOR_EACH_ENTRY
  - LIST_FOR_EACH_ENTRY_SAFE
  - util_dynarray_foreach
  - nir_foreach_variable
  - nir_foreach_variable_safe
  - nir_foreach_register
  - nir_foreach_register_safe
  - nir_foreach_use
  - nir_foreach_use_safe
  - nir_foreach_if_use
  - nir_foreach_if_use_safe
  - nir_foreach_def
  - nir_foreach_def_safe
  - nir_foreach_phi_src
  - nir_foreach_phi_src_safe
  - nir_foreach_parallel_copy_entry
  - nir_foreach_instr
  - nir_foreach_instr_reverse
  - nir_foreach_instr_safe
  - nir_foreach_instr_reverse_safe
  - nir_foreach_function
  - nir_foreach_block
  - nir_foreach_block_safe
  - nir_foreach_block_reverse
  - nir_foreach_block_reverse_safe
  - nir_foreach_block_in_cf_node
IncludeBlocks: Regroup
IncludeCategories:
  - Regex:           '<[[:alnum:].]+>'
    Priority:        2
  - Regex:           '.*'
    Priority:        1
IndentWidth: 3
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyExcessCharacter: 100
SpaceAfterCStyleCast: false
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: false
SpacesInContainerLiterals: false

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319>
2020-03-30 11:05:52 +00:00
Marek Olšák
1a0890dcf3 radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
be772182e0 radeonsi: make si_compile_llvm return bool
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
bd19d144a1 radeonsi: move more LLVM functions into si_shader_llvm.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
1c73d598eb radeonsi: minor cleanup in si_shader_internal.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
ab33ba987a radeonsi: move si_shader_llvm_build.c content into si_shader_llvm.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
cd5b99c541 radeonsi: move VS shader code into si_shader_llvm_vs.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
d1c42e2c6a radeonsi: move non-LLVM code out of si_shader_llvm.c
There was also some redundant code in si_shader_nir.c

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
594f085cfa radeonsi: use ctx->ac. for types and integer constants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>
2020-01-23 19:10:21 +00:00
Marek Olšák
8db00a51f8 radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
a966729c84 radeonsi/gfx10: export primitives at the beginning of VS/TES
This decreases VGPR usage and will allow us to merge some IF blocks
in shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
5a0fcf11f0 radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders
This will allow us to merge some IF blocks in shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
c4daf2b485 radeonsi: merge si_compile_llvm and si_llvm_compile functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
68586bdd21 radeonsi: remove useless #includes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
30b14ba67e radeonsi: move code for shader resources into si_shader_llvm_resources.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
da2c12af4b radeonsi: move geometry shader code into si_shader_llvm_gs.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
57bd73e229 radeonsi: remove llvm_type_is_64bit
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
194449a405 radeonsi: move tessellation shader code into si_shader_llvm_tess.c
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
2020-01-15 21:54:55 +00:00
Marek Olšák
cf65c6f0d2 radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:31 -05:00
Marek Olšák
34ef0c5083 radeonsi: make si_insert_input_* functions non-static
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-15 15:06:29 -05:00
Marek Olšák
8832a88434 radeonsi: move PS LLVM code into si_shader_llvm_ps.c
This is an attempt to clean up si_shader.c.

v2: don't move code that is not specific to LLVM

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
2020-01-14 18:46:07 -05:00
Marek Olšák
9b60b3ce93 radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
37916a66b1 radeonsi: fold si_create_function into si_llvm_create_func
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2020-01-14 18:46:07 -05:00
Marek Olšák
363b4027fc radeonsi: put up to 5 VBO descriptors into user SGPRs
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs

We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.

Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák
420fe1e7f9 radeonsi: remove TGSI
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-06 15:57:20 -05:00
Marek Olšák
43f05e0421 radeonsi/gfx10: fix ngg_get_ordered_id
This could have caused issues with NGG streamout.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
1a07df840e radeonsi: deduplicate ES and GS thread enablement code
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Connor Abbott
3b143369a5 ac/nir, radv, radeonsi: Switch to using ac_shader_args
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-11-25 14:17:10 +01:00
Marek Olšák
442ef8c3e3 radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector
This decreases memory usage, because serialized NIR is more compact.

The main shader part is compiled from nir_shader.
Monolithic shader variants are compiled from nir_binary.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-05 23:28:45 -05:00
Marek Olšák
2f42d4cacc radeonsi/gfx10: use fma for TGSI_OPCODE_FMA
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Marek Olšák
223b3174bd radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks.

This solution is better, because the IR isn't dependent on wave32.
2019-08-19 17:23:38 -04:00
Marek Olšák
1f8a661748 radeonsi: clean up si_llvm_context_set_tgsi
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-19 17:23:38 -04:00
Marek Olšák
0ef4c1c04d radeonsi: don't use lp_build_if for the wrapping if block in merged shaders 2019-07-30 22:06:23 -04:00
Marek Olšák
9234275320 radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced 2019-07-30 22:06:23 -04:00
Marek Olšák
88efb63caf radeonsi/gfx10: implement Wave32
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:19 -04:00
Marek Olšák
1d3bffaf9c radeonsi/gfx10: enable image stores with DCC
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2019-07-09 17:24:16 -04:00
Nicolai Hähnle
792a638b03 radeonsi/gfx10: implement streamout-related queries
The NGG hardware pipeline doesn't track these statistics automatically,
and in fact *cannot* track them automatically when API geometry shaders
are involved, so we accumulate statistics in the shader using atomic
adds.

This implementation accumulates statistics via the memory system and
the RW buffer descriptor setup. We could use GDS, but since these
atomics aren't latency-sensitive, that basically just trades off
L2$ bandwidth vs. export bus bandwidth. One single memory transaction
per shader workgroup doesn't seem too bad. The result ring buffer in
memory is needed either way to avoid pipeline stalls.

The shader code contains the atomic unconditionally, though the
GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The
atomic is simply discarded by the shader hardware in that case.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle
4ecc39e1aa radeonsi/gfx10: NGG geometry shader PM4 and upload
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
a04aa4be2b radeonsi/gfx10: generate geometry shaders for NGG
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
612489bd5d radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders
This does not support geometry shading yet. Also missing are streamout
and NGG-specific optimizations.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
112bf7f900 radeonsi: make emit_streamout_output externally accessible
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
04e27ec136 radeonsi: make get_primitive_id externally visible
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
5059a4df8a radeonsi: make si_llvm_export_vs externally available
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:12 -04:00
Nicolai Hähnle
bf8a1ca902 radeonsi: use the new run-time linker for shaders
v2:
- fix a memory leak

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-06-12 20:28:23 -04:00
Marek Olšák
43aa2f4f7c radeonsi: make functions for creating LLVM functions non-static
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-05-16 13:10:07 -04:00
Marek Olšák
be0bd95abf radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák
c5442c1165 radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI 2018-08-29 15:31:42 -04:00
Marek Olšák
e80e8d7adc ac: fix WAITCNT flags for GFX9
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-22 14:34:43 -04:00
Marek Olšák
a2c18bfbe3 radeonsi: don't use emit_data->args in atomic_emit
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:21:03 -04:00
Marek Olšák
cb6b241c30 ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2)
To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
finish sooner on the older CPUs. (otherwise it gets killed and we fail
the test)

Acked-by: Dave Airlie <airlied@gmail.com>
2018-08-01 15:25:18 -04:00