Samuel Pitoiset
61a91ca3f5
ac/nir: move unpack_param() to ac_llvm_build.c
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
28bb6873ec
ac/nir: move trim_vector to ac_llvm_build.c
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
895632baef
ac/nir: move cast_ptr() to ac_llvm_build.c
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-03-13 14:05:06 +01:00
Samuel Pitoiset
bf6368297b
ac/nir: move ac_build_alloca() to ac_llvm_build.c
...
As well as si_build_alloca_undef() and drop the si prefix.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-03-13 14:05:06 +01:00
Timothy Arceri
42627dabb4
ac: add if/loop build helpers
...
These have been ported over from radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-03-08 10:12:34 +11:00
Samuel Pitoiset
675dde13b2
ac: update enabled channels mask when optimizing PARAM exports
...
When the mask is not 0xf we need to update the number of
enabled channels, otherwise the hardware won't emit the
components that are combined.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2018-03-06 10:37:52 +01:00
Samuel Pitoiset
322a51b549
ac: add ac_build_fsign()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-03-05 11:04:36 +01:00
Samuel Pitoiset
e8bdde2289
ac: add ac_build_isign()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-03-05 11:04:32 +01:00
Samuel Pitoiset
459e33900f
ac: add ac_build_fract()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-03-05 11:04:30 +01:00
Marek Olšák
931ec80eeb
radeonsi: implement 32-bit pointers in user data SGPRs (v2)
...
User SGPRs changes:
VS: 14 -> 9
TCS: 14 -> 10
TES: 10 -> 6
GS: 8 -> 4
GSCOPY: 2 -> 1
PS: 9 -> 5
Merged VS-TCS: 24 -> 16
Merged VS-GS: 18 -> 11
Merged TES-GS: 18 -> 11
SGPRS: 2170102 -> 2158430 (-0.54 %)
VGPRS: 1645656
-> 1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size: 52094872 -> 52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)
v2: - the shader cache needs to take address32_hi into account
- set amdgpu-32bit-address-high-bits
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com > (v1)
2018-02-17 04:52:17 +01:00
Timothy Arceri
12a2350e6d
ac: add 64bit support to ac_find_lsb()
...
v2: use LLVMBuildTrunc()
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-09 09:42:59 +11:00
Timothy Arceri
a9f6b392c7
ac: move get_elem_bits() to ac_llvm_build.c
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-09 09:42:59 +11:00
Samuel Pitoiset
bd9f7b7635
ac: add ac_build_export_null() helper
...
Imported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-08 22:11:42 +01:00
Timothy Arceri
b7b89bbddb
ac/radeonsi: create ac_build_shader_clock() helper
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-07 08:43:08 +11:00
Marek Olšák
3bf1e036e8
amd: remove support for LLVM 3.9
...
Only these are supported:
- LLVM 4.0
- LLVM 5.0
- LLVM 6.0
- master (7.0)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-02-02 23:47:40 +01:00
Marek Olšák
847d0a393d
radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-02 16:46:22 +01:00
Marek Olšák
bac9fa9f17
ac: add glc parameter to ac_build_buffer_load_format
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-01 16:20:19 +01:00
Marek Olšák
be973ed21f
radeonsi: load the right number of components for VS inputs and TBOs
...
The supported counts are 1, 2, 4. (3=4)
The following snippet loads float, vec2, vec3, and vec4:
Before:
buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904
buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507
After:
buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04
buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805
buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-01 16:20:19 +01:00
Dave Airlie
16dd0eb517
ac/llvm: bump the number of results to 8.
...
This function can get access for a 64-bit dvec4, which means we
have to load 8 components.
This fixes:
R600_DEBUG=nir ./bin/shader_runner generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-abs-dvec4.shader_test -auto
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-01-31 05:37:16 +10:00
Marek Olšák
b633999a4e
ac: rename and move si_const_array into common code
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-27 02:09:09 +01:00
Samuel Pitoiset
51e14bc3c0
ac: pass the number of channels to ac_build_buffer_load_format()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2018-01-26 12:14:27 +01:00
Samuel Pitoiset
d7c93b558a
ac: add ac_build_buffer_load_common() helper
...
For both versions of llvm.amdgcn.buffer.load.{format}.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2018-01-26 12:14:27 +01:00
Timothy Arceri
5b9362c248
ac: fix ac_build_varying_gather_values() for packed layouts
...
This fixes a segfault for varyings not starting at component 0.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2018-01-23 10:00:52 +11:00
Timothy Arceri
38876c88d1
ac: add i64_0 and i64_1 to llvm build context
...
These will be used in the following patch.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-01-14 11:40:03 +11:00
Timothy Arceri
d7b6b8ba52
ac: add f64_0 to the llvm build context
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-12 09:29:18 +11:00
Timothy Arceri
c0eb304acd
ac: add f64_1 to the llvm build context
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-12 09:29:17 +11:00
Samuel Pitoiset
7239e265eb
amd/common: import get_{load,store}_intr_attribs() from RadeonSI
...
v2: move those helpers to the header and use static inline
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl > (v1)
2018-01-10 19:02:23 +01:00
Marek Olšák
a140aeb619
ac: add ac_build_fmin/fmax helpers
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-06 09:51:43 +01:00
Timothy Arceri
4a0c24f2dd
ac: rework ac_llvm_extract_elem()
...
Simplifies the logic a little and asserts index is 0.
Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-01-05 12:20:38 +11:00
Timothy Arceri
b99ebaa4fd
ac: move some helpers to ac_llvm_build.c
...
We will call these from the radeonsi NIR backend.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-01-05 11:58:55 +11:00
Samuel Pitoiset
03ef264146
amd/common: pass the family to ac_llvm_context_init()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2017-12-22 10:38:44 +01:00
Samuel Pitoiset
225b198802
amd/common: add ac_build_waitcnt()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2017-12-14 22:24:44 +01:00
Samuel Pitoiset
d43e72fd8c
radeonsi: make use of ac_build_fdiv()
...
And move the comment to amd/common.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2017-12-14 22:24:38 +01:00
Timothy Arceri
caf15ce670
ac: move build_varying_gather_values() to ac_llvm_build.h and expose
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-12-04 12:52:19 +11:00
Timothy Arceri
7f4966731f
ac: add v2f32 to the common code and make use of it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4
ac: add v3i32 to the common code and make use of it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d
ac: add v2i32 to the common code and use it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-11-03 14:54:45 +11:00
Dave Airlie
82d47b9d38
ac/llvm: consolidate find lsb function.
...
This was the same between si and ac.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-10-26 15:59:31 +10:00
Dave Airlie
a76b6c2192
ac/llvm: add i1false/i1true to common code.
...
These get used in fair few places.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-10-26 15:59:18 +10:00
Dave Airlie
f925f5b074
ac/nir: move lds declaration/load/store into shared code.
...
This was duplicated between both drivers, share here.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-10-26 15:59:11 +10:00
Marek Olšák
2a414c3961
radeonsi: postponed KILL isn't postponed anymore, but maintains WQM
...
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-24 14:56:34 +02:00
Marek Olšák
478afbe525
ac: use llvm.amdgcn.kill with LLVM 6.0
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-24 14:56:34 +02:00
Marek Olšák
1ff9e27cbd
ac: replace ac_build_kill with ac_build_kill_if_false
...
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-24 14:56:34 +02:00
Eric Anholt
34c04c734f
ac: Fix a compiler warning for possibly undefined "name"
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-23 10:14:40 -07:00
Dave Airlie
1dda214d9c
ac/nir: init full exec mask for merged shaders.
...
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2017-10-20 01:50:40 +02:00
Marek Olšák
854593b8eb
ac: clean up ac_build_indexed_load function interfaces
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-17 22:03:03 +02:00
Marek Olšák
bcd3e761a3
ac: properly document a buffer.store LLVM workaround
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-06 02:56:11 +02:00
Marek Olšák
94d800bfa3
ac: silence a warning
2017-10-04 17:00:05 +02:00
Nicolai Hähnle
052b974fed
amd/common: move ac_build_phi from radeonsi
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-10-02 12:17:15 +02:00
Nicolai Hähnle
a6ea4c1b93
amd/common: save an instruction in the build_cube_select sequence
...
Avoid a v_cndmask: the absolute value is free due to input modifiers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de >
2017-09-29 11:43:07 +02:00