Timothy Arceri
42627dabb4
ac: add if/loop build helpers
...
These have been ported over from radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-03-08 10:12:34 +11:00
Samuel Pitoiset
322a51b549
ac: add ac_build_fsign()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-03-05 11:04:36 +01:00
Samuel Pitoiset
e8bdde2289
ac: add ac_build_isign()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-03-05 11:04:32 +01:00
Samuel Pitoiset
459e33900f
ac: add ac_build_fract()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
2018-03-05 11:04:30 +01:00
Marek Olšák
931ec80eeb
radeonsi: implement 32-bit pointers in user data SGPRs (v2)
...
User SGPRs changes:
VS: 14 -> 9
TCS: 14 -> 10
TES: 10 -> 6
GS: 8 -> 4
GSCOPY: 2 -> 1
PS: 9 -> 5
Merged VS-TCS: 24 -> 16
Merged VS-GS: 18 -> 11
Merged TES-GS: 18 -> 11
SGPRS: 2170102 -> 2158430 (-0.54 %)
VGPRS: 1645656
-> 1641516 (-0.25 %)
Spilled SGPRs: 9078 -> 8810 (-2.95 %)
Spilled VGPRs: 130 -> 114 (-12.31 %)
Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread
Code Size: 52094872 -> 52692540 (1.15 %) bytes
Max Waves: 371848 -> 372723 (0.24 %)
v2: - the shader cache needs to take address32_hi into account
- set amdgpu-32bit-address-high-bits
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com > (v1)
2018-02-17 04:52:17 +01:00
Bas Nieuwenhuizen
7461bd5b8f
ac: Use the renumbered const address space for LLVM 7.
...
The LLVM AMDGPU backend decided to renumber the constant address
space ....
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-14 01:05:03 +01:00
Timothy Arceri
a9f6b392c7
ac: move get_elem_bits() to ac_llvm_build.c
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-09 09:42:59 +11:00
Samuel Pitoiset
bd9f7b7635
ac: add ac_build_export_null() helper
...
Imported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-08 22:11:42 +01:00
Timothy Arceri
b7b89bbddb
ac/radeonsi: create ac_build_shader_clock() helper
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-02-07 08:43:08 +11:00
Marek Olšák
847d0a393d
radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-02 16:46:22 +01:00
Marek Olšák
bac9fa9f17
ac: add glc parameter to ac_build_buffer_load_format
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-01 16:20:19 +01:00
Marek Olšák
be973ed21f
radeonsi: load the right number of components for VS inputs and TBOs
...
The supported counts are 1, 2, 4. (3=4)
The following snippet loads float, vec2, vec3, and vec4:
Before:
buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904
buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507
After:
buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04
buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805
buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-02-01 16:20:19 +01:00
Marek Olšák
b633999a4e
ac: rename and move si_const_array into common code
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-27 02:09:09 +01:00
Marek Olšák
e17eb8800f
ac: move address space definitions to common code
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-27 02:09:09 +01:00
Samuel Pitoiset
51e14bc3c0
ac: pass the number of channels to ac_build_buffer_load_format()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2018-01-26 12:14:27 +01:00
Timothy Arceri
38876c88d1
ac: add i64_0 and i64_1 to llvm build context
...
These will be used in the following patch.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2018-01-14 11:40:03 +11:00
Timothy Arceri
d7b6b8ba52
ac: add f64_0 to the llvm build context
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-12 09:29:18 +11:00
Timothy Arceri
c0eb304acd
ac: add f64_1 to the llvm build context
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-12 09:29:17 +11:00
Marek Olšák
a140aeb619
ac: add ac_build_fmin/fmax helpers
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2018-01-06 09:51:43 +01:00
Timothy Arceri
b99ebaa4fd
ac: move some helpers to ac_llvm_build.c
...
We will call these from the radeonsi NIR backend.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2018-01-05 11:58:55 +11:00
Samuel Pitoiset
03ef264146
amd/common: pass the family to ac_llvm_context_init()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2017-12-22 10:38:44 +01:00
Samuel Pitoiset
225b198802
amd/common: add ac_build_waitcnt()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2017-12-14 22:24:44 +01:00
Timothy Arceri
caf15ce670
ac: move build_varying_gather_values() to ac_llvm_build.h and expose
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-12-04 12:52:19 +11:00
Timothy Arceri
7f4966731f
ac: add v2f32 to the common code and make use of it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-11-03 14:54:46 +11:00
Timothy Arceri
ee376ac6f4
ac: add v3i32 to the common code and make use of it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-11-03 14:54:45 +11:00
Timothy Arceri
309a51411d
ac: add v2i32 to the common code and use it
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-11-03 14:54:45 +11:00
Dave Airlie
82d47b9d38
ac/llvm: consolidate find lsb function.
...
This was the same between si and ac.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-10-26 15:59:31 +10:00
Dave Airlie
a76b6c2192
ac/llvm: add i1false/i1true to common code.
...
These get used in fair few places.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-10-26 15:59:18 +10:00
Dave Airlie
f925f5b074
ac/nir: move lds declaration/load/store into shared code.
...
This was duplicated between both drivers, share here.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-10-26 15:59:11 +10:00
Marek Olšák
2a414c3961
radeonsi: postponed KILL isn't postponed anymore, but maintains WQM
...
This restores performance for the drirc workaround, i.e.
KILL_IF does:
visible = src0 >= 0;
kill_flag &= visible; // accumulate kills
amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only
And all helper pixels are killed at the end of the shader:
amdgcn_kill(kill_flag);
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-24 14:56:34 +02:00
Marek Olšák
1ff9e27cbd
ac: replace ac_build_kill with ac_build_kill_if_false
...
This will be a new LLVM intrinsic and will also work nicely with
llvm.amdgcn.wqm.vote.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-24 14:56:34 +02:00
Dave Airlie
1dda214d9c
ac/nir: init full exec mask for merged shaders.
...
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2017-10-20 01:50:40 +02:00
Marek Olšák
854593b8eb
ac: clean up ac_build_indexed_load function interfaces
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-17 22:03:03 +02:00
Marek Olšák
bcd3e761a3
ac: properly document a buffer.store LLVM workaround
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
2017-10-06 02:56:11 +02:00
Nicolai Hähnle
052b974fed
amd/common: move ac_build_phi from radeonsi
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-10-02 12:17:15 +02:00
Nicolai Hähnle
6772452e4c
amd/common: remove has_ds_bpermute argument from ac_build_ddxy
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
3db86d86ed
amd/common: add chip_class to ac_llvm_context
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
e0af3bed2c
amd/common: round cube array slice in ac_prepare_cube_coords
...
The NIR-to-LLVM pass already does this; now the same fix covers
radeonsi as well.
Fixes various tests of
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2017-09-18 11:25:18 +02:00
Connor Abbott
50967cd0b0
ac: move ac_to_integer() and ac_to_float() to ac_llvm_build.c
...
We'll need to use ac_to_integer() for other stuff in ac_llvm_build.c.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2017-09-08 04:24:02 +01:00
Connor Abbott
b8a51c8c4b
radeonsi: move the guts of ARB_shader_group_vote emission to ac
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-09-08 04:12:49 +01:00
Connor Abbott
bd73b89792
radeonsi: move si_emit_ballot() to ac
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-09-08 04:12:42 +01:00
Connor Abbott
ac27fa7294
radeonsi: move emit_optimization_barrier() to ac
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-09-08 04:06:47 +01:00
Connor Abbott
c181d4f2b7
radeonsi: move llvm_get_type_size() to ac
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-09-08 04:04:16 +01:00
Dave Airlie
cb6f16dce9
radeon/ac: use ds_swizzle for derivs on si/cik.
...
This looks like it's supported since llvm 3.9 at least,
so switch over radeonsi and radv to using it, -pro also
uses this. We can now drop creating lds for these operations
as the ds_swizzle operation doesn't actually write to lds at all.
Acked-by: Marek Olšák <marek.olsak@amd.com >
(stable requested due to fixing radv CIK conformance tests)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-08-02 00:12:01 +01:00
Nicolai Hähnle
a69afb68c9
radeonsi: use new function ac_build_umin for edgeflag clamping
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-07-31 14:55:42 +02:00
Nicolai Hähnle
ac2ab5acad
ac/nir: add always_vector argument to ac_build_gather_values_extended
...
This simplifies a bunch of places that no longer need special treatment
of value_count == 1. We rely on LLVM to optimize away the 1-element vector
types.
This fixes a bunch of bugs where 1-element arrays are indexed indirectly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2017-07-31 14:55:42 +02:00
Dave Airlie
ff422500cc
ac/nir: remove last remnants of v16i8
...
llvm doesn't need this workaround anymore.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-06-28 20:22:30 +01:00
Nicolai Hähnle
edfd3be77e
ac: add ac_llvm_context::v8i32
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-06-27 10:28:29 +10:00
Nicolai Hähnle
331a574732
ac: add ac_llvm_context::{i,f}32_{0,1}
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-06-27 10:28:29 +10:00
Nicolai Hähnle
7bf8c944dc
ac: add ac_llvm_context::{i16, i64, f16, f64}
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Signed-off-by: Dave Airlie <airlied@redhat.com >
2017-06-27 10:28:29 +10:00