Lionel Landwerlin
5a9f8d21d0
nir/lower_shader_calls: lower scratch access to format internally
...
For a follow up optimization, we would like to track scratch loads.
This isn't possible with global load/store intrinsics. So use a couple
of special intrinsic in the pass and only lower it to global
intrinsics at the end.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16556 >
2022-10-26 12:53:25 +00:00
Rhys Perry
382831c986
radv,nir: add intrinsics for streamout and GS copy shaders
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19302 >
2022-10-25 17:35:08 +00:00
Qiang Yu
7fb506d068
nir: add nir_load_prim_xfb_query_enabled_amd
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17457 >
2022-10-25 12:58:43 +00:00
Qiang Yu
83643e4dc8
nir,ac/nir/ngg,radv: split shader_query_enabled_amd
...
For used by different counter.
Vulkan:
1. VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT,
sum generated primitives of all 4 streams when GS.
2. VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT, count generated
primitives for all 4 streams when VS/TES/GS.
3. VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT, count generated
and streamout primitives for all 4 streams when VS/TES/GS.
OpenGL:
1. GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB, sum generated
primitives for all 4 streams when GS.
2. GL_PRIMITIVES_GENERATED, count generated primitives for all 4
streams when VS/TES/GS.
3. GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, count streamout
primitives for all 4 streams when VS/TES/GS.
pipeline_stat_query_enabled_amd is for Vulkan 1 and OpenGL 1.
xfb_query_enabled_amd is for Vulkan 2/3 and OpenGL 2/3.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19015 >
2022-10-25 02:42:52 +00:00
Samuel Pitoiset
09033c7b22
nir: add nir_intrinsic_load_ring_attr_{offset}_amd
...
These intrinsics will be used to lower NGG attributes to memory on
GFX11.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19173 >
2022-10-20 15:59:44 +00:00
Qiang Yu
58e006b174
nir,ac/llvm,radv: add nir_intrinsic_load_provoking_vtx_in_prim_amd
...
For radeonsi which load this from arg.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19166 >
2022-10-20 06:53:56 +00:00
Samuel Pitoiset
dd30e7bfa0
nir: add nir_load_rasterization_samples_amd
...
This will be used to load the number of rasterization samples when a
fragment shader is compiled inside a library without the MSAA state.
RADV needs to know the number of samples for loading sample positions
with interpolateAtSample().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18677 >
2022-09-21 10:30:33 +00:00
Samuel Pitoiset
7f444fc72c
nir: add nir_intrinsic_load_sample_positions_amd
...
This will be used to lower barycentric_at_sample in NIR for RADV.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615 >
2022-09-20 09:52:37 +00:00
Rhys Perry
7d26fafacf
radv: fix dynamic RT stack size with VGPR spilling
...
VGPR spilling might cause VGPRs to be spilled at scratch offset 0, so we
can't use that.
fossil-db (Sienna Cichlid, Q2RTX and Control):
Totals from 4 (0.26% of 1524) affected shaders:
Instrs: 8734 -> 8737 (+0.03%)
CodeSize: 48492 -> 48504 (+0.02%)
Latency: 384375 -> 384369 (-0.00%)
InvThroughput: 256250 -> 256246 (-0.00%)
Copies: 1312 -> 1313 (+0.08%)
Branches: 256 -> 258 (+0.78%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541 >
2022-09-20 01:39:20 +00:00
Qiang Yu
4e06a8f15e
nir: add nir_intrinsic_ordered_xfb_counter_add_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654 >
2022-09-16 08:51:28 +00:00
Qiang Yu
1119e06a45
nir,ac/llvm: add nir_intrinsic_load_ordered_id_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654 >
2022-09-16 08:51:28 +00:00
Qiang Yu
5c2d710064
nir: add nir_intrinsic_load_streamout_buffer_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654 >
2022-09-16 08:51:28 +00:00
Qiang Yu
2ae357aa23
nir: add nir_intrinsic_load_num_vertices_per_primitive_amd
...
This is used in streamout as radeonsi pass this value for VS
by arg.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654 >
2022-09-16 08:51:28 +00:00
Qiang Yu
a19dcdf9d5
nir,ac/llvm: add nir_intrinsic_load_viewport_xy_scale_and_offset
...
Used by RADV/Radeonsi NGG culling. Pack them into a single vec4
load for radeonsi to reduce const buffer load.
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651 >
2022-08-26 05:50:30 +00:00
Qiang Yu
1aef9c8318
nir,ac/llvm: add nir_intrinsic_load_half_line_width_amd
...
Used by AMD GPU NGG line culling. We could use nir load
line width and viewport scale to calculate this in shader,
but this way needs expensive divide ops.
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17651 >
2022-08-26 05:50:30 +00:00
Marek Olšák
6483fd394e
nir: add nir_intrinsic_image_descriptor_amd
...
This returns the AMD shader resource descriptor.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17693 >
2022-08-03 17:44:15 +00:00
Marek Olšák
ea6993f9c7
nir: add nir_intrinsic_image_samples_identical
...
radeonsi will use it
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17693 >
2022-08-03 17:44:15 +00:00
Iago Toral Quiroga
b18cecbfb6
nir: add nir_address_format_2x32bit_global
...
This adds support for global 64-bit GPU addresses as a pair of
32-bit values. This is useful for platforms with 32-bit GPUs
that want to support VK_KHR_buffer_device_address, which makes
GPU addresses explicitly 64-bit.
With the new format we also add new global intrinsics with 2x32
suffix that consume the new address format.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17275 >
2022-07-19 09:47:34 +02:00
Arvind Yadav
8adbd2a964
ac/llvm: Implement nir_intrinsic_load_point_coord_maybe_flipped opcodes
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15117 >
2022-07-16 07:08:10 -04:00
Qiang Yu
fdf589321c
ac/nir: add nir_intrinsic_load_hs_out_patch_data_offset_amd
...
Also add radv and radeonsi implementation. Will be used in tess lowering.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705 >
2022-06-27 02:38:21 +00:00
Konstantin Seurer
16585664cd
radv: vkCmdTraceRaysIndirect2KHR
...
This changes the trace rays logic to always use
VkTraceRaysIndirectCommand2KHR and implements
vkCmdTraceRaysIndirect2KHR. I renamed the
load_sbt_amd to sbt_base_amd and moved the SBT
load lowering from ACO to NIR.
Note that we can not just upload one pointer to
all the trace parameters because that would
be incompatible with traceRaysIndirect.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16430 >
2022-06-08 20:20:21 +00:00
Timur Kristóf
02c87e66e9
nir: Introduce new intrinsics for AMD specific mesh shader task ring.
...
The mesh shader task ring is a buffer in VRAM which we will use to
store some mesh shader outputs that don't fit into LDS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16737 >
2022-06-08 08:43:51 +00:00
Qiang Yu
33b4b923ee
nir: add nir_intrinsic_load_lshs_vertex_stride_amd
...
For loading LS-HS vertex stride by shader argument in radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Alyssa Rosenzweig
5c79d649af
nir: Add transform feedback system values
...
These will be used to facilitate transform feedback lowering for Panfrost,
although other backends could use the sysvals in the future.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720 >
2022-06-04 14:35:56 +00:00
Lionel Landwerlin
5078b4fff1
nir/divergence: handle load_ray_num_dss_rt_stacks_intel
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16797 >
2022-06-01 04:58:50 +00:00
Lionel Landwerlin
d3c1b0ac28
nir/divergence: handle load_scratch_base_ptr
...
v2: divergent (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16797 >
2022-06-01 04:58:50 +00:00
Marcin Ślusarz
b95d9bca1d
nir: add load_task_payload intrinsic to nir_divergence_analysis
...
It's divergent depending on sources.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16668 >
2022-05-24 17:53:29 +00:00
Marcin Ślusarz
95dbdbf063
nir: add load_mesh_inline_data_intel intrinsic to nir_divergence_analysis
...
It's not divergent.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16668 >
2022-05-24 17:53:29 +00:00
Timur Kristóf
47da245ff2
nir: Add explicit task payload atomic intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16693 >
2022-05-24 17:21:22 +00:00
Konstantin Seurer
938c9d9615
nir: Add a ray launch size addr intrinsic
...
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15712 >
2022-05-12 15:04:31 +00:00
Timur Kristóf
7de3034897
ac/nir: Add I/O lowering for task and mesh shaders.
...
Task shaders store their output payload to VRAM where mesh
shaders read from. There are two ring buffers:
1. Draw ring: this is where mesh dispatch sizes and
the ready bit are stored.
2. Payload ring: this is where the optional payload
is stored (up to 16K per task workgroup).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14929 >
2022-05-12 00:29:51 +00:00
Jordan Justen
1c3e584dfa
nir/divergence: handle more *_intel intrinsics
...
v2: fix topo/btd (Lionel)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16421 >
2022-05-10 08:49:58 +00:00
Lionel Landwerlin
9f44a26462
nir/divergence: handle load_global_block_intel
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: dd39e311b3
("nir: Add nir_intrinsic_{load,store}_deref_block_intel")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16075 >
2022-04-21 17:41:04 +00:00
Rhys Perry
8ff122f8b8
nir: add load_shared2_amd and store_shared2_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
5c038b3f02
nir: add _amd global access intrinsics
...
These are the same as the normal ones, but they take an unsigned 32-bit
offset in BASE and another unsigned 32-bit offset in the last source.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Kenneth Graunke
af529b545a
nir: Teach nir_divergence_analysis about Intel-specific intrinsics
...
- load_reloc_const is just an immediate constant load, it's convergent.
- nir_intrinsic_load_global_const_block_intel should be convergent,
it says the address must be uniform, and we uniformize the predicate
- Lowered image intrinsics: image_deref_load_param_intel just reads
information about an image, as long as the image variable is
convergent it should be too. load_raw_intel...if the address we
come up with is convergent, it ought to be as well.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484 >
2022-03-26 00:28:19 +00:00
Rhys Perry
ff52a724d4
nir: add load_{scalar,vector}_arg_amd and load_smem_amd intrinsics
...
load_smem_gcn is similar to load_global/load_global_constant, but it's
guaranteed to use SMEM and it's much easier to utilize the format's 32-bit
offset source.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12773 >
2022-03-22 16:33:27 +00:00
Kenneth Graunke
85d30846db
nir: Print divergence status of SSA values if analysis was ever run.
...
After running divergence analysis, we include "div" or "con" for each
SSA def's divergence/convergence status:
vec1 32 div ssa_35 = fddy ssa_34
vec1 32 con ssa_36 = fddy ssa_6.x
We omit this before the first time divergence analysis has been run.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15445 >
2022-03-21 22:46:34 -07:00
Timur Kristóf
4b99b528f5
nir: Introduce workgroup_index and ability to lower workgroup_id to it.
...
The workgroup_index is intended for situations when a 3 dimensional
workgroup_id is not available on the HW, but a 1 dimensional index is.
In this case, we can use lower the 3D ID to use this.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103 >
2022-03-08 17:36:31 +00:00
Samuel Pitoiset
74b932f8d3
nir: add nir_intrinsic_load_vrs_rates_amd
...
This intrinsic specific to RADV will be used to load VRS rates from
an user SGPR when RADV_FORCE_VRS is enabled by the application.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14713 >
2022-02-16 08:11:12 +01:00
Jason Ekstrand
e8acc5a7ea
nir: Add a new sample_pos_or_center system value
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198 >
2021-12-17 16:02:16 +00:00
Marek Olšák
2785141c16
nir: add nir_has_divergent_loop function
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966 >
2021-12-11 20:07:35 +00:00
Marek Olšák
2ab310b78b
nir: handle more intrinsics in divergence analysis
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966 >
2021-12-11 20:07:35 +00:00
Jason Ekstrand
956199e870
nir: s/nir_var_mem_image/nir_var_image/g
...
We typically use nir_var_mem_* for stuff that has an explicit byte-based
memory layout. Images are opaque.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13386 >
2021-10-16 03:47:10 +00:00
Caio Marcelo de Oliveira Filho
de3705edb0
nir: Add nir_var_mem_image
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743 >
2021-10-15 14:58:55 +00:00
Bas Nieuwenhuizen
0d8bd8518d
nir: Support ray launch size in divergence analysis.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592 >
2021-09-21 01:53:39 +00:00
Bas Nieuwenhuizen
56b06c09b4
nir: Add AMD rt intrinsics.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592 >
2021-09-21 01:53:39 +00:00
Caio Marcelo de Oliveira Filho
27697d5eb8
nir/divergence_analysis: Handle Task/Mesh shaders
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600 >
2021-08-28 03:56:42 +00:00
Timur Kristóf
1bbea90f50
aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags.
...
Instead of v_bfe + v_lshl_or for each vertex, get all 3 edge flags
at once of every vertex. This takes fewer VALU instructions than
previously.
Fossil DB results on Sienna Cichlid (with NGGC on):
Totals from 56917 (44.24% of 128647) affected shaders:
CodeSize: 161028288 -> 158751628 (-1.41%)
Instrs: 30917985 -> 30519571 (-1.29%)
Latency: 130617204 -> 129975532 (-0.49%); split: -0.50%, +0.01%
InvThroughput: 21280238 -> 20927401 (-1.66%)
Copies: 3011120 -> 3011125 (+0.00%); split: -0.00%, +0.00%
No Fossil DB changed with NGGC off.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908 >
2021-08-02 11:38:25 +00:00
Timur Kristóf
48e638ab29
nir: Add AMD specific intrinsics for NGG shader based culling.
...
The new intrinsics fall into the following categories:
1. New viewport intrinsics:
For missing components that we need.
RADV will emit new SGPR arguments which will contain the
viewport information for culling shaders. These are used to
compute the screen space coordinates for small primitive culling.
2. load_cull_xxx:
Load the culling settings in runtime.
These will be a new SGPR argument in RADV.
3. overwrite_xxx:
These are needed because system values such as vertex and
instance ID are not writeable, but we need to change them
after repacking shader invocations of VS and TES.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525 >
2021-07-13 23:56:33 +00:00