Also moved everything in a struct and then return the struct from
the helper function, so it is clear in the caller what part of the
pipeline gets modified.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This gives about 2% performance improvement on dota2 for me.
This is mostly a mechanical copy and replacement, but at bind time
we still do:
1) Some stuff that is only based on num_samples changes.
2) Some command buffer state setting.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This was just found while reading for other stuff,
src/core/hw/gfxip/gfx6/gfx6DepthStencilView.cpp.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We need to enable the pos float location 2 mode anytime we have
persample not just when forced by the frag shader.
This fixes:
dEQP-VK.pipeline.multisample.min_sample_shading*
Fixes: 58c97a079 (radv: enable location at sample when persample is forced.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is ported from radeonsi and fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_*
v2: don't call this path for radeonsi, it does it in the epilog.
use the radeonsi code path.
v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel)
v3.1: set ps_iter_samples default to 1 (Bas)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: bdcbe7c76 (radv: add sample mask input support)
Signed-off-by: Dave Airlie <airlied@redhat.com>
This can't work for two reasons:
- TESSINNER/TESSOUTER are shader input values, so never translated
to the intrinsic ops
- the shader info pass scans the current stage but we want to know
in TCS, if TES reads the tess factors.
This fixes 6 regressions related to
deqp-vk/tessellation/shader_input_output/tess_level_{inner,outer}_XXX_tes
This reverts commit 5ba1a61648.
Tested with a modified deferred demo and no regressions in a 1.0.2
mustpass run.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The EXT values are really large, e.g.
VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 1000099000, so 1 << value
is not going to fit into a 32-bit mask.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
It makes more sense to move all scan stuff in the same place.
Also, we don't really need to duplicate the uses_primid field
for each stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Overall it does not really help or hurt. The deferred demo gets 1%
improvement and some games a 3% decrease, so I don't think this
should be enabled by default.
But with the code upstream it is easier to experiment with it.
v2: Remove initializing the registers from si_emit_config.
Reviewed-by: Dave Airlie <airlied@redhat.com>
anv merges the tess info correctly, but radv wasn't doing this.
This fixes hangs in
dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw
Fixes: 60fc0544e0 (radv/pipeline: handle tessellation shader compilation)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use 16_ABGR instead of 32_ABGR if Z isn't written.
Ported from RadeonSI.
No CTS regressions on Polaris.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It's really annoying and this pollutes the output especially
when a bunch of non-meta shaders are compiled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
I don't think we will need a 64-bit unsigned integer for the
dirty flags in the future, and there is still 20 bits left.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This allows an app to query shader statistics and get a disassembly of
a shader. RenderDoc git has support for it, so this allows you to view
shader disassembly from a capture.
When this extension is enabled on a device (or when tracing), we now
disable pipeline caching, since we don't get the shader debug info when
we retrieve cached shaders.
v2: Improvements to resource usage reporting
v3: Disassembly string must be null terminated (string_buffer's length
does not include the terminator)
v4: Fixed LDS reporting. (Bas)
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The beginning of the end for the shader keys. Not entirely sure
what I'm going to replace them with for the compiler though, so this
is the first step.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>