Flushing and invalidating caches isn't necessary for workgroup scope
fences. In fact, the DP_FLUSH_TYPE docs (BSpec 54041) say:
"If the fence scope is Local or Threadgroup, HW ignores the flush
type and operates as if it was set to None(no flush)"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
With the new nir_opt_barrier_modes() pass, we may encounter control
barriers with no memory modes set, such as:
@barrier () (execution_scope=WORKGROUP, memory_scope=WORKGROUP, mem_semantics=ACQ|REL, mem_modes=0)
The DXIL validator documentation [1] mentions an
INSTR.BARRIERMODENOMEMORY validation rule:
"sync must include some form of memory barrier - _u (UAV) and/or
_g (Thread Group Shared Memory). Only _t (thread group sync) is
optional."
We were generating a dx.op.barrier instruction with only one flag,
DXIL_BARRIER_MODE_SYNC_THREAD_GROUP. This seems to run afoul of the
above validator rule. So, this patch adjusts the code generator to
set DXIL_BARRIER_MODE_UAV_FENCE_THREAD_GROUP too, whenever
UAV_FENCE_GLOBAL isn't required.
[1] https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
Most drivers will want nir_opt_barrier_modes() to optimize out
unnecessary memory barrier modes. However, virgl has to translate
back to GLSL, which means it can really only handle partial memory
barriers in compute shaders today, because there isn't a proper
way to express them otherwise. Just ask nir_to_tgsi to promote
these back to full barriers as a workaround.
See KHR-GL43.shader_storage_buffer_object.advanced-readWrite-case1
on virpipe-on-gl as a case where this hack is needed.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
Originally written by Ian Romanick for the Intel backend, but ported
to the new nir_opt_barrier_modes() common optimization pass. Ian's
original explanation and commit message follows:
Shared memory only exists within a workgroup, so synchronizing it beyond
workgroup scope is nonsense.
Basically every SPIR-V compiler generates operations like
OpMemoryBarrier(/*Memory*/Device,
/*Semantics*/AcquireRelease | WorkgroupMemory)
This is suggested in numerous places, including
https://github.com/KhronosGroup/GLSL/blob/master/extensions/khr/GL_KHR_vulkan_glsl.txt.
Even Mesa's glsl_to_nir pass does this. This advice, which has been
copy-and-pasted everywhere, is contrary to issue 13 in the original
GL_ARB_compute_shader spec:
"Since shared memory is only accessible to threads within a single
work group, memoryBarrierShared() also only requires synchronization
with other threads in the same work group."
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
Many shaders issue full memory barriers, which may need to synchronize
access to images, SSBOs, shared local memory, or global memory.
However, many of them only use a subset of those memory types - say,
only SSBOs.
Shaders may also have patterns such as:
1. shared local memory access
2. barrier with full variable modes
3. more shared local memory access
4. image access
In this case, the barrier is needed to ensure synchronization between
the various shared memory operations. Image reads and writes do also
exist, but they are all on one side of the barrier, so it is a no-op for
image access. We can drop the image mode from the barrier here too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24842>
this is the inverted version of rewrite_read_as_0 which tests for mismatched
component i/o on a given location and rewrites the inputs to zero if the
producer shader didn't write to the component
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24950>
Re-purpose renderer has_external_sync to cover explicit sync emulation
in venus, so that we don't have to add a new flag to distinguish the
emulation path enablement for virtgpu and vtest.
This is to unblock zink implicit sync hanlding against venus for now,
and soon we should migrate to virtgpu fence passing.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25127>
Don't rely on the HW to set values correctly so just emit
STATE_COMPUTE_MODE with default values set to zero.
Also, this change includes workaround changes:-
- 14015808183 (Parent HSD 14015782607) - Need to emit pipe control
with HDC flush and untyped cache flush set to 1 when CCS has
non-pipelined state update with STATE_COMPUTE_MODE.
- 14014427904 (Parent HSD 22013045878) - We need additional
invalidate/flush when emitting non-pipelined state commands with
multiple CCS enabled.
v2: (Tapani)
- Use lineage HSD numbers for check
- Don't use poisoned WA directly
- Use intel_needs_workaround helper
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24508>
When applying barriers for image transitions, we're currently
considering all possible usages of an image. But when running on a
compute only queue for example, the usage of an image will never be
one of those :
- VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT
- VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
- VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
- VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT
- VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE_ATTACHMENT_BIT_KHR
Removing unused usages for the compute queue allows us to reduce the
scope of the VK_IMAGE_LAYOUT_GENERAL for example. This a bunch of
transition operation that are completely useless when dealing with
barriers on the compute queue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25092>