If a loop is followed by a barrier, the ordering between a load inside
the loop and other memory operations after the barrier may have to be
preserved depending on the type of memory involved. This is relevant
when the memory is writeable by other invocations. In such case, it
is not valid to completely eliminate the loop.
This commit doesn't attempt to precisely catch the barrier case, as
analysis could become too complex. It simply assumes it can't drop
the loops that contain certain types of loads unless those are known
to be safe to reorder (via the access flag).
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4475
Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9938>
mmap requires its offset is page aligned, but the current code only
guarantees 4k alignment, causing drm-shim to break badly on kernels with
>4k page sizes. This fixes drm-shim on my Apple M1, running bare metal
Linux with 16k pages. It probably also fixes exotic PowerPC systems with
64k pages.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Zoltan Boszormenyi <zboszor@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12347>
Without this stitch_blocks complains about ending in a jump with a
non-empty block after the inserted body.
I hit this with CTS raytracing tests where we tried to inline a
function that basically ended up being something like
{
ignore_ray_intersection
halt
}
I kept the nop path when possible as that does not leave a mess
for the optimization loop to optimize.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12163>
Need to avoid lowering temps when they are used by other instructions,
like the rt instructions (some of the shader call parameters get
converted to temp variables and we will lower them later with
the explicit io lowering pass as we need to guarantee they will
end up in scratch).
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12162>
When creating an image out of a swapchain on Android, the android
layer call will detect a VkBindImageMemorySwapchainInfoKHR in the
pNext chain of the vkBindImageMemory2() call and add a
VkNativeBufferANDROID in the chain. This is what we should use as
backing memory for that image.
v2: Fix a couple of obvious mistakes (Tapani)
v3: Silence build warning (Lionel)
Fix invalid object argument to vk_error() (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bc3c71b87a ("anv: don't try to access Android swapchains")
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5180
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12244>
We're just using this as a minor optimization to allocate pages for
buffers up front outside of some kernel locking. It's not essential.
The SET_DOMAIN ioctl may be going away in the future, so let's be a
bit cautious and try it but not fail.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12044>
We need to raise an error when importing a user pointer as a BO if the
supplied pointer can't actually be used on the GPU. Previously, we were
(ab)using the SET_DOMAIN ioctl for this, but it's not really intended
for that purpose, and is going away on discrete.
Fortunately, there's a new kernel API for this: the I915_USERPTR_PROBE
flag tries to perform basic sanity checking of the supplied pointer so
that we can at least reject obvious misuse of this API up front.
Use the new API where available. Fall back to SET_DOMAIN if it isn't.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12044>
Previously, when importing a resource with modifiers that include
clear color as auxiliary data, we were mapping the clear color BO
on the CPU in order to set res->aux.clear_color to the value stored
there.
We are generally trying to avoid CPU mapping imported buffers, because
in hybrid setups, they could be from a different device, and may not be
mappable. So we'd like to avoid that here.
This CPU-side knowledge of the clear color is only used in a few cases:
1. Avoiding a resolve due to a partial clear with a changing clear color
2. Avoiding disabling CCS when rendering to a texture view of a resource
where the only format difference is sRGB vs. linear and the clear
color is 0/1.
Instead of mapping the clear color BO on the CPU, we instead set a flag
indicating that we don't know the clear color, and reset it to known on
our first clear.
In the first case, the first partial fast clear of a resource would eat
an extra resolve (there is no penalty if we clear the whole resource).
The second case doesn't seem that critical, as it's someone rendering
to an imported BO - when the common case is to texture from it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11888>
When incrementing unifa address in DCE optimization, ensure that we
setup correctly the current block, so the ldfunif optimization is also
executed correctly.
This fixes
dEQP-VK.graphicsfuzz.cov-struct-float-array-mix-uniform-vectors
heap-buffer overflow with address sanitizer enabled.
v2 (Iago):
- Save and restore current block
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12339>
While this feature is optional in Vulkan 1.1 and we don't currently
expose it, the CTS still requires that the entry points exist.
From the Vulkan 1.1 spec:
"If the VK_KHR_sampler_ycbcr_conversion extension is not supported,
support for the samplerYcbcrConversion feature is optional."
(...)
"samplerYcbcrConversion specifies whether the implementation supports
sampler YCBCR conversion. If samplerYcbcrConversion is VK_FALSE,
sampler YCBCR conversion is not supported, and samplers using sampler
YCBCR conversion must not be used."
Fixes (with Vulkan 1.1 exposed):
dEQP-VK.api.version_check.entry_points
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12338>
This allows this pattern:
$ radeonsi-run-tests.py /tmp/foo
... reports that some piglit tests regressed ...
$ radeonsi-run-tests.py -t /tmp/foo/new_baseline/sienna_cichlid-piglit-quick-fail.csv
... this only runs the test that regressed ...
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12306>