if we're creating a block containing multiple variables, we want to create
the whole block at once, not just each individual variable, as we have no
way to reference individual variables in vulkan due to the requirement
for VkDescriptorSetLayoutBinding members to have different binding values
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6981>
this extension allows for array nesting with no clear limitations, so we need
to ensure that we use the "unrolled" array size in a couple places in order to
correctly bind and access these types of arrays in shaders
fixes spec@arb_separate_shader_objects@active sampler conflict and others
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6981>
LLVM loves take advantage of the fact that vec3s in OpenCL are 16B
aligned and so it can just read/write them as vec4s. This results in a
LOT of vec4->vec3 casts on loads and stores. One solution to this
problem is to get rid of all vec3 variables.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
LLVM loves take advantage of the fact that vec3s in OpenCL are 16B
aligned so it can just read/write them as vec4s. This is questionably
legal except that it uses a xyz write-mask when it does it. The result
is a LOT of vec4->vec3 casts on loads and stores. This optimization
detects this case as well as other bit-cast cases and rewrites them to
get rid of the cast.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
These are based on the ones which already existed in the load/store
vectorization pass but I made some improvements while moving them. In
particular,
1. They're both faster if the bit sizes are equal
2. The check is faster if old_bit_size > new_bit_size
3. The check now fails if it would use more than NIR_MAX_VEC_COMPONENTS
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
This pass attempts to optimize three broad categories of memcpy:
1. Self-copies: These we can discard out-of-hand.
2. Vector copies: It doesn't matter what the vector size is or if the
source and destination have different vector types, it's still easy
enough to emit a load/store pair.
3. Tightly packed copies: In the case where a type is tightly packed
(no padding bits), we can replace the memcpy with a copy_deref
instruction which the optimizer is far better at handling.
This has proven capable of getting rid of many of the memcpy instances
in some rather gnarly OpenCL C kernels I've been looking at, even after
coming out of LLVM's optimizer.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
In 9f3c595dfc, we attempted to handle casts in opt_find_array_copies
but missed a critical case. In particular, in the case where we begin
finding a copy but then encounter a cast, we need to discard everything
which might alias that cast.
Fixes: 9f3c595dfc "nir/find_array_copies: Handle cast derefs"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
When I originally wrote this code, I forgot to release the views the
code creates, leaking a bit of memory that never gets cleaned up. That's
not great, so let's plug it.
Fixes: e8a40715a8 ("gallium/util: add blitter-support for stencil-fallback")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6960>
v2:
* Use sub_cs when creating the IB in tu6_build_lrz(). (Jonathan Marek)
* Emit tu6_build_lrz() only when pipeline state changes or there is a
clear. (Jonathan Marek)
v3:
* Don't modify tu_pipeline object, track the changes in command buffer
state.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5146>
v2:
* Don't emit tu6_clear_lrz() using a IB but in the command stream
provided. (Jonathan Marek)
* Valid_clear_ib is always false if TU_DEBUG_NOLRZ is set. Remove the
useless condition. (Jonathan Marek)
* Added more comments.
* Use r2d function for blitting LRZ. (Jonathan Marek)
v3:
* Do LRZ tracking in the command buffer state (Connor).
v4:
* Simplify the emission of source setup (Jonathan Marek)
v5:
* Separate LRZ setup in a different function.
* Not hide LRZ setup inside GMEM path (Jonathan Marek)
* Fix iova address emission in tu6_clear_lrz() (Jonathan Marek)
* Add CCU sysmem flushes (Jonathan Marek)
v6:
* Fixed bug related to storing a VkClearValue pointer that could be
out-of-scope when we access to it for emitting LRZ clear.
v7:
* Merge tu6_clear_lrz() and tu6_clear_lrz_setup() into the same
function and emit LRZ clear at the beginning of the renderpass.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5146>
There are depth compare op modes that are not supported by LRZ in the
HW. Also, it is not supported when blend or stencil are enabled.
v2:
* Set pipeline->lrz.write to the same value than depthWriteEnable.
* Improve comment on disabling LRZ write on blend.
* Remove pipeline's lrz invalidation when there is no clear mask in
render pass. It is confusing. (Jonathan Marek)
* Mark the pipeline state as changed.
* Add comment on not using GREATER flag.
v3:
* Replace {rb,gras}_lrz_cntl by flags in struct tu_pipeline.
* Added z_test_enable flag.
v4:
* Created struct tu_lrz_pipeline to avoid modifying immutable objects.
v5:
* Fixed crashes when pDepthStencilState pointer is NULL.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5146>
v2:
- Add missing vulkan subpass support. (Jonathan Marek)
- When creating the BO, mark it as not valid until it is cleared.
- Move LRZ struct to tu_image. (Jonathan Marek)
- Destroy BO when we destroy the image. (Jonathan Marek)
v3:
- Allocate the buffer as part of the image's BO (Connor)
- Moved image's LRZ values to its layout.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5146>
The poweron failure happens before we get to the bootloader
("load_archive: loading locale_en.bin") not after we're trying to boot the
kernel and we're waiting for the deqp run to complete.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6970>
The linker was adding all state vars as uniforms, doubling the storage size
for shaders using only builtin uniforms, which increased CPU overhead for
constant buffer uploads.
When this code was originally ported from the GLSL IR linker we forgot
to exclude builtins because the check was not done in the
add_uniform_to_shader class but rather a check was done when passing
variables to this class for processing.
Fixes: 664e4a610d ("glsl/nir: Fill in the Parameters in NIR linker")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6958>
With blob resources, stride doesn't necesarily have to
equal width * bpp. The use case for this a minigbm blob
resource with blob mem BLOB_MEM_HOST3D_GUEST imported into
guest Mesa. In addition, for BLOB_MEM_HOST we can repurpose
the transfer ioctls to also flush caches if need be, so this
seems a good time to fix this issue.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4821>
We should have GL4.5 with this. Piglit tests should now pass.
In terms of performance, we're between 70% to 80% of host
performance on Iris, based on a apitrace of a 2013 GL4.5
game:
11.204 FPS (guest)
15.947 FPS (host)
This is still better than the status quo, when said game was unplayable
with Virgl due to an inefficient GL4.3 fallback.
TEST=piglit -t arb_buffer_storage all results/ passes
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4821>
A blob resource is a container for:
- VIRTGPU_BLOB_MEM_GUEST: a guest memory allocation
(referred to as a "guest-only blob resource")
- VIRTGPU_BLOB_MEM_HOST3D: a host3d memory allocation
(referred to as a "host-only blob resource")
- VIRTGPU_BLOB_MEM_HOST3D_GUEST: a guest + host3d memory allocation
(referred to as a "default blob resource").
Blob resources can be used to implement new features and fix shortcomings
with the current resource create path. The subsequent patches how
blob resources may be leveraged to implement GL_ARB_buffer_storage
and get GL4.5.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4821>
This reverts commit 4fb2eddfdf.
This reverts commit 7a1deb16f8.
This reverts commit 2b6a172343.
This reverts commit 5af81393e4.
This reverts commit 87900afe5b.
A couple of problems were discovered after this series was merged that
cause breakage in different configurations:
(1) It seems that using -mf16c also enables AVX, leading to SIGILL on
platforms that do not support AVX.
(2) Since clang only warns about unknown flags, and as I understand
it Meson's handling in cc.has_argument() is broken, the F16C code is
wrongly enabled when clang is used, even for example on ARM, leading
to a compilation error.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3583
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6969>
Android.mk and Makefile.sources are still defining virgl_driinfo.h target
This patch removes the remaining gen rules
Fixes the following building error:
FAILED: out/target/product/x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_virgl_intermediates/virgl/virgl_driinfo.h
...
cp: bad 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_virgl_intermediates/virgl/virgl_driinfo.h': No such file or directory
Fixes: 974981c4e6 ("gallium/drm: Make the pipe loader handle the driconf merging.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6880>
Android.mk and Makefile.sources are still defining si_driinfo.h target
This patch removes the remaining gen rules
Fixes the following building error:
FAILED: out/target/product/x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/si_driinfo.h
...
cp: bad 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/si_driinfo.h': No such file or directory
Fixes: 974981c4e6 ("gallium/drm: Make the pipe loader handle the driconf merging.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6880>
Android.mk and Makefile.sources are still defining iris_driinfo.h target
This patch removes the remaining gen rules
Fixes the following building error:
FAILED: out/target/product/x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_iris_intermediates/iris/iris_driinfo.h
...
cp: bad 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_iris_intermediates/iris/iris_driinfo.h': No such file or directory
Fixes: 974981c4e6 ("gallium/drm: Make the pipe loader handle the driconf merging.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6880>