LLVM_LIB_DIR is a variable used for runtime compilations.
When cross compiling, LLVM_LIB_DIR must be set to the
libclang path on the target. So, this path should not
be retrieved during compilation but at runtime.
dladdr uses an address to search for a loaded library.
If a library is found, it returns information about it.
The path to the libclang library can therefore be
retrieved using one of its functions. This is useful
because we don't know the name of the libclang library
(libclang.so.X or libclang-cpp.so.X)
v2 (Karol): use clang::CompilerInvocation::CreateFromArgs for dladdr
v3 (Karol): follow symlinks to fix errors on debian
Fixes: e22491c832 ("clc: fetch clang resource dir at runtime")
Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (v1): Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>
Setting opencl-external-clang-headers to enabled while using shared LLVM
was broken and this option was mostly used for windows to force static
inclusion of opencl base headers.
Simply relying on the shared-llvm option here is enough to get what we
want.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>
In fragment shaders these instructions consider a lane active when
any lane in the same quad is active, which is not what we want, so
we need to include the current sample mask in the condition mask
used with these instructions to limit lane selection to those that
are really active.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
If the lane from which the hardware writes the unifa address
is disabled, then we may end up with a bogus address and invalid
memory accesses from follow-up ldunifa.
Instead of always disabling unifa loads in non-uniform control
flow we can try to see if the address is prouced from a nir
register (which is the only case where we do conditional writes
under non-uniform control flow in ntq_store_def), and only
disable it in that case.
When enabling subgroups for graphics pipelines, this fixes a
GMP violation in the simulator with the following test
(which has non-uniform control flow writing unifa with lane 0
disabled, which is the lane from which the unifa takes the
address):
dEQP-VK.subgroups.ballot_broadcast.graphics.subgroupbroadcastfirst_int
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
We don't rely in any lowerings for these (other than
scalarization). The only noteworthy aspect is that these
instructions, like ballot, use the condition mask to
filter out valid invocations that are inactive because of
control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
This adds support in our compiler for the subgroup ballot
feature. To this end we start using the NIR lowering for
subgroups which can lowers some of these intrinsics into
things more amenable to our hardware and takes care of
scalarization.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
c->execute is 0 (not the block index) for lanes currently active
under non-uniform control flow.
Also this simplifies a bit the instructions we emit for flag
generation, both for uniform and non-uniform control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
If the ELSE block is cheap then we don't emit the branch instruction
but we still want to generate the flags, since these are setting
the flags for the THEN block too.
Fixes: e401add741 ("broadcom/compiler: skip jumps in non-uniform if/then when block cost is small")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
Provide fd_rd_output facilities which enable constructing RD dumps that are
stored into gzip-compressed output. This matches the default behavior of
libwrap. Enabling and adjusting the behavior of functionality is done
through the FD_RD_DUMP environment variable.
Integration into Turnip's MSM backend is covered, replacing the previous
RD dump that was enabled through TU_DEBUG=rd. That debug option still
works and is the same as using FD_RD_DUMP=enable.
By default the dumps are created for each submission, using the provided
submit index. FD_RD_DUMP=combine enables gathering dumps for submissions
for the given logical device into a single file.
In the Turnip integration, FD_RD_DUMP=full will force dumping contents of
any buffer object. Additionally, with that option enabled any previous
submit will be waited on.
Specifying FD_RD_DUMP=trigger sets up a trigger file that can be used to
activate dumping manually. Writing zero or some non-integer value to the
file will disable dumping. Writing a positive integer value to it will
enable dumps for that many future submissions. Writing -1 to it will enable
dumps until disabled.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27230>
descriptor buffer uses mapped buffers. mapping/unmapping buffers
uses a ctx in the function params, but at this time there is no ctx.
since the ctx is not actually used for unmapping descriptor buffers,
this can instead use a special buffer unmap function to avoid invalid access
Fixes: b06f6e00fb ("zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27344>
this is already implied since the buffers must be BAR-allocated,
but it ensures the context isn't accessed during unmap
Fixes: b06f6e00fb ("zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27344>
This works around some Unity engine behaivor with ANGLE-on-Venus, when
cmd pools are created on main thread once while the render thread only
does descriptor pool creation for set allocations during recording time.
This change also explicitly forces async pipeline create for threads
creating the device instead of implicitly via feedback cmd pool create.
This ensures intended behavior when feedback is disabled.
Fixes: d17ddcc847 ("venus: dispatch background shader tasks to secondary ring")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27347>
../src/compiler/nir/nir_builder.h: In function ‘nir_build_deref_follower’:
../src/compiler/nir/nir_builder.h:1607:1: error: control reaches end of non-void function [-Werror=return-type]
1607 | }
Fixes: 4a4e175738
nir: Support deref instructions in lower_var_copies
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
../src/compiler/nir/nir_lower_int64.c: In function ‘lower_int64_intrinsic’:
../src/compiler/nir/nir_lower_int64.c:1347:1: error: control reaches end of non-void function [-Werror=return-type]
1347 | }
Fixes: bf7a114246
nir/lower_int64: Add lowering for some 64-bit subgroup ops
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>
../src/amd/vulkan/radv_sampler.c: In function ‘radv_tex_wrap’:
../src/amd/vulkan/radv_sampler.c:50:1: error: control reaches end of non-void function [-Werror=return-type]
50 | }
| ^
../src/amd/vulkan/radv_sampler.c: In function ‘radv_tex_compare’:
../src/amd/vulkan/radv_sampler.c:76:1: error: control reaches end of non-void function [-Werror=return-type]
76 | }
| ^
Fixes: 4de305cb8a
radv: move sampler related code to radv_sampler.c
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27345>