This commit consolidates the queue submit code paths into one.
Now we always allocate BOs for every CS, but when IBs aren't
allowed, we simply submit every BO to the kernel.
A microbenchmark done by Bas indicated that submitting more IBs to
the kernel only adds a negligible overhead. Additionally, this
allows us to stop copying the command buffer contents in system
memory and get rid of a lot of legacy code.
In order to be able to submit every BO, we make sure to add the
last BO to the old_ib_buffers array on cs_finalize. This also
necessitates some changes in cs_execute_secondary and other
functions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
This will still make it so that RADV_DEBUG=hang will only submit
one command buffer at a time, but otherwise let's pass all CS
objects into one submission and let the winsys split them if
necessary.
The winsys can do a better job at splitting them because
radv_queue has no knowledge of IBs and ignores chaining in the
splitting logic.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
In a gang submit, the follower (typically ACE) and leader
(typically GFX) can have synchronization between each other.
We must ensure that these end up in the same submission,
otherwise we can deadlock the GPU.
We rely on radv_queue here to order follower before the leader
in the submitted CS array.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
Currently, radv_queue already splits submissions but we want to
change this and be able to split them in the winsys code as well.
Necessary because we want to split based on number of actual
IBs not number of command buffers, but radv_queue is not
aware of IBs.
Note that this commit does not actually take this new split into
use yet, that will be done in a following commit when it is ready,
this is why we set the max IB count higher than radv_queue here.
This commit is the first step in making "fallback" the default and
only submission code path.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
Skipping the continue preamble can allow other processes to mess
up some registers set by the current process.
Originally, we could omit generating the continue preamble when
no shader rings were used, because the register initialization
happened at the beginning of every main cmdbuf. However, this
isn't the case anymore.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
Indeed, the unnecessary drmDevice objects were not freed.
For instance, this issue could be triggered with: "piglit/bin/egl_ext_platform_device -auto -fbo":
SUMMARY: AddressSanitizer: 2796 byte(s) leaked in 12 allocation(s).
Fixes: e39d72aec2 ("egl: only take render nodes into account when listing DRM devices")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22408>
The base SGPR and the number of SGPRs can be equal but it was incorrect
because one VS can have draw_id and one can have base_instance. Fix
this by invalidating the vertex user SGPRs unconditionally.
Though they should also be invalidated after executing secondaries,
otherwise nothing is invalidated if the same pipeline is bind to the
primary again.
This fixes dEQP-VK.dynamic_rendering.primary_cmd_buff.random.seed*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21652>
Replaces the RADV pipeline cache with an implementation
based on the common vk_pipeline_cache.
We use a dual-layer approach with two types of cache entries.
1. radv_shader:
- serialized as radv_shader_binary
- uses SHA1 of the binary as key
2. radv_pipeline_cache_object:
- contains pointers to associated radv_shaders
- serialized as list of SHA1
- uses the pipeline hash as key
In combination with single-file disk-cache, this reduces the cache size by ~60%.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22030>
This function takes a radv_shader_binary and writes it to the
disk cache before creating and returning a radv_shader cache entry.
The key of the cache entry is the full SHA1 hash of the binary.
This way, we will be able to deduplicate identical shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22030>
The instructions manipulation cr0 use the default mask on lane0. So if
for some reason that lane is disabled in some of the dispatchs, we can
end up not executing the instructions.
Fixes flakyness in dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.uniform_float_32_to_16.uniform_matrix_float_rtz_frag
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22314>