The current implementation has many different code paths which get very
messy to reason about and maintain.
- FIFO mode worked well enough.
- IMMEDIATE did not need a thread at all, but present wait
implementation complicated a lot of things since we had to handle
concurrent special event reads.
- MAILBOX (and Xwayland) adds even more jank on top of this where
have present thread, but no acquire thread, so there are tons of
forward progress issues to consider.
In the new model, we have two threads:
- Queue thread is only responsible for receiving presents, waiting for
them if necessary, and submitting them to X.
- Event thread pumps the special event queue and notifies
other threads about frame completions.
- Application thread does not interact with X directly, only through
acquire/present queues and present wait condvar.
Two threads are required to implement IMMEDIATE and MAILBOX well.
IDLE events can come back at any time and the queue thread might be
waiting for a new presentation request to come through.
This new model has the advantage that we will be able to implement
VK_EXT_swapchain_maintenance1 in a more reasonable way, since we can
just toggle the present mode per present request as all presentation
go through the same system.
Some cleanups were done as well:
- We no longer need the busy bool. Since everything goes through thread,
we just rely on acquire/present queues.
- SW/non-MITSHM path is also moved to thread. Move acquire-specific
logic to the thread as well.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26954>
To handle coverity warning:
4. thread2_modifies_field: Thread2 sets cache_size to a new value. Note that this write can be reordered at runtime to occur before instructions that do not access this field within this locked region. After Thread2 leaves the critical section, control is switched back to Thread1.
CID 1559509 (#1 of 1): Check of thread-shared field evades lock acquisition (LOCK_EVASION)6. thread1_overwrites_value_in_field: Thread1 sets cache_size to a new value. Now the two threads have an inconsistent view of cache_size and updates to fields correlated with cache_size may be lost.
521 cache->cache_size += bo->size;
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26951>
Full coverity warning:
CID 1558604: Uninitialized pointer read (UNINIT)12. uninit_use_in_call: Using uninitialized value *results when calling nir_vec.
236 return nir_vec(b, results, DIV_ROUND_UP(num_components, 2));
To fix it we initialize the variables, provide a unreachable on the
switch that sets the results values. As we are here we also move a
comment to make things more clear.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26951>
No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16
and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8,
16 and 32 bit.
Also turn the `lower` into a `has` option as only two drivers actually
support `uror` and `urol` at this momemt.
Fixes crashes with CL integer_rotate on iris and nouveau since we emit
urol for `rotate`.
v2: always lower 64 bit
Fixes: fe0965afa6 ("spirv: Don't use libclc for rotate")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>
The panfrost driver now makes an ioctl to retrieve some new memory
parameters, and DRM_PANFROST_PARAM_MEM_FEATURES is required (does not
default in the caller). This caused drm-shim to stop working. This
patch adds some defaults to get drm-shim working again.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 91fe8a0d28 ("panfrost: Back panfrost_device with pan_kmod_dev object")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27162>
GFX10 has a hw bug and it can't handle 0-sized index buffer. The
non-indirect draw path was fine but not the indirect path where RADV
emits the index buffer.
This fixes flakes with dEQP-VK.*maintenance6* on NAVI14, and possibly
GPU hangs if there is an indirect draw with a valid index buffer right
before because it would re-use the same index buffer.
Fixes: db9816fd66 ("radv: add support for NULL index buffer")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27142>
Similar to headergen2, the output matches as closely as is reasonable.
The time format and file listing ends up being slightly different but
those would be part of the diffstat when we next update kernel headers
regardless.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27124>
Track the files we've parsed, and skip ones we have already seen, if
(for example) we see the same paths imported from imported files.
Additionally having the list of files we have parsed will be useful to
generate a headergen-like top-of-file license comment.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27124>
On chordal graphs, a greedy coloring can be done in a way that never uses
more colors than are required for the largest clique. However, since we
have vector values and force phi resources into the same spill slots, the
interference graphs are not chordal, and thus, this assumption doesn't hold.
Use twice as many spill slots as upper bound.
Totals from 10 (0.01% of 79242) affected shaders: (GFX11)
MaxWaves: 52 -> 54 (+3.85%)
Instrs: 271386 -> 271779 (+0.14%)
CodeSize: 1362544 -> 1365432 (+0.21%)
VGPRs: 2536 -> 2532 (-0.16%)
SpillVGPRs: 778 -> 818 (+5.14%)
Scratch: 73472 -> 76800 (+4.53%)
Latency: 3331718 -> 3328798 (-0.09%); split: -0.14%, +0.05%
InvThroughput: 1665860 -> 1643350 (-1.35%); split: -1.40%, +0.05%
VClause: 3292 -> 3329 (+1.12%); split: -0.06%, +1.18%
Copies: 46082 -> 46257 (+0.38%)
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27011>
We can combine the previously introduced mechanisms to implement
line-smoothing, combined with some bespoke late RSD-patching to force
the correct multisampling and rasterization states.
The reason we need to do this patching late, is that we need to know the
primitive-type before we can determine this state, and we don't know
this until draw-time. Luckily, we have a convenient spot to do this.
This bespoke patching only happens for gen7 and earlier. Luckily, the
bits here remain the same for all of these gens. On later gens, we can
just emit this normally, without any bespoke patching.
We also need to use a fresh batch when changing the effective smoothing
state, because that's framebuffer-state. This applies to all gens.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10281
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26904>