11 instructions, but now processes up to 4 channels at once (since TGSI
splits to scalar for these math ops) while being higher accuracy.
Previously we used 6 instructions per channel, but it didn't look like a
sine wave. i915c managed it in 9 instructions per scalar channel, thanks
to avoiding an extra mov we do for the fabs (should be fixable), and
avoiding an extra MUL (maybe just needs reassociation of our immediates?).
But, the ALU count win from doing 4 channels at once will be way more
important for making sure that programs compile than those 2 ALU ops, plus
now we do it in NIR instead of assembly.
Closes: #4981
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12250>
The flock is per-fd, not per thread, and we do it outside of the main mutex. This was
done to avoid having to wait in the mutex, but we can get a case where one ends up running
the body with the flock unlocked.
Fix this by adding a mutex that doesn't need to be locked for reads.
Fixes: 4f0f8133a3 "util/fossilize_db: Do not lock the fossilize db permanently."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12266>
Don't anticipate seeing any partial written headers but just in case we
should probably wait on the lock to make sure whatever header was being
written is finished being written.
Fixes: 4f0f8133a3 "util/fossilize_db: Do not lock the fossilize db permanently."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12204>
Building toward scheduled nightly runs, add a button to do a full VK run
when you think you're changing test expectations.
Be gentle with the play button on this, 4 people doing this at once
would block marge for everyone else for a while.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12150>
Otherwise, another thread might reuse their object ids for other
objects. For example,
T1: free queue with object id X
T2: reuse id X
T2: emit vkCreateFoo with id X
T1: emit vkDestroyDevice
virglrenderer happily accepts that which leads to double frees of the
queue: once when X is updated to point to another object and once when
vkDestroyDevice is executed. virglrenderer should be fixed to catch
such invalid object id reuse as well.
Fixes
dEQP-VK.api.object_management.multithreaded_shared_resources.device_group.
Fixes: ddd7533055 ("venus: initial support for queue/fence/semaphore")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12252>
In the UABI it is already 64b, but userspace ignored the upper 32b. But
it looks like we will start needing the upper 32b. So before we start
actually *using* chip_id, lets make sure everything is treating it as
64b.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
Newer GPU's are moving away from using gpu_id, including the code
landing upstream for "7c Gen 3". But most of the places in the gallium
driver where we were looking at gpu_id, we only cared about the major
generation. So convert those to use screen->gen instead.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
... and reduce maxDescriptorSetUpdateAfterBindStorageBuffersDynamic from 12 to
8.
MAX_DYNAMIC_BUFFERS is MAX_DYNAMIC_UNIFORM_BUFFERS +
MAX_DYNAMIC_STORAGE_BUFFERS. We set
maxDescriptorSetUniformBuffersDynamic = MAX_DYNAMIC_UNIFORM_BUFFERS
maxDescriptorSetStorageBuffersDynamic = MAX_DYNAMIC_STORAGE_BUFFERS
maxDescriptorSetUpdateAfterBindUniformBuffersDynamic = MAX_DYNAMIC_BUFFERS / 2
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic = MAX_DYNAMIC_BUFFERS / 2
The CTS test checks that
maxDescriptorSetUpdateAfterBindUniformBuffersDynamic
- is at least 8; and
- is at least maxDescriptorSetUniformBuffersDynamic
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic
- is at least 4; and
- and is at least maxDescriptorSetStorageBuffersDynamic
Prior to this patch maxDescriptorSetUpdateAfterBindUniformBuffersDynamic was 12
but maxDescriptorSetUniformBuffersDynamic was 16, thus causing the CTS failure
in
dEQP-VK.api.info.vulkan1p2_limits_validation.ext_descriptor_indexing
By raising maxDescriptorSetUpdateAfterBindUniformBuffersDynamic to the same
value as maxDescriptorSetUniformBuffersDynamic, we bring the limits into the
appropriate ranges. We do the same thing for
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic by assigning it the same
value as maxDescriptorSetStorageBuffersDynamic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12193>
The radv_emit_ngg_culling_state function won't write the
SPI_SHADER_PGM_RSRC2_GS register when it knows in advance that
radv_emit_graphics_pipeline will overwrite it anyway.
However, there is an unhandled case:
radv_emit_graphics_pipeline will not emit anything (including this
register) when the pipeline is already emitted. Hence, improve
the check in radv_emit_ngg_culling_state to consider this.
Fixes: 9a95f5487f
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12237>