It is observed that in display resolutions where width is not equal to
stride, vulkan rendering is being distorted. This is happening due to
stride calculation mismatch between minigbm and mesa.
This fix makes sure that the stride calculated in minigbm is passed to
anv and isl.
The issue was found while debugging the following android cts tests and
thus fixes them as well.
android.graphics.cts.VulkanPreTransformTest#testVulkanPreTransformNotSetToMatchCurrentTransform
android.graphics.cts.VulkanPreTransformTest#testVulkanPreTransformSetToMatchCurrentTransform
Signed-off-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22163>
Xe KMD does a special caching handling for buffers that will be
scanout to display, so that is why it needs a flag set during
allocation.
Checking if VK_STRUCTURE_TYPE_WSI_MEMORY_ALLOCATE_INFO_MESA
is available in AllocateMemory() and marking the buffer as scanout.
All WSI code paths but one sets
VK_STRUCTURE_TYPE_WSI_MEMORY_ALLOCATE_INFO_MESA.
The only one that doesn't requires that WSI is initialize with
wsi_device_options.sw_device = true to be executed, what is not the
case for ANV.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21885>
The HW can technically execute 16-bit operations, but the restrictions on
16-bit ALU ops are so great that it ends up not being a win for
GLES-on-Vulkan to lower mediump to 16-bit operations, at least with the
current state of the Intel compiler. This brings zink-on-anv in line with
iris and angle-on-anv for mediump behavior (ANGLE uses RelaxedPrecision,
which we ignore).
Perf on some angle traces on my brya (ADL) and i9-9900K (CFL):
ADL zink pubg_mobile_battle_royale: +13.4574% +/- 5.2046% (n=5)
CFL zink pubg_mobile_battle_royale: +29.5332% +/- 0.646585% (n=6)
ADL zink aztec_ruins_high: +5.78027% +/- 4.80645% (n=4)
CFL zink aztec_ruins_high: -1.10641% +/- 0.140562% (n=12)
ADL zink trex_200: +5.86956% +/- 2.09633% (n=10)
CFL zink trex_200: +9.72136% +/- 0.749261% (n=10)
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21775>
Now that all color blend bits are dynamic, emit_cb_state() is doing
almost nothing and half of that is wrong.
In the case that color write enable is dynamic, at the time the pipeline
state is emitted, it sees all the color attachments as having write
disabled and stores the WriteDisabled bit for each channel.
When all dynamic state is flushed, we have the right values already but
the values recorded into the command buffer get ORed with the ones
stored in the pipeline, and so WriteDisabled tag along when they
shouldn't.
Since all disabled color attachments are handled already when dynamic
state is flushed, there's no point in doing so at pipeline creation
time too. And since the only other thing done by emit_cb_state() is
writing three hardcoded values, they might as well be taken care of in
the same place as everything else.
Fixes CTS from the future:
dEQP-VK.pipeline.*.extended_dynamic_state.*.color_blend_equation_*dynamic*
dEQP-VK.pipeline.*.extended_dynamic_state.*.color_blend_all_*
Fixes: fc3fd7c69e (anv: dynamic color write mask)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21509>
Since the dirty range started out as 0..0, you would have 0..VBend as the
new dirty range on the first draw, and if your VB was >32b then you'd
flush every time you used it. Instead, if there's no existing dirty range
then just set it to our new VB's range.
Perf results with zink+anv on my CFL:
sauerbraten: +24.8182% +/- 0.602077% (n=5)
portal-2-v2.trace: +4.64289% +/- 0.285285% (n=5)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21370>
Functions that are in hot paths will have a different treatment to
support i915 and Xe KMD.
Each KMD will have an anv_kmd_backend that will have the hot path
functions set, this way we can avoid branch prediction misses.
Other functions will gradually be moved to anv_kmd_backend.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20948>
As we continue to refactor the code base to support Xe KMD here I'm
dropping anv_gem_create() and unifying all graphics memory allocation
calls to anv_gem_create_regions().
anv_gem_create_regions() will call DRM_IOCTL_I915_GEM_CREATE_EXT
for integrated platforms too only leaving DRM_IOCTL_I915_GEM_CREATE
calls to kernel versions that do not support
DRM_IOCTL_I915_GEM_CREATE_EXT.
This can be detected by devinfo->mem.use_class_instance as
DRM_I915_QUERY_MEMORY_REGIONS uAPI landed in the same kernel version
as DRM_IOCTL_I915_GEM_CREATE_EXT.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20948>
The decoder context needs to know what engine it's associated with.
Nowadays, we have render, compute, blitter, even video engines being
used from the same driver. Rather than trying to have a single decoder
and thwacking the engine field back and forth between calls, we make
one per queue family, and stash a pointer in anv_queue for easy access.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21149>
u_vector_add() don't keep the returned pointers valid.
After the initial size allocated in u_vector_init() is reached it will
allocate a bigger buffer and copy data from older buffer to the new
one and free the old buffer, making all the previous pointers returned
by u_vector_add() invalid and crashing the application when trying to
access it.
This is reproduced when running
dEQP-VK.synchronization.signal_order.timeline_semaphore.* in DG2 SKUs
that has 4 CCS engines, INTEL_COMPUTE_CLASS=1 is set and of course
perfetto build is enabled.
To fix this issue here I'm moving the storage/allocation of
struct intel_ds_queue to struct anv_queue/iris_batch and using
struct list_head to maintain a chain of intel_ds_queue of the
intel_ds_device.
This allows us to append or remove queues dynamically in future if
necessary.
Fixes: e760c5b37b ("anv: add perfetto source")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20977>