This commit moves the allocation and filling out of surface state from
CreateImageView time to BeginRenderPass time. Instead of allocating the
render target surface state as part of the image view, we allocate it in
the command buffer state at the same time that we set up clears. For
secondary command buffers, we allocate memory for the surface states in
BeginCommandBuffer but don't fill them out; instead, we use our new
SOL-based memcpy function to copy the surface states from the primary
command buffer. This allows us to handle secondary command buffers without
the user specifying the framebuffer ahead-of-time.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
This method of doing copies has the advantage of touching very little of
the GPU state. While it does disable all the shader stages, it doesn't
have to blow away binding tables, viewports, scissors, or any other bits of
dynamic state other than VBO 32 which is already reserved. All of the
state that it does touch is contained within a pipeline anyway so that's
the only thing that has to be dirtied.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
There are a few dynamic bits, namely binding table and sampler addresses,
but most of it is static and really belongs in the pipeline. It certainly
doesn't belong in flush_compute_descriptor_set. We'll use the same state
merging trick we use for gen7 DEPTH_STENCIL.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
This commit makes both gen7 and gen8 pipeline setup emit state packets
in exactly the same order.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
With this commit, a few fields are now specified on gen7 which weren't
before. However, the values specified are zero which is the default so the
final hardware packet remains the same.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
From the Vulkan spec version 1.0.32 docs for vkCreateGraphicsPipelines:
The stage member of one element of pStages must be
VK_SHADER_STAGE_VERTEX_BIT
Since a vertex shader is always required, this hasn't been used since we
deleted meta. Let's get rid of the complexity.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Now that we have anv_shader_bin, they're completely redundant with other
information we have in the pipeline. For vertex shaders, we also go
through way too much work to put the offset in one or the other field and
then look at which one we put it in later.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
A commit from the CTS suite on the 1.0-dev branch started using
VK_REMAINING_MIP_LEVELS, we're not dealing with it properly for clears.
Fixes:
dEQP-VK.api.image_clearing.clear_color_image.*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
According to the spec for vkGetPhysicalDeviceImageFormatProperties:
"If format is not a supported image format, or if the combination of format,
type, tiling, usage, and flags is not supported for images, then
vkGetPhysicalDeviceImageFormatProperties returns VK_ERROR_FORMAT_NOT_SUPPORTED."
Makes the following Vulkan CTS tests report 'Not Supported' instead of crashing:
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_unorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_snorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_unorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_snorm
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sscaled
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sint
dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r4g4_unorm_pack8
dEQP-VK.api.image_clearing.clear_color_image.1d_r8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb
dEQP-VK.api.image_clearing.clear_color_image.1d_b5g5r5a1_unorm_pack16
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
fixes following compilation warnings on Android build:
"warning: implicit declaration of function 'static_assert' is invalid in
C99 [-Wimplicit-function-declaration]"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
I just noticed the new vulkan headers changed a prototype,
so I've decided to import them and fix the drivers to use the
new API.
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From the Vulkan spec version 1.0.32 docs for vkFreeMemory:
"If a memory object is mapped at the time it is freed, it is implicitly
unmapped."
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Our previous fence implementation was very simple. Fences had two states:
signaled and unsignaled. However, this didn't properly handle all of the
edge-cases that we need to handle. In order to handle the case where the
client calls vkGetFenceStatus on a fence that has not yet been submitted
via vkQueueSubmit, we need a three-status system. In order to handle the
case where the client calls vkWaitForFences on fences which have not yet
been submitted, we need more complex logic and a condition variable. It's
rather annoying but, so long as the client doesn't do that, we should still
hit the fast path and use i915_gem_wait to do all our waiting.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Vulkan has introduced the consept of .specVersion which can be used to
attribute changes of the said extension.
The current loader does not check the value, thus it have gone unnoticed
that the driver exposes an old version of the following extensions:
VK_KHR_xcb_surface (Rev 6)
VK_KHR_xlib_surface (Rev 6)
VK_KHR_wayland_surface (Rev 5)
- Updated the surface create function to take a pCreateInfo structure
VK_KHR_swapchain (Rev 68)
- Moved the "validity" include for vkAcquireNextImage to be in its proper
place, after the prototype and list of parameters.
...
According to the documentation:
* pname:specVersion is the version of this extension.
It is an integer, incremented with backward compatible changes.
Based on the history of vk.xml the above (latest) revision has been
available since Vulkan 1.0 so even if they were any backwards
incompatible change(s) [as hinted by the revision log] those should be
safe.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Since our surface state buffer is shared by all batches, the kernel does a
full stall and sync with the CPU between batches every time we call
execbuf2 because it refuses to do relocations on an active buffer. Doing
them in userspace and passing the NO_RELOC flag to the kernel allows us to
perform the relocations without stalling.
This improves the performance of Dota 2 by around 30% on a Sky Lake GT2.
v2 (Jason Ekstrand):
- Better comments (Chris Wilson)
- Fixed write_reloc for correct canonical form (Chris Wilson)
v3 (Jason Ekstrand):
- Skip relocations which aren't needed
- Provide an environment variable to always use the kernel
- More comments about correctness (Chris Wilson)
v4 (Jason Ekstrand):
- More comments (Chris Wilson)
v5 (Jason Ekstrand):
- Rebase on top of moving execbuf2 setup go QueueSubmit
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Ever since the early days of the Vulkan driver, we've been setting up the
lists of relocations at EndCommandBuffer time. The idea behind this was to
move some of the CPU load out of QueueSubmit which the client is required
to lock around and into command buffer building which could be done in
parallel. Then QueueSubmit basically just becomes a bunch of execbuf2
calls.
Technically, this works. However, when you start to do more in QueueSubmit
than just execbuf2, you start to run into problems. In particular, if a
block pool is resized between EndCommandBuffer and QueueSubmit, the list of
anv_bo's and the execbuf2 object list can get out of sync. This can cause
problems if, for instance, you wanted to do relocations in userspace.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
The original reason for putting it in the batch_bo was to allow primaries
to share it across secondaries or something like that. However, the
relocation lists in secondary command buffers are are always left alone and
copied into the primary command buffer's relocation list. This means that
the offset really applies at the command buffer level and putting it in the
batch_bo doesn't make sense. This fixes a couple of potential bugs around
re-submission of command buffers that are not likely to be hit but are bugs
none the less.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
This commit adds a little helper struct for storing everything we use to
build an execbuf2 call. Since the add_bo function really has nothing to do
with a command buffer, it makes sense to break it out a bit. This also
reduces some of the churn in the next commit.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
The old version wasn't properly handling large addresses where we have to
sign-extend to get it into the "canonical form" expected by the hardware.
Also, the new version is capable of doing a clflush of the newly written
reloc if requested.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>