Commit Graph

976 Commits

Author SHA1 Message Date
Iago Toral Quiroga
c04dbd6b3e anv/cmd_buffer: handle out of memory during vkCmdPushConstants
Fixes:
dEQP-VK.api.out_of_host_memory.cmd_push_constants

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
94a4f0c255 anv/cmd_buffer: handle allocation errors during vkCmdBeginRenderPass()
Fixes:
dEQP-VK.api.out_of_host_memory.cmd_begin_render_pass

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
d823f381a5 anv/cmd_buffer: skip vkCmdEndRenderPass() for broken command buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
6743456699 anv/cmd_buffer: skip vkCmdNextSubpass() for broken command buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
8174e63869 anv/cmd_buffer: report tracked errors in vkEndCommandBuffer()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
68d88f0237 anv: handle failures when growing reloc lists
Growing the reloc list happens through calling anv_reloc_list_add() or
anv_reloc_list_append(). Make sure that we call these through helpers
that check the result and set the batch error status if needed.

v2:
  - Handling the crashes is not good enough, we need to keep track of
    the error, for that, keep track of the errors in the batch instead (Jason).
  - Make reloc list growth go through helpers so we can have a central
    place where we can do error tracking (Jason).

v3:
  - Callers that need the offset returned by anv_reloc_list_add() can
    compute it themselves since it is extracted from the inputs to the
    function, so change the function to return a VkResult, make
    anv_batch_emit_reloc() also return a VkResult and let their callers
    do the error management (Topi)

v4:
  - Let anv_batch_emit_reloc() return an uint64_t as it originally did,
    there is no real benefit in having it return a VkResult.
  - Do not add an is_aux parameter to add_surface_state_reloc(), instead
    do error checking for aux in add_image_view_relocs() separately.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
d4bdd871dc anv: avoid crashes when failing to allocate batches
Most of the time we use macros that handle this situation transparently,
but there are some cases were we need to handle this explicitly.

This patch makes sure we don't crash, notice that error handling takes
place in the function that actually failed the allocation,
anv_batch_emit_dwords(), which will set the status field of the batch
so it can be used at a later moment to report the error to the user.

v2:
  - Not crashing is not good enough, we need to keep track of the error
    (Topi, Jason). Iago: now that we track errors in the batch, this
    is being handled.
  - Added guards in a few more places that needed it (Iago)

v3:
  - Check result of anv_batch_emitn() for NULL before calling memset()
    in emit_vertex_input() (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
31f5049ff1 anv: handle allocation failure in anv_batch_emit_dwords()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
9e69409fcf anv: handle allocation failure in anv_batch_emit_batch()
v2:
 - Call the error handler (Topi)

Fixes:
dEQP-VK.api.out_of_host_memory.cmd_execute_commands

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
a8ce8e3542 anv: add anv_batch_set_error() and anv_batch_has_error() helpers
The anv_batch_set_error() helper will track the first error that happened
while recording a command buffer. The helper returns the currently tracked
error to help the job of internal functions that may generate errors that
need to be tracked and return a VkResult to the caller.

We will use the anv_batch_has_error() helper to guard parts of the driver
that are not safe to execute if an error has been generated while recording
a particular command buffer.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
d0195bd067 anv/cmd_buffer: add a status field to anv_batch
The vkCmd*() functions do not report errors, instead, any errors should be
reported by the time we call vkEndCommandBuffer(). This means that we
need to make the driver robust against incosistent and/or imcomplete
command  buffer states through the command recording process, particularly,
avoid crashes due to access to memory that we failed to allocate previously.

The strategy used to do this is to track the first error ocurred while
recording a command buffer in the batch associated with it. We use the
batch to track this information because the command buffer may not be
visible to all parts of the driver that can produce errors we need to be
aware of (such as allocation failures during batch emissions).

Later patches will use this error information to guard parts of the driver
that may not be safe to execute.

v2: Move the field from the command buffer to the batch so we can track
    errors from batch emissions (Jason)

v3: Registering errors in the command buffer's batch during
    anv_create_cmd_buffer() is unnecessary, since the command buffer
    is freed at the end of the function in that case (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
6dd06f54eb anv/cmd_buffer: report errors in vkBeginCommandBuffer()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
88b539c4a0 anv: do not try to ref/unref NULL shaders
This situation can happen if we failed to allocate memory for the shader.

v2:
 - We shouldn't see NULL shaders in anv_shader_bin_ref so we should not check
   for that (Jason). Make sure that callers don't attempt to call this
   function with a NULL shader and assert that this never happens (Iago).

v3:
 - All callers to anv_shader_bin_unref seem to check for NULL before calling,
   so just assert that it is not NULL (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
bad3a2e911 anv/blorp: return early if we failed to create the shader binary
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
e2f707ce5b intel/blorp: make upload_shader() return a bool indicating success or failure
For now we always return true, follow-up patches will handle fail scenarios.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
808503b8f8 anv: remove unnecessary function prototype.
The function is defined right after the prototype declaration. Also, the
protoype for it is included in anv_genX.h which is included via anv_private.h.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Emil Velikov
b1fb6e8d8c anv: do not open random render node(s)
drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're interested or not.

v2: Rebase, explicitly require/check for libdrm
v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia)
v4: Rebase

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:05 +00:00
Emil Velikov
a9a4028fd7 util/sha1: rework _mesa_sha1_{init,final}
Rather than having an extra memory allocation [that we currently do not
and act accordingly] just make the API take an pointer to a stack
allocated instance.

This and follow-up steps will effectively make the _mesa_sha1_foo simple
define/inlines around their SHA1 counterparts.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:18:43 +00:00
Jason Ekstrand
d142c7436c intel/debug: Add a common INTEL_DEBUG=nohiz option
The GL driver had a driconf option (which doesn't make much sense) and
the Vulkan driver had a hand-rolled environment variable.  Instead,
let's tie both into the INTEL_DEBUG mechanism and unify things.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 21:00:09 -07:00
Jason Ekstrand
c09bb956ca anv/image: Move handling of INTEL_VK_HIZ
This makes it so that you don't get an "Implement gen7 HiZ" perf warning
when you manually disable HiZ on gen8.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 21:00:09 -07:00
Jason Ekstrand
aed2714145 anv: Properly enumerate physical devices when none are present 2017-03-14 09:08:07 -07:00
Jason Ekstrand
762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
678fd00f2f anv/blorp: Only set a clear color for resolves if fast-cleared
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
273b720310 anv/blorp: Turn off AUX after doing a CCS_D resolve
For render passes with multiple subpasses on gen7, we only fast-clear at
the top but an input attachment use can cause us to do a resolve in the
middle of the render pass.  Once we've done so, we are no longer have a
fast-cleared surface so we can just set aux_usage to NONE.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Gwan-gyeong Mun
8f22552a4f anv: Add missing error-checking to anv_CreateDevice (v3)
This patch adds missing error-checking and fixes resource leak in
allocation failure path on anv_CreateDevice()

v2: Fixes from Jason Ekstrand's review
  a) Add missing destructors for all of the state pools on allocation
     failure path
  b) Add missing destructor for batch bo pools on allocation failure path

v3: Fixes from Emil Velikov's review
  Add missing destructor for queue and scratch_pool on allocation failure
  path

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 21:29:43 -07:00
Chad Versace
c5a0829e1f anv: Use vk_outarray in vkGetPhysicalDeviceQueueFamilyProperties
No intended change in behavior. Just a refactor.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 15:08:15 -07:00
Chad Versace
876f0ecd2f anv: Use vk_outarray in vkEnumeratePhysicalDevices (v2)
No intended change in behavior. Just a refactor.

v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For
    Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 15:08:15 -07:00
Jason Ekstrand
dd4db84640 anv: Use on-the-fly surface states for dynamic buffer descriptors
We have a performance problem with dynamic buffer descriptors.  Because
we are currently implementing them by pushing an offset into the shader
and adding that offset onto the already existing offset for the UBO/SSBO
operation, all UBO/SSBO operations on dynamic descriptors are indirect.
The back-end compiler implements indirect pull constant loads using what
basically amounts to a texelFetch instruction.  For pull constant loads
with constant offsets, however, we use an oword block read message which
goes through the constant cache and reads a whole cache line at a time.
Because of these two things, direct pull constant loads are much faster
than indirect pull constant loads.  Because all loads from dynamically
bound buffers are indirect, the user takes a substantial performance
penalty when using this "performance" feature.

There are two potential solutions I have seen for this problem.  The
alternate solution is to continue pushing offsets into the shader but
wire things up in the back-end compiler so that we use the oword block
read messages anyway.  The only reason we can do this because we know a
priori that the dynamic offsets are uniform and 16-byte aligned.
Unfortunately, thanks to the 16-byte alignment requirement of the oword
messages, we can't do some general "if the indirect offset is uniform,
use an oword message" sort of thing.

This solution, however, is recommended for a few of reasons:

 1. Surface states are relatively cheap.  We've been using on-the-fly
    surface state setup for some time in GL and it works well.  Also,
    dynamic offsets with on-the-fly surface state should still be
    cheaper than allocating new descriptor sets every time you want to
    change a buffer offset which is really the only requirement of the
    dynamic offsets feature.

 2. This requires substantially less compiler plumbing.  Not only can we
    delete the entire apply_dynamic_offsets pass but we can also avoid
    having to add architecture for passing dynamic offsets to the back-
    end compiler in such a way that it can continue using oword messages.

 3. We get robust buffer access range-checking for free.  Because the
    offset and range are baked into the surface state, we no longer need
    to pass ranges around and do bounds-checking in the shader.

 4. Once we finally get UBO pushing implemented, it will be much easier
    to handle pushing chunks of dynamic descriptors if the compiler
    remains blissfully unaware of dynamic descriptors.

This commit improves performance of The Talos Principle on ULTRA
settings by around 50% and brings it nicely into line with OpenGL
performance.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-13 07:58:00 -07:00
Jason Ekstrand
6b644e571e anv: Stall before fast-clear operations
During initial CCS bring-up, I discovered that you have to do a full CS
stall prior to doing a CCS resolve as well as afterwards.  It appears
that the same is needed for fast-clears as well.  This fixes rendering
corruptions on The Talos Principle on Sky Lake GT4.  The issue hasn't
been demonstrated on any other hardware however, given that this appears
to be a "too many things in the pipe" problem, having it be easier to
reproduce on a system with more EUs makes sense.  The issues with
resolves is demonstrable on a GT3 or GT2 so this is probably also a
problem on all GTs.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-13 07:57:03 -07:00
Jason Ekstrand
5e44ef4a76 anv: Accurately advertise dynamic descriptor limits
The number of dynamic descriptors is limited by both the number of
descriptors and the total number of dynamic things.  Because there isn't
a single "maximum dynamic things" limit, we need to divide by two so
that they can create the maximum of both UBOs and SSBOs.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-13 07:57:03 -07:00
Jason Ekstrand
d36b463817 anv: Add a helper for working with VK_WHOLE_SIZE for buffers
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-13 07:57:03 -07:00
Jason Ekstrand
ee8044fd33 intel/vulkan: Get rid of recursive make
v2 [Emil Velikov]
 - Various fixes and initial stab at the Android build.
 - Keep the generation rules/EXTRA_DIST outside the conditional

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Jason Ekstrand
700bebb958 i965: Move the back-end compiler to src/intel/compiler
Mostly a dummy git mv with a couple of noticable parts:
 - With the earlier header cleanups, nothing in src/intel depends
files from src/mesa/drivers/dri/i965/
 - Both Autoconf and Android builds are addressed. Thanks to Mauro and
Tapani for the fixups in the latter
 - brw_util.[ch] is not really compiler specific, so it's moved to i965.

v2:
 - move brw_eu_defines.h instead of brw_defines.h
 - remove no-longer applicable includes
 - add missing vulkan/ prefix in the Android build (thanks Tapani)

v3:
 - don't list brw_defines.h in src/intel/Makefile.sources (Jason)
 - rebase on top of the oa patches

[Emil Velikov: commit message, various small fixes througout]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Jason Ekstrand
e042f5fcbc anv: Stop including brw_context.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Jason Ekstrand
12f348bc98 vulkan/wsi: Generate wayland protocol headers separately from EGL
Previously, we were depending on EGL for generating the headers and
providing the protocol symbols. However, since neither Vulkan driver
actually wants to link against EGL, this is kind of pointless. It also
creates a weird build dependency.

v2 [Jason]
 - Add missing wsi/ prefix, MKDIR_GEN

v3 [Emil Velikov]
 - include BUILT_SOURCES/generation rules outside of conditional

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Jason Ekstrand
4ea9bbe1f6 anv/wsi: Don't include wayland headers
Unused and we'll rework the way wayland-drm-client-protocol.h is
generated with later commit.

v2 [Emil]
 - Also remove wayland-client.h

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:30 +00:00
Tapani Pälli
db5f9c3177 anv: change BLOCK_POOL_MEMFD_SIZE to exactly 2GB
This is what comment above definition says and change fixes issue with
32bit build where BLOCK_POOL_MEMFD_SIZE is used as ftruncate parameter
and constant currently gets converted from 4294967296 to 0.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-08 07:57:55 +02:00
Jason Ekstrand
0421813588 anv: Make the framebuffer-renderpass format assert non-fatal
This should let Dota 2 run on debug builds though it will spew errors
like mad.  Hopefully, Valve will get this fixed sooner rather than
later.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
33301d949f anv: Drop the anv_validate block helper
Over the course of driver development, we've come up with a number of
different schemes for adding giant blocks of asserts inside the driver.
This one is only being used once in anv_pipeline.c and the way it's
being used actually generates compiler warnings in release builds.  This
commit drops the anv_validate macro and just puts the contents of the
one validation function in side of a "#ifdef DEBUG" guard.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
a316d8f406 anv: Get rid of the stub() macros
Except for a few unimplemented things on gen7, we don't really have
stubs anymore so we should drop this.  This commit replaces the few gen7
stub() calls with explicitly labeled finishme's and makes the sparse
binding stuff silently no-op or return a FEATURE_NOT_PRESENT error.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
1488d079cb anv: Remove a pointless finishme
We've been supporting multiple shaders per module for some time now.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
1a43792783 anv: Convert the HiZ finishme's to perf_warn
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
201fc83df7 anv: Add a performance warning helper
This acts identically to anv_finishme except that it only dumps out
these nice log messages if you run with INTEL_DEBUG=perf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
b3135c3cf3 anv: Advertise shaderInt64 on Broadwell and above
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-03 13:59:29 -08:00
Nanley Chery
d7d64f1091 anv/image: Allow HiZ on input attachment-capable depth/stencil images
While an input attachment may only take on one of those two layouts,
other depth/stencil attachments that use the same image may have
HiZ-enabled layouts. Improves the average frame rate on a release
candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my
SKL GT4.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
76b8cc2a1c anv/cmd_buffer: Centralize automatic layout transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
0a72b5f3cb anv/cmd_buffer: Add attachment transitioning functions
This is needed to transition input attachments.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
9950774f8b anv/blorp: Encapsulate subpass id querying
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
c78a959bcf anv/cmd_buffer: Enable render pass awareness
v2: Update cmd_state_reset (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
c0223d052b anv/pass: Store subpass attachment reference list
We'll loop through this array when performing automatic layout
transitions.

v2: Adjust formatting of an assignment (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00