Commit Graph

28 Commits

Author SHA1 Message Date
Jason Ekstrand
c0420a62c9 anv/query: Increment an index while writing results
Instead of computing an index at the end which we hope maps to the
number of things written, just count the number of things as we go.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Dylan Baker
8396043f30 Replace uses of _mesa_bitcount with util_bitcount
and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem
in nir for platforms that don't have popcount or popcountll, such as
32bit msvc.

v2: - Fix additional uses of _mesa_bitcount added after this was
      originally written

Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-07 10:21:26 -07:00
Scott D Phillips
4affeba1e9 anv: Soft-pin everything else
v2 (Jason Ekstrand):
 - Break up Scott's mega-patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:13 -07:00
Jason Ekstrand
f270a09737 anv: Use an anv_address in anv_buffer
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
d6c9a89d13 anv/cmd_buffer: Get rid of the meta query workaround
Meta has been gone for a long time.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:20 -08:00
Iago Toral Quiroga
7ec6e4e689 anv/query: implement multiview interactions
From the Vulkan spec with KHX extensions:

  "If queries are used while executing a render pass instance that has
   multiview enabled, the query uses N consecutive query indices
   in the query pool (starting at query) where N is the number of bits
   set in the view mask in the subpass the query is used in.

   How the numerical results of the query are distributed among the
   queries is implementation-dependent. For example, some implementations
   may write each view's results to a distinct query, while other
   implementations may write the total result to the first query and write
   zero to the other queries. However, the sum of the results in all the
   queries must accurately reflect the total result of the query summed
   over all views. Applications can sum the results from all the queries to
   compute the total result."

In our case we only really emit a single query (in the first query index)
that stores the aggregated result for all views, but we still need to manage
availability for all the other query indices involved, even if we don't
actually use them.

This is relevant when clients call vkGetQueryPoolResults and pass all N
queries to retrieve the results. In that scenario, without this patch,
we will never see queries other than the first being available since we
never emit them.

v2: we need the same treatment for timestamp queries.

v3 (Jason):
 - Better an if instead of an early return.
 - We can't write to this memory in the CPU, we should use
   MI_STORE_DATA_IMM and emit_query_availability (Jason).

v4 (Jason):
 - No need to take the value to write as parameter, just hard code it to 0.

Fixes test failures in some work-in-progress CTS multiview+query tests.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-18 16:37:06 +01:00
Tapani Pälli
d083bc1c4b anv: wire up vk_errorf macro to do debug reporting
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:42:00 +03:00
Matt Turner
6cfc49287d anv: Remove 'inline' keywords
Unless you have data, the compiler knows better than you whether a
function should be inlined.

No difference in the resulting binary with gcc-6.3.0 or clang-4.0.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Jason Ekstrand
00df1cd9d6 anv: Stop setting BO flags in bo_init_new
The idea behind doing this was to make it easier to set various flags.
However, we have enough custom flag settings floating around the driver
that this is more of a nuisance than a help.  This commit has the
following functional changes:

 1) The workaround_bo created in anv_CreateDevice loses both flags.
    This shouldn't matter because it's very small and entirely internal
    to the driver.

 2) The bo created in anv_CreateDmaBufImageINTEL loses the
    EXEC_OBJECT_ASYNC flag.  In retrospect, it never should have gotten
    EXEC_OBJECT_ASYNC in the first place.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:38 -07:00
Iago Toral Quiroga
7761cf6d01 anv/query: handle more cases of 'out of host memory'
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-05 08:53:33 +02:00
Jason Ekstrand
1e21d4227e anv/query: Use genxml for MI_MATH
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed by: Iago Toral Quiroga <itoral@igalia.com>
2017-04-20 15:24:06 -07:00
Jason Ekstrand
4e17b59f6c anv/query: Use snooping on !LLC platforms
Commit b2c97bc789 which made us start
using a busy-wait for individual query results also messed up cache
flushing on !LLC platforms.  For one thing, I forgot the mfence after
the clflush so memory access wasn't properly getting fenced.  More
importantly, however, was that we were clflushing the whole query range
and then waiting for individual queries and then trying to read the
results without clflushing again.  Getting the clflushing both correct
and efficient is very subtle and painful.  Instead, let's side-step the
problem by just snooping.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-07 12:17:20 -07:00
Jason Ekstrand
b2c97bc789 anv/query: Busy-wait for available query entries
Before, we were just looking at whether or not the user wanted us to
wait and waiting on the BO.  Some clients, such as the Serious engine,
use a single query pool for hundreds of individual query results where
the writes for those queries may be split across several command
buffers.  In this scenario, the individual query we're looking for may
become available long before the BO is idle so waiting on the query pool
BO to be finished is wasteful. This commit makes us instead busy-loop on
each query until it's available.

This significantly reduces pipeline bubbles and improves performance of
The Talos Principle on medium settings (where the GPU isn't overloaded
with drawing) by around 20% on my SkyLake gt4.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
2017-04-05 21:17:11 -07:00
Jason Ekstrand
c964f0e485 anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible.  In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering.  In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-04 18:33:52 -07:00
Iago Toral Quiroga
129fd58131 anv/query: handle out of host memory without crashing in compute_query_result()
We don't need to make the caller (CmdCopyQueryPoolResults) aware of the
problem since compute_query_result() only emits state. The caller is also
expected to hit OOM in this scenario right after calling this function, but
it is already handling it safely.

Fixes:
dEQP-VK.api.out_of_host_memory.cmd_copy_query_pool_results

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-24 09:39:44 +01:00
Iago Toral Quiroga
4da1832c00 anv: return VK_ERROR_DEVICE_LOST immeditely when device is known to be lost
If we know the device has been lost we should return this error code for
any command that can report it before we attempt to do anything with the
device.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Jason Ekstrand
1d5f4f46da genxml: Make MI_STORE_DATA_IMM have a single 64-bit data field
This is way more convenient than having two separate dword fields.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 15:31:19 -07:00
Ilia Mirkin
e675f57d4f anv: Implement pipeline statistics queries
In the end, pipeline statistics queries look a lot like occlusion
queries only with between 1 and 11 begin/end pairs being generated
instead of just the one.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
149d10d38a anv/query: Rework store_query_result
The new version is a nice GPU parallel to cpu_write_query_result and it
nicely handles things like dealing with 32 vs. 64-bit offsets in the
destination buffer.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
c773ae88df anv/query: Break GPU query calculation into a helper
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
0557dfdb4a anv/query: Add a helper for writing a query pool result
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
bce4a935c6 anv/query: Use a variable-length slot size
Not all queries are the same.  Even the two queries we support today
require a different amount of data per slot.  Once we introduce pipeline
statistics queries, the size will vary wildly.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:49 -07:00
Jason Ekstrand
1c797af2c6 anv/query: Move the available bits to the front
We're about to make slots variable-length and always having the
available bits at the front makes certain operations substantially
easier once we do that.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:47 -07:00
Jason Ekstrand
9d43afa3dc anv/query: Let 32-bit values wrap
From the Vulkan 1.0.39 Specification:

   "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a
   32-bit value, the value may either wrap or saturate."

So we can either clamp or wrap.  Wrapping is both easier and what the
user gets if they use vkCmdCopyQueryPoolResults and we should be
consistent.  We could make vkCmdCopyQueryPoolResults clamp but it's
annoying and ends up burning extra batch for something the spec clearly
doesn't require.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:11:35 -07:00
Jason Ekstrand
08df015b9d anv/GetQueryPoolResults: Actually implement the spec
The Vulkan spec is fairly clear about when we should and should not
write query pool results.  We're also supposed to return VK_NOT_READY if
VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries
which are not yet finished.  This fixes rendering corruptions on The
Talos Principle where geometry flickers in and out due to bogus query
results being returned by the driver.  These issues are most noticable
on Sky Lake GT4 2hen running on "ultra" settings.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:18 -07:00
Jason Ekstrand
81840130c0 anv/query: Invalidate the correct range
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00
Jason Ekstrand
4bbb4b95b8 anv/query: Fix the location of timestamp availability
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00
Jason Ekstrand
b6b03329af anv: Put everything about queries in genX_query.c
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00