Commit Graph

81 Commits

Author SHA1 Message Date
Rafael Antognolli
e3dc56d731 anv: Update usage of block_pool->bo.
Change block_pool->bo to be a pointer, and update its usage everywhere.
This makes it simpler to switch it later to a list of BOs.

v3:
 - Use a static "bos" field in the struct, instead of malloc'ing it.
 This will be later changed to a fixed length array of BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:02 -08:00
Rafael Antognolli
fc3f588320 anv/allocator: Remove pool->map.
After switching to using anv_state_table, there are very few places left
still using pool->map directly. We want to avoid that because it won't
be always the right map once we split it into multiple BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:00 -08:00
Rafael Antognolli
54e21e145e anv/allocator: Rename anv_free_list2 to anv_free_list.
Now that we removed the original anv_free_list, we can now use its name.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:58 -08:00
Rafael Antognolli
234c9d8a40 anv/allocator: Remove anv_free_list.
The next commit already renames anv_free_list2 -> anv_free_list since
the old one is gone.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:56 -08:00
Rafael Antognolli
e2179aceaf anv/allocator: Use anv_state_table on back_alloc too.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:52 -08:00
Rafael Antognolli
d18267fb48 anv/allocator: Use anv_state_table on anv_state_pool_alloc.
Use anv_state_pool_return_blocks() to return blocks to the pool, instead
of manually pushing them.

v3:
 - return blocks from the end of the chunk (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:50 -08:00
Rafael Antognolli
6a1dcfe73d anv/allocator: Add helper to push states back to the state table.
The use of anv_state_table_add() combined with anv_state_table_push(),
specially when adding a bunch of states to the table, is very verbose.
So we add this helper that makes things easier to digest.

We also already add the anv_state_table member in this commit, so things
can compile properly, even though it's not used.

v2: assert that the states are always aligned to their size (Jason)
v3: Add "table" member to anv_state_pool in this commit.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:47 -08:00
Rafael Antognolli
e8b6e0a5ba anv/allocator: Add getter for anv_block_pool.
We will need the anv_block_pool_map to find the map relative to some BO
that is not at the start of the block pool.

v2: just return a pointer instead of a struct (Jason)
v4: Update comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:43 -08:00
Rafael Antognolli
6a2d5ae305 anv/allocator: Add anv_state_table.
Add a structure to hold anv_states. This table will initially be used to
recycle anv_states, instead of relying on a linked list implemented in
GPU memory. Later it could be used so that all anv_states just point to
the content of this struct, instead of making copies of anv_states
everywhere.

One has to call anv_state_table_add(), which returns an index for the
state in the table, and then get a pointer to such index, and finally
fill in the rest of the struct.

TODO:
   1) There's a lot of common code between this table backing store
   memory and the anv_block_pool buffer, due to how we grow it. I think
   it's possible to refactory this and reuse code on both places.

   2) Add unit tests.

v3:
 - Rename state table memfd (Jason)
 - Return VK_ERROR_OUT_OF_HOST_MEMORY on more places (Jason)
 - anv_state_table_grow returns VkResult (Jason)
 - Rename variables to be more informative (Jason)
 - Return errors on state table grow.
 - Rename anv_state_table_push/pop to anv_free_list_push2/pop2
   This will be renamed again to remove the trailing "2" later.

v4:
 - Remove exit(-1) from anv_state_table (Jason).
 - Use uint32_t "next" field in anv_free_entry (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:34 -08:00
Caio Marcelo de Oliveira Filho
09c3ff01df src/intel: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:33 -08:00
Eric Engestrom
4f5a526789 anv: drop unneeded KHR suffix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:47:56 +00:00
Dave Airlie
29a7631986 anv: add missing unlock in error path.
Not going to matter, but be consistent.

Found by coverity

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: caf41c78c (anv/allocator: Support softpin in the BO cache)
2018-10-11 09:50:27 +10:00
Jason Ekstrand
7a89a0d9ed anv: Use separate MOCS settings for external BOs
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:03 -05:00
Scott D Phillips
4affeba1e9 anv: Soft-pin everything else
v2 (Jason Ekstrand):
 - Break up Scott's mega-patch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:13 -07:00
Scott D Phillips
f3dbe0419d anv: Soft-pin batch buffers
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:12 -07:00
Jason Ekstrand
caf41c78ca anv/allocator: Support softpin in the BO cache
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:11 -07:00
Jason Ekstrand
b0d50247a7 anv/allocator: Set the BO flags in bo_cache_alloc/import
It's safer to set them there because we have the opportunity to properly
handle combining flags if a BO is imported more than once.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-06-01 14:27:10 -07:00
Scott D Phillips
e662bdb820 anv: Soft-pin state pools
The state_pools reserve virtual address space of the full
BLOCK_POOL_MEMFD_SIZE, but maintain the current behavior of
growing from the middle.

v2: - rename block_pool::offset to block_pool::start_address (Jason)
    - assign state pool start_address statically (Jason)
v3: - remove unnecessary bo_flags tampering for the dynamic pool (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-06-01 13:49:22 -07:00
Jason Ekstrand
3db93f9128 anv/allocator: Don't shrink either end of the block pool
Previously, we only tried to ensure that we didn't shrink either end
below what was already handed out.  However, due to the way we handle
relocations with block pools, we can't shrink the back end at all.  It's
probably best to not shrink in either direction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2018-04-26 13:17:14 -07:00
Ian Romanick
d76c204d05 util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious.  The next
patch adds a new function with different zero-input behavior.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2018-03-29 14:09:23 -07:00
Jordan Justen
24b415270f intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-09 16:15:58 -08:00
Alex Smith
00a81e9909 anv: Add missing unlock in anv_scratch_pool_alloc
Fixes hangs seen due to the lock not being released here.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-01-04 14:54:02 +00:00
Vinson Lee
8c1e4b1afc anv: Check if memfd_create is already defined.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-30 01:36:46 -08:00
Lionel Landwerlin
118a8c7587 anv: setup BO flags at state_pool/block_pool creation
This will allow to set the flags on any anv_bo created/filled from a
state pool or block pool later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-22 22:53:27 +00:00
Nicolai Hähnle
ffc2060616 anv: fix build failure
Fixes: e3a8013de8 ("util/u_queue: add util_queue_fence_wait_timeout")
2017-11-09 14:49:19 +01:00
Timothy Arceri
f98a2768ca mesa: Add new fast mtx_t mutex type for basic use cases
While modern pthread mutexes are very fast, they still incur a call to an
external DSO and overhead of the generality and features of pthread mutexes.
Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
inline the atomic operation and make the fast case just two intructions.
Mutexes are subtle and finicky to implement, so we carefully copy the
implementation from Ulrich Dreppers well-written and well-reviewed paper:

  "Futexes Are Tricky"
  http://www.akkadia.org/drepper/futex.pdf

We implement "mutex3", which gives us a mutex that has no syscalls on
uncontended lock or unlock.  Further, the uncontended case boils down to a
cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
and an untaken branch.  We use __builtin_expect() to indicate that contention
is unlikely so that gcc will put the contention code out of the main code
flow.

A fast mutex only supports lock/unlock, can't be recursive or used with
condition variables.  We keep the pthread mutex implementation around as
for the few places where we use condition variables or recursive locking.
For platforms or compilers where futex and atomics aren't available,
simple_mtx_t falls back to the pthread mutex.

The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
applications.  Most CPU bound cases are helped and some of our internal
bind_buffer_object heavy benchmarks gain up to 10%.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 12:07:48 +11:00
Chad Versace
9775894f10 anv: Move size check from anv_bo_cache_import() to caller (v2)
This change prepares for VK_ANDROID_native_buffer. When the user imports
a gralloc hande into a VkImage using VK_ANDROID_native_buffer, the user
provides no size. The driver must infer the size from the internals of
the gralloc buffer.

The patch is essentially a refactor patch, but it does change behavior
in some edge cases, described below. In what follows, the "nominal size"
of the bo refers to anv_bo::size, which may not match the bo's "actual
size" according to the kernel.

Post-patch, the nominal size of the bo returned from
anv_bo_cache_import() is always the size of imported dma-buf according
to lseek(). Pre-patch, the bo's nominal size was difficult to predict.
If the imported dma-buf's gem handle was not resident in the cache, then
the bo's nominal size was align(VkMemoryAllocateInfo::allocationSize,
4096).  If it *was* resident, then the bo's nominal size was whatever
the cache returned. As a consequence, the first cache insert decided the
bo's nominal size, which could be significantly smaller compared to the
dma-buf's actual size, as the nominal size was determined by
VkMemoryAllocationInfo::allocationSize and not lseek().

I believe this patch cleans up that messy behavior. For an imported or
exported VkDeviceMemory, anv_bo::size should now be the true size of the
bo, if I correctly understand the problem (which I possibly don't).

v2:
  - Preserve behavior of aligning size to 4096 before checking. [for
    jekstrand]
  - Check size with < instead of <=, to match behavior of commit c0a4f56
    "anv: bo_cache: allow importing a BO larger than needed". [for
    chadv]
2017-10-17 23:46:06 -07:00
Chad Versace
eb69a61806 anv: Move close(fd) from anv_bo_cache_import to its callers (v2)
This will allow us to implement VK_ANDROID_native_buffer without dup'ing
the fd. We must close the fd in VK_KHR_external_memory_fd, but we should
not in VK_ANDROID_native_buffer.

v2:
  - Add missing close(fd) for case
    VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR, subcase
    ANV_SEMAPHORE_TYPE_BO.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Lionel Landwerlin
c0a4f56fb9 anv: bo_cache: allow importing a BO larger than needed
It's not a problem if a BO has been allocated larger than we need it
to be.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102940
Fixes: 818b857914 ("anv: Use the BO cache for DeviceMemory allocations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-11 22:29:55 +01:00
Tapani Pälli
d083bc1c4b anv: wire up vk_errorf macro to do debug reporting
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:42:00 +03:00
Matt Turner
cdbaa8a12f anv: Mark functions used conditionally as UNUSED
The functions we're marking as UNUSED in genX_pipeline.c are used only
when compiling for particular generations.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Jason Ekstrand
227debdc92 vulkan: Update to the new 1.0.54 spec XML and headers
There is one small ANV change here because we used the
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had
to be updated to have the _KHR suffix.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
00df1cd9d6 anv: Stop setting BO flags in bo_init_new
The idea behind doing this was to make it easier to set various flags.
However, we have enough custom flag settings floating around the driver
that this is more of a nuisance than a help.  This commit has the
following functional changes:

 1) The workaround_bo created in anv_CreateDevice loses both flags.
    This shouldn't matter because it's very small and entirely internal
    to the driver.

 2) The bo created in anv_CreateDmaBufImageINTEL loses the
    EXEC_OBJECT_ASYNC flag.  In retrospect, it never should have gotten
    EXEC_OBJECT_ASYNC in the first place.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:38 -07:00
Jason Ekstrand
e05e3e07ab anv/allocator: Only write to _vg_ptr if we have valgrind
This fixes the build when not building against valgrind headers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100945
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-05 12:49:51 -07:00
Jason Ekstrand
98cd512089 anv/allocator: Improve block pool growing asserts
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
955127db93 anv/allocator: Add support for large stream allocations
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
f82d3d38b6 anv/allocator: Allow state pools to allocate large states
Previously, the maximum size of a state that could be allocated from a
state pool was a block.  However, this has caused us various issues
particularly with shaders which are potentially very large.  We've also
hit issues with render passes with a large number of attachments when we
go to allocate the block of surface state.  This effectively removes the
restriction on the maximum size of a single state.  (There's still a
limit of 1MB imposed by a fixed-length bucket array.)

For states larger than the block size, we just grab a large block off of
the block pool rather than sub-allocating.  When we go to allocate some
chunk of state and the current bucket does not have state, we try to
pull a chunk from some larger bucket and split it up.  This should
improve memory usage if a client occasionally allocates a large block of
state.

This commit is inspired by some similar work done by Juan A. Suarez
Romero <jasuarez@igalia.com>.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
8c079b566e anv/allocator: Support pushing multiple blocks onto a free list at once
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
8769fb48fb anv/allocator: Add helpers for dealing with bucket sizes
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
12043ca696 anv/allocator: Add the capability to allocate blocks of different sizes
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
01170df262 anv/allocator: Rework a comment
This commit just fixes up the English a bit and re-flows the comment.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
bcc5d0defb anv/allocator: Tweak the block pool growing algorithm
The old algorithm worked fine assuming a constant block size.  We're
about to break that assumption so we need an algorithm that's a bit more
robust against suddenly growing by a huge amount compared to the
currently allocated quantity of memory.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
d3ed72e2c2 anv/allocator: Embed the block_pool in the state_pool
Now that the state stream is allocating off of the state pool, there's
no reason why we need the block pool to be separate.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
bb2a3f0df8 anv/allocator: Get rid of the ability to free blocks
Now that everything is going through the state pools, the block pool no
longer needs to be able to handle re-use.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
55f49e6b7e anv/allocator: Add support for "back" allocations to state_pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
49ecaf88d1 anv/allocator: Drop the block_size field from block_pool
Since the state_stream is now pulling from a state_pool, the only thing
pulling directly off the block pool is the state pool so we can just
move the block_size there.  The one exception is when we allocate
binding tables but we can just reference the state pool there as well.

The only functional change here is that we no longer grow the block pool
immediately upon creation so no BO gets allocated until our first state
allocation.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
30d63ffe26 anv/allocator: Pull the userptr part of block_pool_grow into a helper
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
c73ce41a48 anv/allocator: Roll fixed_size_state_pool into state_pool
The helper functions aren't really gaining us as much as they claim and
are actually about to be in the way.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
6d02ef011e anv/allocator: Remove the state_size field from fixed_size_state_pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
367031a5c8 anv: Get rid of a bunch of uses of size_t
We should only use size_t when referring to sizes of bits of CPU memory.
Anything on the GPU or just a regular array length should be a type that
has the same size on both 32 and 64-bit architectures.  For state
objects, we use a uint32_t because we'll never allocate a piece of
driver-internal GPU state larger than 2GB (more like 16KB).

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00