Commit Graph

47 Commits

Author SHA1 Message Date
Jason Ekstrand
853d3b59fd anv: Allocate misc BOs from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:09 +00:00
Jason Ekstrand
c4be72934e anv: Fix a relocation race condition
Previously, we would read the offset from the BO in anv_reloc_list_add
to generate the presumed offset and then again in the caller to compute
the 64-bit address to write into the buffer.  However, if the offset
somehow changed between these two points, the presumed offset would no
longer match the written offset.  This is unlikely to actually ever be a
problem in practice because the presumed offset gets recorded first and
so if the written address is wrong then the presumed offset is almost
certainly wrong and the relocation will trigger.  However, it's much
safer to simply have anv_reloc_list_add return the 64-bit address.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-31 13:46:08 +00:00
Jordan Justen
7737f56544 anv/gen12: Write GFX_AUX_TABLE base address register
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:14 -07:00
Francisco Jerez
c2fe7a0fb8 anv/gen9: Optimize slice and subslice load balancing behavior.
See "i965/gen9: Optimize slice and subslice load balancing behavior."
for the rationale.  According to Jason, improves Aztec Ruins
performance by 2.7%.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)

v2: Undo CPU performance micro-optimization done in i965 and iris due
    to lack of data justifying it on anv.  Use
    cmd_buffer_apply_pipe_flushes wrapper instead of emitting pipe
    control command directly.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-12 14:40:21 -07:00
Lionel Landwerlin
9628631a38 Revert "anv: limit URB reconfigurations when using blorp"
In commit 0d46e404 ("anv: limit URB reconfigurations when using
blorp") we tried to limit the number of URB reconfiguration by
checking if the last allocation is large enough to fit the blorp
dispatch.

We used the last bound pipeline to compare the allocation. The problem
with this is that the pipeline is bound but its commands might not
have been emitted into the command buffer yet.

Let's just revert commit 0d46e40467
since it didn't seem to yield any performance improvement.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0d46e404 ("anv: limit URB reconfigurations when using blorp")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110535
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-29 11:41:27 +00:00
Lionel Landwerlin
0d46e40467 anv: limit URB reconfigurations when using blorp
If the last graphics pipeline bound to the command buffer has enough
space in its VS URB entries for Blorp then avoid reconfiguring the URB
partitions.

v2: s/0/MESA_SHADER_VERTEX/ (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 16:58:06 +01:00
Kenneth Graunke
8bf9b7b5b6 intel: Emit 3DSTATE_VF_STATISTICS dynamically
Pipeline statistics queries should not count BLORP's rectangles.

    (23) How do operations like Clear, TexSubImage, etc. affect the
         results of the newly introduced queries?

      DISCUSSION: Implementations might require "helper" rendering
      commands be issued to implement certain operations like Clear,
      TexSubImage, etc.

      RESOLVED: They don't. Only application submitted rendering
      commands should have an effect on the results of the queries.

Piglit's arb_pipeline_statistics_query-vert_adj exposes this bug when
the driver is hacked to always perform glBufferData via a GPU staging
copy (for debugging purposes).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-14 19:58:04 -07:00
Lionel Landwerlin
3c4c18341a anv: narrow flushing of the render target to buffer writes
In commit 9a7b319903 ("anv/query: flush render target before
copying results") we tracked all the render target writes to apply a
flushes in the vkCopyQueryResults(). But we can narrow this down to
only when we write a buffer (which is the only input of
vkCopyQueryResults).

v2: Drop newer render target write flags introduce by 1952fd8d2c
    ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+")

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2019-01-19 15:45:41 +00:00
Rafael Antognolli
643248b66a anv: Remove state flush.
We have all the state buffers snooped, so we don't need to clflush
everything anymore.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:22 -08:00
Rafael Antognolli
e3dc56d731 anv: Update usage of block_pool->bo.
Change block_pool->bo to be a pointer, and update its usage everywhere.
This makes it simpler to switch it later to a list of BOs.

v3:
 - Use a static "bos" field in the struct, instead of malloc'ing it.
 This will be later changed to a fixed length array of BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:02 -08:00
Rafael Antognolli
e8b6e0a5ba anv/allocator: Add getter for anv_block_pool.
We will need the anv_block_pool_map to find the map relative to some BO
that is not at the start of the block pool.

v2: just return a pointer instead of a struct (Jason)
v4: Update comment (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:07:43 -08:00
Kenneth Graunke
084a1cdbb7 blorp: Add blorp_get_surface_address to the driver interface.
Currently, BLORP expects drivers to provide two functions for dealing
with buffers: blorp_emit_reloc and blorp_surface_reloc.  Both record a
relocation and combine the BO address and offset into a full 64-bit
address.  Traditionally, blorp_surface_reloc has written that combined
address to an implicitly-known buffer where surface states are stored.
(In contrast, blorp_emit_reloc returns the value.)

The upcoming Iris driver stores surface states in multiple buffers,
which makes it impossible for blorp_surface_reloc to write the combined
address - it only takes an offset, not the actual buffer to write to.

This commit adds a third function, blorp_get_surface_address, which
combines and returns an address, which is then passed to ISL's surface
state fill functions.  Softpin-only drivers can return a real address
here and skip writing it in blorp_surface_reloc.  Relocation-based
drivers are have options.  They can simply return 0 from the new
function, and continue writing the address from blorp_surface_reloc.
Or, they can return a presumed address from blorp_get_surface_address,
and have other relocation processing write the real value later.

For now, i965 and anv simply return 0.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-10 20:51:53 -08:00
Lionel Landwerlin
9a7b319903 anv/query: flush render target before copying results
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.

v2: Simplify logic in CopyQueryResults (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
2018-12-05 11:43:34 +00:00
Kenneth Graunke
74259b98aa intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin.
commit 92f01fc5f9 made i965 start emitting
VF cache invalidates when the high bits of vertex buffers change.  But
we were not tracking vertex buffers emitted by BLORP.  This was papered
over by a mistake where I emitted VF cache invalidates all the time,
which Chris fixed in commit 3ac5fbadfd.

This patch adds a new hook which allows the driver to track addresses
and request a VF cache invalidate as appropriate.

v2: Make the driver do the PIPE_CONTROL so it can apply workarounds
    (caught by Jason Ekstrand).  Rebase on anv bug fix.
v3: Don't screw up the boolean (caught by Jason Ekstrand).

Fixes: 92f01fc5f9 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-06-06 19:45:09 -07:00
Scott D Phillips
29a139b308 anv/blorp: Write relocated values into surface states
v2 (Jason Ekstrand):
 - Split the blorp bit into it's own patch and re-order a bit
 - Use anv_address helpers

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-31 16:51:47 -07:00
Jason Ekstrand
185630c6bc anv/blorp: Do the gen11 BTI flush
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-04-20 16:30:14 -07:00
Nanley Chery
377da9eb78 blorp: Silence unused function warnings
vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from vulkan/genX_blorp_exec.c:35:0:
./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~
genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from genX_blorp_exec.c:33:0:
../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function]
 blorp_emit_memcpy(struct blorp_batch *batch,
 ^~~~~~~~~~~~~~~~~

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-04-11 13:04:49 -07:00
Lionel Landwerlin
0f544a3c51 anv: silence unused function warning on gen11
[84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'.
../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function]
 blorp_get_surface_base_address(struct blorp_batch *batch)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-03-15 18:55:42 +00:00
Jason Ekstrand
f4f95496cb anv/blorp: Allow indirect clear colors on blorp sources on gen7
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-01 14:07:58 -08:00
Jason Ekstrand
116e818ef1 anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
This fixes a pile of hangs caused by the recent shuffling of resolves
and transitions.  The particularly problematic case is when you have at
least three attachments with load ops of CLEAR, LOAD, CLEAR.  In this
case, we execute the first CLEAR followed by a MI memcpy to copy the
clear values over for the LOAD followed by a second CLEAR.  The MI
commands cause the first CLEAR to hang which causes us to get stuck on
the 3DSTATE_MULTISAMPLE in the second CLEAR.

We also add guards for BLORP to fix the same issue.  These shouldn't
actually do anything right now because the only use of indirect clears
in BLORP today is for resolves which are already guarded by a render
cache flush and CS stall.  However, this will guard us against potential
issues in the future.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:19 -08:00
Jason Ekstrand
8bd5ec5b86 anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state
Vertex buffers are entirely a graphics pipeline thing.

Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:39 -08:00
Jason Ekstrand
e85aaec148 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:36 -08:00
Jason Ekstrand
67b676f0c5 intel/blorp: Add initial support for indirect clear colors
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-11-27 16:22:12 -08:00
Jason Ekstrand
bc933d0e84 intel/blorp: Make the MOCS setting part of blorp_address
This makes our MOCS settings significantly more flexible.

Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-13 19:40:10 -08:00
Kenneth Graunke
3e50607a40 intel: Move clflush helpers from anv to common/gen_clflush.h.
I want to use these in the OpenGL driver as well.

v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:19 -07:00
Jason Ekstrand
9cb6ac62fb intel/blorp: Plumb through access to the workaround BO
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101283
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
0ed6f196fc intel/blorp: Add support for gen4-5 SF programs
As part of enabling support for SF programs, we plumb the SF URB size
through to emit_urb_config.  For now, it's always zero but, on gen4, it
may be something larger.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
d3ed72e2c2 anv/allocator: Embed the block_pool in the state_pool
Now that the state stream is allocating off of the state pool, there's
no reason why we need the block pool to be separate.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
5d1ba2cb04 anv/blorp: Align vertex buffers to 64B
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:52 -07:00
Kenneth Graunke
0c3fbf8028 i965: Drop AUB_TRACE_* stuff.
This was used for aubdumping (deleted a while ago) and INTEL_DEBUG=bat
decoding (deleted recently).

While we're changing parameters, delete the wrapper macro and make the
actual function brw_state_batch instead of __brw_state_batch.

This subsumes a patch by Emil Velikov to drop this from BLORP.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 13:49:18 -07:00
Jason Ekstrand
dda54890f3 anv: Disable VF statistics for blorp and SOL memcpy
In order to get accurate statistics, we need to disable statistics for
blits, clears, and the surface state memcpy at the top of each secondary
command buffer.  There are two possible approaches to this:

 1) Disable before the blit/memcpy and re-enable afterwards

 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it
    part of pipeline state and then just disabale statistics before
    blits and memcpy operations.

Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't
really matter which path we take.  We choose the second option as it's
more consistent with the way the rest of the statistics are enabled and
disabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Iago Toral Quiroga
be52f9693a anv/blorp: make anv_cmd_buffer_alloc_blorp_binding_table() return a VkResult
Instead of asserting inside the function, and then use use that information
to return early from its callers upon failure.

v2:
  - Make sure that clear_color_attachment() and
    clear_depth_stencil_attachment() get the VkResult as well so they
    avoid executing the batch if an error happened. (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
68d88f0237 anv: handle failures when growing reloc lists
Growing the reloc list happens through calling anv_reloc_list_add() or
anv_reloc_list_append(). Make sure that we call these through helpers
that check the result and set the batch error status if needed.

v2:
  - Handling the crashes is not good enough, we need to keep track of
    the error, for that, keep track of the errors in the batch instead (Jason).
  - Make reloc list growth go through helpers so we can have a central
    place where we can do error tracking (Jason).

v3:
  - Callers that need the offset returned by anv_reloc_list_add() can
    compute it themselves since it is extracted from the inputs to the
    function, so change the function to return a VkResult, make
    anv_batch_emit_reloc() also return a VkResult and let their callers
    do the error management (Topi)

v4:
  - Let anv_batch_emit_reloc() return an uint64_t as it originally did,
    there is no real benefit in having it return a VkResult.
  - Do not add an is_aux parameter to add_surface_state_reloc(), instead
    do error checking for aux in add_image_view_relocs() separately.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
808503b8f8 anv: remove unnecessary function prototype.
The function is defined right after the prototype declaration. Also, the
protoype for it is included in anv_genX.h which is included via anv_private.h.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Jason Ekstrand
f31ed6d0cd anv: Take a device parameter in anv_state_flush
This allows the helper to check for llc instead of having to do it
manually at all the call sites.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
f9d7d27d6d anv: Rename clflush_range and state_clflush
It's a bit shorter and easier to work with.  Also, we're about to add a
helper called clflush which does the clflush but without any memory
fencing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
075ed20614 intel/blorp: Explicitly flush all allocated state
Found by inspection.  However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
e8d52dab48 anv: Add support for the PMA fix on Broadwell
This helps Dota 2 on Broadwell by 8-9%.  I also hacked up the driver and
used the Sascha "shadowmapping" demo to get some results.  Setting
uses_kill to true dropped the framerate on the demo by 25-30%.  Enabling
the PMA fix brought it back up to around 90% of the original framerate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
a8b85f1f77 anv: Implement a depth stall restriction on gen7
Fixes around 60 Vulkan CTS tests on Haswell

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2016-11-20 20:40:40 -08:00
Kenneth Graunke
9ef2b9277d intel: Share URB configuration code between GL and Vulkan.
This code is far too complicated to cut and paste.

v2: Update the newly added genX_gpu_memcpy.c; const a few things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-19 11:40:01 -08:00
Jason Ekstrand
e371850d94 anv/blorp: Break the guts of alloc_binding_table into a shared helper
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:29 -08:00
Jason Ekstrand
52904ba85c anv: Get rid of anv_cmd_buffer_emit_state_base_address
All code that would have once called this can now call the gen-specific
version.  The switching version is no longer needed.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-10-17 17:41:35 -07:00
Jason Ekstrand
146ee31159 anv/blorp: Don't hand-roll flush_pipeline_select_3d
When I initially brought up Vulkan blorp, I completely missed that this
was already factored out.  There's no good reason for us to hand-roll it.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:39:41 -07:00
Jason Ekstrand
a814e18c96 intel/blorp: Stop setting 3DSTATE_DRAWING_RECTANGLE
The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT
at the GPU initialization time and never sets it again.  The GL driver sets
it every time the framebuffer changes.  Originally, blorp set it to the
size of the drawing area but meant we had to set it back in the Vulkan
driver.  Instead, we can easily just do that in the GL driver's blorp_exec
implementation and not set it in blorp core.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14 17:51:16 -07:00
Jason Ekstrand
b56f509ee0 intel/blorp: Emit 3DSTATE_MULTISAMPLE directly
Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE.  However,
now that Vulkan and GL use the same sample positions, we can set up
3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14 17:51:16 -07:00
Jason Ekstrand
c779ad3e66 intel: Move Vulkan sample positions to common code
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-09-14 17:51:16 -07:00
Jason Ekstrand
8f780af968 anv: Add initial blorp support
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00