Commit Graph

17 Commits

Author SHA1 Message Date
Lionel Landwerlin
9a7b319903 anv/query: flush render target before copying results
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.

v2: Simplify logic in CopyQueryResults (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
2018-12-05 11:43:34 +00:00
Jason Ekstrand
7a89a0d9ed anv: Use separate MOCS settings for external BOs
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:03 -05:00
Jason Ekstrand
c811af767e anv/so_memcpy: Don't consider src/dst_offset when computing block size
The only thing that matters is the size since we never specify any
offsets in terms of blocks.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-19 09:38:04 -05:00
Jason Ekstrand
40149441b8 anv: Add a mi_memset and use it for zeroing queries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-09-17 02:57:21 -05:00
Caio Marcelo de Oliveira Filho
f9d25f630c anv/memcpy: fix build after starting to use addresses
The offsets now come from the anv_address, these references were not
updated and using the old variable.

Fixes: e1ab834557 "anv/memcpy: Use addresses instead of bo+offset"
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2018-09-14 21:45:50 -07:00
Jason Ekstrand
e1ab834557 anv/memcpy: Use addresses instead of bo+offset
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-14 22:12:11 -05:00
Jason Ekstrand
90b46f6c17 anv/so_memcpy: Use the correct SO_BUFFER size on gen8+
This shouldn't matter as we'll never write OOB anyway but we may as well
get it right.  It's supposed to be in dwords - 1.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-14 22:12:11 -05:00
Kenneth Graunke
0472aa3efe intel: Drop SURFACE_FORMAT enum from genxml.
We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-05 09:51:08 -08:00
Jason Ekstrand
116e818ef1 anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
This fixes a pile of hangs caused by the recent shuffling of resolves
and transitions.  The particularly problematic case is when you have at
least three attachments with load ops of CLEAR, LOAD, CLEAR.  In this
case, we execute the first CLEAR followed by a MI memcpy to copy the
clear values over for the LOAD followed by a second CLEAR.  The MI
commands cause the first CLEAR to hang which causes us to get stuck on
the 3DSTATE_MULTISAMPLE in the second CLEAR.

We also add guards for BLORP to fix the same issue.  These shouldn't
actually do anything right now because the only use of indirect clears
in BLORP today is for resolves which are already guarded by a render
cache flush and CS stall.  However, this will guard us against potential
issues in the future.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
2018-02-20 13:49:19 -08:00
Jason Ekstrand
e85aaec148 anv/cmd_buffer: Move dirty bits into anv_cmd_*_state
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
2018-01-23 21:10:36 -08:00
Matt Turner
5d4afef459 anv: Explicitly cast between different enums
Fixes warnings like

warning: implicit conversion from enumeration type 'enum isl_format' to
different enumeration type 'enum GEN10_SURFACE_FORMAT'
[-Wenum-conversion]
         .SourceElementFormat = ISL_FORMAT_R32_UINT,
                                ^~~~~~~~~~~~~~~~~~~

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
6cfc49287d anv: Remove 'inline' keywords
Unless you have data, the compiler knows better than you whether a
function should be inlined.

No difference in the resulting binary with gcc-6.3.0 or clang-4.0.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Nanley Chery
0b16600056 anv/gpu_memcpy: Add a lighter-weight GPU memcpy function
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.

v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
d6748f1fc4 anv/gpu_memcpy: Rename the gpu_memcpy function
A GPU memcpy function could alternatively be implemented using MI_*
commands. Provide more detail into how this one operates in case another
memcpy function is created.

v2:
- Update the commit message.
v3:
- Use 'memcpy' instead of 'cpy' (Jason Ekstrand)
- Shorten 'streamout' to 'so'

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Jason Ekstrand
dda54890f3 anv: Disable VF statistics for blorp and SOL memcpy
In order to get accurate statistics, we need to disable statistics for
blits, clears, and the surface state memcpy at the top of each secondary
command buffer.  There are two possible approaches to this:

 1) Disable before the blit/memcpy and re-enable afterwards

 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it
    part of pipeline state and then just disabale statistics before
    blits and memcpy operations.

Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't
really matter which path we take.  We choose the second option as it's
more consistent with the way the rest of the statistics are enabled and
disabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Kenneth Graunke
9ef2b9277d intel: Share URB configuration code between GL and Vulkan.
This code is far too complicated to cut and paste.

v2: Update the newly added genX_gpu_memcpy.c; const a few things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-19 11:40:01 -08:00
Jason Ekstrand
3d9747780b anv: Add a helper for doing buffer copies with nothing but VF and SOL.
This method of doing copies has the advantage of touching very little of
the GPU state.  While it does disable all the shader stages, it doesn't
have to blow away binding tables, viewports, scissors, or any other bits of
dynamic state other than VBO 32 which is already reserved.  All of the
state that it does touch is contained within a pipeline anyway so that's
the only thing that has to be dirtied.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-11-16 10:11:07 -08:00