Commit Graph

303 Commits

Author SHA1 Message Date
Lionel Landwerlin
3c4c18341a anv: narrow flushing of the render target to buffer writes
In commit 9a7b319903 ("anv/query: flush render target before
copying results") we tracked all the render target writes to apply a
flushes in the vkCopyQueryResults(). But we can narrow this down to
only when we write a buffer (which is the only input of
vkCopyQueryResults).

v2: Drop newer render target write flags introduce by 1952fd8d2c
    ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+")

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2019-01-19 15:45:41 +00:00
Danylo Piliaiev
1952fd8d2c anv: Implement VK_EXT_conditional_rendering for gen 7.5+
Conditional rendering affects next functions:
- vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
- vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
- vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase
- vkCmdClearAttachments

Value from conditional buffer is cached into designated register,
MI_PREDICATE is emitted every time conditional rendering is enabled
and command requires it.

v2: by Jason Ekstrand
  - Use vk_find_struct_const instead of manually looping
  - Move draw count loading to prepare function
  - Zero the top 32-bits of MI_ALU_REG15

v3: Apply pipeline flush before accessing conditional buffer
 (The issue was found by Samuel Iglesias)

v4: - Remove support of Haswell due to possible hardware bug
    - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to
       define registers in one place.

v5: thanks to Jason Ekstrand and Lionel Landwerlin
    - Workaround the fact that MI_PREDICATE_RESULT is not
      accessible on Haswell by manually calculating
      MI_PREDICATE_RESULT and re-emitting MI_PREDICATE
      when necessary.

v6: suggested by Lionel Landwerlin
    - Instead of calculating the result of predicate once - re-emit
      MI_PREDICATE to make it easier to investigate error states.

v7: suggested by Jason
    - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL
      if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set.

v8: suggested by Lionel
    - Precompute conditional predicate's result to
      support secondary command buffers.
    - Make prepare_for_draw_count_predicate more readable.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Danylo Piliaiev
ed6e2bf263 anv: Implement VK_KHR_draw_indirect_count for gen 7+
v2: by Jason Ekstrand
  - Move out of the draw loop population of registers
    which aren't changed in it.
  - Remove dependency on ALU registers.
  - Clarify usage of PIPE_CONTROL
  - Without usage of ALU registers patch works for gen7+

v3: set pending_pipe_bits |= ANV_PIPE_RENDER_TARGET_WRITES

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Rafael Antognolli
643248b66a anv: Remove state flush.
We have all the state buffers snooped, so we don't need to clflush
everything anymore.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:22 -08:00
Rafael Antognolli
e3dc56d731 anv: Update usage of block_pool->bo.
Change block_pool->bo to be a pointer, and update its usage everywhere.
This makes it simpler to switch it later to a list of BOs.

v3:
 - Use a static "bos" field in the struct, instead of malloc'ing it.
 This will be later changed to a fixed length array of BOs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:02 -08:00
Lionel Landwerlin
4149d41f2e anv: fix invalid binding table index computation
The ++ operator strikes again.

Fixes: f92c5bc8f3 ("anv/device: fix maximum number of images supported")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 11:49:10 -08:00
Iago Toral Quiroga
f92c5bc8f3 anv/device: fix maximum number of images supported
We had defined MAX_IMAGES as 8, which we used to size the array for
image push constant data. The comment there stated that this was for
gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes
that array, asserting that we don't exceed that number of images,
which imposes a limit of MAX_IMAGES on all gens.

Furthermore, despite this, we are exposing up to 64 images per shader
stage on all gens, gen8 included.

This patch lowers the number of images we expose in gen8 to 8 and
keeps 64 images for gen9+ while making sure that only pre-SKL gens
use push constant space to handle images.

v2:
 - <= instead of < in the assert (Eric, Lionel)
 - Change the way the assertion is written (Eric)

v3:
 - Revert the way the assertion is written to the form it had in v1,
   the version in v2 was not equivalent and was incorrect. (Lionel)

v4:
 - gen9+ doesn't need push constants for images at all (Jason)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3)
2019-01-17 07:59:00 +01:00
Jason Ekstrand
5e4f9ea363 anv: Implement VK_KHR_depth_stencil_resolve 2019-01-14 10:16:52 -06:00
Jason Ekstrand
9f44088468 anv: Move resolve_subpass to genX_cmd_buffer.c
We may have to do transitions around certain kinds of resolves so it
helps to have it genX code.
2019-01-14 10:16:52 -06:00
Lionel Landwerlin
add5a2ec92 anv: flush fast clear colors into compressed surfaces
In the following scenario :

   1. Create image format R8G8B8A8_UNORM
   2. Create image view format R8G8B8A8_SRGB
   3. Clear the view through a sub pass to a particular color
   4. Barrier on the image to from color attachment to source transfer
   5. Copy the image into a linear buffer to check the content

The step 4 resolving the clear color is unaware of the SRGB format of
the view, because the blorp resolve operations operate on images the
color associated with the resolve will not operate on SRGB format but
UNORM. Leading to the wrong color being written into surfaces.

This change forces a clear color resolve at the end of the render pass
so following resolves won't have to deal with the clear color with a
format that doesn't match the image's format.

On gfxbench vulkan_5_normal 1280x720, this appear to cost us ~0.5fps,
from 49.316 down to 48.949.

v2: Only fast clear resolve when image & view have different formats
    (Lionel)

v3: Update warning (Jason)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2019-01-08 16:37:00 +00:00
Lionel Landwerlin
366eb656ac anv: explictly specify format for blorp ccs/mcs op
Resolve operations can happen when dealing with view (begin/end
subpasses) in which case the view's format needs to apply, not the
image's format.

v2: Relayout arguments of a ccs_op() call (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
2019-01-08 16:36:56 +00:00
Lionel Landwerlin
e2ae5f2f0a anv: don't do partial resolve on layer > 0
We've made the choice not to use fast clears on layer > 0 with
multilayer images. This is partly because we would need to store
multiple clear colors for each layer, making the existing memory
layout, already including aux surfaces, fast clear color, image state,
etc... even more complex.

Partial resolves are the operations transfering the clear colors into
the auxiliary buffers. This operation is currently implemented in
Blorp by loading the clear color from the image's BO, into a shader
that then samples from the auxiliary buffer and writes the color only
if it isn't there already.

The problem here is that because we store only one clear color for all
layers and it is used for partial resolves. If you trigger a partial
clear on a layer > 0, then you're likely to deal with a color that is
not what you actually want. In the particular issues below, we have
multiple layers, each cleared with a different color but the partial
resolve just writes the wrong color into the auxiliary buffers for
layers > 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108910
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
2018-12-24 09:42:46 +00:00
Kenneth Graunke
0b44644ca6 genxml: Consistently use a numeric "MOCS" field
When we first started using genxml, we decided to represent MOCS as an
actual structure, and pack values.  However, in many places, it was more
convenient to use a numeric value rather than treating it as a struct,
so we added secondary setters in a bunch of places as well.

We were not entirely consistent, either.  Some places only had one.
Gen6 had both kinds of setters for STATE_BASE_ADDRESS, but newer gens
only had the struct-based setters.  The names were sometimes "Constant
Buffer Object Control State" instead of "Memory", making it harder to
find.  Many had prefixes like "Vertex Buffer MOCS"...in a vertex buffer
packet...which is a bit redundant.

On modern hardware, MOCS is simply an index into a table, but we were
still carrying around the structure with an "Index to MOCS Table" field,
in addition to the direct numeric setters.  This is clunky - we really
just want a number on new hardware.

This patch eliminates the struct-based setters, and makes the numeric
setters be consistently called "MOCS".  We leave the struct definition
around on Gen7-8 for reference purposes, but it is unused.

v2: Drop bonus "Depth Buffer MOCS" fields on Gen7.5 and Gen9

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2018-12-14 00:44:54 -08:00
Lionel Landwerlin
9a7b319903 anv/query: flush render target before copying results
This change tracks render target writes in the pipeline and applies a
render target flush before copying the query results to make sure the
preceding operations have landed in memory before the command streamer
initiates the copy.

v2: Simplify logic in CopyQueryResults (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909
Fixes: 37f9788e9a ("anv: flush pipeline before query result copies")
Cc: mesa-stable@lists.freedesktop.org
2018-12-05 11:43:34 +00:00
Anuj Phogat
16e4911972 anv/icl: Set use full ways in L3CNTLREG
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2018-11-26 15:11:36 -08:00
Anuj Phogat
13c955182f anv/icl: Set Error Detection Behavior Control Bit in L3CNTLREG
The default setting of this bit is not the desirable behavior.
WA_1406697149

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-11-01 12:00:23 -07:00
Jordan Justen
d18a0d955e anv/gen9+: Initialize new fields in STATE_BASE_ADDRESS
Ref: 263b584d5e "i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake."
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2018-10-11 15:16:00 -07:00
Jason Ekstrand
7a89a0d9ed anv: Use separate MOCS settings for external BOs
On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-10-03 09:03:03 -05:00
Jason Ekstrand
b3f477ef7a intel/isl: Add a unit suffixes to some struct fields and variables
I was about to make the claim to someone that every field in isl_surf
is either an enum or has explicit units.  Then I looked at isl_surf and
discovered this claim was wrong.  We should fix that.  This commit does
a few refactors:

 * Add _B suffixes to some struct fields
 * Add _B to some variables and parameters
 * Rename row_pitch_tiles -> row_pitch_tl

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-09-26 08:52:26 -05:00
Jason Ekstrand
d6a73824bd anv/cmd_buffer: Take an address in emit_lrm
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-14 22:12:11 -05:00
Jason Ekstrand
e1ab834557 anv/memcpy: Use addresses instead of bo+offset
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-09-14 22:12:11 -05:00
Jason Ekstrand
465e5a868c anv: Clamp scissors to the framebuffer boundary
The Vulkan 1.1.81 spec says:

    "It is legal for offset.x + extent.width or offset.y + extent.height
    to exceed the dimensions of the framebuffer - the scissor test still
    applies as defined above. Rasterization does not produce fragments
    outside of the framebuffer, so such fragments never have the scissor
    test performed on them."

Elsewhere, the Vulkan 1.1.81 spec says:

    "The application must ensure (using scissor if necessary) that all
    rendering is contained within the render area, otherwise the pixels
    outside of the render area become undefined and shader side effects
    may occur for fragments outside the render area. The render area
    must be contained within the framebuffer dimensions."

Unfortunately, there's some room for interpretation here as to what the
consequences are of having the render area set to exactly the
framebuffer dimensions and having a scissor that is larger than the
framebuffer.  Given that GL and other APIs provide automatic clipping to
the framebuffer, it makes sense that applications would assume that
Vulkan does this as well.  It costs us very little to play it safe and
just clamp client-provided scissors to the framebuffer dimensions.
Fortunately, the user is required to provide us with at least one
scissor so we don't need to handle the case where they don't.

Fixes: fb2a5ceb32 "anv: Emit DRAWING_RECTANGLE once at driver..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Jason Ekstrand
5dee89438a anv: Implement a VF cache invalidate workaround
Known to fix nothing whatsoever but it's in the docs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Jason Ekstrand
c643c5e18d anv: Re-emit vertex buffers when the pipeline changes
Some of the bits of VERTEX_BUFFER_STATE such as access type, instance
data step rate, and pitch come from the pipeline.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-09-07 15:19:02 -05:00
Dylan Baker
8396043f30 Replace uses of _mesa_bitcount with util_bitcount
and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem
in nir for platforms that don't have popcount or popcountll, such as
32bit msvc.

v2: - Fix additional uses of _mesa_bitcount added after this was
      originally written

Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-09-07 10:21:26 -07:00
Jason Ekstrand
d8033d4083 intel/compiler: Remove surface_idx from brw_image_param
Now that the drivers are lowering to surface indices themselves, we no
longer need to push the surface index into the shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:03 -05:00
Jason Ekstrand
2caf6c0392 anv/pipeline: Add a per-VB instance divisor
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jason Ekstrand
32f4feb5a0 anv/pipeline: Use a per-VB struct instead of separate arrays
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-07-09 15:37:51 -07:00
Jason Ekstrand
a695de5845 anv: Add support for VK_KHR_create_renderpass2
The implementation of CreateRenderPass2 uses the helpers we broke out in
previous commits.  The implementations of the new vkCmd functions just
call the old versions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
208be8eafa anv: Make subpass::depth_stencil_attachment a pointer
This makes certain checks a bit easier and means that we don't have
the attachment information duplicated in the attachment list and in
depth_stencil_attachment.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-07-09 10:11:53 -07:00
Jason Ekstrand
70ce880434 anv: Add state setup support for shader constants
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-07-02 12:09:49 -07:00
Iago Toral Quiroga
1b54824687 anv/cmd_buffer: make descriptors dirty when emitting base state address
Every time we emit a new state base address we will need to re-emit our
binding tables, since they might have been emitted with a different base
state adress.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
2018-07-02 08:31:20 +02:00
Iago Toral Quiroga
6a1d8350c9 anv/cmd_buffer: clean dirty push constants flag after emitting push constants
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
2018-07-02 08:31:02 +02:00
Jason Ekstrand
bf34ef16ac anv: Use an address for each anv_image plane
This is better than having BO and offset fields.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
1f2328c3b7 anv/cmd_buffer: Rework surface relocation helpers
This commit renames add_surface_state_reloc to add_surface_reloc and
makes it takes an address.  We also rename add_image_view_relocs to
add_surface_state_relocs because it takes an anv_surface_state and
doesn't really care about the image view anymore.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
f270a09737 anv: Use an anv_address in anv_buffer
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
8a8bd39d5e anv/cmd_buffer: Use anv_address for handling indirect parameters
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
1029458ee3 anv: Use an anv_address in anv_buffer_view
Instead of storing a BO and offset separately, use an anv_address.  This
changes anv_fill_buffer_surface_state to use anv_address and we now call
anv_address_physical and pass that into ISL.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
de1c5c1b50 anv: Use full anv_addresses in anv_surface_state
This refactors surface state filling to work entirely in terms of
anv_addresses instead of offsets.  This should make things simpler for
when we go to soft-pin image buffers.  Among other things,
add_image_view_relocs now only cares about the addresses in the surface
state and doesn't really need the image view anymore.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2018-05-31 16:51:46 -07:00
Jason Ekstrand
a8a740f272 i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL
From the bspec docs for "Indirect State Pointers Disable":

    "At the completion of the post-sync operation associated with this
    pipe control packet, the indirect state pointers in the hardware are
    considered invalid"

So the ISP disable is a post-sync type of operation which means that it
should be combined with a CS stall.  Without this, the simulator throws
an error.

Fixes: 766d801ca "anv: emit pixel scoreboard stall before ISP disable"
Fixes: f536097f6 "i965: require pixel scoreboard stall prior to ISP disable"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-05-09 18:03:28 -07:00
Lionel Landwerlin
766d801ca3 anv: emit pixel scoreboard stall before ISP disable
We want to make sure that all indirect state data has been loaded into
the EUs before disable the pointers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Fixes: 78c125af39 ("anv/gen10: Ignore push constant packets during context restore.")
2018-05-09 20:11:57 +01:00
Neil Roberts
c366f422f0 nir: Offset vertex_id by first_vertex instead of base_vertex
base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-04-19 15:57:45 -07:00
Neil Roberts
c4f30a9100 spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX
The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.

v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used
for SpvBuiltInBaseVertex.  Suggested by Jason.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-19 15:57:45 -07:00
Lionel Landwerlin
0a6547014f anv: fix number of planes for depth & stencil
We're not counting correctly with depth & stencil images.

Additionally we need to move an assert that is meant just for color
attachments.

v2: Move an assert() (Reported by Craig)
    Change aspect mask checks (Francesco)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a62a979335 ("anv: enable multiple planes per image/imageView")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2018-04-13 11:44:53 -07:00
Rafael Antognolli
7728720f07 anv: Make blorp update the clear color.
Instead of updating the clear color in anv before a resolve, just let
blorp handle that for us during fast clears.

v5: Update comment about HiZ clear color (Jordan).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
021e1885d0 anv: Emit the fast clear color address, instead of value.
On Gen10+, instead of copying the clear color from the state buffer to
the surface state, just use the address of the state buffer in the
surface state directly. This way we can avoid the copy from state buffer
to surface state.

v4:
 - Remove use_clear_address from anv code. (Jason)
 - Use the helper to extract clear color from attachment (Jason)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
3f96b459f4 anv: Add a helper to extract clear color from the attachment.
Extract the code from color_attachment_compute_aux_usage, so we can
later reuse it to update the clear color state buffer.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2018-04-05 07:42:45 -07:00
Rafael Antognolli
94675edcfd intel: Use Clear Color struct size.
The size of the clear color struct (expected by the hardware) is 8
dwords (isl_dev.ss.clear_value_state_size here). But we still need to
track the size of the clear color, used when memcopying it to/from the
state buffer. For that we keep isl_dev.ss.clear_value_size.

v4:
 - Add struct to gen11 too (Jason, Jordan)
 - Add field for Converted Clear Color to gen11 (Jason)
 - Add clear_color_state_offset to differentiate from
   clear_value_offset.
 - Fix all the places where clear_value_size was used.

v5 (Jason):
 - Split genxml changes to another commit.
 - Remove unnecessary gen checks.
 - Bring back missing offset increment to init_fast_clear_color().

v6 (Jason):
 - On init_fast_clear_color, change:
   addr.offset += 4 => sdi.Address.offset += i * 4
 - Use GEN_GEN instead of GEN_VERSIONx10.

[jordan.l.justen@intel.com: isl_device_init changes]
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-05 07:42:45 -07:00
Iago Toral Quiroga
31881079af anv/cmd_buffer: honor pending clear views for depth/stencil attachments
v2: rebased on top of subpass rework.

v3: rebased

v4:
 - rebased
 - reset pending clear views in one go rather one bit at a time (Caio)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:24 +02:00
Iago Toral Quiroga
f60c5fc17e anv/cmd_buffer: consider multiview masks for tracking pending clear aspects
When multiview is active a subpass clear may only clear a subset of the
attachment layers. Other subpasses in the same render pass may also
clear too and we want to honor those clears as well, however, we need to
ensure that we only clear a layer once, on the first subpass that uses
a particular layer (view) of a given attachment.

This means that when we check if a subpass attachment needs to be cleared
we need to check if all the layers used by that subpass (as indicated by
its view_mask) have already been cleared in previous subpasses or not, in
which case, we must clear any pending layers used by the subpass, and only
those pending.

v2:
  - track pending clear views in the attachment state (Jason)
  - rebased on top of fast-clear rework.

v3:
  - rebased on top of subpass rework.

v4: rebased.

v5 (Caio):
 - Rebased.
 - Initialize pending clear views to only have bits set for layers
   that exist.
 - Reset pending clear views in one go rather one bit at a time.
 - Put "last subpass for this attachment" condition in a separate
   function to simplify the conditional that resets pending_clear_aspects.

Fixes:
dEQP-VK.multiview.readback_implicit_clear.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-04-02 09:53:15 +02:00