Commit Graph

62 Commits

Author SHA1 Message Date
Kenneth Graunke
c0c899cf78 iris: Allow HiZ for copy_region sources
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2020-01-04 12:25:55 -08:00
Kenneth Graunke
645b195312 iris: Delete remnants of the unimplemented ASTC 5x5 workaround
I copy and pasted some of the boilerplate but never the implementation.
For now, ASTC 5x5 is disabled and faked via uncompressed RGBA; let's
delete these remnants until such a time when we implement it properly.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2020-01-03 18:06:38 -08:00
Eric Anholt
882ca6dfb0 util: Move gallium's PIPE_FORMAT utils to /util/format/
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium.  Since u_format used
util_copy_rect(), I moved that in there, too.

I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.

Closes: #1905
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 10:47:20 -08:00
Rafael Antognolli
a4da6008b6 iris: Use mocs from isl_dev.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Sagar Ghuge
b22b349443 iris: Resolve stencil resource prior to copy or used by CPU
v2: Decide aux usage in get_copy_region_aux_settings (Nanley Chery)

v3: Use isl_surf_usage_is_stencil function (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
5d331251cf iris: Prepare resources before stencil blit operation
We have to resolve destination surfaces if we are bliting to and from
the same surface.

v2: Revert unrelated change (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-29 14:46:15 -07:00
Sagar Ghuge
758a6a3a00 iris: Get correct resource aux usage for copy
Add case for MCS_CCS so that we get the correct aux usage while copy
operation.

v2: Fix commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Nanley Chery
7a619b5c75 iris: Enable HIZ_CCS sampling
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
af6ff48894 iris: Define initial HIZ_CCS state and transitions
Make it match those of HIZ.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Marek Olšák
732ea0b213 gallium: add PIPE_RESOURCE_FLAG_SINGLE_THREAD_USE to skip util_range lock
u_upload_mgr sets it, so that util_range_add can skip the lock.

The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2%
in torcs on radeonsi.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-07 20:05:00 -04:00
Kenneth Graunke
87fa8d9ebc iris: Lessen texture cache hack flush for blits/copies on Icelake.
Lionel found actual documentation for this at long last.  Apparently
it actually is a sampler cache limitation that was mostly fixed on
Icelake.  Unfortunately, it seems there are still issues with ASTC
and non-ASTC sampler views.  Still, we can lessen the flush condition
from "format mismatch" to "ASTC mismatch", which eliminates most of
the flushing here.

We also update the documentation to refer to the workaround name.
2019-08-31 20:17:55 -07:00
Jordan Justen
246eebba4a iris: Export and import surfaces with modifiers that have aux data
The DRI interface for modifiers with aux data treats the aux data as a
separate plane of the main surface.

When the dri layer requests the plane associated with the aux data, we
save the required information into the dri aux plane image.

Later when the image is used, the dri plane image will be available in
the pipe_resource structure's `next` field. Therefore in iris, we
reconstruct the aux setup from this separate dri plane image when the
image is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-13 15:20:47 -07:00
Kenneth Graunke
9b1b971491 iris: Use MI_COPY_MEM_MEM for tiny resource_copy_region calls.
If our resource_copy_region size is a small number of DWords, then
instead of firing up BLORP, we can simply use MI_COPY_MEM_MEM (after
a CS stall).  We also try and select the optimal batch.

Improves performance in Shadow of Mordor on Low settings at 1920x1080
on Skylake GT4e by 0.689096% +/- 0.473968% (n=4).  It tries to copy
4 bytes of data to a buffer which was most recently used as a writable
compute shader SSBO.  Previously we were switching from compute to the
render pipeline, then firing up all of blorp_buffer_copy...for 4 bytes.

I arbitrarily decided to support 4/8/12/16 bytes.  Jason thinks this
is about the right threshold where it's cheaper to use MI_COPY_MEM_MEM.
2019-07-01 13:59:49 -07:00
Kenneth Graunke
ecc500398f iris: Drop RT flushes from depth stencil clearing flushes.
These write depth and stencil, not color writes, so there's no need
to flush the render target.
2019-06-20 13:32:16 -05:00
Kenneth Graunke
6890340c31 iris: Avoid double flushing in iris_transfer_flush_region when copying.
My intention was to have iris_copy_region not do flushing, and leave
that up to the callers.  iris_resource_copy_region needs to do this,
but iris_transfer_flush_region was already doing it.  The net result
was that we were doing it twice for transfers.

So, move the flushing from iris_copy_region to iris_resource_copy_region
so that it only happens in the callers as I intended.
2019-06-20 13:32:15 -05:00
Kenneth Graunke
d4a4384b31 iris: Implement INTEL_DEBUG=pc for pipe control logging.
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush.  That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on.  It should make it easier to determine whether we're doing
too many flushes and why.
2019-06-20 13:32:15 -05:00
Kenneth Graunke
659d4f613e iris: Make resource_copy_region handle packed depth-stencil resources.
Also copy along the separate stencil buffer if needed.

Fixes Piglit's arb_copy_image-formats.
2019-06-17 17:29:09 -05:00
Kenneth Graunke
a36f1542ae iris: Order CS stall and TC invalidate for format reinterpretation hacks
This should ensure the TC invalidate happens after the stall.

Fixes KHR-GL43.copy_image.functional which does a CopyImage (blorp_copy)
from a buffer (using R8G8B8A8_UINT), then GetTexImage to read back the
original image (using R10G10B10A2_UNORM).
2019-06-17 16:38:08 -05:00
Kenneth Graunke
94b9f50e63 iris: Be more aggressive at post-format-reintepret TC invalidate hack
When copying/blitting with format reinterpretation, we invalidate the
texture cache before/after.  Before is so the source of the copy works,
and after is to get rid of our new data in the "wrong" format to protect
future attempts to sample.

When I ported these hacks to iris, I tried to be cautious by only
bothering with the hacks if the batch referenced the BO.  This makes
some sense for the before case.  If it isn't referenced, the texture
cache can't really have any data for the BO (since it's also invalidated
between batches).  But we still need to do the after case regardless,
as we've just polluted the cache with hazardous entries.
2019-06-17 16:38:08 -05:00
Mike Blumenkrantz
ddd716e746 iris: support dmabuf imports with offsets
this adds support for imports where the image data begins at an offset
from the start of the buffer, as used in h/x264

fixes kwg/mesa#47

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-07 13:36:08 -07:00
Kenneth Graunke
77449d7c41 iris: Track valid data range and infer unsynchronized mappings.
Applications frequently call glBufferSubData() to consecutive regions
of a VBO to append new vertex data.  If no data exists there yet, we
can promote these to unsynchronized writes, even if the buffer is busy,
since the GPU can't be doing anything useful with undefined content.
This can avoid a bunch of unnecessary blitting on the GPU.

u_threaded_context would do this for us, and in fact prohibits us from
doing so (see TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED).  But we haven't
hooked that up yet, and it may be useful to disable u_threaded_context
when debugging...at which point we'd still want this optimization.  At
the very least, it would let us measure the benefit of threading
independently from this optimization.  And it's not a lot of code.

Removes most stall avoidance blits in "Total War: WARHAMMER."

On my Skylake GT4e at 1920x1080, this appears to improve performance
in games by the following (but I did not do many runs for proper
statistics gathering):

   ----------------------------------------------
   | DiRT Rally        | +2% (avg) | + 2% (max) |
   | Bioshock Infinite | +3% (avg) | + 9% (max) |
   | Shadow of Mordor  | +7% (avg) | +20% (max) |
   ----------------------------------------------
2019-04-23 00:24:08 -07:00
Kenneth Graunke
c4478889b7 iris: Add texture cache flushing hacks for blit and resource_copy_region
This is a port of Jason's 8379bff6c4
from i965 to iris.  We can't find anything relevant in the documentation
and no one we've talked to has been able to help us pin down a solution.

Unfortunately, we have to put the hack in both iris_blit() and
iris_copy_region().  st/mesa's CopyImage() implementation sometimes
chooses to use pipe->blit() instead of pipe->resource_copy_region().
For blits, we only do the hack if the blit source format doesn't match
the underlying resource (i.e. it's reinterpreting the bits).  Hopefully
this should not be too common.
2019-04-16 13:04:22 -07:00
Kenneth Graunke
9c52dce6a9 iris: Actually mark blorp_copy_buffer destinations as written. 2019-04-15 14:51:01 -07:00
Sergii Romantsov
72a921e12a i965,iris/blorp: do not blit 0-sizes
Seems there is no sense in blitting 0-sized sources
or destinations.
Additionaly it may cause segfaults for i965.

v2: Function call replaced with inline check

v3: Added check to avoid devision by zero (L. Landwerlin)

v4: Added simillar check for Iris (L. Landwerlin)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-30 11:50:40 +00:00
Kenneth Graunke
ee8370c766 iris: Fix blits with S8_UINT destination
For depth and stencil blits, we always want the main mask to be Z, and
the secondary pass mask to be S.  If asked to blit Z+S to S, we should
handle the blit in the second pass which properly gets the stencil
resources.

Before, we were trying to handle S as the main mask, and accidentally
blitting a Z source to a S destination, which doesn't work out well.

Fixes Piglit's "framebuffer-blit-levels {draw,read} stencil" tests.
2019-03-28 10:47:26 -07:00
Rafael Antognolli
e7c8402163 iris: Let blorp update the clear color for us.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli
34d00b4410 iris: Use the clear depth when emitting 3DSTATE_CLEAR_PARAMS.
Take the clear depth into account when IRIS_DIRTY_DEPTH_BUFFER is marked
as dirty.

Also update the blorp surface clear color.

v2: Use a single if (zres && zres->aux.bo) (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Kenneth Graunke
2993088500 iris: Export a copy_region helper that doesn't flush
I'll want to use this for transfer maps, which already do their own
flushing.  This lets us avoid a double flush, and also gives us more
control over the batch which is selected.
2019-03-07 17:08:19 -08:00
Kenneth Graunke
809a81ec3a iris: Properly support alpha and luminance-alpha formats
For texturing, we map alpha formats to the corresponding red format,
as many alpha formats are outright missing, and red is more efficient
when sampling anyway.

When rendering to A8_UNORM, we use that format directly, so the image
gets the shader output's .a/.w channel, rather than the .r/.x channel.

All other A* formats are non-renderable, so we can't do much and just
mark them as unsupported for rendering.  Fortunately, GL only requires
rendering to A8_UNORM, so that works out.

According to Andre Heider and Timur Kristóf, this fixes font rendering
in Witcher 1 (via nine).  Andre also reported that it fixes Unigine
Heaven (presumably via nine).

v2: Use the same swizzle for both sampler views and "render targets".
    BLORP expects the read swizzle, and will take the inverse when
    setting up the destination swizzle (and actually applying it in
    the shaders).  We ignore the format swizzle when setting up normal
    rendering SURFACE_STATEs, which is necessary because it would be
    an illegal shader channel select combination.  Thanks to Jason
    Ekstrand for pointing out that BLORP took an inverse swizzle.

Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-07 11:39:27 -08:00
Kenneth Graunke
744b8e1c12 iris: Fix MOCS for blits and clears
I915_MOCS_CACHED is the wrong value.  Expose mocs() and use that.
2019-03-06 18:04:53 -08:00
Kenneth Graunke
2cddc953cd iris: some initial HiZ bits 2019-02-21 10:26:12 -08:00
Kenneth Graunke
b77d2dc71b iris: Make blit code use actual aux usages 2019-02-21 10:26:12 -08:00
Kenneth Graunke
5eb75345b8 iris: try to fix copyimage vs copybuffers 2019-02-21 10:26:12 -08:00
Kenneth Graunke
3c979b0e6d iris: add some draw resolve hooks 2019-02-21 10:26:12 -08:00
Kenneth Graunke
53c484ba8a iris: blorp using resolve hooks 2019-02-21 10:26:12 -08:00
Kenneth Graunke
77a1070d36 iris: Initial import of resolve code 2019-02-21 10:26:12 -08:00
Kenneth Graunke
c81941f1e7 iris: Pay attention to blit masks
For combined depth/stencil formats, we may want to only blit one half.
If PIPE_BLIT_Z is set, blit depth; if PIPE_BLIT_S is set, blit stencil.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7837fec740 iris: Assert about blits with color masking
st/mesa never asks for this today, but in theory someone might, and we
don't support it.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
7a9e87c224 iris: Implement multi-slice copy_region
I don't know if this is required - surprisingly, I haven't seen it
matter - but I'd like to use it for multi-slice transfer maps.  We may
as well do the right thing.
2019-02-21 10:26:11 -08:00
Kenneth Graunke
761a5fb36a iris: fix conditional compute, don't stomp predicate for pipelined queries 2019-02-21 10:26:10 -08:00
Kenneth Graunke
0c3ea03e4b iris: for BLORP, only use the predicate enable bit when USE_BIT 2019-02-21 10:26:10 -08:00
Dave Airlie
7bbf3ff4a9 iris: add conditional render support 2019-02-21 10:26:10 -08:00
Kenneth Graunke
415ede346d iris: Flush for history at various moments
When we blit, transfer, or copy_resource to a buffer, we need to flush
to ensure any stale data for that buffer is invalidated in the caches.

bind_history will inform us which caches need to be flushed.

Also, for any push constant buffers, we need to flag those dirty so
that we re-emit 3DSTATE_CONSTANT_*, causing the data to be re-pushed.
2019-02-21 10:26:10 -08:00
Kenneth Graunke
c5b22441f1 iris: Fix buffer -> buffer copy_region
Size can be too large for a surf, blorp_buffer_copy chops things up
into segments we can actually handle

Fixes map_buffer_range_test and copy_buffer_coherency
2019-02-21 10:26:10 -08:00
Kenneth Graunke
f1a7392be1 iris: Put batches in an array
We keep re-making this array all over the place
2019-02-21 10:26:10 -08:00
Kenneth Graunke
9878ea842f iris: scissored and mirrored blits 2019-02-21 10:26:10 -08:00
Kenneth Graunke
94569a6458 iris: rework format translation apis 2019-02-21 10:26:09 -08:00
Kenneth Graunke
42dccb1233 iris: use consistent copyright formatting
some of them had typos, didn't say 'authors or copyright holders',
or other mistakes.  This is now https://opensource.org/licenses/MIT
text, formatted consistently.
2019-02-21 10:26:08 -08:00
Kenneth Graunke
84b30a2900 iris: call maybe_flush for each blorp operation
otherwise with high layer counts we may exceed two batches worth of
commands... (!)
2019-02-21 10:26:08 -08:00
Kenneth Graunke
0e059e4829 iris: assert depth is 1 in resource_copy_region
given the dstz parameter I don't think it does multiple slices..
2019-02-21 10:26:08 -08:00