Reorder its members to fill the current padding hole, reducing the
struct size from 32 to 24.
This struct appears multiple times inside struct anv_image and its
members, so this change brings down sizeof(struct anv_image) from
1744 to 1600.
We went from:
struct anv_image_memory_range {
enum anv_image_memory_binding binding; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
uint64_t offset; /* 8 8 */
uint64_t size; /* 16 8 */
uint32_t alignment; /* 24 4 */
/* size: 32, cachelines: 1, members: 4 */
/* sum members: 24, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 32 bytes */
};
to:
struct anv_image_memory_range {
enum anv_image_memory_binding binding; /* 0 4 */
uint32_t alignment; /* 4 4 */
uint64_t size; /* 8 8 */
uint64_t offset; /* 16 8 */
/* size: 24, cachelines: 1, members: 4 */
/* last cacheline: 24 bytes */
};
Considering we can have tens of thousands of anv_image structs
allocated at the same time on gaming workloads, this can save us a few
MB of memory. It ain't much but it's honest work.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28700>
Remove a premature optimization. When PIPE_MAP_DISCARD_WHOLE_RESOURCE
is set we were setting create_new_bo, and then if that was set we skipped
a set of tests which if passed would cause a panfrost_flush_writer.
In fact we need that flush in some cases (e.g. when any batch is
reading the resource). Moreover, we should sometimes copy the resource
(set the copy_resource flag) and that again was being skipped if
create_new_bo was initially true due to PIPE_MAP_DISCARD_WHOLE_RESOURCE
being set.
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28406>
X8Z24 surfaces have a don't care stencil channel, which is okay to be
cleared together with the depth channel. Set the depth clear bits
accordingly to allow those clears to use the fast-clear path when
only depth is to be cleared. This change aligns the RS with the BLT
ZS clear path.
Fixes: df63f188e8 ("etnaviv: fix separate depth/stencil clears")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
Currently both TS and non-TS paths use the same place to store the compiled
RS commands to clear the surface. In the TS case the commands only initialize
the TS buffer, while the non-TS commands clear the whole buffer. The
assumption here is that a TS enabled surface will only ever be fast cleared,
which doesn't hold anymore, now that we can fall back to slow clears on TS
enabled depth/stencil buffers.
The fallback to a slow clear will overwrite the stored RS commands with a
full buffer clear. If we can transition to a fast clear later, the commands
to initialize the TS buffer will not be regenerated and a full buffer clear
will be submitted instead. In addition to the performance degradation, it
will also leave TS in an inconsistent state, as the TS buffer will not be
initialized, but the TS state still gets marked as valid.
To avoid this confusion and not introduce any more state tracking to remember
the target of the clear commands and regenerate TS clears if needed, simply
split the storage for compiled TS and non-TS clear commands.
Fixes: df63f188e8 ("etnaviv: fix separate depth/stencil clears")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
Now that we switch dynamically between fast (TS) and slow (regular)
clears on TS enabled surfaces, we must trigger reevaluation of the
current TS state also after a slow clear, as otherwise the PE might
continue to use the invalidated TS state.
Fixes: df63f188e8 ("etnaviv: fix separate depth/stencil clears")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28696>
mem_access_base_pointer loads memory (the descriptor) and therefore
needs to be guarded. Fixes
dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_null_pointer_load.
Fixes: fc8a83c ("gallivm/ssbo: mask offset with exec_mask instead of building the 'if'")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28614>
Since DGC preprocessing for IBO is supported, the driver generates
an indexed indirect draw but SQTT markers were missing and this
introduced complete non-sense in RGP captures.
Fixes: e59a16bbb8 ("radv: use an indirect draw when IBO isn't updated as part of DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28710>
constantQp will be 0 according to spec for any rate control method
other than NONE, so it should only be used with NONE rate control and
not when default rate control (which is internally NONE) is used.
Also it shouldn't override min/max QP.
Fixes: 54d499818c ("radv/video: add initial support for encoding with h264.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28734>