On windows when external resources are imported
there is no information about them. And in such cases
resource_from_hanlde templ argument is equal NULL.
To support such case on virgl, virgl winsys can now
fill in template for resource, that will be used if
templ=NULL. Additionally helper functions were
added to convert virgl encoded enums to pipe.
Reviewed-by: Feng Jiang <jiangfeng@kylinos.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27308>
While OpenGL requires that VE must be bound,
other mesa frontends, f.e. d3d10umd, can emit draw
without any VE bound. Which led to vctx->vertex_elements
to be null, which lead to null derefence. Add check
for ve not being null to avoid that.
Supported by virglrenderer@b8ac10db
Reviewed-by: Feng Jiang <jiangfeng@kylinos.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27308>
The CTS image allocation sometimes doesn't try to allocate a complete
DPB, but the amdgpu kernel module checks for this, so always make
the DPB max sized on uvd instances.
Fixes part of video decode on Fiji/Polaris
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27186>
This only dumps the begin tokens. Tokens are written to a buffer
containing a 12 byte header at the beginning.
We use an intermediate format for the ray history tokens because the RRA
format is very inefficient.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25548>
DRM modifiers are a BSD/Linux phenomenon.
We can also remove a bunch of these checks too. No Linux specific
symbol or header is **actually** used, and the DRM modifier is
just represented as uint64_t. But kept the style of the file
as is.
Reviewed-by: Serdar Kocdemir <kocdemir@google.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27425>
Follow the blob and optimize subgroup operation using brcst.active and
getlast when supported.
The transformation consists of two parts. First, a NIR transform
replaces subgroup operations with a sequence of new brcst_active_ir3
intrinsics followed by a new [type]_clusters_ir3 intrinsic (where type
can be reduce, inclusive_scan, or exclusive_scan).
The brcst_active_ir3 intrinsic is lowered directly to a brcst.active
instruction. The other intrinsics get lowered to a new macro
(OPC_SCAN_CLUSTERS_MACRO) which later gets emitted as a loop (using
getlast/getone) that iterates all clusters and produces the requested
scan result.
OPC_SCAN_CLUSTERS_MACRO has a number of optional arguments. First, since
the exclusive scan result is not a natural by-product of the loop but
has to be calculated explicitly, its destination is optional. This is
necessary since adding it unconditionally will produce unused
instructions that won't be DCE'd anymore at this point. Second, when
performing 32b MUL_U reductions (that expand to multiple instructions),
an extra scratch register is necessary.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6387
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26950>
This pass will later also serve as a way to accurately insert physical
edges, which is the original motivation. However it also lets us put
branchstack handling on a more solid footing.
There was an off-by-one in the old branchstack handling because it
didn't consider that a single if-else actually has two reconvergence
points active at the same time, so it undercounted the branchstack by 1
for pretty much every shader. We change the HW formula to produce the
same result, which now makes it much more sensible.
We can also delete the physical predecessor handling in ir3_legalize,
because it was only needed to handle (jp) which is now handled earlier.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22072>
We were relying on it in RA to tell us whether we could give more
registers to the shader mostly "for free" (because occupancy is bounded
by the branchstack), but it turns out it was actually 0 so we weren't
taking advantage of it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22072>
There are two problems with shared register allocation at the moment:
1. We weren't modelling physical edges correctly, and once we do, the
current hack in RA for handling them won't work correctly. This means
live-range splitting doesn't work. I've tried various strategies but
none of them seems to fix this.
2. Spilling of shared registers to non-shared registers isn't
implemented.
Spilling of shared regs is significantly simpler than spilling
non-shared regs, because (1) spilling and unspilling is significantly
cheaper, just a single mov, and (2) we can swap "stack slots" (actually
non-shared regs) so all the complexity of parallel copy handling isn't
necessary. This means that it's much easier to integrate RA and
spilling, while still using the tree-scan framework, so that we can
spill instead of splitting live ranges. The other issue, of phi nodes
with physical edges, we can handle by spilling those phis earlier. For
this to work, we need to accurately insert physical edges based on
divergence analysis or else every phi node would involve physical edges,
which later commits will accomplish.
This commit adds a shared register allocation pass which is a
severely-cut-down version of RA and spilling. Everything to do with live
range splitting is cut from RA, and everything to do with parallel copy
handling and for spilling we simply always spill as long as soon as we
encounter a case where it's necessary. This could be improved,
especially the spilling strategy, but for now it keeps the pass simple
and cuts down on code duplication. Unfortunately there's still some
shared boilerplate with regular RA which seems unavoidable however.
The new RA requires us to redo liveness information, which is
significantly expensive, so we keep the ability of the old RA to handle
shared registers and only use the new RA when it may be required: either
something potentially requiring live-range splitting, or a too-high
shared register limit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22072>