Hidden behind `RUSTICL_ENABLE=fp16` for now as the OpenCL CTS doesn't have
enough fp16 tests at the moment. There has been a lot of work on it though,
so hopefully we can enable and verify it soon.
Additionally libclc also misses a bunch of fp16 functionality, so most of
the tests would also just crash.
However this flag is useful for development as it already wires up most of
the code needed.
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Nora Allen <blackcatgames@protonmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23788>
Since there are concerns that the VK_EXT_GPL implementation may have
issues with mesh shading, disable it by default but give users a knob to
turn it on to experiment.
This doesn't automatically enable GPL use in zink, because we lack
extendedDynamicState2PatchControlPoints, but it means that you only need
to set ZINK_DEBUG=gpl and not both env vars.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>
When this debug flag is set, the driver sets the umd metadata for
all color textures and enables the use of extended metadata.
Extended metadata allows umr to import textures and setting these
on all color texture allows to import non-exported textures
(eg: dGPU draw surface when DRI_PRIME=1 is used).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21984>
Following PAL's implementation, this patch avoids allocating shader code
buffers in BAR and use SDMA to upload them to invisible VRAM
directly.
For some games like HZD, shaders can take as much as 400MB, which exceeds
the non-resizable BAR size (256MB) and cause inconsistent spilling
behavior. The kernel will normally move these to invisible VRAM on its own,
but there are a few cases that it does not reliably happen. This patch does
the moving explicitly in the driver to ensure predictable results.
In this patch, we upload the shaders synchronously; so the shader will be
ready as soon as vkCreate*Pipeline returns. A following patch will make
this asynchronous and don't block until we see a use of the pipeline.
As a side effect, when SQTT is used we now store the shaders on a cacheable
buffer which would speed up writing the trace to the disk.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
XE architecture enables many more metrics, perhaps too many for
the average user. Reduce reported metrics to smaller subset,
known as non-extended metrics, by default. Can re-enable extended
metrics with env var INTEL_EXTENDED_METRICS=1
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21841>
When enabled, on gfx12 plus, we will add the sync nop instruction after
each instruction to make sure that current instruction depends on the
previous instruction explicitly.
This option will help us to get a hint if something is missing or broken
in software scoreboard pass.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21797>
The graphics pipeline library implementation in RADV has been
improved considerably lately.
There is still a bit of work for caching individual libraries
and optimized (LTO) pipelines but I think overall it seems good
enough to stop reporting it as experimental and suboptimal.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21213>
Whenever a single file mesa-db cache hits max size limit, a half of cache
is evicted and the cache file is defragmented. The downside of this eviction
strategy is that it causes high disk IO usage during eviction if mesa-db
cache file size is large.
In order to mitigate this downside, we will split mesa-db into multiple
part such that only one part will be evicted at a time. Each part will be
an individual single file mesa-db cache, like a DB shard. The new multipart
mesa-db cache will merge the parts into a single virtual cache.
This patch introduces two new environment variables:
1. MESA_DISK_CACHE_DATABASE_NUM_PARTS:
Controls number of mesa-db cache file parts. By default 50 parts will be
created. The old pre-multipart mesa-db cache files will be auto-removed
if they exist, i.e. Mesa will switch to the new DB version automatically.
2. MESA_DISK_CACHE_DATABASE_EVICTION_SCORE_2X_PERIOD:
Controls the eviction score doubling time period. The evicted DB part
selection is based on cache entries size weighted by 'last_access_time' of
the entries. By default the cache eviction score is doubled for each month
of cache entry age, i.e. for two equally sized entries where one entry is
older by one month than the other, the older entry will have x2 eviction
score than the other entry. Database part with a highest total eviction
score is selected for eviction.
This patch brings x40 performance improvement of cache eviction time using
multipart cache vs a single file cache due to a smaller eviction portions
and more optimized eviction algorithm.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20256>
Allow the loading process to affect driconf option matching without
changing the behavior throughout mesa common code or leaking the name of
the loading process to logs, artifact storage, or in sub-thread naming,
as can be the case with the broader MESA_PROCESS_NAME override.
This new MESA_DRICONF_EXECUTABLE_OVERRIDE takes higher precedence over
MESA_PROCESS_NAME in the case where both are set.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20779>