SMEM does the addition with 64-bits, not 32. So if the original code
relied on wrapping around (for example, for subtraction), it would break.
Apparently swizzled MUBUF accesses also have issues with combining
additions that could overflow. Normal MUBUF accesses seem fine.
fossil-db (Navi):
Totals from 27219 (20.02% of 135946) affected shaders:
CodeSize: 128303256 -> 131062756 (+2.15%); split: -0.00%, +2.15%
Instrs: 24818911 -> 25280558 (+1.86%); split: -0.01%, +1.87%
VMEM: 162311926 -> 177226874 (+9.19%); split: +9.36%, -0.17%
SMEM: 18182559 -> 20218734 (+11.20%); split: +11.53%, -0.34%
VClause: 423635 -> 424398 (+0.18%); split: -0.02%, +0.20%
SClause: 865384 -> 1104986 (+27.69%); split: -0.00%, +27.69%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2748
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>
We don't want these files shared between builds (it'll get blown away by
the next rsync), and NFS will just increase our latency for hitting the
cache.
Drops a630 gles31 run from 11-17 minutes to 5.5. Maximum cache size on a
run I've seen is 153M, which it seems we can easily spare.
Fixes: f97acb4bb4 ("freedreno/ir3: disk-cache support")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5998>
fd.o has retuned the x86 runners on packet for -j8. Rather than having to
tweak our CI every time fd.o decides to rebalance job concurrency, respect
what the runner admin has chosen for their builds (this will also be
convenient for people with large local runners).
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5669>
If the SPIR-V had a shared+image memory barrier, we would emit two NIR
barriers: a shared barrier and an image barrier.
Unlike a single barrier, two barriers allows transformations such as:
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_shared () ()
intrinsic memory_barrier_image () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
->
intrinsic memory_barrier_shared () ()
intrinsic store_shared (ssa_35, ssa_24) (0, 1, 4, 0)
intrinsic image_deref_store (ssa_27, ssa_33, ssa_34, ssa_32, ssa_25) (1)
intrinsic memory_barrier_image () ()
This commit fixes two dEQP-VK.memory_model.* CTS tests with ACO.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5951>
legalize_block() can get run multiple times, which I didn't notice when
adding fine derivs support. Other instruction clones change things such
that the legalization won't trigger again, but that didn't apply to the
DS.PP legalization. To keep someone else from tripping over this, split
the one-shot legalization out of the iterative sync flag application.
Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
v2: Be more explicit about sampler types. Prefer the term "load" to
"resolve" to match VK convention. Generate shaders for MRT 8x. Blit
shader generation adds about 6ms to startup cost. We could cache thes.
shaders to disk if we needed to (or indeed, ship binaries).
v3: Fallback on u_blitter on Bifrost so Bifrost continues to work.
KHR_partial_update support is mostly no-oped on Bifrost now, but that's
okay for now - compositors are still functional.
v4: Specialize on multisample state as well to enable reloads of MSAA
textures. This requires 2x the shader variants, so I assume we're up to
12ms startup cost for generation. Annoying. Also fix interactions with
depth- or stencil-only clears of combined depth-stencil surfaces.
v5: Cache to the device (screen) instead of the context, reducing
duplicated work in apps that create many contexts (e.g. Chromium)
v6: Squash in KHR_partial_update cleanup to fix intermediate
regressions on a few tests.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5824>
The fast log2 works better if rho is squared, i.e.
fast_log2(sqrt(2)) == 0.4
0.5 * fast_log2(2) == 0.5
so just square rho, and always divide by 2 afterwards.
Fixes:
GTF-GL45.gtf30.GL3Tests.sgis_texture_lod.sgis_texture_lod_basic_lod_selection
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5820>
Map all formats to a valid ifmt. FMT6_10_10_10_2_UNORM_DEST still
doesn't work on the blitter so keep that one on the u_blitter path.
Fixes
dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.*
with FD_MESA_DEBUG=nogmem.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5717>
It's something that was added to ease development, but that was supposed
to be removed before merging.
It also causes problems when arm-related jobs aren't enabled, as
arm_build is needed by these jobs but in that case isn't there.
Also extend from .ci-run-policy.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5802>
I want the docs to be discoverable next to the code, and sphinx insists
that all docs are under the top-level docs dir (sigh). We can't symlink
from that dir to .gitlab-ci because windows builds can't do symlinks, so
link back the other direction.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5510>
Place the kernel and ramdisk into a place in the file server so the URL
will only change when the contents also change.
Also put the Mesa build into a separate tarball so the ramdisk's
contents don't change every build.
With proper caching in place, all devices in the same farm need only to
download the mesa tarball once, saving time.
As we switch to MinIO for making kernels and rootfs available to LAVA
devices, we can stop using Docker to distribute them.
Instead, build when needed in separate jobs that push directly to MinIO,
from where LAVA devices can download them.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5515>