Dylan Baker
6a83d94e62
VERSION: bump to 22.2-devel for next cycle
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15935 >
2022-04-13 23:40:25 +00:00
Rhys Perry
2036a2c5c5
radv: use load_shared2_amd/store_shared2_amd
...
fossil-db (Sienna Cichlid):
Totals from 376 (0.23% of 162293) affected shaders:
MaxWaves: 9620 -> 9596 (-0.25%); split: +0.08%, -0.33%
Instrs: 207533 -> 203901 (-1.75%); split: -1.76%, +0.01%
CodeSize: 1130904 -> 1106420 (-2.16%); split: -2.17%, +0.01%
VGPRs: 14016 -> 14120 (+0.74%); split: -0.34%, +1.08%
Latency: 2143281 -> 2132212 (-0.52%); split: -0.56%, +0.05%
InvThroughput: 389116 -> 387990 (-0.29%); split: -0.34%, +0.05%
VClause: 4483 -> 4485 (+0.04%); split: -0.11%, +0.16%
SClause: 5780 -> 5778 (-0.03%); split: -0.17%, +0.14%
Copies: 15319 -> 15331 (+0.08%); split: -0.53%, +0.61%
Branches: 5561 -> 5563 (+0.04%)
PreSGPRs: 11776 -> 11775 (-0.01%)
PreVGPRs: 11393 -> 11497 (+0.91%); split: -0.13%, +1.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
67fc0e3655
ac/llvm: implement load_shared2_amd/store_shared2_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
c883abda76
aco: implement load_shared2_amd/store_shared2_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
5aa5af7776
aco: handle read2st64/write2st64 in optimizer
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
2135c88d9c
aco: fix signedness of DS_instruction::offset0/1
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
778fc176b1
nir/opt_load_store_vectorize: create load_shared2_amd/store_shared2_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
dc835626b3
nir/opt_load_store_vectorize: fix broken indentation
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
8ff122f8b8
nir: add load_shared2_amd and store_shared2_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Konstantin Seurer
bbdf22ce13
radv: Fix barriers with cp dma
...
We need to wait for cp dma if VK_PIPELINE_STAGE_2_ALL_TRANSFER_BIT or
VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT are set.
Closes : #5911
Fixes: 4b9bc4791b
("radv: only sync CP DMA for transfer operations or bottom pipe")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15933 >
2022-04-13 22:16:43 +00:00
Daniel Schürmann
d703a0e808
aco: remove register hints entirely
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
2fe005a3fe
aco: remove occurences of VCC hint
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
b10c4d7dee
aco: make program->needs_vcc independent of VCC hints
...
Totals from 5 (0.00% of 135048) affected shaders: (GFX9)
SGPRs: 208 -> 160 (-23.08%)
CodeSize: 2700 -> 2692 (-0.30%)
Instrs: 533 -> 531 (-0.38%)
Latency: 41688 -> 41680 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
415a3820fc
aco/ra: omit VCC affinity on VOPC_SDWA for GFX9+
...
VOPC_SDWA can also use arbitrary SGPR pairs on GFX9+.
Totals from 5607 (4.16% of 134913) affected shaders: (GFX10.3)
CodeSize: 42470760 -> 42452988 (-0.04%)
Instrs: 7943174 -> 7942883 (-0.00%)
Latency: 102887029 -> 102886305 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 20454456 -> 20454338 (-0.00%); split: -0.00%, +0.00%
Copies: 376818 -> 376865 (+0.01%); split: -0.00%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
6ebc61d71b
aco/ra: create VCC-affinities during RA
...
instead of using register hints.
Totals from 88367 (65.50% of 134913) affected shaders: (GFX10.3)
CodeSize: 322492184 -> 322252912 (-0.07%); split: -0.08%, +0.01%
Instrs: 60615809 -> 60541260 (-0.12%); split: -0.12%, +0.00%
Latency: 557067980 -> 557009210 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 109676757 -> 109674804 (-0.00%); split: -0.00%, +0.00%
SClause: 1939703 -> 1939924 (+0.01%); split: -0.01%, +0.02%
Copies: 4557567 -> 4487530
(-1.54%); split: -1.54%, +0.00%
Branches: 1941123 -> 1937453 (-0.19%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Daniel Schürmann
44fb9ba84a
aco/ra: only use VCC if program->needs_vcc == true
...
A future commit will make VCC register assignment independent
from register hints. Up to GFX9, VCC can alternatively be used
as regular SGPR, so prevent overlap.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15408 >
2022-04-13 21:52:43 +00:00
Lionel Landwerlin
08f3950d6b
anv: stop using old entrypoint/struct/enum names for 1.3
...
v2: More replacements
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Reviewed-by: Tapani Pälli <tapani.palli@intel.com > (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15920 >
2022-04-13 21:13:56 +00:00
Emma Anholt
5fad6bca72
nir_to_tgsi: Do the required cleanup for nir_opt_find_array_copies().
...
If we made a copy deref, then we need to do dead-write elimination for the
pervious writes or we'll just emit the same copy deref again next time
around. And, at the end of the opt loop, we need to lower copy derefs
because later passes (locals_to_regs, notably) depend on it.
Fixes infinite opt loop on fs-function-inout-array with virgl on NTT.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15899 >
2022-04-13 19:44:39 +00:00
Jason Ekstrand
c8df09ebd4
iris: More gracefully fail in resource_from_user_memory
...
rusticl (and clover) would like to get a graceful fail here so they can
fall back to a shadow copy instead of us asserting. We also start
rejecting arrayed surface because isl doesn't allow selecting a QPitch
yet. Even if it did, QPitch is horribly restrictive, even for linear
surfaces, that it likely wouldn't be that useful.
Fixes: e81f3edf76
("iris: Allow userptr on 1D and 2D images")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15903 >
2022-04-13 19:18:54 +00:00
Mike Blumenkrantz
8501661332
zink: set optimal tiling on swapchain images
...
this otherwise breaks kopper
fixes #6294
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15928 >
2022-04-13 19:01:29 +00:00
Louis-Francis Ratté-Boulianne
3017522e74
dzn: Add CI target for vulkan driver
...
A custom branch of `deqp` is used to have proper results when
crashing. See:
https://github.com/KhronosGroup/VK-GL-CTS/issues/311
A custom branch of `deqp-runner` with Windows support is also
used until the changes are merged into the main repository.
The `api`, `info`, `draw`, `query-pool` and `memory` test cases are
executed for now.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com >
Acked-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15742 >
2022-04-13 18:05:44 +00:00
Louis-Francis Ratté-Boulianne
fb24f34fc3
dzn: Add a debug flag to enable D3D12 debug layer
...
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15742 >
2022-04-13 18:05:44 +00:00
Karmjit Mahil
f7ddd584ab
pvr: Implement vkCreateQueryPool() and vkDestroyQueryPool().
...
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com >
Reviewed-by: Frank Binns <frank.binns@imgtec.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15880 >
2022-04-13 17:58:03 +00:00
Karmjit Mahil
1250e30929
pvr: Add pvrsrvkm visibility test heap.
...
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com >
Reviewed-by: Frank Binns <frank.binns@imgtec.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15880 >
2022-04-13 17:58:03 +00:00
Karmjit Mahil
76ee1671f6
pvr: Add core count info and pvr_device_runtime_info.
...
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com >
Reviewed-by: Frank Binns <frank.binns@imgtec.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15880 >
2022-04-13 17:58:03 +00:00
Jason Ekstrand
93fbaae7d5
v3dv: Add emulated timeline semaphore support
...
This is trivial thanks to the emulated timelines provided in common
code. "Real" timeline semaphores which can be shared across processes
will require kernel support.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
1cc917bc68
v3dv: Use the core version property helpers
...
vulkaninfo is the same before and after.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
1973b2da9d
v3dv: Use the core version feature helpers
...
vulkaninfo is the same before and after.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
316728a55b
v3dv: Switch to the common submit framework
...
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
321f0b85f2
v3dv: Always wait on last_job_syncs if job->serialize
...
Even if we're the first job on some queue, there may be no wait
semaphores but we still need to ensure things happen in-order. (See
the "Implicit Synchronization Guarantees" section of the Vulkan spec.)
The client can submit back-to-back command buffers with no semaphores
between them and it needs to adt the same as if there were a semaphore.
If job->serialize is set because of a barrier or something, we still
need to synchronize across HW queues by waiting on last_job_syncs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
00b84fae2d
v3dv: Add a condition variable for queries
...
In order to properly wait for a query to be complete, we need to first
wait for the end query job to flush through on the queue. Since query
end is always handled on the CPU, we can do this with a condition
variable. The 2s timeout is taken from ANV.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
e5a0e2122f
v3dv: Use util/os_time helpers
...
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
8bd7bd9577
v3dv: Switch to the common device lost tracking
...
Vulkan requires that, once the device has been lost, you keep returning
VK_ERROR_DEVICE_LOST. We've got tracking for this in common code; it
just needs to be wired up.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
32527f3ccc
v3dv: Destroy the device mutex on the teardown path
...
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
30191fd9df
v3dv: Don't use pthread functions on c11 mutexes
...
This only works because c11/threads.h is typedeffing the c11 stuff to
ptrheads.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
25441b5e5c
v3dv: Put indirect compute CSD jobs in the job list
...
Instead of having the CPU job execute the CSD job, put both jobs on the
list with the CPU job first which modifies the GPU job which gets kicked
off next. This gives the queue code more visibility into what types of
jobs are actually in the list. In particular, if an indirect compute
job is the last job in a batch buffer, it currently appears as if the
batch ends with CPU work which isn't true because it kicks off GPU work.
In that case, the last job on the list is now a GPU job, which better
matches reality.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
0208bb2d58
v3dv: Stop directly setting vk_device::alloc
...
vk_device_init() will do this.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Jason Ekstrand
b284c512e6
vulkan/drm_syncobj: Implement WAIT_PENDING with a sync_file lookup
...
The v3dv kernel driver doesn't support timelines yet but we want
threaded submit and that requires WAIT_PENDING. Fortunately, it should
never sit in this loop for long in practice. The primary use-case is
sorting out dependencies and these checks will always trivially succeed
for non-shared semaphores because v3dv only has a single queue.
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704 >
2022-04-13 17:22:14 +00:00
Rhys Perry
7478b00c7c
aco: remove old global access intrinsics
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
61ac5acca3
radv,ac/nir: lower global access to _amd global access intrinsics
...
fossil-db (Sienna Cichlid):
Totals from 400 (0.30% of 134621) affected shaders:
VGPRs: 18696 -> 18688 (-0.04%)
CodeSize: 2031348 -> 1946640 (-4.17%)
Instrs: 374703 -> 360226 (-3.86%)
Latency: 4200727 -> 4108628 (-2.19%); split: -2.20%, +0.01%
InvThroughput: 1059935 -> 1029441 (-2.88%); split: -2.88%, +0.00%
VClause: 5777 -> 5771 (-0.10%)
SClause: 11890 -> 10891 (-8.40%); split: -8.57%, +0.17%
Copies: 34035 -> 33259 (-2.28%); split: -2.98%, +0.70%
Branches: 11108 -> 11100 (-0.07%); split: -0.08%, +0.01%
PreSGPRs: 15999 -> 15942 (-0.36%); split: -0.44%, +0.08%
PreVGPRs: 16994 -> 16970 (-0.14%)
fossil-db (Polaris10):
Totals from 400 (0.29% of 135668) affected shaders:
SGPRs: 23799 -> 22919 (-3.70%); split: -4.30%, +0.61%
VGPRs: 18480 -> 18472 (-0.04%)
CodeSize: 2090316 -> 2041592 (-2.33%)
Instrs: 395461 -> 385747 (-2.46%); split: -2.46%, +0.00%
Latency: 5045768 -> 5020196 (-0.51%); split: -0.53%, +0.02%
InvThroughput: 2694320 -> 2689886 (-0.16%); split: -0.23%, +0.07%
VClause: 5982 -> 5968 (-0.23%)
SClause: 12064 -> 10823 (-10.29%); split: -10.33%, +0.04%
Copies: 48233 -> 48322 (+0.18%); split: -0.47%, +0.65%
PreSGPRs: 16409 -> 16358 (-0.31%); split: -0.39%, +0.08%
fossil-db (Pitcairn):
Totals from 400 (0.29% of 135668) affected shaders:
SGPRs: 22431 -> 22215 (-0.96%); split: -2.60%, +1.64%
VGPRs: 18776 -> 18560 (-1.15%); split: -1.21%, +0.06%
CodeSize: 2104440 -> 2017708 (-4.12%)
MaxWaves: 2363 -> 2367 (+0.17%)
Instrs: 413099 -> 397446 (-3.79%)
Latency: 5507707 -> 5450251 (-1.04%); split: -1.12%, +0.07%
InvThroughput: 2838867 -> 2786903 (-1.83%); split: -1.83%, +0.00%
VClause: 10334 -> 10097 (-2.29%)
SClause: 12346 -> 11005 (-10.86%); split: -10.89%, +0.02%
Copies: 54034 -> 52065 (-3.64%); split: -3.99%, +0.35%
PreSGPRs: 17916 -> 17857 (-0.33%); split: -0.40%, +0.07%
PreVGPRs: 16917 -> 16893 (-0.14%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
9d1bab3615
aco: increase global_load_params.max_const_offset_plus_one
...
The callback now supports this. This shouldn't have any effect yet except
on GFX6 with 12 byte loads.
fossil-db (Pitcairn):
Totals from 246 (0.18% of 135668) affected shaders:
VGPRs: 14684 -> 14768 (+0.57%); split: -0.44%, +1.01%
CodeSize: 1765792 -> 1738040 (-1.57%)
Instrs: 344605 -> 340055 (-1.32%)
Latency: 4892904 -> 4861942 (-0.63%)
InvThroughput: 2479599 -> 2446070 (-1.35%)
VClause: 8782 -> 8735 (-0.54%)
SClause: 9854 -> 9853 (-0.01%)
Copies: 47327 -> 45401 (-4.07%); split: -4.08%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
3e9517c757
aco: implement _amd global access intrinsics
...
fossil-db (Sienna Cichlid):
Totals from 7 (0.01% of 134621) affected shaders:
VGPRs: 760 -> 776 (+2.11%)
CodeSize: 222000 -> 222044 (+0.02%); split: -0.01%, +0.03%
Instrs: 40959 -> 40987 (+0.07%); split: -0.01%, +0.08%
Latency: 874811 -> 886609 (+1.35%); split: -0.00%, +1.35%
InvThroughput: 437405 -> 443303 (+1.35%); split: -0.00%, +1.35%
VClause: 1242 -> 1240 (-0.16%)
SClause: 1050 -> 1049 (-0.10%); split: -0.19%, +0.10%
Copies: 4953 -> 4973 (+0.40%); split: -0.04%, +0.44%
Branches: 1947 -> 1957 (+0.51%); split: -0.05%, +0.56%
PreVGPRs: 741 -> 747 (+0.81%)
fossil-db changes seem to be noise.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
eeefe31014
ac/llvm: implement _amd global access intrinsics
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
5c038b3f02
nir: add _amd global access intrinsics
...
These are the same as the normal ones, but they take an unsigned 32-bit
offset in BASE and another unsigned 32-bit offset in the last source.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
391bf3ea30
aco: don't expand smem/mubuf global loads
...
For example, dwordx3->dwordx4 or ubyte3->dwordx2.
Global loads don't have the bounds checking that buffer loads have that
makes this safe.
The alignment checks are added to global_load_callback() in case
byte_align_loads=false, align=1 and bytes_needed=3. Without them, the
callback will create a dword load.
fossil-db (Sienna Cichlid):
Totals from 267 (0.20% of 134621) affected shaders:
CodeSize: 1603352 -> 1606568 (+0.20%)
Instrs: 294946 -> 295482 (+0.18%); split: -0.00%, +0.18%
Latency: 2997003 -> 2997052 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 526645 -> 526659 (+0.00%)
SClause: 9179 -> 9185 (+0.07%); split: -0.02%, +0.09%
Copies: 25363 -> 25375 (+0.05%); split: -0.08%, +0.13%
Branches: 8298 -> 8299 (+0.01%)
fossil-db (Polaris10):
Totals from 267 (0.20% of 135668) affected shaders:
CodeSize: 1636672 -> 1638756 (+0.13%); split: -0.00%, +0.13%
Instrs: 308484 -> 308733 (+0.08%); split: -0.01%, +0.09%
Latency: 3446045 -> 3446904 (+0.02%); split: -0.00%, +0.03%
InvThroughput: 1206722 -> 1206828 (+0.01%); split: -0.00%, +0.01%
SClause: 9308 -> 9311 (+0.03%); split: -0.08%, +0.11%
Copies: 36933 -> 36921 (-0.03%); split: -0.08%, +0.05%
fossil-db (Pitcairn):
Totals from 275 (0.20% of 135668) affected shaders:
SGPRs: 17616 -> 17520 (-0.54%); split: -0.64%, +0.09%
VGPRs: 15428 -> 15540 (+0.73%); split: -0.23%, +0.96%
CodeSize: 1885792 -> 1929120 (+2.30%); split: -0.00%, +2.30%
MaxWaves: 1284 -> 1285 (+0.08%)
Instrs: 368963 -> 376095 (+1.93%); split: -0.00%, +1.94%
Latency: 5122922 -> 5168398 (+0.89%); split: -0.01%, +0.90%
InvThroughput: 2562866 -> 2604279 (+1.62%)
VClause: 9268 -> 9296 (+0.30%); split: -0.13%, +0.43%
SClause: 10702 -> 10705 (+0.03%); split: -0.05%, +0.07%
Copies: 48620 -> 50629 (+4.13%); split: -0.08%, +4.21%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
6baad09711
aco: use saddr for global access with sgpr address
...
fossil-db (Sienna Cichlid):
Totals from 38 (0.03% of 134621) affected shaders:
CodeSize: 237196 -> 237060 (-0.06%); split: -0.09%, +0.03%
Instrs: 43895 -> 43894 (-0.00%); split: -0.02%, +0.01%
Latency: 914633 -> 916263 (+0.18%); split: -0.01%, +0.19%
InvThroughput: 468215 -> 468971 (+0.16%); split: -0.02%, +0.18%
SClause: 1239 -> 1242 (+0.24%)
PreSGPRs: 997 -> 1003 (+0.60%)
PreVGPRs: 936 -> 923 (-1.39%); split: -1.50%, +0.11%
Regression seems to be RA noise, creating a waitcnt.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
d957730b9b
aco: use vcc for 64-bit vgpr addition
...
fossil-db (Sienna Cichlid):
Totals from 229 (0.17% of 134621) affected shaders:
CodeSize: 1520192 -> 1517644 (-0.17%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Rhys Perry
0360e12ebf
radv: don't require robust vectorization for nir_var_mem_global
...
Robust vectorization is to prevent vectorization of loads using the near
maximum offset with loads of offset 0. Global loads can't read from offset
0 (NULL) anyways, so this isn't necessary.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14124 >
2022-04-13 16:23:35 +00:00
Jason Ekstrand
6ca328988f
iris: Don't leak scratch BOs
...
Fixes: 4d219b0eb3
("iris: implement scratch space!")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15897 >
2022-04-13 15:56:50 +00:00
Timur Kristóf
394b93bfeb
radv: Only use TES vertex offset 2 for triangles and quads.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15837 >
2022-04-13 15:04:26 +00:00