Georg Lehmann
8fabde3be4
aco/gfx11: use dpp_row_xmask and dpp_row_share
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412 >
2023-02-27 11:09:42 +00:00
Georg Lehmann
b7cd0eb439
aco: use v_permlane(x)16_b32 for masked swizzle
...
Should be cheaper than ds_swizzle.
Totals from 8 (0.01% of 134913) affected shaders:
CodeSize: 16316 -> 16388 (+0.44%)
Instrs: 3088 -> 3086 (-0.06%)
Latency: 49558 -> 49508 (-0.10%)
InvThroughput: 9180 -> 9198 (+0.20%)
Copies: 376 -> 384 (+2.13%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412 >
2023-02-27 11:09:42 +00:00
Rhys Perry
94abccf3ce
aco: fix pathological case in LdsDirectVALUHazard
...
Similar to bfd4ac4581
.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Fixes: 296b4d95a3
("aco/gfx11: workaround LdsDirectVALUHazard")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21423 >
2023-02-22 20:46:12 +00:00
Georg Lehmann
ee47cc8256
amd,nir: remove byte_permute_amd intrinsic
...
It's unused and if we ever want to use it again we should make it an alu
opcode instead.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21445 >
2023-02-22 20:13:52 +00:00
Rhys Perry
ab3184c0a2
aco: don't apply modifiers through DPP to unsupported instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21201 >
2023-02-21 14:59:38 +00:00
Georg Lehmann
3bd5b583f9
aco: combine a ^ ~b and ~(a ^ b) to v_xnor_b32
...
Foz-DB Navi21:
Totals from 13 (0.01% of 134913) affected shaders:
CodeSize: 225432 -> 225180 (-0.11%)
Instrs: 41973 -> 41908 (-0.15%)
Latency: 297464 -> 297326 (-0.05%)
InvThroughput: 82536 -> 82467 (-0.08%)
Copies: 2452 -> 2440 (-0.49%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21410 >
2023-02-21 13:35:31 +00:00
Daniel Schürmann
2bb369dd8d
nir: add assertions that loops don't have a Continue Construct
...
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962 >
2023-02-21 10:41:11 +00:00
Timur Kristóf
2c40215ab9
aco/optimizer: Change v_cmp with subgroup invocation to constant.
...
When a shader has a comparison with the subgroup invocation id,
we can use a constant instead, saving a VALU instruction.
When the constant can't be represented as a 64-bit literal,
use the s_bfm_b64 instruction to generate it instead, which
is still a win.
Fossil DB stats on GFX11:
Totals from 300 (0.22% of 134913) affected shaders:
CodeSize: 2223052 -> 2214336 (-0.39%); split: -0.43%, +0.04%
Instrs: 430216 -> 429882 (-0.08%); split: -0.14%, +0.06%
Latency: 5881180 -> 5878181 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 731846 -> 729293 (-0.35%)
Copies: 31662 -> 31847 (+0.58%); split: -0.03%, +0.61%
Branches: 8241 -> 8100 (-1.71%)
PreVGPRs: 15788 -> 15786 (-0.01%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20843 >
2023-02-18 21:16:58 +01:00
Daniel Schürmann
b338d59047
radv: unconditionally enable scratch for RT shaders
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159 >
2023-02-16 19:37:25 +00:00
Timur Kristóf
084d10a702
aco: Remove MTBUF zero operand.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363 >
2023-02-16 17:16:34 +00:00
Timur Kristóf
afdacf4dcc
aco: Don't set scalar offset on buffer load instructions when it's zero.
...
This helps generate slightly more optimal instructions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363 >
2023-02-16 17:16:34 +00:00
Timur Kristóf
4621ffdec1
aco: Get rid of redundant load_vmem_mubuf function.
...
Call emit_load directly from visit_load_buffer instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358 >
2023-02-16 15:29:37 +00:00
Timur Kristóf
881c52ba19
ac: Port ACO's get_fetch_format to ac_get_safe_fetch_size.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358 >
2023-02-16 15:29:36 +00:00
Georg Lehmann
4fbcd046ce
aco: Don't use vcmpx with DPP.
...
V_CMPX+DPP returns 0 with reads from disabled lanes, unlike V_CMP+DPP (RDNA3 ISA doc, 7.7)
Fixes: baab6f18c9
("aco: Optimize branching sequence during SSA elimination.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20537 >
2023-02-14 19:15:17 +00:00
Georg Lehmann
281a505ef0
aco: new 16bit VOP3 opcodes can use opsel
...
No Foz-DB changes on gfx11.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20705 >
2023-02-14 16:14:55 +00:00
Georg Lehmann
533d0008c7
aco: remove stale TODOs about v_interp opsel
...
These are already handled correctly according to the ISA docs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21096 >
2023-02-10 12:01:56 +00:00
Rhys Perry
b4383821e7
aco: don't modify exec in p_interp_gfx11
...
The RDNA3 ISA docs say that lds_param_load write the entire quad
regardless of exec, so this isn't needed.
fossil-db (gfx1100):
Totals from 5291 (3.93% of 134574) affected shaders:
Instrs: 4891396 -> 4789628 (-2.08%)
CodeSize: 25519032 -> 25111960 (-1.60%)
Latency: 36122982 -> 36074300 (-0.13%); split: -0.14%, +0.00%
InvThroughput: 4162436 -> 4161424 (-0.02%); split: -0.02%, +0.00%
Copies: 263862 -> 263838 (-0.01%)
PreSGPRs: 225012 -> 224179 (-0.37%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21171 >
2023-02-08 19:35:54 +00:00
Georg Lehmann
6e4598f7b9
aco: support omod/imod for v_fmac_f16
...
Only matters for post-RA DPP16.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
2deda5c0be
aco: don't list imod/omod support v_fmaak_f32/v_fmamk_f32
...
We can never use them anyway because these opcodes don't support VOP3/DPP16/SDWA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
4c9ac73064
aco: allow output modifiers for ldexp_f16
...
It also supports imod for the first operand, but we cannot express that at
moment.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
b63aa2bb8e
aco: don't allow output modifiers for v_cvt_pkrtz_f16_f32
...
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
5038a049f1
aco: add mov/cndmask opcodes to does_fp_op_flush_denorms
...
For completeness sake also add v_mov_b32, even if we don't use imod for it
because it's only supported since gfx10.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21170 >
2023-02-08 13:07:46 +00:00
Georg Lehmann
c8adf16278
aco: fix imod/omod for gfx11 VOP3 opcodes
...
Fixes: d8d99c3c4f
("aco: add GFX11 opcode numbers")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21170 >
2023-02-08 13:07:46 +00:00
Rhys Perry
fad1f716dd
aco: fix out-of-bounds access when moving s_mem(real)time across SMEM
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8224
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21138 >
2023-02-07 14:50:43 +00:00
Rhys Perry
24618721d3
ac: move ring_offsets to ac_shader_args
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19202 >
2023-02-06 14:25:15 +00:00
Qiang Yu
ed419f46aa
aco: remove early_rast wait insert
...
It's done in nir position export.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:44 +00:00
Qiang Yu
f6b194b648
nir,ac/llvm,aco,radv,radeonsi: remove nir_export_vertex_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:44 +00:00
Qiang Yu
f44872c7b6
nir,ac/llvm,aco: remove nir_export_primitive_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:44 +00:00
Qiang Yu
601ad9e0a9
amd,radeonsi: implement nir_load_force_vrs_rates_amd in driver abi
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:43 +00:00
Qiang Yu
8331842258
aco: implement nir_export_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:43 +00:00
Georg Lehmann
3c25edfdb7
aco: Improve wave64 cycle estimates.
...
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20507 >
2023-02-02 17:59:23 +01:00
Rhys Perry
bfd4ac4581
aco: limit VALUPartialForwardingHazard search
...
Complicated CFG and lots of SALU can cause this to take an extremely long
time to finish.
Fixes
dEQP-VK.graphicsfuzz.cov-value-tracking-selection-dag-negation-clamp-loop
and Monster Hunter Rise demo compile times.
fossil-db (gfx1100):
Totals from 57 (0.04% of 134574) affected shaders:
Instrs: 170919 -> 171165 (+0.14%)
CodeSize: 860144 -> 861128 (+0.11%)
Latency: 961466 -> 961505 (+0.00%)
InvThroughput: 127598 -> 127608 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8153
Fixes: 5806f0246f
("aco/gfx11: workaround VALUPartialForwardingHazard")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20941 >
2023-02-01 18:52:40 +00:00
Georg Lehmann
2b264455b5
aco: use s_pack_ll_b32_b16 for constant copies
...
Totals from 2 (0.00% of 134913) affected shaders:
CodeSize: 28636 -> 28628 (-0.03%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970 >
2023-02-01 17:07:25 +00:00
Georg Lehmann
9ee9b0859b
aco: use s_bfm_64 for constant copies
...
Foz-DB Navi21:
Totals from 1025 (0.76% of 134913) affected shaders:
CodeSize: 1436752 -> 1432412 (-0.30%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970 >
2023-02-01 17:07:25 +00:00
Rhys Perry
bbc5247bf7
aco/spill: always end spill vgpr after control flow
...
To fix a hypothetical issue:
v0 = start_linear_vgpr
if (...) {
} else {
use_linear_vgpr(v0)
}
v0 = phi
We need a p_end_linear_vgpr to ensure that the phi does not use the same
VGPR as the linear VGPR.
This is also much simpler.
fossil-db (gfx1100):
Totals from 1195 (0.89% of 134574) affected shaders:
Instrs: 4123856 -> 4123826 (-0.00%); split: -0.00%, +0.00%
CodeSize: 21461256 -> 21461100 (-0.00%); split: -0.00%, +0.00%
Latency: 62816001 -> 62812999 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 9339049 -> 9338564 (-0.01%); split: -0.01%, +0.00%
Copies: 304028 -> 304005 (-0.01%); split: -0.02%, +0.01%
PreVGPRs: 115761 -> 115762 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621 >
2023-02-01 15:45:22 +00:00
Rhys Perry
850d945baf
aco/tests: add setup_reduce_temp.divergent_if_phi
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621 >
2023-02-01 15:45:22 +00:00
Rhys Perry
44fdd2ebcb
aco: end reduce tmp after control flow, when used within control flow
...
In the case of:
v0 = start_linear_vgpr
if (...) {
} else {
use_linear_vgpr(v0)
}
v0 = phi
We need a p_end_linear_vgpr to ensure that the phi does not use the same
VGPR as the linear VGPR.
fossil-db (gfx1100):
Totals from 3763 (2.80% of 134574) affected shaders:
MaxWaves: 90296 -> 90164 (-0.15%)
Instrs: 6857726 -> 6856608 (-0.02%); split: -0.03%, +0.01%
CodeSize: 35382188 -> 35377688 (-0.01%); split: -0.02%, +0.01%
VGPRs: 234864 -> 235692 (+0.35%); split: -0.01%, +0.36%
Latency: 47471923 -> 47474965 (+0.01%); split: -0.03%, +0.04%
InvThroughput: 5640320 -> 5639736 (-0.01%); split: -0.04%, +0.03%
VClause: 93098 -> 93107 (+0.01%); split: -0.01%, +0.02%
SClause: 214137 -> 214130 (-0.00%); split: -0.00%, +0.00%
Copies: 369895 -> 369305 (-0.16%); split: -0.31%, +0.15%
Branches: 164996 -> 164504 (-0.30%); split: -0.30%, +0.00%
PreVGPRs: 210655 -> 211438 (+0.37%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621 >
2023-02-01 15:45:22 +00:00
Rhys Perry
695cf75266
aco: set has_color_exports with GPL
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Fixes: 192486b7aa
("aco/gfx11: export mrtz in discard early exit for non-color shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20937 >
2023-01-27 16:51:56 +00:00
Erik Faye-Lund
d54c8a47c6
meson: avoid using deprecated build_root() method
...
The meson.build_root() method has been deprecated, so let's switch to
meson.project_build_root(), which usually means the same thing. The case
where it doesn't do the same thing is if Mesa is a subproject to some
other project, but in that case I believe we want the build root of Mesa,
not of the parent project anyway.
Reviewed-by: Eric Engestrom <eric@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20907 >
2023-01-27 11:35:50 +00:00
Timur Kristóf
c644461b71
radv, aco, ac: Implement pack_half_2x16_rtz_split.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838 >
2023-01-26 12:24:24 +00:00
Timur Kristóf
9fc5d8d211
aco: Remove dynamic VS input loads.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20733 >
2023-01-26 02:43:11 +00:00
Timur Kristóf
81620fc7b0
aco: Enable constant exec mask based optimization on compute shaders.
...
We know for sure exec is initially -1 when the shader always has full subgroups.
Fossil DB stats on GFX11:
Totals from 3884 (2.88% of 134913) affected shaders:
SpillSGPRs: 1673 -> 1697 (+1.43%); split: -1.67%, +3.11%
SpillVGPRs: 2316 -> 2310 (-0.26%); split: -0.65%, +0.39%
CodeSize: 19584436 -> 19567156 (-0.09%); split: -0.13%, +0.04%
Scratch: 217088 -> 216832 (-0.12%)
Instrs: 3784596 -> 3780303 (-0.11%); split: -0.15%, +0.03%
Latency: 39971204 -> 39794967 (-0.44%); split: -0.47%, +0.03%
InvThroughput: 7885552 -> 7801247 (-1.07%); split: -1.14%, +0.07%
VClause: 74654 -> 74611 (-0.06%); split: -0.07%, +0.01%
SClause: 103139 -> 103043 (-0.09%); split: -0.13%, +0.04%
Copies: 279864 -> 281995 (+0.76%); split: -0.72%, +1.48%
Branches: 92082 -> 92084 (+0.00%); split: -0.03%, +0.03%
PreSGPRs: 155637 -> 149491 (-3.95%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20670 >
2023-01-26 01:59:26 +00:00
Timur Kristóf
39448c8e9c
radv, aco: Add uses_full_subgroups to compute shader info.
...
Allow the compiler to assume that the shader always has full subgroups,
meaning that the initial EXEC mask is -1 in all waves (all lanes enabled).
This assumption is incorrect for ray tracing and internal (meta) shaders
because they can use unaligned dispatch.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20670 >
2023-01-26 01:59:26 +00:00
Georg Lehmann
e527f686ca
Revert "aco: Combine v_cvt_u32_f32 with insert to v_cvt_pk_u8_f32."
...
This reverts commit 6d02054047
.
v_cvt_pk_u8_f32 returns 0xff instead of v_cvt_u32_f32 & 0xff if the input is
larger than 255.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8128
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20829 >
2023-01-23 16:22:55 +00:00
Rhys Perry
26e4621fa2
aco/tests: update assembler tests for latest LLVM 16
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20747 >
2023-01-23 12:30:28 +00:00
Rhys Perry
b0fa106dc6
aco/tests: fix assembler.gfx11.vop12c_v128 with LLVM 15
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8089
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20747 >
2023-01-23 12:30:28 +00:00
Dylan Baker
3c5e969144
meson: replace uses of ExternalProgram.path with .full_path
...
The former is deprecated
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20409 >
2023-01-19 16:29:03 +00:00
Rhys Perry
068c84f275
aco: add support for fp32 addition atomics
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19810 >
2023-01-17 17:39:15 +00:00
Timur Kristóf
faba30a8f3
aco/optimizer: Optimize p_extract + v_mul_u32_u24 to v_mad_u32_u16.
...
This should perform the same but removes SDWA from the address
calculations in NGG culling shaders for example.
This is done because SDWA is no longer available on GFX11.
Fossil DB stats on GFX1100:
Totals from 36 (0.03% of 134913) affected shaders:
CodeSize: 300968 -> 300884 (-0.03%); split: -0.04%, +0.01%
Instrs: 60955 -> 60863 (-0.15%); split: -0.15%, +0.00%
Latency: 426809 -> 426819 (+0.00%); split: -0.06%, +0.06%
InvThroughput: 39076 -> 39025 (-0.13%); split: -0.14%, +0.01%
VClause: 1440 -> 1443 (+0.21%)
Copies: 5714 -> 5725 (+0.19%)
Fossil DB stats on GFX1100 with NGG culling enabled:
Totals from 60953 (45.18% of 134913) affected shaders:
VGPRs: 2273172 -> 2273160 (-0.00%)
CodeSize: 186401864 -> 186403036 (+0.00%); split: -0.00%, +0.00%
Instrs: 37038048 -> 36977353 (-0.16%); split: -0.16%, +0.00%
Latency: 146466770 -> 146350172 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 15342790 -> 15228585 (-0.74%); split: -0.74%, +0.00%
VClause: 669662 -> 669665 (+0.00%)
Copies: 2972380 -> 2972482 (+0.00%); split: -0.01%, +0.01%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17924 >
2023-01-16 19:27:39 +00:00
Timur Kristóf
171d76ded1
aco/optimizer: Add missing v_lshlrev condition to can_apply_extract.
...
This was already handled by apply_extract but missing from
can_apply_extract, therefore may not be properly applied everywhere.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17924 >
2023-01-16 19:27:39 +00:00