When loading a TCS or GS input, we generate some code to read the URB
handle for a particular input control point (ICP handle), which often
involves indirect addressing due to a non-constant vertex.
For example:
mov(8) vgrf148+0.0:UW, 76543210V
shl(8) vgrf149:UD, vgrf148+0.0:UW, 2u
shl(8) vgrf150:UD, vgrf145:UD, 5u
add(8) vgrf151:UD, vgrf150:UD, vgrf149:UD
mov_indirect(8) vgrf147:UD, g2:UD, vgrf151:UD, 96u
Unfortunately, the first load with 76543210V is considered a partial
write because the 8 channels of 16-bit UW data doesn't fill an entire
register, and we can't allocate VGRFs at sub-register granularity.
This causes none of the above math to be CSE'd, even though the first
two instructions are common to *all* input loads, and the rest may be
reused sometimes as well.
To work around this, we stop emitting 76543210V to a temporary, and
instead use nir_system_values[SYSTEM_VALUE_SUBGROUP_INVOCATION], which
already contains this value, and is unconditionally set up for us.
With all input loads using the same register for the sequence, our
CSE pass is able to eliminate the rest of the common math.
shader-db results on Tigerlake:
total instructions in shared programs: 20748243 -> 20744844 (-0.02%)
instructions in affected programs: 73410 -> 70011 (-4.63%)
helped: 242 / HURT: 21
helped stats (abs) min: 1 max: 37 x̄: 14.17 x̃: 15
helped stats (rel) min: 0.17% max: 19.58% x̄: 6.13% x̃: 6.32%
HURT stats (abs) min: 1 max: 4 x̄: 1.38 x̃: 1
HURT stats (rel) min: 0.18% max: 1.31% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for instructions value: -13.73 -12.12
95% mean confidence interval for instructions %-change: -6.00% -5.19%
Instructions are helped.
total cycles in shared programs: 785828951 -> 785788480 (<.01%)
cycles in affected programs: 597593 -> 557122 (-6.77%)
helped: 227 / HURT: 13
helped stats (abs) min: 6 max: 624 x̄: 182.19 x̃: 185
helped stats (rel) min: 0.24% max: 18.22% x̄: 7.85% x̃: 7.80%
HURT stats (abs) min: 2 max: 153 x̄: 68.08 x̃: 36
HURT stats (rel) min: 0.03% max: 7.79% x̄: 2.97% x̃: 1.25%
95% mean confidence interval for cycles value: -182.55 -154.71
95% mean confidence interval for cycles %-change: -7.84% -6.69%
Cycles are helped.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18455>
With the following SEND instruction :
send(1) nullUD nullUD g0UD 0x4200c504 a0.1<0>UD
This instruction although valid but somewhat nonsensical (SEND message
to write at offset contained in NULL register), triggers an error in
the validator.
The restriction is that we cannot have overlapping sources. The
validator not checking the type of register incorrectly thinks that
the null register (offset 0) is the same as g0.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>
For Gen11 and prior, the dispatch mode for TCS was SINGLE_PATCH, and
this debug setting could be used to change it to 8_PATCH (falling back
to SINGLE_PATCH when shader couldn't be in the multi dispatch mode).
However after talking to Ken, seems this debug setting is not really
worth keeping around, so removing it.
For Gen12+ the only option is 8_PATCH, so it was always using that
dispatch mode as before.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
Warning messages:
../src/intel/compiler/test_eu_compact.cpp:238:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
../src/intel/compiler/test_eu_compact.cpp:256:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18203>
Right now even the simplest mesh test (func.mesh.basic.mesh from crucible) fails like this:
ASSERT: Scalar MESH validation failed!
load_payload(16) vgrf11+0.0:F, vgrf8:D
../../src/intel/compiler/brw_fs_validate.cpp:61: inst->dst.offset / REG_SIZE + regs_written(inst) <= alloc.sizes[inst->dst.nr]
Because we try to load 8 regs with LOAD_PAYLOAD in SIMD16 mode.
Fixes: 349a040f68 ("intel/fs: Make logical URB write instructions more like other logical instructions")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18075>
This backend lowering code has been dead since the removal of i965 -
nothing in the current source tree ever sets the flag.
This is handled by iris_setup_uniforms() and crocus_setup_uniforms().
Variable group size does not appear to be a feature in anv.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18055>
In the early days of NIR, you had to prod at inst->const_index[]
directly, but a long while back, we added handy accessor functions
that let you use the actual name of the thing you want instead of
memorizing the exact order of parameters.
Also rewrite a comment I had a hard time parsing.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18067>
source_root function is deprecated in Meson version 0.56.0, so let's use
instead a current_source_dir() function, available in all Meson
versions. This also allows to deduplicate some code by declaring
commonly used string at the top meson.build file.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17974>