aco_instruction_selection_setup.cpp (previously used as a header) has
been split into a header and an implementation file. The latter "only"
implements init_context and setup_isel_context, but since these files
carry a long trail of helper functions, this cleans up the isel header
a lot.
Reduces library size by 3.1% due to more functions being compiled with
static linkage. Makes aco_instruction_selection.cpp compile 3% faster.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>
Statically known values were encoded using template parameters previously,
causing specializations for each of the 5 sets of template arguments to be
generated. Since emit_load is not performance critical (the inner loop
never runs more often than twice), it's better for build time to use
runtime arguments everywhere.
Reduces build time of this file by 9% (17.3s -> 15.7s on my machine) and
reduces libaco's size by 2.6%.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>
SPIR-V's OpKill is a control-flow instruction but NIR's discard is not.
Therefore, it can be valid SPIR-V to have
if (...) {
foo = /* something */
} else {
discard;
}
use(foo);
without any phi between the definition of foo and its use. This is not
true in NIR, however, because NIR's discard isn't considered
control-flow. Arguably, this is a NIR bug but making discard control-
flow is a very deep change that can have serious ans subtle
side-effects. The easier thing to do is just fix up the SSA in case we
have an OpKill which might have gotten us into the above case.
Fixes dEQP-VK.graphicsfuzz.vectors-and-discard-in-function with the new
NIR dominance validation pass enabled.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
We don't do full dominance validation of SSA values in nir_validate
because it requires generating valid dominance information and, while
that's not extremely expensive, it's probably more than we want to do on
every pass. Also, dominance information is generated through the
metadata system so if we ran it by default in nir_validate, we would get
different beavior of the metadata system based on whether or not you
have a debug build and metadata bugs would be very hard to find.
However, having a pass for it that can be run occasionally, should help
detect and expose bugs. For ease of use, we add a NIR_VALIDATE_SSA_DOMINANCE
environment variable which can be set to manually enable dominance
validation as a standard part of nir_validate.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
freedreno results (note that cat6 is loads from memory as opposed to
pushed constants from the constant file):
total instructions in shared programs: 8044344 -> 8022085 (-0.28%)
total constlen in shared programs: 1411384 -> 1461964 (3.58%)
total cat6 in shared programs: 89983 -> 87065 (-3.24%)
Over the last 3 commits, we increased Manhattan31 performance by ~10%
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
There's no sense in planning out an upload that we won't be able to
actually upload due to the limit. This means that we can keep making
other loads pushable, even after we find one that wouldn't be, and we
don't fill the const file with UBO data for a load we couldn't promote.
total instructions in shared programs: 8096655 -> 8044344 (-0.65%)
total constlen in shared programs: 1447824 -> 1411384 (-2.52%)
total cat6 in shared programs: 97824 -> 89983 (-8.02%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
Now that NIR doesn't lose the original base/range on the
nir_lower_uniforms_to_ubo() path, we get a lot more indirect arrays
uploaded in shader-db.
total instructions in shared programs: 8125988 -> 8103788 (-0.27%)
total constlen in shared programs: 1313096 -> 1448864 (10.34%)
total cat6 in shared programs: 104089 -> 97824 (-6.02%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
For UBO accesses to be the same performance as classic GL default uniform
block uniforms, we need to be able to push them through the same path. On
freedreno, we haven't been uploading UBOs as push constants when they're
used for indirect array access, because we don't know what range of the
UBO is needed for an access.
I believe we won't be able to calculate the range in general in spirv
given casts that can happen, so we define a [0, ~0] range to be "We don't
know anything". We use that at the moment for all UBO loads except for
nir_lower_uniforms_to_ubo, where we now avoid losing the range information
that default uniform block loads come with.
In a departure from other NIR intrinsics with a "base", I didn't make the
base an be something you have to add to the src[1] offset. This keeps us
from needing to modify all drivers (particularly since the base+offset
thing can mean needing to do addition in the backend), makes backend
tracking of ranges easy, and makes the range calculations in
load_store_vectorizer reasonable. However, this could definitely cause
some confusion for people used to the normal NIR base.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>