- Update and add csbgen definitions to make the content of the
geom and frag km stream more obvious.
- Replace some of the hard coded constants with defines.
- Adds some static assert to make the provenance of definitions
more clear as well as making sure things fit properly.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24138>
Use all the nir_legacy.h features to transition away from the deprecated
structures. shader-db is quite happy. I assume that's a mix of more aggressive
source mod usage and better coalescing (nir-to-tgsi doesn't copyprop).
total instructions in shared programs: 900179 -> 887156 (-1.45%)
instructions in affected programs: 562077 -> 549054 (-2.32%)
helped: 5198
HURT: 470
Instructions are helped.
total temps in shared programs: 91165 -> 91162 (<.01%)
temps in affected programs: 398 -> 395 (-0.75%)
helped: 21
HURT: 18
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
Consider code like:
32x4 %2 = @load_interpolated_input (%1, %0 (0x0)) (base=0, component=0, dest_type=float32, io location=VARYING_SLOT_VAR0 slots=1 mediump) // Color
32x4 %3 = fabs %2
32x4 %4 = fsat %3
32x4 %5 = fsin %4
The existing logic would incorrectly tell the backend that both fabs and fsat
could be folded, and then half the shader disappears. Whoops. Fix by stopping
the folding in this case. I choose to do this check in the fsat rather than the
fabs because it's more straightforward (1 source vs N uses) but it's somewhat
arbitrary.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
Consider the IR:
%0 = load_reg
%1 = fneg %0
%2 = ffloor %1
%3 = bcsel .., .., %1
Because the fneg has both foldable and non-foldable users, nir/legacy does not
fold the fneg into the load_reg. This ensures that the backend correctly emits a
dedicated fneg instruction (with the load_reg folded in) for the bcsel to use.
However, because the chasing helpers did not previously take other uses of a
modifier into account, the helpers would fuse in the fneg to the ffloor. Except
that doesn't work, because the load_reg instruction is supposed to be
eliminated. So we end up with broken chased IR:
1 = fneg r0
2 = ffloor -NULL
3 = bcsel, ..., 1
The fix is easy: only fold modifiers into ALU instructions if the modifiers can
be folded away. If we can't eliminate the modifier instruction altogether, it's
not necessarily beneficial to fold it anyway from a register pressure
perspective. So this is probably ok. With that check in place we get correct IR
1 = fneg r0
2 = ffloor 1
3 = bcsel, ..., 1
Fixes carchase/230.shader_test under softpipe.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
This avoids the silly compiler versions. Some bits are slightly more
complicated, because they have to account for inverted enum values (rather than
a separate invert bit), but this is a LOT friendlier to drivers using the pass
and it makes the pass itself more readable.
The conversion functions in panfrost/panvk will go away momentarily.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076>
Vulkan drivers that use nir_lower_blend need to translate Vulkan enums to the
common (non-Vulkan) versions used in nir_lower_blend. We don't need to duplicate
that boilerplate in every VK driver that uses nir_lower_blend, move panvk's
versions to common code so we can use them in agxv. I suspect powervr wants this
too.
It might be useful to also share the logic to translate vk_color_blend_state
to nir_lower_blend_options wholesale, but panvk wouldn't use it and agxv is
downstream so it wouldn't have any in-tree users. So I'll keep that part
vendored (for now). For now, let's share the easy win of the enum translation.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076>
This hasn't been updated since the a5xx days, and we've learned much
more since then. I've tried to expand it from a random collection of
notes to a more complete guide to explaining how to read the firmware
and understand the various tricks it uses to make code more compact.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24125>
This is because we still may want to be able to expose vulkan 1.3
on some devices that technically do not support it, for instance
the adreno 610 has everything required except multiview, which is
not essential for most games.
Signed-off-by: Amber Amber <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20991>
a610/a608 has less pipes, so we need to make it configurable.
In particular we need to program all of the VSC_PIPE_CONFIG_REG[n]
rather than leaving garbage values for the unused pipes. Pointing
multiple VSC pipes at the same bin makes the hw angry.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20991>
There is no .gmem field, there is a ccu color cache size field
which tells the size as a fraction of depth cache used in direct
rendering.
There is also GMEM_FAST_CLEAR_DISABLE flag which is set on a608/a610.
Since these values will stop being the same between models,
make them configurable.
Credits to Connor Abbott for deciphering color cache size meaning.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20991>
ARM64EC is a new build target for Windows ARM64 devices for x64 support.
Currently that build flavor fails due to attempting to use x64 intrinsics.
This commit fixes it by changing the auto-detection to be aarch64
instead of x64 for arm64ec.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24119>
This was missed in 0bf6dcb785
There is a loop which iterates over a temp array. NIR optimization
moves the real work out of the loop and what remains are just ALU ops
with undefs. So after converting undefs to zero, the ALU ops are
optimized out and DCE kills the loop. This is a good thing in
general and we don't fail the linking due to the loop presence.
However than we hit the shader constants and ALU limits later :-(
So from dEQP POW we go from NotSupported to Fail.
Fixes: 0bf6dcb785
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24134>