NIR metadata validation verifies that the debug bit was unset (by a call
to nir_metadata_preserve) if a NIR optimization pass made progress on
the shader. With the expectation that the NIR shader consists of only a
single main function, it has been safe to call nir_metadata_preserve()
iff progress was made.
However, most optimization passes calculate progress per-function and
then return the union of those calculations. In the case that an
optimization pass makes progress only on a subset of the functions in
the shader metadata validation will detect the debug bit is still set on
any unchanged functions resulting in a failed assertion.
This patch offers a quick solution (short of a larger scale refactoring
which I do not wish to undertake as part of this series) that simply
unsets the debug bit on unchanged functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64. This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
v2: use nir_metadata_preserve
preserve metadata in case of !progress
Fixes: 074f5ba0b5
"nir: Add a simple int64 lowering pass"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder. They just leave the behavior undefined
whenever either source is negative. However, in SPIR-V, there is a
defined behavior for negative arguments. This commit beefs up the pass
so that it handles both correctly. Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.
Reviewed-by: Matt Turner <mattst88@gmail.com>
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.
v2: Properly handle vectors
v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the
calculations anyway and this is just extra instructions.
v4:
- Add back in the log2_denom stuff since it's needed for ensuring that
the shifts don't overflow.
- Rework the looping part of the pass to be easier to expand.
Reviewed-by: Matt Turner <mattst88@gmail.com>