Since we now require C11 and C++14, we can use the standard
static_asserts from the standard library instead of rolling our own
compiler-specific versions.
To avoid needing scopes around usage in switch cases, keep the
while-wrapping from before. This means it still can't be used outside of
functions, but that should be fine; we should probably just use
static_assert directly in those cases anyway.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16670>
This macro kinda complements util_is_power_of_two_*, but is implemented
as a macro. This means that it can expand to a constant integral
expression, and thus be used in static_assert.
Because we don't really need the added complexity, this doesn't handle
zero correctly. But that's OK, because the call-sites will.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16670>
dirty is a variable, and C standard doesn't guarantee that the compiler
can see that it's always passed a constant. So this isn't appropriate to
use for static_assert, which we're about to convert STATIC_ASSERT to
expand to.
If we really want to do this compile-time, we need to make
fd_context_dirty a macro instead. But it seems reaonable to just use a
run-time assert instead here.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16670>
Improve: 91fe0b5629 ("radv: Delete lots of sync code.")
As cnd_monotonic.h are include `util/os_time.h`, radv_debug.c and radv_debug.c needs `util/os_time.h`
So include in these files directly.
The compiling errors are:
```
../src/amd/vulkan/radv_debug.c:707:12: error: implicit declaration of function 'os_localtime' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
timep = os_localtime(&raw_time, &result);
../src/amd/vulkan/radv_device.c:97:11: error: implicit declaration of function 'os_time_get_nano' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
return os_time_get_nano();
^
../../src/amd/vulkan/radv_pipeline.c: In function 'radv_create_shaders':
../../src/amd/vulkan/radv_pipeline.c:4119:29: error: implicit declaration of function 'os_time_get_nano' [-Werror=implicit-function-declaration]
4119 | int64_t pipeline_start = os_time_get_nano();
```
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16536>
this one's a little different from other dynamic states in that it
isn't expected to change much, so I've kept it outside of draw handling
since it can be trivially applied elsewhere with no chance of impacting perf
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16838>
Don't compute it based on devinfo->has_64bit_float. Othwerwise we may
end up emitting 64bit-int (Q) instructions on platforms with 64bit
floats but not 64bit integers.
Right now, the only platforms where has_64bit_int is different from
has_64bit_float are the platforms that use GFX7_FEATURES.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15835>
This expression accidentally performs a 32-bit sign-extension when
processing the second half of the expression (the low 16 bits).
Consider -7W, which is represented as 0xfff9fff9 in our encoding (the
16-bit word is replicated to both halves of the 32-bit dword).
Tigerlake's compaction stores the low 11-bits of an immediate as-is,
and replicates the 12th bit. So here, compacted_imm will be 0xff9.
( (int)(0xff9 << 20) >> 4) |
((short)(0xff9 << 4) >> 4))
0xfff90000 | (0xff90 >> 4)
0xfff90000 | 0xfffffff9 ...oops...
0xfffffff9
By casting the second line of the expression to unsigned short, we
prevent the sign-extension when it combines both parts, so we get:
0xfff90000 | 0x0000fff9
0xfff9fff9
Fixes: 12d3b11908 ("intel/compiler: Add instruction compaction support on Gen12")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16833>
Because we lower SPLIT and COLLECT before RA, we need to consider offsets when
determining the dimensions of vectors, in order to align properly. Lowering
COLLECT post-RA would avoid this special case.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
Speeds up compiling shaders/skia/781.shader_test in shader-db by 8x
(Icecream95).
...At least it did before I extended to support register allocation of vec8. On
Valhall, texture instructions require up to 8 consecutive registers. To handle
this, provide for vec8 register allocation. Liveness was already (accidentally?)
vec8. The increased memory requirement is acceptable given that the interference
matrix is now stored sparsely (Alyssa).
Icecream95 reports the vec8 changes hurt RA performance by about 1% on average.
I consider this acceptable for now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>
This is an array which can either be sparse or dense, and was designed
to be used to track liveness and interference information.
Either a sparse array with sorted indices or dense array is used.
Other data structures were tried, such as red-black trees or hash
tables, but they were slower. When used for storing constraints, the
indices do not have to be sorted as duplicating elements is okay, but
the speedup from that was not enough to justify the extra complexity.
v2: Add a comment about how to potentially speed it up. But it seems
fast enough even without this change.
v3: Use a custom struct rather than relying on util_dynarray.
v4: Split out functions only used for liveness analysis, rather than the simpler
data structure needed for the register interference matrix. If we need to
optimize liveness, that can follow on after. Also make it for vec8 (Alyssa).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>