Properly handle the difference between split and merged register file
when determining where arrays can fit without conflicting with other
arrays or pre-colored instructions.
1) if not mergedregs, only consider other things with same precision
as potentially conflicting
2) if mergedregs, calculate everything in therms of half-regs and
convert back to fullregs in the end
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
We shouldn't divide-by-two for half-reg arrays. We set the proper node
interference class, based on `arr->half`.
Fixes a RA fail with 16b arrays:
src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed.
Caused by use/def iterators returning `arr->length` vreg namess, but
only assigning the array half that many names.
Also, since we are assigning unique vreg names to each array element,
there is no need to try and convert from half-reg to it's conflicting
full reg when pre-coloring the array elements. Getting us closer to
having half-arrays work sanely with split-register-file (a5xx and
earlier).
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
Fixes the following build error:
In file included from external/mesa/src/panfrost/encoder/pan_blit.c:34:
In file included from external/mesa/src/panfrost/encoder/../midgard/midgard_compile.h:27:
external/mesa/src/compiler/nir/nir.h:52:10: fatal error: 'nir_opcodes.h' file not found
^~~~~~~~~~~~~~~
1 error generated.
Fixes: 293f251871 ("panfrost: Use Midgard-specific reloads")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5961>
legalize_block() can get run multiple times, which I didn't notice when
adding fine derivs support. Other instruction clones change things such
that the legalization won't trigger again, but that didn't apply to the
DS.PP legalization. To keep someone else from tripping over this, split
the one-shot legalization out of the iterative sync flag application.
Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
However complicated DCC addressing is it is still based on tiles.
If we have the intra-tile offsets + tile dimensions we can expand
that to the full image ourselves.
Behavior around ~1080p on a 2500U:
old:
30-60 ms on every miss
new:
5 ms initally (miss in the tile cache)
<0.5 ms afterwards
The most common case is that the tile cache only contains data for
2 tiles, which for Raven/Renoir/Navi14 will be 4 KiB each, so the
size increase is fairly modest.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5865>
Normally, the linker will pull in any compilation unit (aka .c file) from
a static lib (such as our shared util code) that is depended on by the
code linking against it. Since that code is already compiled, the .text
section is allowed to jump anywhere in .text, and the compiler can't
garbage collect unused functions inside of a compile unit.
Teasing callgraphs apart so that normal compilation-unit-level GCing can
reduce driver size hurts the logical organization of the code and is
difficult. As an example, once I'd split the format pack/unpack tables, I
had to split out util_format_read/write() from util_format.c to avoid
pulling in pack/unpack. But even then it didn't help, because it turns
out turnip's pack calls pull in util_format_bptc.c for bptc packing, but
that file also includes the unpack impls, and those internally call
util_format_unpack, and thus we pulled in all of unpack. Splitting all of
this to separate files makes code harder to find and maintain, and is a
waste of dev time.
By setting these compiler flags, the compiler puts each function and data
symbol in a separate ELF section and the linker can then safely GC unused
text and data sections from a compile unit that gets pulled in. There's a
bit of a space cost due to having those separate sections, but it ends up
being a huge win in disk space on my personal release driver builds:
- i965_dri.so -213k
- x86 gallium dri.so -430k
- libvulkan_intel.so -272k
- aarch64 gallium dri.so -330k
- libvulkan_freedreno.so -783k
No difference on iris drawoverhead -compat -test 1 on my skylake (n=60)
Effect on debugoptimized build times (n=5)
touch nir_lower_io.c build time (bfd) +15.999% +/- 3.80377%
touch freedreno fd6_gmem.c build time (bfd) +13.5294% +/- 4.86363%
touch nir_lower_io.c build time (lld) no change
touch freedreno fd6_gmem.c build time (lld) +2.45375% +/- 2.2383%
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5739>
I see no reason to hide this. The small hit in cycle count is offset in
practice by the increase in thread count. So let's ship it and get some
testing.
If this regresses a workload:
1. Open an issue on the tracker and attach an apitrace.
2. In the meantime set PAN_MESA_DEBUG=nofp16 to override.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5960>
I haven't noticed tftp boot issues in last few days, not sure if they
where just a fluke on Mon or if it is somehow related to # of jobs we
run (ie. having more of the c630 runners powered up and running more
of the time).
Let's turn them back on and see what happens.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5952>
warning: converting a packed ‘instr_cf_t’ {aka ‘union <anonymous>’}
pointer (alignment 1) to a ‘uint16_t’ {aka ‘short unsigned int’} pointer
(alignment 2) may result in an unaligned pointer value
[-Waddress-of-packed-member]
We may know that we'll only ever have aligned instr_cf_ts, but gcc
doesn't.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5955>