On some malloc implementation, malloc doesn't always align to 16
bytes even on 64 bits system. To make sure ralloc_header always
starts at the wanted alignment, just force the size to be aligned at
the alignment of ralloc_header. This fixes crashed on instruction
like "movaps %xmm0,0x10(%rax)" which requires aligned memory access.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6314>
Unfortunately, we can't quite follow the standard C conventions for
these because ralloc doesn't know the sizes of pointers.
Reviewed-by: Eric Anholt <eric@anholt.net>
Prior to this patch sizeof(linear_header) was 20 bytes in a
non-debug build on 32-bit platforms. We do some pointer arithmetic to
calculate the next available location with
ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset);
in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation
would only be 4-byte aligned.
On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of
4-byte registers to memory) requires an 8-byte aligned address. Such an
instruction is used to store to an 8-byte integer type, like intmax_t
which is used in glcpp's expression_value_t struct.
As a result of the 4-byte alignment returned by linear_alloc_child() we
would generate a SIGBUS (unaligned exception) on SPARC.
According to the GNU libc manual malloc() always returns memory that has
at least an alignment of 8-bytes [1]. I think our allocator should do
the same.
So, simple fix with two parts:
(1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally.
(2) Mark linear_header with an aligned attribute, which will cause
its sizeof to be rounded up to that alignment. (We already do
this for ralloc_header)
With this done, all Mesa's unit tests now pass on SPARC.
[1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
Fixes: 47e1758692 ("glcpp: use the linear allocator for most objects")
Bug: https://bugs.gentoo.org/636326
Reviewed-by: Eric Anholt <eric@anholt.net>
v2 (Jason Ekstrand):
- Rename y to pot_align (Brian)
- Also use ALIGN_POT in build_id.c and slab.c (Brian)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It is possible to have DEBUG disabled but asserts on (NDEBUG), which
cannot build because these asserts work on members that are only present
when DEBUG is on.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Found by inspection.
I'm not aware of any actual failures caused by this, but a precise
sequence of ralloc_adopt and ralloc_free should be able to cause
problems.
v2: make the code slightly clearer (Eric)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This function differs from ralloc_strcat() and ralloc_strncat()
in that it does not do any strlen() calls which can become
costly on large strings.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
If the str is long or isn't null-terminated, strlen() could take a lot
of time or even crash. I don't know why was it used in the first place,
maybe for platforms without strnlen(), but strnlen() is already used
inside of ralloc_strndup(), so this change should not additionally
break anything.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Experimentation shows that without alignment factor gcc and clang choose
a factor of 16 even on IA-32, which doesn't match what malloc() uses (8).
The problem is it makes gcc assume the pointer is 16 byte aligned, so
with -O3 it starts using aligned SSE instructions that later fault,
so always specify a suitable alignment factor.
Cc: Jonas Pfeil <pfeiljonas@gmx.de>
Fixes: cd2b55e5 "ralloc: Make sure ralloc() allocations match malloc()'s alignment."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Tested by: Mike Lothian <mike@fireburn.co.uk>
Tested by: Jonas Pfeil <pfeiljonas@gmx.de>
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.
Fixes SIGBUS on Raspberry Pi at high optimization levels.
This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.
v2: Commit message reword/typo fix, and add a bigger explanation in the
code (by anholt)
Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
There was exactly one user of this, and I just removed it.
It also accessed an implicit global context, with no locking. This
meant that it was only safe if all callers of ralloc_autofree_context()
held the same lock...which is a pretty terrible thing for a utility
library to impose.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
time GALLIUM_NOOP=1 ./run shaders/private/alien_isolation/ >/dev/null
Before (2 takes):
real 0m8.734s 0m8.773s
user 0m34.232s 0m34.348s
sys 0m0.084s 0m0.056s
After (2 takes):
real 0m8.448s 0m8.463s
user 0m33.104s 0m33.160s
sys 0m0.088s 0m0.076s
Average change in "real" time spent: -3.4%
calloc should only do 2 things compared to malloc:
- check for overflow of "n * size"
- call memset
I'm not sure if that explains the difference.
v2: clear "parent" and "next" in the caller of add_child.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Juha-Pekka found this back in May 2015:
<1430915727-28677-1-git-send-email-juhapekka.heikkila@gmail.com>
From the discussion, obviously it would be preferable to make
ralloc_size no longer return zeroed memory, but Juha-Pekka found that
it would break Mesa.
In <56AF1C57.2030904@gmail.com>, Juha-Pekka mentioned that patches
exist to fix i965 when ralloc_size is fixed to not zero memory, but
the patches have not made their way to mesa-dev yet.
For now, let's stop doing the double zeroing of rzalloc buffers.
v2:
* Move ralloc_size code to rzalloc_size, and add a comment as
suggested by Ken.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I was cleverly using one iteration to obtain a pointer to the last item
in ralloc's singly list child list, while also setting parents.
Unfortunately, I forgot to set the parent on that last item.
Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
If the string being copied is not NULL-terminated the result of
strlen() is undefined.
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
ralloc_adopt() reparents all children from one context to another.
Conceptually, ralloc_adopt(new_ctx, old_ctx) behaves like this
pseudocode:
foreach child of old_ctx:
ralloc_steal(new_ctx, child)
However, ralloc provides no way to iterate over a memory context's
children, and ralloc_adopt does this task more efficiently anyway.
One potential use of this is to implement a memory-sweeper pass: first,
steal all of a context's memory to a temporary context. Then, walk over
anything that should be kept, and ralloc_steal it back to the original
context. Finally, free the temporary context. This works when the
context is something that can't be freed (i.e. an important structure).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
For a long time, we've wanted a place to put utility code which isn't
directly tied to Mesa or Gallium internals. This patch creates a new
src/util directory for exactly that purpose, and builds the contents as
libmesautil.la.
ralloc seemed like a good first candidate. These days, it's directly
used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl
didn't make much sense.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
v2 (Jason Ekstrand): More realloc uses and some scons fixes
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>