docs/panfrost: use math-role to denote powers of two

We do this elsewhere in the article, so let's be consistent here.

Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24921>
This commit is contained in:
Erik Faye-Lund
2023-08-21 12:04:01 +02:00
committed by Marge Bot
parent 80e8c78fed
commit 9c2212f9b3

View File

@@ -279,7 +279,8 @@ following the exact same algorithm that the hardware uses, then multiply it
by the GL-level divisor to get the hardware-level divisor. This case is
further divided into two more cases. If the hardware-level divisor is a
power of two, then we just need to shift. The shift amount is specified by
the shift field, so that the hardware-level divisor is just 2^shift.
the shift field, so that the hardware-level divisor is just
:math:`2^\text{shift}`.
If it isn't a power of two, then we have to divide by an arbitrary integer.
For that, we use the well-known technique of multiplying by an approximation
@@ -287,21 +288,21 @@ of the inverse. The driver must compute the magic multiplier and shift
amount, and then the hardware does the multiplication and shift. The
hardware and driver also use the "round-down" optimization as described in
http://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf.
The hardware further assumes the multiplier is between 2^31 and 2^32, so the
high bit is implicitly set to 1 even though it is set to 0 by the driver --
presumably this simplifies the hardware multiplier a little. The hardware
first multiplies linear_id by the multiplier and takes the high 32 bits,
then applies the round-down correction if extra_flags = 1, then finally
shifts right by the shift field.
The hardware further assumes the multiplier is between :math:`2^{31}` and
:math:`2^{32}`, so the high bit is implicitly set to 1 even though it is set
to 0 by the driver -- presumably this simplifies the hardware multiplier a
little. The hardware first multiplies linear_id by the multiplier and
takes the high 32 bits, then applies the round-down correction if
extra_flags = 1, then finally shifts right by the shift field.
There are some differences between ridiculousfish's algorithm and the Mali
hardware algorithm, which means that the reference code from ridiculousfish
doesn't always produce the right constants. Mali does not use the pre-shift
optimization, since that would make a hardware implementation slower (it
would have to always do the pre-shift, multiply, and post-shift operations).
It also forces the multiplier to be at least 2^31, which means that the
exponent is entirely fixed, so there is no trial-and-error. Altogether,
given the divisor d, the algorithm the driver must follow is:
It also forces the multiplier to be at least :math:`2^{31}`, which means
that the exponent is entirely fixed, so there is no trial-and-error.
Altogether, given the divisor d, the algorithm the driver must follow is:
1. Set shift = :math:`\lfloor \log_2(d) \rfloor`.
2. Compute :math:`m = \lceil 2^{shift + 32} / d \rceil` and :math:`e = 2^{shift + 32} % d`.