docs/panfrost: use math-role to denote powers of two
We do this elsewhere in the article, so let's be consistent here. Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24921>
This commit is contained in:

committed by
Marge Bot

parent
80e8c78fed
commit
9c2212f9b3
@@ -279,7 +279,8 @@ following the exact same algorithm that the hardware uses, then multiply it
|
||||
by the GL-level divisor to get the hardware-level divisor. This case is
|
||||
further divided into two more cases. If the hardware-level divisor is a
|
||||
power of two, then we just need to shift. The shift amount is specified by
|
||||
the shift field, so that the hardware-level divisor is just 2^shift.
|
||||
the shift field, so that the hardware-level divisor is just
|
||||
:math:`2^\text{shift}`.
|
||||
|
||||
If it isn't a power of two, then we have to divide by an arbitrary integer.
|
||||
For that, we use the well-known technique of multiplying by an approximation
|
||||
@@ -287,21 +288,21 @@ of the inverse. The driver must compute the magic multiplier and shift
|
||||
amount, and then the hardware does the multiplication and shift. The
|
||||
hardware and driver also use the "round-down" optimization as described in
|
||||
http://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf.
|
||||
The hardware further assumes the multiplier is between 2^31 and 2^32, so the
|
||||
high bit is implicitly set to 1 even though it is set to 0 by the driver --
|
||||
presumably this simplifies the hardware multiplier a little. The hardware
|
||||
first multiplies linear_id by the multiplier and takes the high 32 bits,
|
||||
then applies the round-down correction if extra_flags = 1, then finally
|
||||
shifts right by the shift field.
|
||||
The hardware further assumes the multiplier is between :math:`2^{31}` and
|
||||
:math:`2^{32}`, so the high bit is implicitly set to 1 even though it is set
|
||||
to 0 by the driver -- presumably this simplifies the hardware multiplier a
|
||||
little. The hardware first multiplies linear_id by the multiplier and
|
||||
takes the high 32 bits, then applies the round-down correction if
|
||||
extra_flags = 1, then finally shifts right by the shift field.
|
||||
|
||||
There are some differences between ridiculousfish's algorithm and the Mali
|
||||
hardware algorithm, which means that the reference code from ridiculousfish
|
||||
doesn't always produce the right constants. Mali does not use the pre-shift
|
||||
optimization, since that would make a hardware implementation slower (it
|
||||
would have to always do the pre-shift, multiply, and post-shift operations).
|
||||
It also forces the multiplier to be at least 2^31, which means that the
|
||||
exponent is entirely fixed, so there is no trial-and-error. Altogether,
|
||||
given the divisor d, the algorithm the driver must follow is:
|
||||
It also forces the multiplier to be at least :math:`2^{31}`, which means
|
||||
that the exponent is entirely fixed, so there is no trial-and-error.
|
||||
Altogether, given the divisor d, the algorithm the driver must follow is:
|
||||
|
||||
1. Set shift = :math:`\lfloor \log_2(d) \rfloor`.
|
||||
2. Compute :math:`m = \lceil 2^{shift + 32} / d \rceil` and :math:`e = 2^{shift + 32} % d`.
|
||||
|
Reference in New Issue
Block a user