soft-fp64/fsat: Micro-optimize x >= 1 test
Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 841590 -> 841332 (-0.03%) instructions in affected programs: 121957 -> 121699 (-0.21%) helped: 7 HURT: 0 helped stats (abs) min: 15 max: 54 x̄: 36.86 x̃: 41 helped stats (rel) min: 0.16% max: 0.33% x̄: 0.23% x̃: 0.18% 95% mean confidence interval for instructions value: -49.73 -23.98 95% mean confidence interval for instructions %-change: -0.29% -0.16% Instructions are helped. total cycles in shared programs: 6926828 -> 6923967 (-0.04%) cycles in affected programs: 1038569 -> 1035708 (-0.28%) helped: 7 HURT: 0 helped stats (abs) min: 128 max: 616 x̄: 408.71 x̃: 446 helped stats (rel) min: 0.18% max: 0.44% x̄: 0.29% x̃: 0.22% 95% mean confidence interval for cycles value: -571.72 -245.70 95% mean confidence interval for cycles %-change: -0.38% -0.19% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
This commit is contained in:
@@ -264,7 +264,25 @@ __fsat64(uint64_t __a)
|
||||
if (__is_nan(__a) || int(a.y) < 0)
|
||||
return 0ul;
|
||||
|
||||
if (!__flt64_nonnan(__a, 0x3FF0000000000000ul /* 1.0 */))
|
||||
/* IEEE 754 floating point numbers are specifically designed so that, with
|
||||
* two exceptions, values can be compared by bit-casting to signed integers
|
||||
* with the same number of bits.
|
||||
*
|
||||
* From https://en.wikipedia.org/wiki/IEEE_754-1985#Comparing_floating-point_numbers:
|
||||
*
|
||||
* When comparing as 2's-complement integers: If the sign bits differ,
|
||||
* the negative number precedes the positive number, so 2's complement
|
||||
* gives the correct result (except that negative zero and positive zero
|
||||
* should be considered equal). If both values are positive, the 2's
|
||||
* complement comparison again gives the correct result. Otherwise (two
|
||||
* negative numbers), the correct FP ordering is the opposite of the 2's
|
||||
* complement ordering.
|
||||
*
|
||||
* We know that both values are not negative, and we know that at least one
|
||||
* value is not zero. Therefore, we can just use the 2's complement
|
||||
* comparison ordering.
|
||||
*/
|
||||
if (ilt64(0x3FF00000, 0x00000000, a.y, a.x))
|
||||
return 0x3FF0000000000000ul;
|
||||
|
||||
return __a;
|
||||
|
Reference in New Issue
Block a user