Commit Graph

37 Commits

Author SHA1 Message Date
Ian Romanick
d1e0227ef1 soft-fp64/b2f: Reimplement using bitwise logic ops
This doesn't help a lot of shaders, but it helps those few a LOT.

This could also be implemented using bcsel.  That version is very
slightly worse because the generated SEL instruction wants to have two
immediate sources, so one of them usually needs an extra MOV instruction
to load.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 929619 -> 928859 (-0.08%)
instructions in affected programs: 1651 -> 891 (-46.03%)
helped: 8
HURT: 0
helped stats (abs) min: 38 max: 152 x̄: 95.00 x̃: 95
helped stats (rel) min: 42.70% max: 86.36% x̄: 49.88% x̃: 44.66%
95% mean confidence interval for instructions value: -132.97 -57.03
95% mean confidence interval for instructions %-change: -62.28% -37.49%
Instructions are helped.

total cycles in shared programs: 7280180 -> 7272912 (-0.10%)
cycles in affected programs: 12960 -> 5692 (-56.08%)
helped: 8
HURT: 0
helped stats (abs) min: 352 max: 1456 x̄: 908.50 x̃: 910
helped stats (rel) min: 52.45% max: 91.19% x̄: 59.24% x̃: 55.15%
95% mean confidence interval for cycles value: -1274.03 -542.97
95% mean confidence interval for cycles %-change: -70.06% -48.41%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>
2020-03-18 20:36:29 +00:00
Francisco Jerez
a30bb25a7a glsl: Fix software 64-bit integer to 32-bit float conversions.
The current implementation was broken for any integers between 2^24
and 2^30 (it would return zero for me on ICL).  The reason is that for
such integers we wouldn't take the 'if (0 <= shiftCount)' early return
path, however 'shiftCount + 7' would be positive, leading to a
negative 'count' argument passed to __shift64RightJamming(), which
would give undefined results.

This reworks the affected conversion functions to use either
__shortShift64Left() or __shift64RightJamming() based on the sign of
the final shift count, which should avoid the problem.  In addition
this should qualify as a clean-up/optimization -- This implementation
of the conversion functions translates to 7 instructions less than the
original on Intel hardware.

This fixes the 'KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot'
conformance tests on soft fp64 hardware with large enough subgroup
size (>16).

Fixes: d5cf6e92b4 "glsl: Add built-in functions to do uint64_to_fp32(uint64_t)"
Fixes: c9d333a6b7 "glsl: Add built-in functions to do int64_to_fp32(int64_t)"
Cc: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2020-01-10 10:51:58 -08:00
Sagar Ghuge
06807e1948 glsl: Fix round64 conversion function
Fix round64 function to handle round to nearest even cases specially
with positive and negative numbers with fraction part 0.5.

v2: 1) Simplify unused bits (Elie Tournier)

Fixes:
   KHR-GL45.gpu_shader_fp64.builtin.round_dvec2
   KHR-GL45.gpu_shader_fp64.builtin.round_dvec3
   KHR-GL45.gpu_shader_fp64.builtin.round_dvec4
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_double
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3
   KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-06-25 15:19:10 -07:00
Anuj Phogat
a42163cbbc compiler: Add lowering support for 64-bit saturate operations to software
Fixes 7 Khronos GL CTS tests:
KHR-GL45.gpu_shader_fp64.builtin.smoothstep_dvec{double, 2, 3, 4}
KHR-GL45.gpu_shader_fp64.builtin.smoothstep_against_scalar_dvec{2, 3, 4}

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-05-15 23:30:30 +00:00
Sagar Ghuge
f998ce4111 glsl: Add "built-in" functions to do fp32_to_int64(fp32)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
2632c12477 glsl: Add "built-in" functions to do fp32_to_uint64(fp32)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
876a4b85fe glsl: Add "built-in" functions to do fp64_to_int64(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
21e9bb2b3f glsl: Add utility function to round and pack int64_t value
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
5a674fd789 glsl: Add "built-in" functions to do fp64_to_uint64(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
5a87441807 glsl: Add utility function to round and pack uint64_t value
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
c9d333a6b7 glsl: Add "built-in" functions to do int64_to_fp32(int64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
d5cf6e92b4 glsl: Add "built-in" functions to do uint64_to_fp32(uint64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
b830efb191 glsl: Add "built-in" functions to do int64_to_fp64(int64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Sagar Ghuge
7c5b982b89 glsl: Add "built-in" functions to do uint64_to_fp64(uint64_t)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-01-09 16:42:40 -08:00
Matt Turner
15757bc80b glsl: Add "built-in" functions to convert bool to double
And vice versa.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
e213f3871f glsl: Add "built-in" functions to do ffract(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
5c9a659f50 glsl: Add "built-in" function to do ffloor(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
83762afa66 glsl: Add "built-in" functions to do fmin/fmax(fp64)
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Matt Turner
92ac2169fb glsl: Add "built-in" functions to do ffma(fp64)
Definitely not actually a fused-multiply add.

Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
3db81b5d9f glsl: Add "built-in" functions to do round(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
48891ab441 glsl: Add "built-in" functions to do trunc(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
2119094b1d glsl: Add "built-in" functions to do sqrt(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
cad58fc5e7 glsl: Add "built-in" functions to do fp32_to_fp64(fp32)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
407bd1bbf9 glsl: Add "built-in" functions to do fp64_to_fp32(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
f499942b31 glsl: Add "built-in" functions to do int_to_fp64(int)
v2: use mix
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
773190f281 glsl: Add "built-in" functions to do fp64_to_int(fp64)
v2: use mix

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
cbf090b809 glsl: Add "built-in" functions to do uint_to_fp64(uint)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
a3551ee61f glsl: Add "built-in" functions to do fp64_to_uint(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
4a93401546 glsl: Add "built-in" functions to do mul(fp64, fp64)
v2: use mix
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
f111d72596 glsl: Add "built-in" functions to do add(fp64, fp64)
v2: use mix and findMSB to optimise.
v3: [Sagar] Fix zFrac0 == 0u case in __normalizeRoundAndPackFloat64

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
c036fc97a2 glsl: Add "built-in" functions to do lt(fp64, fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
3e4d5ea7b8 glsl: Add utility function to extract 64-bit sign
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Elie Tournier
ec6e823a99 glsl: Add "built-in" functions to do eq/ne(fp64, fp64) 2019-01-09 16:42:40 -08:00
Elie Tournier
c802cdde9d glsl: Add "built-in" function to do sign(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
eac66f0248 glsl: Add "built-in" functions to do neg(fp64)
v2: use mix.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Elie Tournier
0428951b9d glsl: Add "built-in" function to do abs(fp64)
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
2019-01-09 16:42:40 -08:00
Matt Turner
b63a1f8e40 glsl: Create file to contain software fp64 functions
The following patches will add implementations of various
double-precision operations to this file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00