docs: Initial version of INTEL_shader_atomic_float_minmax spec
v2: Describe interactions with the capabilities added by SPV_INTEL_shader_atomic_float_minmax v3: Remove 64-bit float support. v4: Explain NaN issues. Explain issues with atomicMin(-0, +0) and atomicMax(-0, +0). v5: Fix whitespace issues noticed by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
This commit is contained in:
200
docs/specs/INTEL_shader_atomic_float_minmax.txt
Normal file
200
docs/specs/INTEL_shader_atomic_float_minmax.txt
Normal file
@@ -0,0 +1,200 @@
|
||||
Name
|
||||
|
||||
INTEL_shader_atomic_float_minmax
|
||||
|
||||
Name Strings
|
||||
|
||||
GL_INTEL_shader_atomic_float_minmax
|
||||
|
||||
Contact
|
||||
|
||||
Ian Romanick (ian . d . romanick 'at' intel . com)
|
||||
|
||||
Contributors
|
||||
|
||||
|
||||
Status
|
||||
|
||||
In progress
|
||||
|
||||
Version
|
||||
|
||||
Last Modified Date: 06/22/2018
|
||||
Revision: 4
|
||||
|
||||
Number
|
||||
|
||||
TBD
|
||||
|
||||
Dependencies
|
||||
|
||||
OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
|
||||
ARB_compute_shader is required.
|
||||
|
||||
This extension is written against version 4.60 of the OpenGL Shading
|
||||
Language Specification.
|
||||
|
||||
Overview
|
||||
|
||||
This extension provides GLSL built-in functions allowing shaders to
|
||||
perform atomic read-modify-write operations to floating-point buffer
|
||||
variables and shared variables. Minimum, maximum, exchange, and
|
||||
compare-and-swap are enabled.
|
||||
|
||||
|
||||
New Procedures and Functions
|
||||
|
||||
None.
|
||||
|
||||
New Tokens
|
||||
|
||||
None.
|
||||
|
||||
IP Status
|
||||
|
||||
None.
|
||||
|
||||
Modifications to the OpenGL Shading Language Specification, Version 4.60
|
||||
|
||||
Including the following line in a shader can be used to control the
|
||||
language features described in this extension:
|
||||
|
||||
#extension GL_INTEL_shader_atomic_float_minmax : <behavior>
|
||||
|
||||
where <behavior> is as specified in section 3.3.
|
||||
|
||||
New preprocessor #defines are added to the OpenGL Shading Language:
|
||||
|
||||
#define GL_INTEL_shader_atomic_float_minmax 1
|
||||
|
||||
Additions to Chapter 8 of the OpenGL Shading Language Specification
|
||||
(Built-in Functions)
|
||||
|
||||
Modify Section 8.11, "Atomic Memory Functions"
|
||||
|
||||
(add a new row after the existing "atomicMin" table row, p. 179)
|
||||
|
||||
float atomicMin(inout float mem, float data)
|
||||
|
||||
|
||||
Computes a new value by taking the minimum of the value of data and
|
||||
the contents of mem. If one of these is an IEEE signaling NaN (i.e.,
|
||||
a NaN with the most-significant bit of the mantissa cleared), it is
|
||||
always considered smaller. If one of these is an IEEE quiet NaN
|
||||
(i.e., a NaN with the most-significant bit of the mantissa set), it is
|
||||
always considered larger. If both are IEEE quiet NaNs or both are
|
||||
IEEE signaling NaNs, the result of the comparison is undefined.
|
||||
|
||||
(add a new row after the exiting "atomicMax" table row, p. 179)
|
||||
|
||||
float atomicMax(inout float mem, float data)
|
||||
|
||||
Computes a new value by taking the maximum of the value of data and
|
||||
the contents of mem. If one of these is an IEEE signaling NaN (i.e.,
|
||||
a NaN with the most-significant bit of the mantissa cleared), it is
|
||||
always considered larger. If one of these is an IEEE quiet NaN (i.e.,
|
||||
a NaN with the most-significant bit of the mantissa set), it is always
|
||||
considered smaller. If both are IEEE quiet NaNs or both are IEEE
|
||||
signaling NaNs, the result of the comparison is undefined.
|
||||
|
||||
(add to "atomicExchange" table cell, p. 180)
|
||||
|
||||
float atomicExchange(inout float mem, float data)
|
||||
|
||||
(add to "atomicCompSwap" table cell, p. 180)
|
||||
|
||||
float atomicCompSwap(inout float mem, float compare, float data)
|
||||
|
||||
Interactions with OpenGL 4.6 and ARB_gl_spirv
|
||||
|
||||
If OpenGL 4.6 or ARB_gl_spirv is supported, then
|
||||
SPV_INTEL_shader_atomic_float_minmax must also be supported.
|
||||
|
||||
The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
|
||||
OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.
|
||||
|
||||
Issues
|
||||
|
||||
1) Why call this extension INTEL_shader_atomic_float_minmax?
|
||||
|
||||
RESOLVED: Several other extensions already set the precedent of
|
||||
VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
|
||||
that enable floating-point atomic operations. Using that as a base for
|
||||
the name seems logical.
|
||||
|
||||
There already exists NV_shader_atomic_float, but the two extensions have
|
||||
nearly zero overlap in functionality. NV_shader_atomic_float adds
|
||||
atomicAdd and image atomic operations that currently shipping Intel GPUs
|
||||
do not support. Calling this extension INTEL_shader_atomic_float would
|
||||
likely have been confusing.
|
||||
|
||||
Adding something to describe the actual functions added by this extension
|
||||
seemed reasonable. INTEL_shader_atomic_float_compare was considered, but
|
||||
that name was deemed to be not properly descriptive. Calling this
|
||||
extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
|
||||
out.
|
||||
|
||||
2) What atomic operations should we support for floating-point targets?
|
||||
|
||||
RESOLVED. Exchange, min, max, and compare-swap make sense, and these are
|
||||
all supported by the hardware. Future extensions may add other functions.
|
||||
|
||||
For buffer variables and shared variables it is not possible to bit-cast
|
||||
the memory location in GLSL, so existing integer operations, such as
|
||||
atomicOr, cannot be used. However, the underlying hardware implementation
|
||||
can do this by treating the memory as an integer. It would be possible to
|
||||
implement atomicNegate using this technique with atomicXor. It is unclear
|
||||
whether this provides any actual utility.
|
||||
|
||||
3) What should be said about the NaN behavior?
|
||||
|
||||
RESOLVED. There are several aspects of NaN behavior that should be
|
||||
documented in this extension. However, some of this behavior varies based
|
||||
on NaN concepts that do not exist in the GLSL specification.
|
||||
|
||||
* atomicCompSwap performs the comparison as the floating-point equality
|
||||
operator (==). That is, if either 'mem' or 'compare' is NaN, the
|
||||
comparison result is always false.
|
||||
|
||||
* atomicMin and atomicMax implement the IEEE specification with respect to
|
||||
NaN. IEEE considers two different kinds of NaN: signaling NaN and quiet
|
||||
NaN. A quiet NaN has the most significant bit of the mantissa set, and
|
||||
a signaling NaN does not. This concept does not exist in SPIR-V,
|
||||
Vulkan, or OpenGL. Let qNaN denote a quiet NaN and sNaN denote a
|
||||
signaling NaN. atomicMin and atomicMax specifically implement
|
||||
|
||||
- fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
|
||||
- fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
|
||||
- fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
|
||||
fmax(qNaN, sNaN) = sNaN
|
||||
- fmin(sNaN, sNaN) = sNaN. This specification does not define which of
|
||||
the two arguments is stored.
|
||||
- fmax(sNaN, sNaN) = sNaN. This specification does not define which of
|
||||
the two arguments is stored.
|
||||
- fmin(qNaN, qNaN) = qNaN. This specification does not define which of
|
||||
the two arguments is stored.
|
||||
- fmax(qNaN, qNaN) = qNaN. This specification does not define which of
|
||||
the two arguments is stored.
|
||||
|
||||
Further details are available in the Skylake Programmer's Reference
|
||||
Manuals available at
|
||||
https://01.org/linuxgraphics/documentation/hardware-specification-prms.
|
||||
|
||||
4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
|
||||
arguments?
|
||||
|
||||
RESOLVED. atomicMin should store -0.0, and atomicMax should store +0.0.
|
||||
Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
|
||||
stored. This behavior may change in later GPUs.
|
||||
|
||||
Revision History
|
||||
|
||||
Rev Date Author Changes
|
||||
--- ---------- -------- ---------------------------------------------
|
||||
1 04/19/2018 idr Initial version
|
||||
2 05/05/2018 idr Describe interactions with the capabilities
|
||||
added by SPV_INTEL_shader_atomic_float_minmax.
|
||||
3 05/29/2018 idr Remove mention of 64-bit float support.
|
||||
4 06/22/2018 idr Resolve issue #2.
|
||||
Add issue #3 (regarding NaN behavior).
|
||||
Add issue #4 (regarding atomicMin(-0, +0).
|
Reference in New Issue
Block a user