Need to do a sqrt().
FWIW, the html that Sphinx 1.1.3 generates for the math expressions
looks completely broken.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Add documentation for TEX2/TXL2/TXB2 tgsi opcodes. Also, the texture opcode
documentation wasn't very accurate so fix this up a bit.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The new location field can be either center, centroid, or sample, which
indicates the location that the shader should interpolate at.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The extension is always supported if GLSL 1.30 is supported.
Softpipe and llvmpipe support is also added (trivial).
Radeon and nouveau support is already done.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This opcode provide support for GL_ARB_texture_query_lod,
Signed-off-by: Dave Airlie <airlied@redhat.com>
[imirkin: rebase, docs update]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
To emphasize that the result is floating point 1.0 or 0.0, to match
other opcodes like SLE and SEQ.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This adds support to gallium for a TG4 instruction,
and two CAPs. The first CAP is required for GL_ARB_texture_gather.
The second CAP is required to expose GL_ARB_gpu_shader5.
However so far we haven't found any hardware that natively
exposes the textureGatherOffsets feature from GL, so just
lower it for now. If hardware appears for this we can add
another CAP to allow TG4 to take 4 offsets.
v2: add component selection src and a cap to say
hw can do it. (st can use to help control
GL_ARB_gpu_shader5/GLSL 4.00). Add docs.
v3: rename to SM5, add docs.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In the specification text of NV_vertex_program1_1, the upper
limit of the RCC instruction is written as 1.884467e+19 in
scientific notation, but as 0x5F800000 in binary. But the binary
version translates to 1.84467e+19 rather than 1.884467e+19 in
scientific notation.
Since the lower-limit equals 2^-64 and the binary version equals
2^+64, let's assume the value in scientific notation is a typo
and implement this using the value from the binary version
instead.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
D3D9 Shader Model 2 restricted the fog register to one component,
http://msdn.microsoft.com/en-us/library/windows/desktop/bb172945.aspx ,
but that restriction no longer exists in Shader Model 3, and several
WHCK tests enforce that.
So this change:
- lifts the single-component restriction TGSI_SEMANTIC_FOG
from Gallium interface
- updates the Mesa state tracker to enforce output fog has (f, 0, 0, 1)
- draw module was updated to leave TGSI_SEMANTIC_FOG output registers
alone
Several gallium drivers that are going out of their way to clear
TGSI_SEMANTIC_FOG components could be simplified in the future.
Thanks to Si Chen and Michal Krol for identifying the problem.
Testing done: piglit fogcoord-*.vpfp tests
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The code introduces two new 32bit integer multiplication opcodes which
can be used to produce correct 64 bit results. GLSL, OpenCL and D3D10+
require them. We use two seperate opcodes, because they match the
behavior of GLSL and OpenCL, are a lot easier to add than a single
opcode with multiple destinations and because there's not much (any)
difference wrt code-generation.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Newer graphic languages don't want messy float mask results but instead true
"boolean" mask results for float comparisons. Otherwise just need to convert
the floats back to integers. Need to keep the old opcodes however due to both
legacy (gl and d3d9) needing them and because older hw can't really deal with
integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and
hence must be supported if a driver claims to support glsl 1.30 (or
PIPE_SHADER_CAP_INTEGERS).
Reviewed-by: Zack Rusin <zackr@vmware.com>
This opcode is quite problematic in tgsi, while it tries to mirror
d3d10 resinfo it can't really do what's stated there due to missing
the crazy return type modifiers. Hence specify this is ignored along
with the swizzle.
(Other options would be to have multiple opcodes or specify the ret
type modifier maybe in dst_reg as there's padding bits left there but
it is the only instruction allowing this.)
Reviewed-by: Zack Rusin <zackr@vmware.com>
Previously, nothing was said what happens with shift counts exceeding
bit width of the values to shift. In theory 3 behaviors are possible:
1) undefined (classic c definition)
2) just shift out all bits (so result is zero, or -1 potentially for ashr)
3) mask the shift count to bit width - 1
API's either require 3) or are ok with 1). In particular, GLSL (as well as a
couple uninteresting legacy GL extensions) is happy with undefined, whereas
both OpenCL and d3d10 require 3). Consequently, most hw also implements 3).
So, for simplicity we just specify that 3) is required rather than saying
undefined and then needing state trackers to work around it.
Also while here specify shift count as a vector, not scalar. As far as I
can tell this was a doc bug, neither state trackers nor drivers used scalar
shift count.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
unlike OpenGL, the texel swizzle is embedded in the instruction, so honor
that.
(Technically we now execute both the sampler_view swizzle and the
per-instruction swizzle but this should be quite ok.)
v2: add documentation note as it's not obvious.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
GLSL spec says that rsq is undefined for src<=0, but the D3D10
spec says it needs to be a NaN, so lets stop taking an absolute
value of the source which completely breaks that behavior. For
the gl program we can simply insert an extra abs instrunction
which produces the desired behavior there.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional
kill (if any src component < 0). The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.
This patch renames both opcodes:
TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0)
TGSI_OPCODE_KILP -> KILL (unconditional kill)
Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.
I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up. Driver authors should review their code.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
KILP is really unconditional fragment kill.
We've had KIL and KILP transposed forever. I'll fix that next.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
There are strict limits on those registers. Define the maximums
and use them instead of magic numbers. Also allows us to add
some extra sanity checks.
Suggested by Brian.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
cull distance is analogous to clip distance. If a register is
given this semantic, then the values in it are assumed to be a
float32 distance to a plane. Primitives will be completely
discarded if the plane distance for all of the vertices in
the primitive are < 0.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Since pipe_surface already has all the necessary fields no interface
changes are necessary except adding a new shader semantic value
(TGSI_SEMANTIC_LAYER).
(Note that what GL knows as "gl_Layer" variable d3d10 is naming
"RENDER_TARGET_ARRAY_INDEX".)
v2: drop cap bit (just tied to geometry shader), add docs.
Adds the remaining integer opcodes, and some opcodes are moved to more
appropriate places, along with getting rid of the (already nearly empty)
ps_2_x section. Though the CAP bits for some of these are still a bit in
the air so the documentation isn't quite as watertight as is desirable.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
A lot of them were missing. Others were moved from the Compute ISA
to a new Integer ISA section as that seemed more appropriate.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
It's valid because we reuse certain arithmetic operations
for both signed and unsigned types (e.g. uadd, umad, which
have a bit unfortunate naming)
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Squashed commit of the following:
commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Apr 23 17:37:18 2013 +0100
gallium: s/lower_left_origin/bottom_edge_rule/
commit 4dff4f64fa83b9737def136fffd161d55e4f1722
Author: José Fonseca <jfonseca@vmware.com>
Date: Tue Apr 23 17:35:04 2013 +0100
gallium: Move diagram to docs.
commit 442a63012c8c3c3797f45e03f2ca20ad5f399832
Author: James Benton <jbenton@vmware.com>
Date: Fri May 11 17:50:55 2012 +0100
gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
This change is necessary to achieve correct results when using OpenGL
FBOs.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
TGSI_OPCODE_IF condition had two possible interpretations:
- src.x != 0.0f
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
vertex and fragment shaders
- gallivm/llvmpipe
- postprocess
- vl state tracker
- vega state tracker
- most old drivers
- old internal state trackers
- many graw examples
- src.x != 0U
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
vertex and fragment shaders
- tgsi_exec/softpipe
- r600
- radeonsi
- nv50
And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)
This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range. It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.
But with this change there are now two opcodes, IF and UIF, with clear
meaning.
Drivers that do not support native integers do not need to worry about
UIF. However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.
I tried to implement this for r600 and radeonsi based on the surrounding
code. I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
properly in nv50/ir.
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum