Don't call set_unfiform_initializers if link failed, or it would trigger
a GL_INVALID_OPERATION error. That's not an expected behavior of
glLinkProgram function.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Switch all of the code in ir_to_mesa, st_glsl_to_tgsi, glUniform*,
glGetUniform, glGetUniformLocation, and glGetActiveUniforms to use the
gl_uniform_storage structures in the gl_shader_program.
A couple of notes:
* Like most rewrite-the-world patches, this should be reviewed by
applying the patch and examining the modified functions.
* This leaves a lot of dead code around in linker.cpp and
uniform_query.cpp. This will be deleted in the next patches.
v2: Update the comment block (previously a FINISHME) in _mesa_uniform
about generating GL_INVALID_VALUE when an out-of-range sampler index
is specified.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Connects all of the gl_program_parameter structures with the correct
gl_uniform_storage structures.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
This is an OpenGL ES specific extension. External textures are textures that
may be sampled from, but not be updated (no glTexSubImage* and etc.). The
image data are taken from an EGLImage.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Jakob Bornecrantz <jakob@vmware.com>
Previously check_resources could fail, but we'd still try to optimize
the shader, do device-specific code generation, etc. In some cases,
this could explode (especially in the device-specific code
generation). I haven't found that I could trigger this with the
current code. When too many samplers were used with the new uniform
handling code, I observed several crashes deep down in the driver.
NOTE: This is candidate for the 7.11 branch.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41609
Cc: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
This patch makes GLSL interpolation qualifiers visible to drivers via
the array InterpQualifier[] in gl_fragment_program, so that they can
easily be used by driver back-ends to select the correct interpolation
mode.
Previous to this patch, the GLSL compiler was using the enum
ir_variable_interpolation to represent interpolation types. Rather
than make a duplicate enum in core mesa to represent the same thing, I
moved the enum into mtypes.h and renamed it to be more consistent with
the other enums defined there.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Generate the program parameters list by walking the IR instead of by
walking the list of linked uniforms. This simplifies the code quite a
bit, and is probably a bit more correct. The list of linked uniforms
should really only be used by the GL API to interact with the
application.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Bryan Cain <bryancain3@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Having a few of these includes or forward declarations inside the
'extern "C"' block can cause problems later. Specifically, it
prevents C++ linkage functions from being added to ir_to_mesa.h and
makes G++ angry if 'struct foo' is seen both inside and outside an
'extern "C"'.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Drivers implementing GLSL 1.30 want to do integer modulus, and until we
can stop generating code via ir_to_mesa, it's easier to make it silently
generate rubbish code. Multiply will do.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
For hardware drivers, we only have ir_to_mesa called for the purposes
of potential swrast fallbacks (basically never on a 1.30 driver),
which we don't really care about. This will allow 1.30 to be
implemented without rewriting swrast for it.
Reviewed-by: Chad Versace <chad@chad-versace.us>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
GLSL 1.30 requires us to use gl_ClipDistance for clipping if the
vertex shader contains a static write to it, and otherwise use
user-defined clipping planes. Since the driver needs to behave
differently in these two cases, we need a flag to record whether the
shader has written to gl_ClipDistance.
The new flag is called UsesClipDistance. We initially store it in
gl_shader_program (since that is the data structure that is available
when we check to see whethe gl_ClipDistance was written to), and we
later copy it to a flag with the same name in gl_vertex_program, since
that is a more convenient place for the driver to access it (in i965,
at least).
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
This is a better, more fine-grained way of lowering if statements. Fixes the
game And Yet It Moves on nv50.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Using multiply and reciprocal for integer division involves potentially
lossy floating point conversions. This is okay for older GPUs that
represent integers as floating point, but undesirable for GPUs with
native integer division instructions.
TGSI, for example, has UDIV/IDIV instructions for integer division,
so it makes sense to handle this directly. Likewise for i965.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Bryan Cain <bryancain3@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
!a && b occurs frequently when nexted if-statements have been
flattened. It should also be possible use a MAD for (a && b) || c,
though that would require a MAD_SAT.
Reviewed-by: Eric Anholt <eric@anholt.net>
The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z
!= b.z || a.w != b.w). Logical-or is implemented using addition
(followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing
the logical-or operators with addition gives !bool((int(a.x != b.x) +
int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be
implemented using a dot-product with a vector of all 1.0. After the
dot-product, the value will be an integer on the range [0,4].
Previously a SEQ instruction was used to clamp the resulting logic
value to [0,1] and invert the result. Using an SGE instruction on the
negation of the dot-product result has the same effect. Many older
shader architectures do not support the SEQ instruction. It must be
emulated using two SGE instructions and a MUL. On these
architectures, the single SGE saves two instructions.
Reviewed-by: Eric Anholt <eric@anholt.net>
The operation ir_binop_any_nequal is (a.x != b.x) || (a.y != b.y) ||
(a.z != b.z) || (a.w != b.w), and that is the same as any(bvec4(a.x !=
b.x, a.y != b.y, a.z != b.z, a.w != b.w)). Implement the any() part
the same way the regular ir_unop_any is implemented.
Reviewed-by: Eric Anholt <eric@anholt.net>
This is just like the ir_binop_logic_or case. The operation
ir_unop_any is (a.x || a.y || a.z || a.w). Logical-or is implemented
using addition (followed by clampling to [0,1]) on values of 0.0 and
1.0. Replacing the logical-or operators with addition gives (a.x +
a.y + a.z + a.w). This can be implemented using a dot-product with a
vector of all 1.0.
Previously a SNE instruction was used to clamp the resulting logic
value to [0,1]. In a fragment shader, using a saturate on the
dot-product has the same effect. Adding the saturate to the
dot-product is free, so (at least) one instruction is saved.
In a vertex shader, using an SLT on the negation of the dot-product
result has the same effect. Many older shader architectures do not
support the SNE instruction. It must be emulated using two SLT
instructions and an ADD. On these architectures, the single SLT saves
two instructions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Logical-or is implemented using addition (followed by clampling to
[0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators
with addition gives a + b which has a result on the range [0, 2].
Previously a SNE instruction was used to clamp the resulting logic
value to [0,1]. In a fragment shader, using a saturate on the add has
the same effect. Adding the saturate to the add is free, so (at
least) one instruction is saved.
In a vertex shader, using an SLT on the negation of the add result has
the same effect. Many older shader architectures do not support the
SNE instruction. It must be emulated using two SLT instructions and
an ADD. On these architectures, the single SLT saves two
instructions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Rely on the driver to do the right thing. This probably means falling
back to software. Page 88 of the OpenGL 2.1 spec specifically says:
"A shader should not fail to compile, and a program object should
not fail to link due to lack of instruction space or lack of
temporary variables. Implementations should ensure that all valid
shaders and program objects may be successfully compiled, linked
and executed."
There is no provision for saying "No" to a valid shader that is
difficult for the hardware to handle, so stop doing that.
On i915 this causes a large number of piglit tests to change from FAIL
to WARN. The warning is because the driver still emits messages to
stderr like "i915_program_error: Unsupported opcode: BGNLOOP".
It also fixes ES2 conformance CorrectFull_frag and CorrectParse1_frag
on i915 (and probably other hardware that can't handle loops).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The functionality is not used by anything yet, and the glUniform functions will
need to be reworked before this can reach its full usefulness. It is
nonetheless a step towards integer support in the state tracker and classic drivers.
This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.
Fixes i965 piglit:
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
Fixes swrast piglit:
fs-temp-array-mat[234]-index-col-rd
fs-temp-array-mat[234]-index-col-row-rd
fs-temp-array-mat[234]-index-col-wr
fs-uniform-array-mat[234]-index-col-rd
fs-uniform-array-mat[234]-index-col-row-rd
fs-varying-array-mat[234]-index-col-rd
fs-varying-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
vs-uniform-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-rd
vs-varying-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-wr
Reviewed-by: Eric Anholt <eric@anholt.net>
And don't delete them. Let ralloc clean them up. Deleting the
temporary IR leaves dangling references in the prog_instruction. That
results in a bad dereference when printing the IR with MESA_GLSL=dump.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38584
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Mesa IR actually stores all numbers as floating point, so this is
totally a farce, but we may as well keep it going.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Mesa already supports this because of NV_fragment_program.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>