Make gl_program::InputsRead a 64 bits bitfield.
Adapt the intel and radeon driver to handle a 64 bits
InputsRead value.
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad@chad-versace.us>
Acked-by: Paul Berry <stereotype441@gmail.com>
This can only happen in GLSL shaders because assembly shaders that use
too many temps are rejected by core Mesa. It is easiest to make this
happen with shaders that contain flow-control that could not be lowered.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Squashed commit of the following:
Author: Marek Olšák <maraeo@gmail.com>
mesa: fix getteximage so that it doesn't clamp values
mesa: update the compute_version function
mesa: add display list support for ARB_color_buffer_float
mesa: fix glGet query with GL_ALPHA_TEST_REF and ARB_color_buffer_float
commit b2f6ddf907935b2594d2831ddab38cf57a1729ce
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Tue Aug 31 16:50:57 2010 +0200
mesa: document known possible deviations from ARB_color_buffer_float
commit 5458935be800c1b19d1c9d1569dc4fa30a97e8b8
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Tue Aug 24 21:54:56 2010 +0200
mesa: expose GL_ARB_color_buffer_float
commit aef5c3c6be6edd076e955e37c80905bc447f8a82
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:12:34 2010 +0200
mesa, mesa/st: handle read color clamping properly
(I'll squash the st/mesa part to a separate commit. -Marek)
We set IMAGE_CLAMP_BIT in the caller based on _ClampReadColor, where
the operation mandates it.
TODO: did I get the set of operations mandating it right?
commit 3a9cb5e59b676b6148c50907ce6eef5441677e36
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:09:41 2010 +0200
mesa: respect color clamping in texenv programs (v2)
Changes in v2:
- Fix attributes other than vertex color sometimes getting clamped
commit de26f9e47e886e176aab6e5a2c3d4481efb64362
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:05:53 2010 +0200
mesa: restore color clamps on glPopAttrib
commit a55ac3c300c189616627c05d924c40a8b55bfafa
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:04:26 2010 +0200
mesa: clamp color queries if and only if fragment clamping is enabled
commit 9940a3e31c2fb76cc3d28b15ea78dde369825107
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Wed Aug 25 00:00:16 2010 +0200
mesa: introduce derived _ClampXxxColor state resolving FIXED_ONLY
To do this, we make ClampColor call FLUSH_VERTICES with the appropriate
_NEW flag.
We introduce _NEW_FRAG_CLAMP since fragment clamping has wide-ranging
effects, despite being in the Color attrib group.
This may be easily changed by s/_NEW_FRAG_CLAMP/_NEW_COLOR/g
commit 6244c446e3beed5473b4e811d10787e4019f59d6
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 17:58:24 2010 +0200
mesa: add unclamped color parameters
Instead of using the current gl_fragment_program. These aren't necessarily
the same, for example when translate_program() is called by
i915ValidateFragmentProgram().
Reducing the number of relocations has lots of nice knock-on effects,
not least including reducing batch buffer size, auxilliary array sizes
(vmalloced and copied into the kernel), processing of uncached
relocations etc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previously the SNE and SEQ instructions would calculate the partial
result to the destination register. This would cause problems if the
destination register was also one of the source registers.
Fixes piglit tests glsl-fs-any, glsl-fs-struct-equal,
glsl-fs-struct-notequal, glsl-fs-vec4-operator-equal,
glsl-fs-vec4-operator-notequal.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
Previously a register would be marked as available if any component
was written. This caused shaders such as this:
0: TEX TEMP[0].xyz, INPUT[14].xyyy, texture[0], 2D;
1: MUL TEMP[1], UNIFORM[0], TEMP[0].xxxx;
2: MAD TEMP[2], UNIFORM[1], TEMP[0].yyyy, TEMP[1];
3: MAD TEMP[1], UNIFORM[2], TEMP[0].zzzz, TEMP[2];
4: ADD TEMP[0].xyz, TEMP[1].xyzx, UNIFORM[3].xyzx;
5: TEX TEMP[1].w, INPUT[14].xyyy, texture[0], 2D;
6: MOV TEMP[0].w, TEMP[1].wwww;
7: MOV OUTPUT[2], TEMP[0];
8: END
to produce incorrect code such as this:
BEGIN
DCL S[0]
DCL T_TEX0
R[0] = MOV T_TEX0.xyyy
U[0] = TEXLD S[0],R[0]
R[0].xyz = MOV U[0]
R[1] = MUL CONST[0], R[0].xxxx
R[2] = MAD CONST[1], R[0].yyyy, R[1]
R[1] = MAD CONST[2], R[0].zzzz, R[2]
R[0].xyz = ADD R[1].xyzx, CONST[3].xyzx
R[0] = MOV T_TEX0.xyyy
U[0] = TEXLD S[0],R[0]
R[1].w = MOV U[0]
R[0].w = MOV R[1].wwww
oC = MOV R[0]
END
Note that T_TEX0 is copied to R[0], but the xyz components of R[0] are
still expected to hold a calculated value.
Fixes piglit tests draw-elements-vs-inputs, fp-kill, and
glsl-fs-color-matrix. It also fixes Meego bugzilla #13005.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
Fixes hang with "gst-launch-0.10 videotestsrc ! video/x-raw-rgb !
glupload ! gleffects effect=heat ! glimagesink" which uses 2 samplers
pointing at GL_TEXTURE1 and GL_TEXTURE2, and piglit
glsl-fs-sampler-numbering.
GL_TRUE indicates that the driver accepts the program.
GL_FALSE indicates the program can't be compiled/translated by the
driver for some reason (too many resources used, etc).
Propogate this result up to the GL API: set GL_INVALID_OPERATION
error if glProgramString() was called. Set shader program link
status to GL_FALSE if glLinkProgram() was called.
At this point, drivers still don't do any program checking and
always return GL_TRUE.
Other vendors have enabled ARB_fragment_shader as part of OpenGL 2.0
enablement even on hardware like the 915 with no dynamic branching or
dFdx/dFdy support. But for now we'll leave it disabled because we don't
do any flattening of ifs or loops, which is rather restrictive.
This support is not complete, and may be unstable depending on your shaders.
It passes 10/15 of the piglit glsl tests, but hangs on glean glsl1.
Previously, we were doing it in the midst of the pipeline run, which gave
an opportunity to enable/disable fallbacks, which is certainly the wrong
time to be doing so. This manifested itself in a NULL dereference for PutRow
after transitioning out of a fallback during a run_pipeline in glean glsl1.
It's misleading to report things like the program having too many native
instructions as a Mesa implementation error, when the program may just be
too big for the hardware.
There's really no need for two negation fields. This came from the
GL_NV_fragment_program extension. The new, unified Negate bitfield applies
after the absolute value step.
s/FRAG_RESULT_DEPR/FRAG_RESULT_DEPTH/
s/FRAG_RESULT_COLR/FRAG_RESULT/COLOR/
Remove FRAG_RESULT_COLH (NV half-precision) output since we never used it.
Next, we might merge the COLOR and DATA outputs (COLOR0, COLOR1, etc).
Previously fog parameter and specular color are packed into the
same dword. Note specular color should be packed in BGRA for device,
so if fog parameter and specular color all are present, fog parameter
will dirty the alpha term of specular color. This fixes rendering
issue when playing 'Yo Frankie' on 915/945.
The Taylor series notably fails at producing sin(pi) == 0, which leads to
discontinuity every 2*pi. The quadratic gets us sin(pi) == 0 behavior, at the
expense of going from 2.4% THD with working Taylor series to 3.8% THD (easily
seen on comparative graphs of the two). However, our previous implementation
was producing sin(pi) < -1 and worse, so any reasonable approximation is an
improvement. This also fixes the repeating behavior, where the previous
implementation would repeat sin(x) for x>pi as sin(x % pi) and the opposite
for x < -pi.