Currently everything is padded to 4 components. Making the list
more flexible will allow us to do uniform packing.
V2 (suggestions from Nicolai):
- always pass existing calls to _mesa_add_parameter() true for padd_and_align
- fix bindless param value offsets
- remove left over wip logic from pad and align code
- zero out param value padding
- whitespace fix
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
gl_FragCoord contains the window coordinates so it seems to me that
we should not use perspective correct interpolation for it. At least
now I get similar output as i965/swrast/llvmpipe produce.
This fixes dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_w.
dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz was already
passing, though I'm not quite sure how it managed to do that.
v2: Add definitons for the S3 "wrap shortest" bits as well (Ian)
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Unlike the older gen2 hardware, gen3 performs perspective
correct interpolation even for the primary/secondary colors.
To do that it naturally needs us to emit W for the vertices.
Currently we emit W only when at least one texture coordinate
set gets emitted. This means the interpolation of color will
change depending on whether texcoords/varyings are used or not.
That's probably not what anyone would expect, so let's just
always emit W to get consistent behaviour. Trying to avoid
emitting W seems like more hassle than it's worth, especially
as bspec seems to suggest that the hardware will perform the
perspective division anyway.
This used to be broken until it was accidentally fixed it in
commit c349031c27 ("i915: Fix texcoord vs. varying collision
in fragment programs") by introducing a bug that made the driver
always emit W. After fixing that bug in commit c1eedb43f3
("i915: Fix wpos_tex vs. -1 comparison") we went back to the
old behaviour and caused an apparent regression.
Fixes: c1eedb43f3 ("i915: Fix wpos_tex vs. -1 comparison")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101451
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
wpos_tex used to be a GLuint so assigning -1 to it and
later comparing with -1 worked correctly, but commit
c349031c27 ("i915: Fix texcoord vs. varying collision in
fragment programs") changed wpos_tex to uint8_t and hence
broke the comparison. To fix this define a more explicit
invalid value for wpos_tex.
gcc warns us:
i915_fragprog.c:1255:57: warning: comparison is always true due to limited range of data type [-Wtype-limits]
if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
^
And clang says:
i915_fragprog.c:1255:57: warning: comparison of constant -1 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare]
if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
~~~~~~~~~~~ ^ ~~
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: c349031c27 ("i915: Fix texcoord vs. varying collision in fragment programs")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Set the flag via the _mesa_init_gl_program() and NewProgram()
helpers.
In i965 we currently check for the existance of gl_shader_program
to decide if this is an ARB assembly style program or not.
Adding a flag makes the code clearer and will help removes a
dependency on gl_shader_program in the i965 codegen functions.
Also this will allow use to skip initialising sampler units for
linked shaders, we currently memset it to zero again during linking.
Reviewed-by: Eric Anholt <eric@anholt.net>
It's common for games to compile 2000 programs or more so at
32bits x 2000 programs x 22 fields x 2 (at least) stages
This should give us something like 352 kilobytes in savings
once we add some more glsl only fields.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
This allows us to use ralloc_parent() to see which data structure owns
shader_info which allows us to fix a regression in nir_sweep().
This will also allow us to move some fields from gl_linked_shader to
gl_program, which will allow us to do some clean-ups like storing
gl_program directly in the CurrentProgram array in gl_pipeline_object
enabling some small validation optimisations at draw time.
Also it is error prone to depend on the gl_linked_shader for
programs in current use because a failed linking attempt will free
infomation about the current program. In i965 we could be trying
to recompile a shader variant but may have lost some required fields.
Reviewed-by: Eric Anholt <eric@anholt.net>
And set set inputs_read directly in shader_info.
To avoid regressions between changes this change is a squashed
version of the following patches.
st/mesa changes where:
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Here we move OriginUpperLeft and PixelCenterInteger into gl_program
all other fields have been replace by shader_info.
V2: Don't use anonymous union/structs to hold vertex/fragment fields
suggested by Ian.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
There is nothing left that can generate them. These used to be
generated by ir_to_mesa or by the assembler for various NV extensions
that have been removed.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Rather than accepting a void pointer, only to down and up cast around
it, convert the function to take the base (struct gl_program) pointer.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
i915 fragment programs utilize the texture coordinate registers
for both texture coordinates and varyings. Unfortunately the
code doesn't check if the same index might be in use for both.
It just naively uses the index to pick a texture unit, which
could lead to collisions.
Add an extra mapping step to allocate non conflicting texture
units for both uses.
The issue can be reproduced with a pair of simple shaders like
these:
attribute vec4 in_mod;
varying vec4 mod;
void main() {
mod = in_mod;
gl_TexCoord[0] = gl_MultiTexCoord0;
gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
}
varying vec4 mod;
uniform sampler2D tex;
void main() {
gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod;
}
Fixes many piglit tests on i915:
glsl-link-varyings-2
glsl-orangebook-ch06-bump
interpolation-none-gl_frontcolor-smooth-fixed
interpolation-none-gl_frontcolor-smooth-none
interpolation-none-gl_frontcolor-smooth-vertex
interpolation-none-gl_frontsecondarycolor-smooth-fixed
interpolation-none-gl_frontsecondarycolor-smooth-vertex
interpolation-none-gl_frontsecondarycolor-smooth-none
interpolation-none-other-flat-fixed
interpolation-none-other-flat-none
interpolation-none-other-flat-vertex
interpolation-none-other-smooth-fixed
interpolation-none-other-smooth-none
interpolation-none-other-smooth-vertex
v2 [idr]: Minor formatting tweaks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
It was 2 bits to accommodate SATURATE_PLUS_MINUS_ONE (removed by commit
09b566e1). A similar change was made to TGSI recently in commit
e1c4e8aa.
Reducing the size from 2 bits to 1 reduces the size of the bit fields
from 17 bits to 16, which is a much nicer number.
Reviewed-by: Brian Paul <brianp@vmware.com>
i915_fragprog.c: In function ‘i915ValidateFragmentProgram’:
i915_fragprog.c:1453:11: warning: variable ‘k’ set but not used [-Wunused-but-set-variable]
int k;
^
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Matt Turner noticed that the hardware has always had a MIN
instruction, but the driver always used MAX+MOV for no
apparent reason.
This should cut an instruction, and a temporary, allowing
more programs to run in hardware.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the
old copyright name is creating unnecessary confusion, hence this change.
This was the sed script I used:
$ cat tg2vmw.sed
# Run as:
#
# git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
#
# Rename copyrights
s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
/Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
s/TUNGSTEN GRAPHICS/VMWARE/g
# Rename emails
s/alanh@tungstengraphics.com/alanh@vmware.com/
s/jens@tungstengraphics.com/jowen@vmware.com/g
s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g
s/keithw\?@tungstengraphics.com/keithw@vmware.com/g
s/michel@tungstengraphics.com/daenzer@vmware.com/g
s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
s/zack@tungstengraphics.com/zackr@vmware.com/
# Remove dead links
s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g
# C string src/gallium/state_trackers/vega/api_misc.c
s/"Tungsten Graphics, Inc"/"VMware, Inc"/
Reviewed-by: Brian Paul <brianp@vmware.com>
This has been replaced with referring to env parameters using
PROGRAM_STATE_VAR and _mesa_load_state_parameters.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This has been replaced with referring to local parameters using
PROGRAM_STATE_VAR and _mesa_load_state_parameters.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
SEQ and SNE are not native i915 instructions, so they each generate at
least 3 instructions. If both operands are uniforms or constants, we
get 5 instructions like:
U[1] = MOV CONST[1]
U[0].xyz = SGE CONST[0].xxxx, U[1]
U[1] = MOV CONST[1].-x-y-z-w
R[0].xyz = SGE CONST[0].-x-x-x-x, U[1]
R[0].xyz = MUL R[0], U[0]
This code is stupid. Instead of having the individual calls to
i915_emit_arith generate the moves to utemps, do it in the caller. This
results in code like:
U[1] = MOV CONST[1]
U[0].xyz = SGE CONST[0].xxxx, U[1]
R[0].xyz = SGE CONST[0].-x-x-x-x, U[1].-x-y-z-w
R[0].xyz = MUL R[0], U[0]
This allows fs-temp-array-mat2-index-col-wr and
fs-temp-array-mat2-index-row-wr to fit in hardware limits (instead of
falling back to software rasterization).
NOTE: Without pending patches to the piglit tests, these tests will now
fail. This is an unrelated, pre-existing issue.
v2: Copy most of the body of the commit message into comments in the
code. Suggested by Eric.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
These were only part of NV_fragment_program, so we can kill them.
The fact that PROGRAM_NAMED_PARAM appears in r200_vertprog.c is rather
comedic, but also demonstrates that people just spam the various types
of parameters everywhere because they're confusing.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Gallium drivers and i965 don't require special notification when
sampler uniforms change. They simply see the _NEW_TEXTURE and adjust
their indirection tables. These drivers don't want ProgramStringNotify:
it simply causes pointless recompiles.
Unfortunately, i915 still requires shader recompiles and needs
ProgramStringNotify. Rather than trying to fix that, simply change the
hook to a new, more specific one: ShaderUniformChange. On i915, this
translates to ProgramStringNotify; others simply ignore it.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The current code would ignore the point size specified by gl_PointSize
builtin variable in vertex shader on Pineview. This patch servers as
fixing that.
This patch fixes the following issues on Pineview:
webglc: https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/sdk/tests/conformance/rendering/point-size.html
piglit: glsl-vs-point-size
NOTE: This is a candidate for stable release branches.
v2: pick Eric's nice tip for fixing this issue in hardware rendering.
v3: the last arg of EMIT_ATTR specify the size in _byte_. (Eric)
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Make gl_program::InputsRead a 64 bits bitfield.
Adapt the intel and radeon driver to handle a 64 bits
InputsRead value.
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad@chad-versace.us>
Acked-by: Paul Berry <stereotype441@gmail.com>
This can only happen in GLSL shaders because assembly shaders that use
too many temps are rejected by core Mesa. It is easiest to make this
happen with shaders that contain flow-control that could not be lowered.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Squashed commit of the following:
Author: Marek Olšák <maraeo@gmail.com>
mesa: fix getteximage so that it doesn't clamp values
mesa: update the compute_version function
mesa: add display list support for ARB_color_buffer_float
mesa: fix glGet query with GL_ALPHA_TEST_REF and ARB_color_buffer_float
commit b2f6ddf907935b2594d2831ddab38cf57a1729ce
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Tue Aug 31 16:50:57 2010 +0200
mesa: document known possible deviations from ARB_color_buffer_float
commit 5458935be800c1b19d1c9d1569dc4fa30a97e8b8
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Tue Aug 24 21:54:56 2010 +0200
mesa: expose GL_ARB_color_buffer_float
commit aef5c3c6be6edd076e955e37c80905bc447f8a82
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:12:34 2010 +0200
mesa, mesa/st: handle read color clamping properly
(I'll squash the st/mesa part to a separate commit. -Marek)
We set IMAGE_CLAMP_BIT in the caller based on _ClampReadColor, where
the operation mandates it.
TODO: did I get the set of operations mandating it right?
commit 3a9cb5e59b676b6148c50907ce6eef5441677e36
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:09:41 2010 +0200
mesa: respect color clamping in texenv programs (v2)
Changes in v2:
- Fix attributes other than vertex color sometimes getting clamped
commit de26f9e47e886e176aab6e5a2c3d4481efb64362
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:05:53 2010 +0200
mesa: restore color clamps on glPopAttrib
commit a55ac3c300c189616627c05d924c40a8b55bfafa
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 18:04:26 2010 +0200
mesa: clamp color queries if and only if fragment clamping is enabled
commit 9940a3e31c2fb76cc3d28b15ea78dde369825107
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Wed Aug 25 00:00:16 2010 +0200
mesa: introduce derived _ClampXxxColor state resolving FIXED_ONLY
To do this, we make ClampColor call FLUSH_VERTICES with the appropriate
_NEW flag.
We introduce _NEW_FRAG_CLAMP since fragment clamping has wide-ranging
effects, despite being in the Color attrib group.
This may be easily changed by s/_NEW_FRAG_CLAMP/_NEW_COLOR/g
commit 6244c446e3beed5473b4e811d10787e4019f59d6
Author: Luca Barbieri <luca@luca-barbieri.com>
Date: Thu Aug 26 17:58:24 2010 +0200
mesa: add unclamped color parameters
Instead of using the current gl_fragment_program. These aren't necessarily
the same, for example when translate_program() is called by
i915ValidateFragmentProgram().
Reducing the number of relocations has lots of nice knock-on effects,
not least including reducing batch buffer size, auxilliary array sizes
(vmalloced and copied into the kernel), processing of uncached
relocations etc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previously the SNE and SEQ instructions would calculate the partial
result to the destination register. This would cause problems if the
destination register was also one of the source registers.
Fixes piglit tests glsl-fs-any, glsl-fs-struct-equal,
glsl-fs-struct-notequal, glsl-fs-vec4-operator-equal,
glsl-fs-vec4-operator-notequal.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
Previously a register would be marked as available if any component
was written. This caused shaders such as this:
0: TEX TEMP[0].xyz, INPUT[14].xyyy, texture[0], 2D;
1: MUL TEMP[1], UNIFORM[0], TEMP[0].xxxx;
2: MAD TEMP[2], UNIFORM[1], TEMP[0].yyyy, TEMP[1];
3: MAD TEMP[1], UNIFORM[2], TEMP[0].zzzz, TEMP[2];
4: ADD TEMP[0].xyz, TEMP[1].xyzx, UNIFORM[3].xyzx;
5: TEX TEMP[1].w, INPUT[14].xyyy, texture[0], 2D;
6: MOV TEMP[0].w, TEMP[1].wwww;
7: MOV OUTPUT[2], TEMP[0];
8: END
to produce incorrect code such as this:
BEGIN
DCL S[0]
DCL T_TEX0
R[0] = MOV T_TEX0.xyyy
U[0] = TEXLD S[0],R[0]
R[0].xyz = MOV U[0]
R[1] = MUL CONST[0], R[0].xxxx
R[2] = MAD CONST[1], R[0].yyyy, R[1]
R[1] = MAD CONST[2], R[0].zzzz, R[2]
R[0].xyz = ADD R[1].xyzx, CONST[3].xyzx
R[0] = MOV T_TEX0.xyyy
U[0] = TEXLD S[0],R[0]
R[1].w = MOV U[0]
R[0].w = MOV R[1].wwww
oC = MOV R[0]
END
Note that T_TEX0 is copied to R[0], but the xyz components of R[0] are
still expected to hold a calculated value.
Fixes piglit tests draw-elements-vs-inputs, fp-kill, and
glsl-fs-color-matrix. It also fixes Meego bugzilla #13005.
NOTE: This is a candidate for the 7.9 and 7.10 branches.