Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).
Pattern matching or peepholing this is more desirable, but can be
tricky. By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.
Currently, all consumers lower ir_triop_lrp. Subsequent patches will
actually generate different code.
v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
- Move changes from the next patch to opt_algebraic.cpp to accept
3-src operations.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, we had separate constructors for one, two, and four operand
expressions. This patch consolidates them into a single constructor
which uses NULL default parameters.
The unary and binary operator constructors had assertions to verify that
the caller supplied the correct number of operands for the expression,
but the four-operand version did not. Since get_num_operands for
ir_quadop_vector returns the number of vector_elements, we can safely
add that without breaking the semantics of ir_quadop_vector.
This also paves the way for expressions with three operands. Currently,
none can be constructed since get_num_operands() never returns 3.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
total instructions in shared programs: 346873 -> 346847 (-0.01%)
instructions in affected programs: 364 -> 338 (-7.14%)
(All affected shaders are from Lightsmark)
Reviewed-by: Eric Anholt <eric@anholt.net>
Gen6 has write-only MRF registers, and for ease of implementation we
paritition off 16 general purposes registers to act as MRFs on Gen7.
Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't
do with real MRFs:
- read from them;
- return values directly to them from a send instruction; and
- compute directly to them with math instructions.
Reviewed-by: Eric Anholt <eric@anholt.net>
This requirement was added by ARB_fragment_program
When the Steam overlay is enabled, this fixes:
* Menu corruption with the Puddle game
* The screen going black on Rochard when
the Steam overlay is accessed
NOTE: This is a candidate for the 9.0 and 9.1 branches.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Motivated by wanting to see if GenTextures was called by an
application while debugging another Steam overlay issue.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Now that buffers can be used as textures or render targets
make sure they aren't skipped.
Fix suggested by Jose Fonseca.
v2: added a couple of assertions so we can actually guarantee
we check the resources and don't skip them. Also added some comments
that this is actually a lie due to the way the opengl buffer api works.
Unfortunately not usable from OpenGL, and no cap bit.
Pretty similar to a 1d texture, though allows specifying a start element.
v2: also fix up renderbuffer width (which will get promoted to fb width)
to be the number of elements
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
For PIPE_BUFFER we need coord adjustments for the transfer.
And for pure integer formats util_pack_color just crashes,
need to handle that differently due to clear colors being ints/uints.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Use a single sampler adapter instead of per-sampler-unit samplers,
and just pass along texture unit and sampler unit in the calls.
The reason is that for dx10-style sample opcodes pre-wired
samplers including all the texture state aren't really feasible (and for
sample_i/sviewinfo we don't even have samplers).
Of course right now softpipe doesn't actually do anything more than
just look up all its pre-wired per-texunit/per-samplerunit sampler as
it did before so this doesn't really achieve much except one more
function call, however this is now all softpipe's fault (fixing that in
a way which doesn't suck is still an unsolved problem).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
glXCreateWindow() and glXCreatePbuffer() always fail when built without
GLX_DIRECT_RENDERING defined since commit 48331047.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Normally the application will own the main event queue and be responsible
for moving events. In case of EGL_DEFAULT_DISPLAY, EGL opens the display
and has to own the main queue so it can move the events itself.
Call wl_display_dispatch_pending() to take ownership.
Previously only the 32-bit X visual would match the 32-bit RGBA8888
configs. This resulted in every config with alpha getting the "magic"
visual whose alpha is used by the compositor. This also resulted in no
multisample visuals being advertised. How many ways could we lose?
This patch inverts the problem... now you can't get the visual with
alpha used by the compositor even if you want it. I think we need to
invent a new value for EGL_TRANSPARENT_TYPE that apps can use to get
this. I'm surprised that there isn't already a choice for
EGL_TRANSPARENT_ALPHA.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Tian Ye <yex.tian@intel.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59783
Streamout buffers need to be synchronized on r6xx as
well.
v2: Add DEST flush as well.
v3: drop DEST flush
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Since with llvm execution parts of sampler view and sampler state is baked into
the shader, we need to revalidate otherwise the wrong shader might get used.
(Not completely sure but I think this would not be required for non-llvm case,
along with everything else in these functions.)
This caused bugs in piglit arb_texture_buffer_object-formats, because we never
noticed that the view format changed.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This also fixes not honoring first/last_layer view parameters for array
textures, plus not honoring last_level view parameter for all textures
(neither is really used by OpenGL).
This mostly passes piglit arb_texture_buffer_object tests (it needs, however,
glsl 140 version override, plus GL 3.1 override, the latter only because
mesa does not allow ARB_tbo in non-core contexts).
Most arb_texture_buffer_object tests pass, with the exception of
arb_texture_buffer_object-formats. With "arb" parameter it passes most weirdo
formats before it segfaults in the state tracker, this looks to be some issue
with using legacy formats in core context (fails the same in softpipe).
With "core" parameter it passes with "fs", however fails with "vs" (for most
formats). This will be fixed later (debugging shows we're completely missing
the shader recompile depending on format).
v2: based on Jose's feedback, fix comments, variable/function names.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
When you didn't have a texcoord array bound (or a non-1 current w
attrib), we were telling the fragment shader that it could just use "1"
instead of doing expensive pre-gen6 math to invert it. If you drew the
point with a non-1 W value, then you'd get the right size (since all the
vertex computations worked), but we'd mis-interpolate the coordinate
across the face.
Fixes the mesa pointsprite demo on GM45.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30232
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
Note: This is a candidate for the stable branches.
missing case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES is required
by OES_EGL_image_external extension.
Note: This is a candidate for the stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Previously when an input varying was optimized out of the
FS we would still retain it as an output of the VS.
We now build a hash of live FS input varyings rather
than looking in the FS symbol table. (The FS symbol table
will still contain the optimized out varyings.)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We might want to revisit the normalized_coords semantics, but this is
the current expected behavior.
Fixes fdo bug 61091.
Reviewed-by: Brian Paul <brianp@vmware.com>