The current gen_matypes logic assumes that the host compiler will produce
information that is useful for the target compiler. Unfortunately, this
is not the case whenever cross-compiling.
When we detect that we're cross-compiling and using GCC, use the target
compiler to produce assembly from the gen_matypes.c source, then process
it with a shell script to create a usable header. This is similar to how
the linux kernel creates its asm-offsets.c file.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
This is required in case a wrapper or symlink is used. This patch
has also been sent upstream, awaiting moderation.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Oberritter <obi@saftware.de>
Adds the dependencies of builtin_compiler as sources when cross
compiling instead of using libtool to share compilation with src/glsl.
The builtin_compiler executable is built for the host when cross
compiling so it doesn't make sense to share compilation with src/glsl
built for the target in this case.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44618
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Usually with fixed point renderbuffers clamping is done as part of conversion.
However, since we blend in float format, we essentially skip all conversion
steps pre-blend but since this is still a fixed point renderbuffer we must
still clamp the inputs in this case. Makes no difference for piglit though.
Obviously we could skip this if fragment color clamping is enabled, but a)
this is deprecated in OpenGL (d3d never had it) and b) we don't support it
natively so it gets baked into the shader.
Also add some comment about logic ops being broken for srgb, luckily no test
tries to do that as there's no easy fix...
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
We were fixing up the blend factor to ZERO, however this only works correctly
with fixed point render buffers where the input values are clamped to 0/1
(because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped
inputs). Haven't seen any failure anywhere due to that with fixed point SNORM
buffers (which clamp inputs to -1/1) but it should apply there as well (snorm
blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all,
d3d10 requires them but they are not blendable).
Doesn't look like piglit hits this though (some internal testing hits the
float case at least). (With legacy OpenGL we could theoretically still use the
fixup to zero if the fragment color clamp is enabled, but we can't detect that
easily since we don't support native clamping hence it gets baked into the
shader.)
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Adds H.264 and MPEG2 codec support via VP2, using firmware from the
blob. Acceleration is supported at the bitstream level for H.264 and
IDCT level for MPEG2.
Known issues:
- H.264 interlaced doesn't render properly
- H.264 shows very occasional artifacts on a small fraction of videos
- MPEG2 + VDPAU shows frequent but small artifacts, which aren't there
when using XvMC on the same videos
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Use grep -w instead of the empty string escape sequences
which are less portable. Makes the grep tests
function as intended on OpenBSD.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Fixes this build error on OpenBSD 5.3.
In file included from ../../src/mesa/main/ff_fragment_shader.cpp:53:
./../glsl/ir_optimization.h:64: error: comma at end of enumerator list
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Fixes these build errors on OpenBSD 5.3.
In file included from ../../src/mesa/main/errors.h:47,
from ../../src/mesa/main/imports.h:41,
from ../../src/mesa/main/ff_fragment_shader.cpp:32:
../../src/mesa/main/mtypes.h:3286: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3296: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3303: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3356: error: comma at end of enumerator list
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Use "or" instead of "add" (this is a classic select sequence, which at
least newer llvm versions can actually recognize (3.2+?), and the "add"
might prevent that - and we really don't want an add instead of an or with
avx if it isn't recognized (even without avx logic ops might be cheaper)).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Instead of just ignoring the srgb/linear conversions, simply call the
corresponding conversion functions, for all of pack/unpack/fetch,
both for float and unorm8 versions (though some don't make a whole
lot of sense, i.e. unorm8/unorm8 srgb/linear combinations).
Refactored some functions a bit so don't have to duplicate all the code
(there's a slight change for packing dxt1_rgb, as there will now be
always 4 components initialized and sent to the external compression
function so the same code can be used for all, the quite horrid and
ad-hoc interface (by now) should always have worked with that).
Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Scheduler/register allocator in r600-sb was developed and optimized
on evergreen (VLIW-5) hardware, so currently it's not optimal for
VLIW-4 chips.
This patch should improve performance on cayman gpus due to better alu
packing, but also it tends to increase register usage, so overall positive
effect on performance has to be proven by real benchmarks yet.
Some results with bfgminer kernel on cayman:
source bytecode: 60 gprs, 3905 alu groups,
sbcl before the patch: 45 gprs, 4088 alu groups,
sbcl with this patch: 55 gprs, 3474 alu groups.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Ex-scalar instructions that became multislot on cayman do replicate result
to all channels - handle them similar to DOT4.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Actually PS doesn't make sense for cayman and isn't even mentioned in
cayman docs, but llvm backend currently uses it in bytecode and, assuming
that hw seems to be mostly ok with it, this will allow sb to parse such
source bytecode correctly.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Every function but the above four uses explicitly sized types for their
src and dst arguments. Even fetch_rgba_{s,u}int follows the convention.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
MCJIT is the only supported LLVM JIT on AArch64 and ARM (the regular
JIT has bit-rotted badly on ARM and doesn't exist on AArch64.)
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Historically, we indented grammar production rules with a single 8-space
tab, but code inside of blocks used Mesa's 3-space indents.
This meant when editing code, you had to use an 8-space tab for the
first level of indentation, and 3-spaces after that. Unless you
specifically configure your editor to understand this, it will get the
indentation wrong on every single line you touch, which quickly devolves
into a colossal waste of time.
It's also inconsistent with every other file in the entire project.
This patch removes all tabs and moves to a consistent 3-space indent.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
When working on a parser, it's very easy to accidentally introduce
new shift/reduce conflicts. Failing the build guarantees they'll
be noticed and fixed.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
The single remaining shift/reduce conflict was the classic ELSE problem:
292 selection_rest_statement: statement . ELSE statement
293 | statement .
ELSE shift, and go to state 479
ELSE [reduce using rule 293 (selection_rest_statement)]
$default reduce using rule 293 (selection_rest_statement)
The correct behavior here is to shift, which is what happens by default.
However, resolving it explicitly will make it possible to fail the build
on new errors, making them much easier to detect.
The classic way to solve this is to use right associativity:
http://www.gnu.org/software/bison/manual/html_node/Non-Operators.html
Since there is no THEN token in GLSL, we need to fake one. %right THEN
creates a new terminal symbol; the %prec directive says to use the
precedence of that terminal.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
opt_return_value was not initialized if mode != ast_return.
Fixes "Uninitialized pointer field" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.
This is a candidate for the stable series.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
We always emit U,V,R coordinates for this message, but the sampler gets
very angry if we pass garbage in the R coordinate for at least some
texture formats.
Fill the remaining coordinates with zero instead.
Fixes broken rendering on GM45 in Source games, and in VDrift.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65236
NOTE: This is a candidate for stable branches.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
brw_tex_layout.c sets up the align_w/h fields, and has all the
appropriate spec references already.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The Sandybridge code had a citation for the range of the "Maximum Number
of Threads" field, and the Ivybridge code just mentioned the "BSpec" in
general. That's documented in the obvious place, so people can find it
without a spec reference.
The real value of the comment is to say "we tried zero, and it exploded,
so program it to a valid number even if pixel shading is off."
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Unfortunately, the workaround text never made it into the Sandybridge
PRM, so we still have to refer to the BSpec.
It also wasn't obvious why we needed this workaround at all, since we
don't currently do VS passthrough - but BLORP can turn off the VS.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant
text for some reason. However, the Sandybridge PRM has the text Chad
originally quoted, and the modern BSpec has the same text.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
I cut and pasted these comments from the Gen4 code during Ivybridge
enabling, and didn't understand what they meant at the time.
The data cache is NOT the same as the sampler cache on Ivybridge.
The sampler cache has L1 and L2 caches in addition to the L3 cache,
while data port messages to the "data cache" hit L3 directly.
This means that the sampler domain is technically wrong, but we stopped
caring about read/write domains quite a while ago. The kernel just
flushes all the caches at the end of each batchbuffer, and our render to
texture code flushes the sampler caches when necessary.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Presumably, this comment exists to justify the usage of
I915_GEM_DOMAIN_SAMPLER for this relocation. At one point, this was
necessary to ensure that the right flushing was done to keep caches
coherent. These days, the kernel just flushes everything, so I don't
think it matters.
Still, the comment is interesting, so leave it in place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The Ivybridge PRM adds new SFIDs and lists them in a different volume
than Sandybridge, so it's worth adding a reference.
I also removed the BSpec reference, as the section it referred to
was moved somewhere, and I couldn't find it. This leaves one Haswell
SFID without a citation, but we can add one once the PRMs are out.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never
worked anyway in the blend code and are thus disabled, and I don't think anyone
is interested in L8/L8A8. Would need even more hacks otherwise.
Unless I'm missing something, this is the last feature except MSAA needed for
OpenGL 3.0, and for OpenGL 3.1 as well I believe.
v2: prettify a bit, use separate function for packing.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The splitting of a draw call into several draw commands was broken, because
the split sometimes took place in the middle of a primitive. The splitting
was supposed to be dealing with the case when there are more indices than
the maximum size of a CS.
This commit throws that code away and uses a real index buffer instead.
https://bugs.freedesktop.org/show_bug.cgi?id=66558
Cc: mesa-stable@lists.freedesktop.org