Commit Graph

66243 Commits

Author SHA1 Message Date
Chia-I Wu
b039dbfffd configure: check for xlocale.h and strtof
With the assumptions that xlocale.h implies newlocale and strtof_l.  SCons is
updated to define HAVE_XLOCALE_H on linux and darwin.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-30 02:26:19 -07:00
Chia-I Wu
e3f2029479 util: add _mesa_strtod and _mesa_strtof
Both core mesa and glsl have their own wrappers for strtof_l.  Merge
and move them to util/.  They are compiled with a C++ compiler so that
we can make them thread-safe in a following commit.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>
2014-10-30 02:26:19 -07:00
Mathias Fröhlich
2c2ada6720 mesa/gallium: Signal _NEW_TRANSFORM from glClipControl.
This removes the need for the gallium rasterizer state
to listen to viewport changes.
Thanks to Marek Olšák <maraeo@gmail.com>.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-30 07:52:00 +01:00
Matt Turner
600066af93 Revert "i965/compaction: Disable compaction on SNB temporarily."
This reverts commit cabc93c5ad.

Mark thinks the failures on the SNB GT2 in the lab are actually because
of faulty hardware, not instruction compaction. The GT1 didn't see any
problems after changes to the compaction code.
2014-10-29 21:38:39 -07:00
Matt Turner
601a134180 i965/vec4: Perform CSE on MAD instructions with final arguments switched.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:35:46 -07:00
Matt Turner
b65bd9583b i965/fs: Perform CSE on MAD instructions with final arguments switched.
Multiplication is commutative.

instructions in affected programs:     48314 -> 47954 (-0.75%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:35:46 -07:00
Matt Turner
d056863b3c glsl: Drop constant 0.0 components from dot products.
Helps a small number of vertex shaders in the games Dungeon Defenders
and Shank, as well as an internal benchmark.

instructions in affected programs:     2801 -> 2719 (-2.93%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:35:46 -07:00
Kenneth Graunke
26122e09a3 glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.
v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
    rather than gettimeofday(), which gives us the presentation time
    instead of the time when SwapBuffers was called.  Suggested by
    Keith Packard.  This relies on the fact that the X DRI3/Present
    implementations use microseconds for UST.

v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits
    (caught by Keith Packard).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Keith Packard <keithp@keithp.com> [v3]
Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1]
2014-10-29 15:13:58 -07:00
Kenneth Graunke
62b07b934e i965: Rename brw_vec4_gs.[ch] to brw_gs.[ch].
These source files support actual geometry shaders, so using "gs" for
the name makes a lot of sense.  We're going to be adding SIMD8 geometry
shader support as well, at which point "vec4_gs" will be a misnomer.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2014-10-29 12:38:56 -07:00
Kenneth Graunke
02f8f90cc2 i965: Rename brw_gs{,_emit}.[ch] to brw_ff_gs{,_emit}.[ch].
The brw_gs.[ch] and brw_gs_emit.c source files contain code for
emulating fixed-function unit functionality (VF primitive decomposition
or SOL) using the GS unit.  They do not contain code to support proper
geometry shaders.

We've taken to calling that code "ff_gs" (see brw_ff_gs_prog_key,
brw_ff_gs_prog_data, brw_context::ff_gs, brw_ff_gs_compile,
brw_ff_gs_prog).  So it makes sense to make the filenames match.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2014-10-29 12:38:42 -07:00
Kenneth Graunke
1480814173 i965: Rename intel_bufferobj_* functions to match GL and DD hooks.
The GL functions and driver hooks use corresponding names---for example,
glMapBufferRange and Driver.MapBufferRange.  But our implementation was
called "intel_bufferobj_map_range," which has the words "map" and
"buffer" swapped, as well as randomly adding "obj."

FlushMappedBufferRange was even trickier: it ordered the words
3, "obj", 1, 2, 4: intel_bufferobj_flush_mapped_range.

Even though the old names were consistent, I always had trouble
rearranging the jumble of words when searching for a function,
and it took a few tries to eventually land there.

The new names match the word order of GL and the driver hooks;
FlushMappedBufferRange is simply brw_flush_mapped_buffer_range.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-29 12:38:28 -07:00
Jan Vesely
993e2922c9 configure: fix typos
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-29 19:10:48 +00:00
Jan Vesely
af9551e68c configure: include llvm systemlibs when using static llvm
v2: drop -WL,--exclude-libs, it's not necessary
    fix tabs/spaces

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70410
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
2014-10-29 18:52:46 +00:00
Michel Dänzer
402ab50bed radeon/llvm: Dynamically allocate branch/loop stack arrays
This prevents us from silently overflowing the stack arrays, and allows
arbitrary stack depths.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454

Cc: mesa-stable@lists.freedesktop.org
Reported-and-Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-29 19:01:25 +09:00
Chris Forbes
0d5f4960a4 mesa: Fix order of errors for glDrawTransformFeedbackStream
The OpenGL 4.0 core profile specification, section 2.17.3
Transform Feedback Draw Operations says:

   "The error INVALID_VALUE is generated if <stream> is greater
    than or equal to the value of MAX_VERTEX_STREAMS.
    ...
    The error INVALID_OPERATION
    is generated if EndTransformFeedback has never been called
    while the object named by id was bound."

Fixes the piglit test:
   ARB_transform_feedback3/arb_transform_feedback3-draw_using_invalid_stream_index
   (with the test itself fixed to eliminate an unrelated failure)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-29 21:25:20 +13:00
Eric Anholt
f87c700895 vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT.
Fixes 14 ARB_vp tests (which had no lowering done), and should improve
performance of indirect uniform array access in GLSL.
2014-10-28 17:16:05 -07:00
Eric Anholt
5539a5b685 vc4: Fix mixup of return type in reloc_tex(). 2014-10-28 17:15:36 -07:00
Eric Anholt
926ab7dfa5 vc4: Drop redundant check for is_tmu_write().
This function is only called when it would return true.
2014-10-28 17:15:36 -07:00
Eric Anholt
8911879dec vc4: Don't forget to validate code that's got PROG_END on it.
This signal doesn't terminate the program now, it terminates the program
soon.  So you have to actually validate the code in the instruction.
2014-10-28 17:15:36 -07:00
Eric Anholt
fc1eb614a7 vc4: Add .dir-locals.el for kernel style in the kernel code. 2014-10-28 17:15:36 -07:00
Eric Anholt
6576dc1e92 vc4: Fix a couple missing '\n's in error output. 2014-10-28 17:15:36 -07:00
Brian Paul
6ad1c1eec1 st/mesa: use PIPE_BIND_DISPLAY_TARGET when checking for sRGB capability
When we're checking if the framebuffer is sRGB capable, call
is_format_supported() with the PIPE_BIND_DISPLAY_TARGET flag.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-10-28 18:07:54 -06:00
Marek Olšák
6fcb5520b7 Revert "st/mesa: set MaxUnrollIterations = 255"
This reverts commit 20836c8185.

255 is a huge number. If you have a loop with 255 iterations, unrolling it
will exceed the SM3 instruction limit. Let's use the default again.

The comment about a SM3 limit doesn't make sense. For SM3, we generally
want 32 (default) or a lower number due to the SM3 instruction limit, which
is 512 instructions. For SM4, we can try higher numbers if needed, but
some shaders can end up being pretty huge and shader compilation can take
more time.

This fixes a shader compile failure on R500/SM3. Reported on IRC.

Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-28 23:20:51 +01:00
David Heidelberger
b7186ebea9 r300g/vdpau: enable again
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2014-10-28 23:20:51 +01:00
Marek Olšák
3fc499a1dd r300g: only set clip_halfz for chips with HW TCL
I forgot that we cannot emit vertex shader state on a chip without VS.
In such a case, clip_halfz is handled by the Draw module.
2014-10-28 23:20:45 +01:00
Marek Olšák
e05259b637 radeonsi: fix incorrect index buffer max size for lowered 8-bit indices
Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-28 23:20:45 +01:00
Marek Olšák
72424061e0 radeonsi: fix polygon mode for points and lines and point/line fill modes
Fixes piglit/polygon-mode-offset.

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-28 23:20:45 +01:00
Marek Olšák
dab177ea99 r600g: fix polygon mode for points and lines and point/line fill modes
Fixes piglit/polygon-mode-offset.

Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-28 23:20:45 +01:00
Glenn Kennard
7b1c0cbc90 r600g: Implement sm5 UBO/sampler indexing
Caveat: Shaders using UBO/sampler indexing will
not be optimized by SB, due to SB not currently
supporting the necessary CF_INDEX_[01] index
registers.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2014-10-28 23:20:45 +01:00
Glenn Kennard
444c8c2f28 r600g: Implement sm5 interpolation functions
Requires evergreen/cayman

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2014-10-28 23:20:44 +01:00
Neil Roberts
3b83a5c35c docs: Update GL3.txt and relnotes for GL_KHR_context_flush_control 2014-10-28 16:51:12 +00:00
Neil Roberts
60ec95fa1e mesa: Add support for the GL_KHR_context_flush_control extension
The GL side of this extension just provides an accessor via glGetIntegerv for
the value of GL_CONTEXT_RELEASE_BEHAVIOR so it is trivial to implement. There
is a constant on the context for the value of the enum which is initialised to
GL_CONTEXT_RELEASE_BEHAVIOR_FLUSH. The extension is always enabled because it
doesn't need any driver interaction to retrieve the value.

If the value of the enum is anything but FLUSH then _mesa_make_current will
now refrain from calling _mesa_flush. This should only affect drivers that
explicitly change the enum to a non-default value.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-28 16:40:18 +00:00
Neil Roberts
1ecf6e1595 gles2: Update gl2ext.h to revision 28335
The main incentive to do this is to get the defines for the
GL_KHR_context_flush_control extension.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-10-28 16:40:18 +00:00
Jason Ekstrand
17d98ae254 i965/fs: Don't set dependency hints on instructions with spilled destinations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-27 17:54:10 -07:00
Jason Ekstrand
547a7fb458 i965/fs: Make scratch write instructions use the correct execution size
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
9d1f72ebde i965/fs: Use correct spill offsets
Different platforms require the offset to be in different units.  However,
the generator fixes all of this up for us and only requires an offset in
bytes.  Previously, we were getting this wrong all over the place.  Some
computed/used it correctly as bytes while others treated the offset as
whole registers or computed it as bytes or bytes*2 in SIMD16 mode.  This
commit cleans all this up and makes us properly treat it as bytes
everywhere.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
4242eb14c1 i965: Use the spill destination for the message header on GEN >= 7
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
76bb695f09 i965/fs: Don't [un]spill multiple registers at a time in SIMD8 mode
I thought this would be a clever way to make spilling less expensive.
However, it appears that the oword read/write messages we are using for
spilling ignore the execution size and assume SIMD16 whenever working with
more than one register.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Jason Ekstrand
3a5df8b612 i965/fs: Use instruction execution sizes when generating scratch reads/writes
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 13:35:57 -07:00
Lionel Landwerlin
d175e7c16b egl/drm: do not crash when swapping buffers without any rendering
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-27 10:36:21 -07:00
Tobias Klausmann
1a170980a0 nv50: handle inverted render conditions
This enables ARB_conditional_render_inverted.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-26 07:33:16 -04:00
Rob Clark
13862812dc freedreno/ir3: consider instruction neighbors in cp
Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive
scalar registers.  Keep track of instruction neighbors in copy-
propagation step and avoid eliminating mov's which would cause an
instruction to need multiple distinct left and/or right neighbors.

This lets us not fall on our face when we encounter things like:

  1: MOV TEMP[2], IN[0].xyzw
  2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D
  3: MOV TEMP[2].xy, IN[0].yxzz
  4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D
  5: END

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:43 -04:00
Rob Clark
4dff2a6429 freedreno/ir3: always mov tex coords
Always insert extra mov's for the tex coord into the fanin.  This
simplifies things a bit, and avoids a scenario where multiple sam
instructions can have mutually exclusive input's to it's fanin, for
example:

  1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D
  2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D

The CP pass can always remove the mov's that are not actually needed,
so better to start out with too many mov's in the front end, than not
enough.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:34 -04:00
Rob Clark
33193540fc freedreno: rename a couple debug flags
dscis -> noscis
dbypass -> nobypass

a bit more consistant w/ nobin, etc.  And IMO a bit more sensible names.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 12:07:21 -04:00
Rob Clark
ded5013c4c freedreno/ir3: skip virtual outputs in standalone compiler
Kills get added to the outputs list, to ensure they get scheduled.  But
they aren't *really* outputs so skip them in the header comment block.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 10:25:15 -04:00
Mathias Fröhlich
a9c634dded glx: Fix make check.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=85429.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-25 15:14:24 +02:00
Mathias Fröhlich
ce61559413 mesa: Add ARB_clip_control.xml to automake.
Adding this makes 'make check' catch failures introduced from
within ARB_clip_control.xml earlier.

Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
2014-10-25 15:14:24 +02:00
Rob Clark
d6252d0f63 freedreno/ir3: standalone compiler updates for ir3test
In order to test compiler changes more easily, spit out the assembled
shader with some header information so that we can know about
inputs/outputs more easily.

See: git://people.freedesktop.org/~robclark/ir3test

In ir3test we have a big collection of tgsi shaders and reference
ir3_compiler outputs.  When making compiler changes, regenerate the
compiler outputs and feed to ir3test to compare the new vs reference
shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-25 09:08:15 -04:00
Chia-I Wu
762c68b879 ilo: improve blob decoding
The last few dwords were skipped if the total number of dwords was not a
multiple of 4.  Change the formatting for better readability.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-25 14:28:08 +08:00
Eric Anholt
08599f668c i965: Skip recalculating URB allocations if the entry size didn't change.
We only get here if the VS/GS compiled programs change, but we can even
skip it if the VS/GS size didn't change.

Affects cairo runtime on glamor by -1.26471% +/- 0.674335% (n=234)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-24 23:17:14 -07:00