Commit Graph

161 Commits

Author SHA1 Message Date
Eric Anholt
60177d5e2a glsl: Add an array splitting pass.
I've had this code laying around almost done for a long time.  The
idea is like opt_structure_splitting, that we've got a bunch of
transforms at the GLSL IR level that only understand scalars and
vectors, which just skip complicated dereferences.  While driver
backends may manage some optimization after they split matrices up
themselves, it would be better to bring all of our optimization to
bear on the problem.

While I wasn't expecting changes quite yet, a few programs end up
winning: a gstreamer convolution shader, and the Humus dynamic
branching demo:
Total instructions: 269430 -> 269342
3/2148 programs affected (0.1%)
1498 -> 1410 instructions in affected programs (5.9% reduction)
2012-04-11 18:08:21 -07:00
Kenneth Graunke
b2c0df2b60 glsl: Use (const char *) in AST nodes rather than plain (char *).
Nothing actually relied on them being mutable, and there was at least
one cast which discarded const qualifiers.  The next patch would have
introduced many more.

Casting away const qualifiers should be avoided if at all possible.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-04-09 14:30:34 -07:00
Eric Anholt
ac5a5b3243 glsl: Add support for parsing #version 140.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-03-15 08:33:54 -07:00
Eric Anholt
22d81f154f glsl: Save and restore the whole switch state for nesting.
This stuffs them all in a struct for sanity.  Fixes piglit
glsl-1.30/execution/switch/fs-uniform-nested.

NOTE: This is a candidate for the 8.0 branch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-02-03 11:06:50 +01:00
Eric Anholt
b9e27cc142 mesa: Add a flag for forcing all GLSL extensions to "warn".
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-01-30 11:41:49 -08:00
Ian Romanick
fa0a9ac5cd glsl: Track descriptions of some expressions that can't be l-values
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-01-06 14:32:50 -08:00
Marek Olšák
a92ee4abfe glsl: convervative_depth is not allowed in the vertex shader
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-11-22 20:56:50 +01:00
Marek Olšák
bbcb648bc2 mesa: rename the AMD_conservative_depth extension flag to ARB
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-11-22 20:56:50 +01:00
Dan McCabe
5c02e2e2de glsl: Generate IR for switch statements
Up until now modifying the GLSL compiler has been pretty straightforward.
This is where things get interesting. But still pretty straightforward.

Switch statements can be thought of a series of if/then/else statements.
Case labels are compared with the value of a test expression and the case
statements are executed if the comparison is true.

There are a couple of aspects of switch statements that complicate this simple
view of the world. The primary one is that cases can fall through sequentially
to subsequent case, unless a break statement is encountered, in which case,
the switch statement exits completely.

But break handling is further complicated by the fact that a break statement
can impact the exit of a loop. Thus, we need to coordinate break processing
between switch statements and loop statements.

The code generated by a switch statement maintains three temporary state
variables:
    int test_value;
    bool is_fallthru;
    bool is_break;

test_value is initialized to the value of the test expression at the head of
the switch statement. This is the value that case labels are compared against.

is_fallthru is used to sequentially fall through to subsequent cases and is
initialized to false. When a case label matches the test expression, this
state variable is set to true. It will also be forced to false if a break
statement has been encountered. This forcing to false on break MUST be
after every case test. In practice, we defer that forcing to immediately after
the last case comparison prior to executing a case statement, but that is
an optimization.

is_break is used to indicate that a break statement has been executed and is
initialized to false. When a break statement is encountered, it is set to true.
This state variable is then used to conditionally force is_fallthru to to false
to prevent subsequent case statements from executing.

Code generation for break statements depends on whether the break statement is
inside a switch statement or inside a loop statement. If it inside a loop
statement is inside a break statement, the same code as before gets generated.
But if a switch statement is inside a loop statement, code is emitted to set
the is_break state to true.

Just as ASTs for loop statements are managed in a stack-like
manner to handle nesting, we also add a bool to capture the innermost switch
or loop condition. Note that we still need to maintain a loop AST stack to
properly handle for-loop code generation on a continue statement. Technically,
we don't (yet) need a switch AST stack, but I am using one for orthogonality
with loop statements, in anticipation of future use. Note that a simple
boolean stack would have sufficed.

We will illustrate a switch statement with its analogous conditional code that
a switch statement corresponds to by examining an example.

Consider the following switch statement:
	switch (42) {
	case 0:
	case 1:
		gl_FragColor = vec4(1.0, 2.0, 3.0, 4.0);
	case 2:
	case 3:
		gl_FragColor = vec4(4.0, 3.0, 2.0, 1.0);
		break;
	case 4:
	default:
		gl_FragColor = vec4(0.0, 0.0, 0.0, 0.0);
	}

Note that case 0 and case 1 fall through to cases 2 and 3 if they occur.

Note that case 4 and the default case must be reached explicitly, since cases
2 and 3 break at the end of their case.

Finally, note that case 4 and the default case don't break but simply fall
through to the end of the switch.

For this code, the equivalent code can be expressed as:
	int test_val = 42; // capture value of test expression
	bool is_fallthru = false; // prevent initial fall through
	bool is_break = false; // capture the execution of a break stmt

	is_fallthru |= (test_val == 0); // enable fallthru on case 0
	is_fallthru |= (test_val == 1); // enable fallthru on case 1
	is_fallthru &= !is_break; // inhibit fallthru on previous break
	if (is_fallthru) {
		gl_FragColor = vec4(1.0, 2.0, 3.0, 4.0);
	}

	is_fallthru |= (test_val == 2); // enable fallthru on case 2
	is_fallthru |= (test_val == 3); // enable fallthru on case 3
	is_fallthru &= !is_break; // inhibit fallthru on previous break
	if (is_fallthru) {
		gl_FragColor = vec4(4.0, 3.0, 2.0, 1.0);
		is_break = true; // inhibit all subsequent fallthru for break
	}

	is_fallthru |= (test_val == 4); // enable fallthru on case 4
	is_fallthru = true; // enable fallthru for default case
	is_fallthru &= !is_break; // inhibit fallthru on previous break
	if (is_fallthru) {
		gl_FragColor = vec4(0.0, 0.0, 0.0, 0.0);
	}

The code generate for |= and &= uses the conditional assignment capabilities
of the IR.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-11-07 16:31:22 -08:00
Dan McCabe
85beb39e14 glsl: Reference data structure ctors in grammar
We now tie the grammar to the ctors of the ASTs they reference.

This requires that we actually have definitions of the ctors.

In addition, we also need to define "print" and "hir" methods for the AST
classes. The Print methods are pretty simple to flesh out. However, at this
stage of the development, we simply stub out the "hir" methods and flesh
them out later.

Also, since actual class instances get returned by the productions in the
grammar, we also need to designate the type of the productions that
reference those instances.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-11-07 16:31:22 -08:00
Chia-I Wu
2903816aad glsl: add support for GL_OES_EGL_image_external
This extension introduces a new sampler type: samplerExternalOES.
texture2D (and texture2DProj) can be used to do a texture look up in an
external texture.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-11-03 15:09:44 +08:00
Ian Romanick
1d5d67f8ad glsl: Add uniform_locations_assigned parameter to do_dead_code opt pass
Setting this flag prevents declarations of uniforms from being removed
from the IR.  Since the IR is directly used by several API functions
that query uniforms in shaders, uniform declarations cannot be removed
after the locations have been set.  However, it should still be safe
to reorder the declarations (this is not tested).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41980
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Bryan Cain <bryancain3@gmail.com>
Cc: Vinson Lee <vlee@vmware.com>
Cc: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2011-10-25 17:51:43 -07:00
Kenneth Graunke
0fabf8e8dc glsl: Defer initialization of built-in functions until they're needed.
Very simple shaders don't actually use GLSL built-ins.  For example:
- gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
- gl_FragColor = vec4(0.0);
Both of the shaders used by _mesa_meta_glsl_Clear() also qualify.

By waiting to initialize the built-ins until the first time we need to
look for a signature, we can avoid the overhead entirely in these cases.

Makes piglit run roughly 18% faster (255 vs. 312 seconds).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-09-23 17:12:47 -07:00
Eric Anholt
60df737ad5 glsl: Don't do structure splitting until link time.
We were splitting on each side of an unlinked program, and the two
sides lost track of which variables they referenced, resulting in
assertion failure during validation.  Fixes piglit
link-struct-uniform-usage.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-09-08 20:20:49 -07:00
Kenneth Graunke
b9eb4d8a59 glsl: Implement the GL_ARB_conservative_depth extension.
It's the same as GL_AMD_conservative_depth.  The specs have slight
differences in wording, but don't differ in content or behavior.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-08-25 08:07:21 -07:00
Paul Berry
3097715d41 glsl: Rewrote _mesa_glsl_process_extension to use table-driven logic.
Instead of using a chain of manually maintained if/else blocks to
handle "#extension" directives, we now consult a table that specifies,
for each extension, the circumstances under which it is available, and
what flags in _mesa_glsl_parse_state need to be set in order to
activate it.

This makes it easier to add new GLSL extensions in the future, and
fixes the following bugs:

- Previously, _mesa_glsl_process_extension would sometimes set the
  "_enable" and "_warn" flags for an extension before checking whether
  the extension was supported by the driver; as a result, specifying
  "enable" behavior for an unsupported extension would sometimes cause
  front-end support for that extension to be switched on in spite of
  the fact that back-end support was not available, leading to strange
  failures, such as those in
  https://bugs.freedesktop.org/show_bug.cgi?id=38015.

- "#extension all: warn" and "#extension all: disable" had no effect.

Notes:

- All extensions are currently marked as unavailable in geometry
  shaders.  This should not have any adverse effects since geometry
  shaders aren't supported yet.  When we return to working on geometry
  shader support, we'll need to update the table for those extensions
  that are available in geometry shaders.

- Previous to this commit, if a shader mentioned
  ARB_shader_texture_lod, extension ARB_texture_rectangle would be
  automatically turned on in order to ensure that the types
  sampler2DRect and sampler2DRectShadow would be defined.  This was
  unnecessary, because (a) ARB_shader_texture_lod works perfectly well
  without those types provided that the builtin functions that
  reference them are not called, and (b) ARB_texture_rectangle is
  enabled by default in non-ES contexts anyway.  I eliminated this
  unnecessary behavior in order to make the behavior of all extensions
  consistent.

NOTE: This is a candidate for the 7.10 and 7.11 branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-06-28 14:00:20 -07:00
Paul Berry
26b566e19c AST dump: fixed printing of conditionals.
ast_expression::print() had an incorrect index into the subexpressions
array, so (a ? b : c) was being incorrectly rendered as (a ? b : b).

Signed-off-by: Brian Paul <brianp@vmware.com>
2011-06-03 11:07:00 -06:00
Kenneth Graunke
5a3a242a8f glsl: Add compiler support for ARB_shader_texture_lod.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
2011-05-09 11:23:54 -07:00
Marek Olšák
5ba2e7adf0 mesa: implement AMD_shader_stencil_export
It's just an alias of the ARB variant with some GLSL compiler changes.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-03 12:03:22 +02:00
Eric Anholt
6a35cbb656 glsl/opt_cpe: Reenable opt_copy_propagation_elements.cpp pass. 2011-04-13 10:51:03 -07:00
Ian Romanick
03eade164d glsl: Make GL_ARB_shader_stencil_export enable block be similar to other blocks
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-04-11 14:43:42 -07:00
Ian Romanick
f2bda1b566 glsl: Only let a shader enable GL_ARB_draw_instanced if the driver supports it
Also make the GL_ARB_draw_instanced block follow the same pattern as
the other blocks.

Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-04-11 14:43:41 -07:00
Kenneth Graunke
0a163cf56d glsl: Enable GL_OES_texture_3D extension for ES2. 2011-02-28 10:35:57 -08:00
Eric Anholt
60aab5f335 glsl: Disable the new copy propagation pass until it gets fixed.
It apparently regressed a bunch of ES2 cases.
2011-02-08 11:41:05 -08:00
Eric Anholt
e31266ed3e glsl: Add a new opt_copy_propagation variant that does it channel-wise.
This patch cleans up many of the extra copies in GLSL IR introduced by
i965's scalarizing passes.  It doesn't result in a statistically
significant performance difference on nexuiz high settings (n=3) or my
demo (n=10), due to brw_fs.cpp's register coalescing covering most of
those extra moves anyway.  However, it does make the debug of wine's
GLSL shaders much more tractable, and reduces instruction count of
glsl-fs-convolution-2 from 376 to 288.
2011-02-04 12:18:38 -06:00
Kenneth Graunke
a7d350790b glsl: Fix memory error when creating the supported version string.
Passing ralloc_vasprintf_append a 0-byte allocation doesn't work.  If
passed a non-NULL argument, ralloc calls strlen to find the end of the
string.  Since there's no terminating '\0', it runs off the end.

Fixes a crash introduced in 14880a510a.
2011-02-01 00:20:01 -08:00
Ian Romanick
14880a510a glsl: Reject shader versions not supported by the implementation
Previously we'd happily compile GLSL 1.30 shaders on any driver.  We'd
also happily compile GLSL 1.10 and 1.20 shaders in an ES2 context.
This has been a long standing FINISHME in the compiler.

NOTE: This is a candidate for the 7.9 and 7.10 branches
2011-01-31 15:32:56 -08:00
Kenneth Graunke
d3073f58c1 Convert everything from the talloc API to the ralloc API. 2011-01-31 10:17:09 -08:00
Chad Versace
8ba260e099 glsl: Enable AMD_conservative_depth in parser
All the necessary compiler infrastructure for AMD_conservative_depth is in
place, so it's safe to enable it in the parser.
2011-01-26 16:37:45 -08:00
Eric Anholt
58c988ada5 glsl: Skip the rest of loop unrolling if no loops were found.
Shaves 1.6% (+/- 1.0%) off of ff_fragment_shader glean texCombine time
(n=5).
2011-01-18 10:17:37 -08:00
Brian Paul
652901e95b Merge branch 'draw-instanced'
Conflicts:
	src/gallium/auxiliary/draw/draw_llvm.c
	src/gallium/drivers/llvmpipe/lp_state_fs.c
	src/glsl/ir_set_program_inouts.cpp
	src/mesa/tnl/t_vb_program.c
2011-01-15 10:24:08 -07:00
Brian Paul
7ce186358e glsl: add support for system values and GL_ARB_draw_instanced 2010-12-08 18:25:38 -07:00
Kenneth Graunke
9a1d063c6d glsl: Add an optimization pass to simplify discards.
NOTE: This is a candidate for the 7.9 branch.
2010-12-01 11:52:43 -08:00
Kenneth Graunke
63684a9ae7 glsl: Combine many instruction lowering passes into one.
This should save on the overhead of tree-walking and provide a
convenient place to add more instruction lowering in the future.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-11-19 15:56:28 -08:00
Kenneth Graunke
a75da2c0e8 glsl: Remove useless ir_shader enumeration value. 2010-10-20 15:07:47 -07:00
Kristian Høgsberg
f9995b3075 Drop GLcontext typedef and use struct gl_context instead 2010-10-13 09:43:25 -04:00
Dave Airlie
d9671863ea glsl: add support for shader stencil export
This adds proper support for the GL_ARB_shader_stencil_export extension
to the GLSL compiler. Thanks to Ian for pointing out where I need to add things.
2010-10-13 09:30:05 +10:00
Ian Romanick
7f68cbdc4d glsl: Add parser support for GL_ARB_explicit_attrib_location layouts
Only layout(location=#) is supported.  Setting the index requires GLSL
1.30 and GL_ARB_blend_func_extended.
2010-10-08 14:21:22 -07:00
Ian Romanick
e24d35a5b5 glsl: Wrap ast_type_qualifier contents in a struct in a union
This will ease adding non-bit fields in the near future.
2010-10-08 14:21:22 -07:00
Kenneth Graunke
ca92ae2699 glsl: Properly handle nested structure types.
Fixes piglit test CorrectFull.frag.
2010-09-18 11:21:34 +02:00
Ian Romanick
8f2214f489 glsl2: Add pass to remove redundant jumps 2010-09-13 14:25:26 -07:00
Luca Barbieri
3361cbac2a glsl: add continue/break/return unification/elimination pass (v2)
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style

This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.

Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.

It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
   for the main function and/or other functions

This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported

Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.

Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.

Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".

Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.

Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
2010-09-13 13:03:09 -07:00
Luca Barbieri
e591c4625c glsl: add several EmitNo* options, and MaxUnrollIterations
This increases the chance that GLSL programs will actually work.

Note that continues and returns are not yet lowered, so linking
will just fail if not supported.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-08 20:36:37 -07:00
Chia-I Wu
dc754586ca glsl: Require a context in _mesa_glsl_parse_state.
Create a dummy context in the standalone compiler and pass it to
_mesa_glsl_parse_state.
2010-09-08 04:08:24 -07:00
Kenneth Graunke
719caa403e glsl: Accept language version 100 and make it the default on ES2. 2010-09-07 17:30:37 -07:00
Kenneth Graunke
814c89abdb glsl: Set default language version in mesa_glsl_parse_state constructor.
This should make it easier to change the default version based on the
API (say, version 1.00 for OpenGL ES).

Also, synchronize the symbol table's version with the parse state's
version just before doing AST-to-HIR.  This way, it will be set when
it matters, but the main initialization code doesn't have to care about
the symbol table.
2010-09-07 17:30:37 -07:00
Ian Romanick
de7c3fe31a glsl2: Add module to perform simple loop unrolling 2010-09-03 11:55:22 -07:00
Ian Romanick
8df2dbf91d glsl2: Perform initial bits of loop analysis during compilation 2010-09-03 11:55:21 -07:00
Chia-I Wu
bfd7c9ac22 glsl: Include main/core.h.
Make glsl include only main/core.h from core mesa.
2010-08-24 11:27:29 +08:00
Eric Anholt
b83846475b glsl2: Free the shader compiler at dri screen destruction.
Hooray, we can valgrind again without adding suppressions.  This also
adds an interface for use by an implementation of
glReleaseShaderCompiler().
2010-08-18 17:10:48 -07:00