Commit Graph

184 Commits

Author SHA1 Message Date
Kenneth Graunke
30be2cc6c7 i965/fs: Implement texelFetch() on Ironlake and Sandybridge.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-09-19 15:30:54 -07:00
Marek Olšák
da7233840f ir_to_mesa: fix shadow2DArray comparison
The depth should be in W.

v2: adjust the assertion, add a comment
2011-09-10 08:53:29 +02:00
Bryan Cain
488fe51cf8 mesa: Replace the EmitNoIfs compiler flag with a MaxIfDepth flag.
This is a better, more fine-grained way of lowering if statements.  Fixes the
game And Yet It Moves on nv50.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-08-31 21:49:26 -05:00
Bryan Cain
478034f34a glsl: Use a separate div_to_mul_rcp lowering flag for integers.
Using multiply and reciprocal for integer division involves potentially
lossy floating point conversions.  This is okay for older GPUs that
represent integers as floating point, but undesirable for GPUs with
native integer division instructions.

TGSI, for example, has UDIV/IDIV instructions for integer division,
so it makes sense to handle this directly.  Likewise for i965.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Bryan Cain <bryancain3@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-08-31 12:02:18 -07:00
Kai Wasserbäch
79a486ead9 Change return type of try_emit_* methods to bool.
Ian Romanick explained (Message-Id: <4E528973.6080902@freedesktop.org>),
that the return type of non-API methods shouldn't use GLboolean but a
standard C++ bool.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bryan Cain <bryancain3@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2011-08-25 07:21:00 -07:00
Kenneth Graunke
ecf8963754 i965/fs: Implement textureSize (TXS) on Gen5+.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-08-23 11:18:25 -07:00
Kenneth Graunke
1e3bcbdf31 glsl: Add a new ir_txs (textureSize) opcode to ir_texture.
One unique aspect of TXS is that it doesn't have a coordinate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2011-08-23 11:16:30 -07:00
Kenneth Graunke
07e9b9049f ir_to_mesa: Remove incorrect usage of the 'struct' keyword on classes.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-08-19 23:29:24 -07:00
Ian Romanick
ff2cfb8989 ir_to_mesa: Emit a MAD(b, -a, b) for !a && b
!a && b occurs frequently when nexted if-statements have been
flattened.  It should also be possible use a MAD for (a && b) || c,
though that would require a MAD_SAT.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:44 -07:00
Ian Romanick
ba01df11c4 ir_to_mesa: Implement ir_binop_all_equal using DP4 w/SGE
The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z
!= b.z || a.w != b.w).  Logical-or is implemented using addition
(followed by clampling to [0,1]) on values of 0.0 and 1.0.  Replacing
the logical-or operators with addition gives !bool((int(a.x != b.x) +
int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)).  This can be
implemented using a dot-product with a vector of all 1.0.  After the
dot-product, the value will be an integer on the range [0,4].

Previously a SEQ instruction was used to clamp the resulting logic
value to [0,1] and invert the result.  Using an SGE instruction on the
negation of the dot-product result has the same effect.  Many older
shader architectures do not support the SEQ instruction.  It must be
emulated using two SGE instructions and a MUL.  On these
architectures, the single SGE saves two instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:43 -07:00
Ian Romanick
e7bf096e8b ir_to_mesa: Implement ir_binop_any_nequal using DP4 w/saturate or DP4 w/SLT
The operation ir_binop_any_nequal is (a.x != b.x) || (a.y != b.y) ||
(a.z != b.z) || (a.w != b.w), and that is the same as any(bvec4(a.x !=
b.x, a.y != b.y, a.z != b.z, a.w != b.w)).  Implement the any() part
the same way the regular ir_unop_any is implemented.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:43 -07:00
Ian Romanick
92ca560d68 ir_to_mesa: Implement ir_unop_any using DP4 w/saturate or DP4 w/SLT
This is just like the ir_binop_logic_or case.  The operation
ir_unop_any is (a.x || a.y || a.z || a.w).  Logical-or is implemented
using addition (followed by clampling to [0,1]) on values of 0.0 and
1.0.  Replacing the logical-or operators with addition gives (a.x +
a.y + a.z + a.w).  This can be implemented using a dot-product with a
vector of all 1.0.

Previously a SNE instruction was used to clamp the resulting logic
value to [0,1].  In a fragment shader, using a saturate on the
dot-product has the same effect.  Adding the saturate to the
dot-product is free, so (at least) one instruction is saved.

In a vertex shader, using an SLT on the negation of the dot-product
result has the same effect.  Many older shader architectures do not
support the SNE instruction.  It must be emulated using two SLT
instructions and an ADD.  On these architectures, the single SLT saves
two instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:42 -07:00
Ian Romanick
7f4c65256c ir_to_mesa: Make ir_to_mesa_visitor::emit_dp return the instruction
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:41 -07:00
Ian Romanick
41f8ffe5e0 ir_to_mesa: Implement ir_binop_logic_or using an add w/saturate or add w/SLT
Logical-or is implemented using addition (followed by clampling to
[0,1]) on values of 0.0 and 1.0.  Replacing the logical-or operators
with addition gives a + b which has a result on the range [0, 2].

Previously a SNE instruction was used to clamp the resulting logic
value to [0,1].  In a fragment shader, using a saturate on the add has
the same effect.  Adding the saturate to the add is free, so (at
least) one instruction is saved.

In a vertex shader, using an SLT on the negation of the add result has
the same effect.  Many older shader architectures do not support the
SNE instruction.  It must be emulated using two SLT instructions and
an ADD.  On these architectures, the single SLT saves two
instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:40 -07:00
Ian Romanick
6ad08989d7 ir_to_mesa: Implement ir_unop_logic_not using 1-x
Since our logic values are 0.0 (false) and 1.0 (true), 1.0 - x
accurately implements logical not.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-16 14:09:40 -07:00
Eric Anholt
4c7e215c7b ir_to_mesa: Replace open-coded swizzle_for_size() 2011-08-05 10:37:15 -07:00
Eric Anholt
62722d90af ir_to_mesa: Try to avoid emitting a MOV_SAT to saturate an expression tree.
Fixes a regression in codegen quality for ff_fragment_shader
conversion to GLSL -- glean texCombine produces 7.5% fewer Mesa IR
instructions.
2011-08-05 10:08:31 -07:00
Bryan Cain
4683529048 Merge branch 'glsl-to-tgsi'
Conflicts:
	src/mesa/state_tracker/st_atom_pixeltransfer.c
	src/mesa/state_tracker/st_program.c
2011-08-04 15:43:34 -05:00
Ian Romanick
322c3bf9dc ir_to_mesa: Emit warnings instead of errors for IR that can't be lowered
Rely on the driver to do the right thing.  This probably means falling
back to software.  Page 88 of the OpenGL 2.1 spec specifically says:

    "A shader should not fail to compile, and a program object should
    not fail to link due to lack of instruction space or lack of
    temporary variables. Implementations should ensure that all valid
    shaders and program objects may be successfully compiled, linked
    and executed."

There is no provision for saying "No" to a valid shader that is
difficult for the hardware to handle, so stop doing that.

On i915 this causes a large number of piglit tests to change from FAIL
to WARN.  The warning is because the driver still emits messages to
stderr like "i915_program_error: Unsupported opcode: BGNLOOP".

It also fixes ES2 conformance CorrectFull_frag and CorrectParse1_frag
on i915 (and probably other hardware that can't handle loops).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-02 08:23:15 -07:00
Ian Romanick
8aadd89d07 ir_to_mesa: Use Add linker_error instead of fail_link
The functions were almost identical.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-08-02 08:23:15 -07:00
Bryan Cain
6d89abadbc mesa: support boolean and integer-based parameters in prog_parameter
The functionality is not used by anything yet, and the glUniform functions will
need to be reworked before this can reach its full usefulness.  It is
nonetheless a step towards integer support in the state tracker and classic drivers.
2011-08-01 17:59:07 -05:00
Ian Romanick
f7cd9a858c ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructor
Fixes i965 piglit:

    vs-temp-array-mat[234]-col-row-wr
    vs-temp-array-mat[234]-index-col-row-wr
    vs-temp-array-mat[234]-index-row-wr
    vs-temp-mat[234]-col-row-wr

Fixes swrast piglit:

    fs-temp-array-mat[234]-col-row-wr
    fs-temp-array-mat[234]-index-col-row-wr
    fs-temp-array-mat[234]-index-row-wr
    fs-temp-mat[234]-col-row-wr
    vs-temp-array-mat[234]-col-row-wr
    vs-temp-array-mat[234]-index-col-row-wr
    vs-temp-array-mat[234]-index-row-wr
    vs-temp-mat[234]-col-row-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick
d6e1a8f714 ir_to_mesa: Add each relative address to the previous
This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.

Fixes i965 piglit:

    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-wr
    vs-uniform-array-mat[234]-index-col-rd

Fixes swrast piglit:

    fs-temp-array-mat[234]-index-col-rd
    fs-temp-array-mat[234]-index-col-row-rd
    fs-temp-array-mat[234]-index-col-wr
    fs-uniform-array-mat[234]-index-col-rd
    fs-uniform-array-mat[234]-index-col-row-rd
    fs-varying-array-mat[234]-index-col-rd
    fs-varying-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-wr
    vs-uniform-array-mat[234]-index-col-rd
    vs-uniform-array-mat[234]-index-col-row-rd
    vs-varying-array-mat[234]-index-col-rd
    vs-varying-array-mat[234]-index-col-row-rd
    vs-varying-array-mat[234]-index-col-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Eric Anholt
a166720f2d ir_to_mesa: typo fix in a comment. 2011-07-11 08:58:37 -07:00
Ian Romanick
dbda466fc0 ir_to_mesa: Allocate temporary instructions on the visitor's ralloc context
And don't delete them.  Let ralloc clean them up.  Deleting the
temporary IR leaves dangling references in the prog_instruction.  That
results in a bad dereference when printing the IR with MESA_GLSL=dump.

NOTE: This is a candidate for the 7.10 and 7.11 branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38584
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-06 16:41:34 -07:00
Kenneth Graunke
006d5a1aa4 ir_to_mesa: "Support" u2f, i2u, and u2i operations by doing nothing.
Mesa IR actually stores all numbers as floating point, so this is
totally a farce, but we may as well keep it going.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-06-29 16:07:12 -07:00
Eric Anholt
9bd7e9c6b2 mesa: Include shader target in dumps of GLSL source.
This makes automatic parsing of MESA_GLSL=dump output easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 09:07:32 -07:00
Kenneth Graunke
68074387a4 ir_to_mesa: Emit TXD instruction.
Mesa already supports this because of NV_fragment_program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
2011-05-09 11:23:54 -07:00
Marek Olšák
847d397b34 ir_to_mesa: remove set-but-unused variables 2011-05-01 14:02:36 +02:00
Brian Paul
847f991a87 ir_to_mesa: silence signed/unsigned comparison warnings 2011-04-11 21:29:06 -06:00
Kenneth Graunke
b4dfb7473e ir_to_mesa: Use gl_register_file enum type rather than 'int'.
src_reg already used this; make dst_reg use it too.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:31:58 -07:00
Kenneth Graunke
ce5d969adf ir_to_mesa: Unprefix ir_to_mesa_undef* and ir_to_mesa_address_reg.
Rename ir_to_mesa_undef to undef_src, for clarity.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:31:58 -07:00
Kenneth Graunke
5d9718f0db ir_to_mesa: Use emit overloads to avoid passing undef registers.
Makes the code just a little bit cleaner.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:31:58 -07:00
Kenneth Graunke
01e19fcf1f ir_to_mesa: Rename ir_to_mesa_emit_*_opX methods to emit_*.
There's really no need for a prefix on member functions, and overloading
takes care of the _op1/_op2 distinction quite nicely.  Eric already made
a similar change in the i965 FS backend.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:31:58 -07:00
Kenneth Graunke
cb21fa91b8 ir_to_mesa: Use constructors to convert between src_reg and dst_reg.
Rather than ir_to_mesa_dst_reg_from_src and ir_to_mesa_src_reg_from_dst.

The new constructors are marked 'explicit' so that the compiler can
catch cases where source and destination registers were accidentally
interchanged.

This also necessitated using constructors to initialize the undef and
address registers, as well as adding a default constructor.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:29:07 -07:00
Kenneth Graunke
fc6b4332c3 ir_to_mesa: Remove the "ir_to_mesa_" prefix on src_reg/dst_reg types.
Both classes are completely private to ir_to_mesa.cpp, so there won't be
any name conflicts with other parts of Mesa.  The prefix simply makes it
harder to read.

Also, use a class rather than typedef structs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:29:07 -07:00
Kenneth Graunke
461273e910 ir_to_mesa: Rename src_reg and dst_reg variables to src and dst.
This is in preparation from removing the "ir_to_mesa_" prefix on the
src_reg and dst_reg types, which would cause a naming conflict.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-04-05 15:29:07 -07:00
Ian Romanick
9996a86085 ir_to_mesa: Handle shadow compare w/projection and LOD bias correctly
The code would previously handle the projection, then swizzle the
shadow comparitor into place.  However, when the projection is done
"by hand," as in the TXB case, the unprojected shadow comparitor would
over-write the projected shadow comparitor.

Shadow comparison with projection and LOD is an extremely rare case in
real application code, so it shouldn't matter that we don't handle
that case with the greatest efficiency.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=32395
2011-04-05 10:57:27 -07:00
Ian Romanick
a99e80d795 mesa: Fix ugly indentation left from previous commit
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2011-03-29 13:21:09 -07:00
Ian Romanick
89d81ab16c glsl: Calcluate Mesa state slots in front-end instead of back-end
This should be the last bit of infrastructure changes before
generating GLSL IR for assembly shaders.

This commit leaves some odd code formatting in ir_to_mesa and brw_fs.
This was done to minimize whitespace changes / reindentation in some
loops.  The following commit will restore formatting sanity.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2011-03-29 13:21:08 -07:00
Kenneth Graunke
5e9aa9926b mesa: Remove the CompileShader driver hook; it's just a no-op. 2011-03-17 14:48:54 -07:00
Marek Olšák
0f84ddad29 ir_to_mesa: do not check the number of uniforms against hw limits
The r300 compiler can eliminate unused uniforms and remap uniform locations
if their number surpasses hardware limits, so the limit is actually
NumParameters + NumUnusedParameters. This is important for some apps
under Wine to run.

Wine sometimes declares a uniform array of 256 vec4's and some Wine-specific
constants on top of that, so in total there is more uniforms than r300 can
handle. This was the main motivation for implementing the elimination
of unused constants.

We should allow drivers to implement fail & recovery paths where it makes
sense, so giving up too early especially when comes to uniforms is not
so good idea, though I agree there should be some hard limit for all drivers.

This patch fixes:
- glsl-fs-uniform-array-5
- glsl-vs-large-uniform-array
on drivers which can eliminate unused uniforms.
2011-03-14 03:12:34 +01:00
Brian Paul
d7db14ab7d mesa: test against MaxUniformComponents in check_resources()
Since we're compiling/linking GLSL shaders we should check against
the shader uniform limits, not the legacy vertex/fragment program
parameter limits which are usually lower.
2011-03-11 10:04:17 -07:00
Brian Paul
e0e94026a0 mesa: move location of some geometry program limits
The gl_program_constants struct is for limits that are applicable to
any/all shader stages.  Move the geometry shader-only fields into the
gl_constants struct.
Remove redundant MaxGeometryUniformComponents field too.
2011-03-11 09:25:22 -07:00
Brian Paul
8cc84b3e45 mesa: use check_resources() to check program against limits
Without these checks we could create shaders with more samplers,
constants than the driver could handle.  Fail linking rather than
dying later.
2011-03-11 09:25:22 -07:00
José Fonseca
8902c42db4 mesa: Do copy propagation across if-else-endif.
Addresses excessive TEMP allocation in vertex shaders where all CONSTs are
stored into TEMPS at the start, but copy propagation was failing due to
the presence of IFs.

We could do something about loops, but ifs are easy enough.
2011-02-17 15:29:30 +00:00
Ian Romanick
3803295fc2 ir_to_mesa: Don't dereference a NULL pointer during copy propagation
The ACP may already be NULL, so don't try to make it NULL again.

This should fix bugzilla #34119.
2011-02-11 15:44:19 -08:00
Eric Anholt
76857e8954 mesa: Fix the Mesa IR copy propagation to not read past writes to the reg.
Fixes glsl-vs-post-increment-01.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2011-02-08 11:42:35 -08:00
Kenneth Graunke
d3073f58c1 Convert everything from the talloc API to the ralloc API. 2011-01-31 10:17:09 -08:00
Brian Paul
7a4345fd83 glsl: use 'this' pointer to be consistent 2011-01-26 21:16:41 -07:00