This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.
Fixes i965 piglit:
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
Fixes swrast piglit:
fs-temp-array-mat[234]-index-col-rd
fs-temp-array-mat[234]-index-col-row-rd
fs-temp-array-mat[234]-index-col-wr
fs-uniform-array-mat[234]-index-col-rd
fs-uniform-array-mat[234]-index-col-row-rd
fs-varying-array-mat[234]-index-col-rd
fs-varying-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
vs-uniform-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-rd
vs-varying-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-wr
Reviewed-by: Eric Anholt <eric@anholt.net>
And don't delete them. Let ralloc clean them up. Deleting the
temporary IR leaves dangling references in the prog_instruction. That
results in a bad dereference when printing the IR with MESA_GLSL=dump.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38584
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Mesa IR actually stores all numbers as floating point, so this is
totally a farce, but we may as well keep it going.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Mesa already supports this because of NV_fragment_program.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
There's really no need for a prefix on member functions, and overloading
takes care of the _op1/_op2 distinction quite nicely. Eric already made
a similar change in the i965 FS backend.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Rather than ir_to_mesa_dst_reg_from_src and ir_to_mesa_src_reg_from_dst.
The new constructors are marked 'explicit' so that the compiler can
catch cases where source and destination registers were accidentally
interchanged.
This also necessitated using constructors to initialize the undef and
address registers, as well as adding a default constructor.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Both classes are completely private to ir_to_mesa.cpp, so there won't be
any name conflicts with other parts of Mesa. The prefix simply makes it
harder to read.
Also, use a class rather than typedef structs.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is in preparation from removing the "ir_to_mesa_" prefix on the
src_reg and dst_reg types, which would cause a naming conflict.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The code would previously handle the projection, then swizzle the
shadow comparitor into place. However, when the projection is done
"by hand," as in the TXB case, the unprojected shadow comparitor would
over-write the projected shadow comparitor.
Shadow comparison with projection and LOD is an extremely rare case in
real application code, so it shouldn't matter that we don't handle
that case with the greatest efficiency.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=32395
This should be the last bit of infrastructure changes before
generating GLSL IR for assembly shaders.
This commit leaves some odd code formatting in ir_to_mesa and brw_fs.
This was done to minimize whitespace changes / reindentation in some
loops. The following commit will restore formatting sanity.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@intel.com>
The r300 compiler can eliminate unused uniforms and remap uniform locations
if their number surpasses hardware limits, so the limit is actually
NumParameters + NumUnusedParameters. This is important for some apps
under Wine to run.
Wine sometimes declares a uniform array of 256 vec4's and some Wine-specific
constants on top of that, so in total there is more uniforms than r300 can
handle. This was the main motivation for implementing the elimination
of unused constants.
We should allow drivers to implement fail & recovery paths where it makes
sense, so giving up too early especially when comes to uniforms is not
so good idea, though I agree there should be some hard limit for all drivers.
This patch fixes:
- glsl-fs-uniform-array-5
- glsl-vs-large-uniform-array
on drivers which can eliminate unused uniforms.
Since we're compiling/linking GLSL shaders we should check against
the shader uniform limits, not the legacy vertex/fragment program
parameter limits which are usually lower.
The gl_program_constants struct is for limits that are applicable to
any/all shader stages. Move the geometry shader-only fields into the
gl_constants struct.
Remove redundant MaxGeometryUniformComponents field too.
Addresses excessive TEMP allocation in vertex shaders where all CONSTs are
stored into TEMPS at the start, but copy propagation was failing due to
the presence of IFs.
We could do something about loops, but ifs are easy enough.
This catches more opportunities than the prog_optimize.c code on
openarena's fixed function shaders turned to GLSL, mostly due to
looking at multiple source instructions for copy propagation
opportunities. It should also be much more CPU efficient than
prog_optimize.c's code.
This adds a new optional max_depth parameter (defaulting to 0) to
lower_if_to_cond_assign, and makes the pass only flatten if-statements
nested deeper than that.
By default, all if-statements will be flattened, just like before.
This patch also renames do_if_to_cond_assign to lower_if_to_cond_assign,
to match the new naming conventions.
This is the same as what the array dereference handler does.
Fixes piglit test glsl-link-struct-array (bugzilla #31648).
NOTE: This is a candidate for the 7.9 and 7.10 branches.
This should allow lower_if_to_cond_assign to work in the presence of
discards, fixing bug #31690 and likely #31983.
NOTE: This is a candidate for the 7.9 branch.
This should save on the overhead of tree-walking and provide a
convenient place to add more instruction lowering in the future.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>