Commit Graph

78 Commits

Author SHA1 Message Date
Nicolai Hähnle
b71c415c3d glsl: split DIV_TO_MUL_RCP into single- and double-precision flags
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:19 +01:00
Ian Romanick
7122d851aa glsl: Add a lowering pass for 64-bit integer modulus
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
82c31f3eb9 glsl: Add a lowering pass for 64-bit integer division
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
50d52df278 glsl: Add a lowering pass for 64-bit integer sign()
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
6c3af04363 glsl: Add a lowering pass for 64-bit integer multiplication
v2: Rename lower_64bit.cpp and lower_64bit_test.cpp to lower_int64.
Suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Marek Olšák
e33440070a glsl/lower_if: conditionally lower if-branches based on their size
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-15 20:23:39 +01:00
Marek Olšák
83d9b8a6f6 glsl/lower_if: don't lower branches touching tess control outputs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-11-15 20:23:35 +01:00
Ilia Mirkin
73f53c097a glsl: record number of components used in each slot for varying packing
Instead of packing varyings into vec4's, keep track of how many
components each slot uses and create varyings with matching types. This
ensures that we don't end up using more components than the orginal
shader, which is especially important for geometry shader output limits.

This comes up for NVIDIA hw, where the limit is 1024 output components
for a GS, and the hardware complains *loudly* if you even think about
going over.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-11-09 20:26:48 -05:00
Timothy Arceri
481d8ec291 glsl: use reproducible name for lowered const arrays
Otherwise we can end up with mismatching names between the cached
binary and the cached metadata.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-27 11:11:15 +10:00
Kenneth Graunke
8ab50f5dd1 glsl: Add a lowering pass to handle advanced blending modes.
Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
need to emulate it in the pixel shader.  This lowering pass implements
all the necessary math for advanced blending.  It fetches the existing
framebuffer value using the MESA_shader_framebuffer_fetch built-in
variables, and the previous commit's state var uniform to select
which equation to use.

This is done at the GLSL IR level to make it easy for all drivers to
implement the GL_KHR_blend_equation_advanced extension and share code.

Drivers need to hook up MESA_shader_framebuffer_fetch functionality:
1. Hook up the fb_fetch_output variable
2. Implement BlendBarrier()

Then to get KHR_blend_equation_advanced, they simply need to:
3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled
4. Call this lowering pass.

Very little driver specific code should be required.

v2: Handle multiple output variables per render target (which may exist
    due to ARB_enhanced_layouts), and array variables (even with one
    render target, we might have out vec4 color[1]), and non-vec4
    variables (it's easier than finding spec text to justify not
    handling it).  Thanks to Francisco Jerez for the feedback.
v3: Lower main returns so that we have a single exit point where we
    can add our blending epilogue (caught by Francisco Jerez).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-08-25 19:22:10 -07:00
Ian Romanick
a2379e44aa glsl: Add lowering pass for ir_bin_imul_high
This isn't the lowering pass you want.  Most GPUs that can support GLSL
1.30 have a multiply unit that can do something more interesting than
32x32->32.  Many have 32x16->48.  Any GPU that does, should do the
lowering in the backend.  This is just the thing that will always work.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:29 -07:00
Ian Romanick
1b5477668a glsl: Add lowering pass for ir_unop_find_msb
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:29 -07:00
Ian Romanick
2a381a3c73 glsl: Add lowering pass for ir_unop_find_lsb
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:29 -07:00
Ian Romanick
ad9acb19c3 glsl: Add lowering pass for ir_unop_bitfield_reverse
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:28 -07:00
Ian Romanick
3079dcb00c glsl: Add lowering pass for ir_quadop_bitfield_insert
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:28 -07:00
Ian Romanick
4d6d219b58 glsl: Add lowering pass for ir_triop_bitfield_extract
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:28 -07:00
Ian Romanick
7340be8a01 glsl: Add lowering pass for ir_unop_bit_count
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-07-19 12:19:28 -07:00
Timothy Arceri
1fb8c6df88 glsl/mesa: split gl_shader in two
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.

The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.

We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2016-06-30 16:51:25 +10:00
Jason Ekstrand
27b9481d03 glsl: Add an option to clamp block indices when lowering UBO/SSBOs
This prevents array overflow when the block is actually an array of UBOs or
SSBOs.  On some hardware such as i965, such overflows can cause GPU hangs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-23 19:12:34 -07:00
Dave Airlie
a08c4ebbe8 glsl: rewrite clip/cull distance lowering pass
The last version of this broke clipping, and I had to spend
sometime getting this working properly.

I had to introduce a third pass to count the clip/cull totals,
all due to one messy corner case. We have a piglit test
tes-input-gl_ClipDistance.shader_test
that doesn't actually output the clip distances, it just passes
them like a varying from TCS->TES, the older lowering pass worked
but to lower clip/cull we need to know the total number of clip+culls
used to defined the new variable correctly, and to offset culls
properly.

This adds an extra pass that works out the sizes for clip/cull,
then lowers gl_ClipDistance then gl_CullDistance into the new
gl_ClipDistanceMESA.

The pass checks using the fixed array sizes code if they array
has been referenced, or is actually never used, and ignores
it in the latter case.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-05-24 11:27:29 +10:00
Kenneth Graunke
329fe93210 glsl: Consolidate duplicate copies of constant folding.
We could probably clean this up more (maybe make it a method), but at
least there's only one copy of this code now, and that's a start.

No change in shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-05-15 23:59:20 -07:00
Dave Airlie
7a6d55826e Revert "glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4)"
This reverts commit ad355652c2.

This broke a bunch of clip tests.
2016-05-14 11:39:34 +10:00
Tobias Klausmann
ad355652c2 glsl: Extend lowering pass for gl_ClipDistance to support other arrays (v4)
This will come in handy when we want to lower gl_CullDistance into
gl_CullDistanceMESA.

[airlied: drop separate APIs for clip/cull - just use single API
to call both passes.]

v3: reexamine my sanity, this was pretty broken, the new code
creates one copy of gl_ClipDistanceMESA, as the clip distance
varying and lowers everything into that in two passes, one for clips
one for culls.
v4: rework using the passes in clip/cull sizes, instead of the
array sizes.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2016-05-14 08:28:07 +10:00
Jason Ekstrand
89b604922d glsl: Add a pass to propagate the "invariant" and "precise" qualifiers
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Timothy Arceri
d6b9202873 glsl: disable varying packing when its not safe
In GL 4.4+ there is no guarantee that interpolation qualifiers will
match between stages so we cannot safely pack varyings using the
current packing pass in Mesa.

We also disable packing on outerward facing interfaces for SSO
because in ES we need to retain the unpacked varying information
for draw time validation. For desktop GL we could allow packing for
SSO in versions < 4.4 but its just safer not to do so.

We do however enable packing on individual arrays, structs, and
matrices as these are required by the transform feedback code and it
is still safe to do so.

Finally we also enable packing when a varying is only used for
transform feedback and its not a SSO.

This fixes all remaining rendering issues with the dEQP SSO tests,
the only issues remaining with thoses tests are to do with validation.

Note: There is still one remaining SSO bug that this patch doesn't fix.
Their is a chance that VS -> TCS will have mismatching interfaces
because we pack VS output in case its used by transform feedback but
don't pack TCS input for performance reasons. This patch will make the
situation better but doesn't fix it.

V4: fix out of order function params after rebase, make sure packing
still disabled in tess stages. Update comments as to why we disable
packing on SSO.

V3: ES 3.1 *does* require interpolation to match so don't disable
packing there. Rebased on master rather than on enhanced layouts
component packing series.

V2: Make is_varying_packing_safe() a function in the varying_matches
class, fix spelling (Matt) and make sure to remove the outer array
when dealing with Geom and Tess shaders where appropriate.
Lastly fix piglit regression in new piglit test and document the
undefined behaviour it depends on:
arb_separate_shader_objects/execution/vs-gs-linking.shader_test

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-18 10:26:34 +11:00
Timothy Arceri
c0ae6eeb3b glsl: pass disable_varying_packing bool to the lowering pass
This will allow us to choose to ignore the disable which will be
useful for more fine grained control over when to enable or disable
packing.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-18 10:26:30 +11:00
Matt Turner
8709dc0713 glsl: Remove 2x16 half-precision pack/unpack opcodes.
i965/fs was the only consumer, and we're now doing the lowering in NIR.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Emil Velikov
eb63640c1d glsl: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:33 +00:00