Commit Graph

16 Commits

Author SHA1 Message Date
Ian Romanick
912299cb39 glsl: Eliminate ir_assignment::condition
Reformatting is left for the next commit.

v2: Remove assignments from the contructors. :face_palm:

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Ian Romanick
fb630cd783 glsl: Make ir_assignment::condition private
And add get_condition().

This proof that nothing remains that could possibly set ::condition to
anything other than NULL.

v2: Fix bad rebase.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14573>
2022-02-11 17:25:33 +00:00
Caio Marcelo de Oliveira Filho
9fdded0cc3 src/compiler: use new hash table and set creation helpers
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-01-14 10:49:28 -08:00
Eric Engestrom
e27902a261 util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Caio Marcelo de Oliveira Filho
134b5a7047 glsl: teach copy_propagation_elements to deal with whole variables
Keep information in acp_entry whether the entry is full or not, and
use the ACP in more nodes when visiting the instructions:

- add_copy: write whole variables to the ACP state (regardless the
  type).

- visit(ir_dereference_variable *): perform the propagation here if we have a
  full candidate. Element-wise here doesn't apply because the mask
  isn't available at this point.

- visit_leave(ir_assignment *): process beyond scalar and vector, as
  the full variables might have other types.

Also import an improvement from opt_copy_propagation.cpp: if ir_call
is an intrinsic, we know the variables affected, so keep going.

v2: (all from Eric Anholt)
    Describe how acp_entry attributes are used.
    Don't do book-keeping to avoid adding repeated element to
    the dsts in write_elements().

v3: Use _mesa_set_remove_key. (Thomas Helland)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2018-07-27 10:51:25 -07:00
Caio Marcelo de Oliveira Filho
52d831ff83 glsl: remove delegating constructors to allow build with C++98
Delegating constructors is a C++11 feature, so this was breaking when
compiling with C++98. Change the copy_propagation_state() calls that
used the convenience constructor to use a static member function
instead.

Since copy_propagation_state is expected to be heap allocated, this
change is a good fit.

Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305
2018-07-23 10:34:43 -07:00
Caio Marcelo de Oliveira Filho
507a8037a7 glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch
When handling 'if' in copy propagation elements, if a certain variable
was killed when processing the first branch of the 'if', then the
second would get any propagation from previous nodes.

    x = y;
    if (...) {
        z = x;  // This would turn into z = y.
        x = 22; // x gets killed.
    } else {
        w = x;  // This would NOT turn into w = y.
    }

With the change, we let copy propagation happen independently in the
two branches and only then apply the killed values for the subsequent
code.

One example in shader-db part of shaders/unity/8.shader_test:

    (assign  (xyz) (var_ref col_1)  (var_ref tmpvar_8) )
    (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) (
      (assign  (xyz) (var_ref col_1)  (expression vec3 + (var_ref tmpvar_8) ... ) ... )
    )
    (
      (assign  (xyz) (var_ref col_1)  (expression vec3 lrp (var_ref col_1) ... ) ... )
    ))

The variable col_1 was replaced by tmpvar_8 in the then-part but not
in the else-part.

NIR deals well with copy propagation, so it already covered for the
missing ones that this patch fixes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-19 10:00:59 -07:00
Caio Marcelo de Oliveira Filho
e4f32dec23 glsl: change opt_copy_propagation_elements data structures
Instead of keeping multiple acp_entries in lists, have a single
acp_entry per variable. With this, the implementation of clone is more
convenient and now fully implemented. In the previous code, clone was
only partial.

Before this patch, each acp_entry struct represented a write to a
variable including LHS, RHS and a mask of what channels were written
to. There were two main hash tables, the first (lhs_ht) stored a list
of acp_entries per LHS variable, with the values available to copy for
that variable; the second (rhs_ht) was a "reverse index" for the first
hash table, so stored acp_entries per RHS variable.

After the patch, there's a single acp_entry struct per LHS variable,
it contains an array with references to the RHS variables per
channel. There now is a single hash table, from LHS variable to the
corresponding entry. The "reverse index" is stored in the ACP entry,
in the form of a set of variables that copy from the LHS. To make the
clone operation cheaper, the ACP entries are created on demand.

This should not change the result of copy propagation, a later patch
will take advantage of the clone operation.

v2: Add note clarifying how the hashtable is destroyed.

v3: (all from Eric Anholt)
    Add remove_unused_var_from_dsts() function for reuse.
    Remove from dsts as we go instead of clearing at the end.
    Add clarifying comment to erase().

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-19 10:00:30 -07:00
Caio Marcelo de Oliveira Filho
7b0d395250 glsl: separate copy propagation state
Separate higher level logic of visiting instructions and chosing when
to store and use new copy data from the datastructure holding the copy
propagation information. This will also make easier later patches that
change the structure.

v2: Remove empty destructor and clarify how hash tables are destroyed.

Reviewed-by: Eric Anholt <eric@anholt.net>
2018-07-19 10:00:30 -07:00
Ian Romanick
37bd9ccd21 glsl: Don't copy propagate elements from SSBO or shared variables either
Since SSBOs can be written by a different GPU thread, copy propagating a
read can cause the value to magically change.  SSBO reads are also very
expensive, so doing it twice will be slower.

The same shader was helped by this patch and the previous.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14399119 -> 14399113 (<.01%)
instructions in affected programs: 683 -> 677 (-0.88%)
helped: 1
HURT: 0

total cycles in shared programs: 532973113 -> 532971865 (<.01%)
cycles in affected programs: 524666 -> 523418 (-0.24%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774
2018-06-14 11:28:12 -07:00
Thomas Helland
5f129c05e6 glsl: Use hash table cloning in copy propagation
Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.

V2: Remove leftover creation of acp in two places

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:02 +01:00
Marek Olšák
b6f50e4640 glsl: use the linear allocator in opt_copy_propagation_elements
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-10-31 11:53:38 +01:00
Tapani Pälli
c64093e7d5 glsl: optimize copy_propagation_elements pass
Changes make copy_propagation_elements pass faster, reducing link
time spent in test case of bug 94477. Does not fix the actual issue
but brings down the total time. No regressions seen in CI.

v2 (idr): Formatting / whitespace fixes.  Embed the acp_ref in the
acp_entry.

v3 (idr): Delete unused copy constructor.  Use while(pop_head) instead
of foreach() { remove }.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-10 07:29:31 +03:00
Kenneth Graunke
13b859de04 glsl: Make opt_copy_propagation_elements actually propagate into loops.
We've had a FINISHME here since Eric originally wrote the code in 2011.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are basically a wash:

   No change in instruction counts.

   total cycles in shared programs: 78685980 -> 78680730 (-0.01%)
   cycles in affected programs: 2102646 -> 2097396 (-0.25%)
   helped: 48
   HURT: 83

I figured if we're going to do this for one copy propagation pass,
we may as well do it in both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-06-06 14:14:31 -07:00
Kenneth Graunke
95d622e16d glsl: Don't copy propagate or tree graft precise values.
This is kind of a hack.  We currently track precise requirements
by decorating ir_variables.  Propagating or grafting the RHS of an
assignment to a precise value into some other expression tree can
lose those decorations.

In the long run, it might be better to replace these ir_variable
decorations with an "exact" decoration on ir_expression nodes,
similar to what NIR does.

In the short run, this is probably good enough.  It preserves
enough information for glsl_to_nir to generate "exact" decorations,
and NIR will then handle optimizing these expressions reasonably.

Fixes ES31-CTS.gpu_shader5.precise_qualifier.

v2: Drop invariant handling, as it shouldn't be necessary (caught
    by Jason Ekstrand).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-12 15:57:48 -07:00
Emil Velikov
eb63640c1d glsl: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:33 +00:00