Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.
V2: Remove leftover creation of acp in two places
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Changes make copy_propagation_elements pass faster, reducing link
time spent in test case of bug 94477. Does not fix the actual issue
but brings down the total time. No regressions seen in CI.
v2 (idr): Formatting / whitespace fixes. Embed the acp_ref in the
acp_entry.
v3 (idr): Delete unused copy constructor. Use while(pop_head) instead
of foreach() { remove }.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We've had a FINISHME here since Eric originally wrote the code in 2011.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.
The shader-db statistics are basically a wash:
No change in instruction counts.
total cycles in shared programs: 78685980 -> 78680730 (-0.01%)
cycles in affected programs: 2102646 -> 2097396 (-0.25%)
helped: 48
HURT: 83
I figured if we're going to do this for one copy propagation pass,
we may as well do it in both.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This is kind of a hack. We currently track precise requirements
by decorating ir_variables. Propagating or grafting the RHS of an
assignment to a precise value into some other expression tree can
lose those decorations.
In the long run, it might be better to replace these ir_variable
decorations with an "exact" decoration on ir_expression nodes,
similar to what NIR does.
In the short run, this is probably good enough. It preserves
enough information for glsl_to_nir to generate "exact" decorations,
and NIR will then handle optimizing these expressions reasonably.
Fixes ES31-CTS.gpu_shader5.precise_qualifier.
v2: Drop invariant handling, as it shouldn't be necessary (caught
by Jason Ekstrand).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>