Previously, I tried implementing this in the i965 driver, but did so
in a way that violated the intent of the spec, and broke Tropics.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This will be convenient when I want to comment out optimization code
to see the raw program being optimized, but more importantly will let
the interference check be used during optimization.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
We could do more by handling abs/negate and non-GRF sources, but this is
a good start. Improves tropics performance 0.30% +/- .17% (n=43).
shader-db results:
Total instructions: 208032 -> 207184
60/1246 programs affected (4.8%)
23286 -> 22438 instructions in affected programs (3.6% reduction)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When I had a bug causing the backend to never finish optimizing, it
also sent me deep into swap. This avoids extra memory allocation per
trip through optimization, and thus may reduce the peak memory
allocation of the driver even in the success case.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Total instructions: 18210 -> 17836
49/163 programs affected (30.1%)
12888 -> 12514 instructions in affected programs (2.9% reduction)
This reduces Lightsmark's "Scale down filter" shader from 395
instructions to 283, a whopping 28%. It also reduces register pressure
significantly: the SIMD8 program now uses 29 registers instead of 101,
giving us more than enough room for a SIMD16 program.
v2: Add && !inst->conditional_mod to the "skip some instructions" check.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This lets you omit some ampersands and is more idiomatic C++. Using
const also marks the function as not altering either register (which
was obvious, but nice to enforce).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows us to calculate the triangle's area using fixed point,
previously it was cacluated in floating point space. It was possible
that a triangle which had negative area in floating point space had
a positive area in fixed point space.
Fixes fdo 40920.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
As pointed out by Marek, if we have only one cb, we may as well add this
single register write here rather than adding it in the draw loop.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>