ir_to_mesa: Implement ir_binop_all_equal using DP4 w/SGE
The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z != b.z || a.w != b.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives !bool((int(a.x != b.x) + int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be implemented using a dot-product with a vector of all 1.0. After the dot-product, the value will be an integer on the range [0,4]. Previously a SEQ instruction was used to clamp the resulting logic value to [0,1] and invert the result. Using an SGE instruction on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SEQ instruction. It must be emulated using two SGE instructions and a MUL. On these architectures, the single SGE saves two instructions. Reviewed-by: Eric Anholt <eric@anholt.net>
This commit is contained in:
@@ -1237,8 +1237,19 @@ ir_to_mesa_visitor::visit(ir_expression *ir)
|
||||
ir->operands[1]->type->is_vector()) {
|
||||
src_reg temp = get_temp(glsl_type::vec4_type);
|
||||
emit(ir, OPCODE_SNE, dst_reg(temp), op[0], op[1]);
|
||||
|
||||
/* After the dot-product, the value will be an integer on the
|
||||
* range [0,4]. Zero becomes 1.0, and positive values become zero.
|
||||
*/
|
||||
emit_dp(ir, result_dst, temp, temp, vector_elements);
|
||||
emit(ir, OPCODE_SEQ, result_dst, result_src, src_reg_for_float(0.0));
|
||||
|
||||
/* Negating the result of the dot-product gives values on the range
|
||||
* [-4, 0]. Zero becomes 1.0, and negative values become zero. This
|
||||
* achieved using SGE.
|
||||
*/
|
||||
src_reg sge_src = result_src;
|
||||
sge_src.negate = ~sge_src.negate;
|
||||
emit(ir, OPCODE_SGE, result_dst, sge_src, src_reg_for_float(0.0));
|
||||
} else {
|
||||
emit(ir, OPCODE_SEQ, result_dst, op[0], op[1]);
|
||||
}
|
||||
|
Reference in New Issue
Block a user