All of the GL image enums fit in 16-bits.
Also move the fields from the anonymous "image" structucture to the next
higher structure. This will enable packing the bits with the other
bitfield.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0
After (32-bit): 70 40,577,421,777 68,487,584 62,973,695 5,513,889 0
Before (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0
After (64-bit): 74 37,124,603,758 95,891,808 88,466,712 7,425,096 0
A real savings of 346KiB on 32-bit and 262KiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0
After (32-bit): 74 40,580,119,657 69,186,544 63,506,327 5,680,217 0
Before (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0
After (64-bit): 89 36,822,971,897 96,526,616 88,735,296 7,791,320 0
A real savings of 173KiB on 32-bit and 368KiB on 64-bit.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Will be used to implement interpolateAt*() from ARB_gpu_shader5
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This has the added perk that if you forget to set ir_type in the
constructor of a new subclass (or a new constructor of an existing
subclass) the compiler will tell you... instead of relying on
ir_validate or similar run-time detection.
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
This allows them to be moved to .rodata, and allow us to be sure that they
will not be modified.
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
The i965 MUL instruction doesn't natively support 32-bit by 32-bit
integer multiplication; additional instructions (MACH/MOV) are required.
However, we can avoid those if we know one of the operands can be
represented in 16 bits or less. The vector backend's is_16bit_constant
static helper function checks for this.
We want to be able to use it in the scalar backend as well, which means
moving the function to a more generally-usable location. Since it isn't
i965 specific, I decided to make it an ir_constant method, in case it
ends up being useful to other people as well.
v2: Rename from is_16bit_integer_constant to is_uint16_constant, as
suggested by Ilia Mirkin. Update comments to clarify that it does
apply to both int and uint types, as long as the value is
non-negative and fits in 16-bits.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
To fix MSVC compile breakage. Evidently, _restrict is an MSVC keyword,
though the docs only mention __restrict (with two underscores).
Note: we may want to also rename _volatile to volatile_flag to be
consistent.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74900
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
No opaque types may be statically initialized in the shader, all
opaque variables must be declared uniform or be part of an "in"
function parameter declaration, no opaque types may be used as the
return type of a function.
v2: Add explicit check for opaque types in interface blocks. Check
for opaque types in ir_dereference::is_lvalue().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
When handling function calls, we often want to walk through the list of
formal parameters and list of actual parameters at the same time.
(Both are guaranteed to be the same length.)
Previously, we used a pattern of:
exec_list_iterator 1st_iter = <1st list>.iterator();
foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) {
...
1st_iter.next();
}
This was awkward, since you had to manually iterate through one of
the two lists.
This patch introduces a foreach_two_lists macro which safely walks
through two lists at the same time, so you can simply do:
foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) {
...
}
v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested
by Ian Romanick.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
foreach_iter and exec_list_iterators have been deprecated for some time now;
we just hadn't ever bothered to convert code to the newer foreach_list
and foreach_list_safe macros.
In these cases, we aren't editing the list, so we can use foreach_list
rather than foreach_list_safe.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch creates a new generic is_value() method, which checks if an
ir_constant has a particular value. (For vectors, it must have the
single value repeated across all components.)
It then rewrites the is_zero/is_one/is_negative_one methods to use this
generic helper. All three were basically identical except for the value
they checked for. The other difference is that is_negative_one rejects
boolean types. The new is_value function maintains this behavior, only
allowing boolean types when checking for 0 or 1.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This patch moves following bitfields and variables to the data
structure:
explicit_location, explicit_index, explicit_binding, has_initializer,
is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray,
from_named_ifc_block_array, depth_layout, location, index, binding,
max_array_access, atomic
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This patch moves following bitfields in to the data structure:
used, assigned, how_declared, mode, interpolation,
origin_upper_left, pixel_center_integer
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Data section helps serialization and cloning of a ir_variable. This
patch includes the helper bits used for read only ir_variables.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Now that loop_controls no longer creates normatively bound loops,
there is no need for ir_loop::normative_bound or the
lower_bounded_loops pass.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch replaces the ir_loop fields "from", "to", "increment",
"counter", and "cmp" with a single integer ("normative_bound") that
serves the same purpose.
I've used the name "normative_bound" to emphasize the fact that the
back-end is required to emit code to prevent the loop from running
more than normative_bound times. (By contrast, an "informative" bound
would be a bound that is informational only).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
There's no need to loop through the "parameters" list and remove every
element; move_nodes_to(¶meters) already throws away all elements of
the destination list.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
From section 7.1 (Built-In Language Variables) of the GLSL 4.10
spec:
Also, if a built-in interface block is redeclared, no member of
the built-in declaration can be redeclared outside the block
redeclaration.
We have been regarding this text as a clarification to the behaviour
established for gl_PerVertex by GLSL 1.50, so we apply it regardless
of GLSL version.
This patch enforces the rule by adding an enum to ir_variable to track
how the variable was declared: implicitly, normally, or in an
interface block.
Fixes piglit tests:
- gs-redeclares-pervertex-out-after-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-global-redeclaration.vert
- gs-redeclares-pervertex-out-after-other-global-redeclaration.geom
- vs-redeclares-pervertex-out-after-other-global-redeclaration.vert
- gs-redeclares-pervertex-out-before-global-redeclaration
- vs-redeclares-pervertex-out-before-global-redeclaration
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
v2: Don't set "how_declared" redundantly in builtin_variables.cpp.
Properly clone "how_declared".
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I made this a function (instead of a method of ir_variable) because it
made the change set smaller, and I expect that there will be an overload
that takes an ir_var_mode enum. Having both functions used the same way
seemed better.
v2: Add missing case for ir_var_system_value.
v3: Change the ir_var_mode_count case to just break. Move the assertion
and the return outside the switch-statment. In the unlikely event that
var->mode is an invalid value other than ir_var_mode_count, the
assertion will still fire, and in release builds we won't wind up
returning a garbage pointer. Suggested by Paul.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.
We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override. As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: Define local helper function to generate ir_call nodes in the
builtin generator.
v2: Fix GLSL version in which the type became available. Add
contains_atomic() convenience method. Split off atomic counter
comparison error checking to a separate patch that will handle all
opaque types. Include new ir_variable fields for atomic types.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Ever since the addition of interface blocks with instance names, we have
had an implicit invariant:
var->type->is_interface() ==
(var->type == var->interface_type)
The odd use of == here is intentional because !var->type->is_interface()
implies var->type != var->interface_type.
Further, if var->type->is_array() is true, we have a related implicit
invariant:
var->type->fields.array->is_interface() ==
(var->type->fields.array == var->interface_type)
However, the ir_variable constructor doesn't maintain either invariant.
That seems kind of silly... and I tripped over it while writing some
other code. This patch makes the constructor do the right thing, and it
introduces some tests to verify that behavior.
v2: Add general-ir-test to .gitignore. Update the description of the
ir_variable invariant for arrays in the commit message. Both suggested
by Paul.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
For interface blocks that contain arrays, this field will contain the
maximum element of each contained array that is accessed by the
shader. This is a first step toward supporting unsized arrays in
interface blocks.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
These built-ins have two "out" parameters, which makes implementing them
efficiently with our current compiler infrastructure difficult. Instead,
implement them in terms of the existing ir_binop_mul IR (to return the
low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits.
v2: Rename mul64 -> imul_high as suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Calculates the carry out of the addition of two values and the
borrow from subtraction respectively. Will be used in uaddCarry() and
usubBorrow() built-in implementations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Note the parameter name change in the int version of ir_constant, to
avoid the conflict with the loop iterator.
v2: Make analogous change to builtin_builder::imm().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
It's a ?: that operates per-component on vectors. Will be used in
upcoming lowering pass for ldexp and the implementation of frexp.
csel(selector, a, b):
per-component result = selector ? a : b
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
builtin_info was originally going to be a structure containing a bunch
of information, but after various rewrites, it turned into a boolean
availability predicate.
builtin_avail is a better name than builtin_info, since it doesn't
store any information other than availability.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We already have ir_expression constructors for unary and binary
operations, which automatically infer the type based on the opcode and
operand types.
These are convenient and also required for ir_builder support.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We can simply call the stored predicate function. If state is NULL,
just report that the function is available.
v2: Add a comment (requested by Paul Berry).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
A signature is a built-in if and only if builtin_info != NULL, so we
don't actually need a separate flag bit. Making a boolean-valued
method allows existing code to ask the same question while not worrying
about the internal representation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
For the upcoming built-in function rewrite, we'll need to be able to
answer "Is this built-in function signature available?".
This is actually a somewhat complex question, since it depends on the
language version, GLSL vs. GLSL ES, enabled extensions, and the current
shader stage.
Storing such a set of constraints in a structure would be painful, so
instead we store a function pointer. When creating a signature, we
simply point to a predicate that inspects _mesa_glsl_parse_state and
answers whether the signature is available in the current shader.
Unfortunately, IR reader doesn't actually know when built-in functions
are available, so this patch makes it lie and say that they're always
present. This allows us to hook up the new functionality; it just won't
be useful until real data is populated. In the meantime, the existing
profile mechanism ensures built-ins are available in the right places.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
This commit adds all of the parsing and semantics for GLSL 150 style
geometry shaders.
v2 (Paul Berry <stereotype441@gmail.com>): Add a few missing calls to
get_pipeline_stage(). Fix some signed/unsigned comparison warnings.
Fix handling of NULL consumer in assign_varying_locations().
v3 (Bryan Cain <bryancain3@gmail.com>): fix indexing order of 2D
arrays. Also, allow interpolation qualifiers in geometry shaders.
v4 (Paul Berry <stereotype441@gmail.com>): Eliminate
get_pipeline_stage()--it is no longer needed thanks to 030ca23 (mesa:
renumber shader indices according to their placement in pipeline).
Remove 2D stuff. Move vertices_per_prim() to ir.h, so that it will be
accessible from outside the linker. Remove
inject_num_vertices_visitor. Rework for GLSL 1.50.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v5 (Paul Berry <stereotype441@gmail.com>): Split out
do_set_program_inouts() argument refactoring to a separate patch.
Move geom_array_resizing_visitor to later in the series.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The new opcode is used to generate a new vector with a single field from
the source vector replaced. This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add constant expression handling for ir_triop_vector_insert. This
prevents the constant matrix inversion tests from regressing. Duh.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>