We are testing the behaviour of a tool, for different input files, each
one using a different newline sequence. ('\n' on UNIX, '\r\n' on
Windows, …)
Unfortunately, when opening a file in text mode, Python 3 will by
default enable the "universal newlines" mode, which means it replaces
all the known newline sequences by '\n'.
This (usually useful) behaviour breaks the tests, which are specifically
trying to handle files with newline sequences different from '\n'.
Disabling the universal newlines mode fixes the tests.
However, to keep the script compatible with both Python 2 and 3, we must
use the io.open() function instead of the open() builtin, as the latter
only knows about the `newline` argument on Python 3.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Python 3 does not automatically convert from bytes to unicode strings
like Python 2 used to do.
This commit makes sure we pass unicode strings to difflib.unified_diff,
so that the script works on both Python 2 and 3.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
v2: - explicitly decode the output of subprocesses
- handle bytes and string types consistently rather than relying on
python 2's coercion for bytes and ignoring them in python 3
v3: - explicitly set encode as well as decode
- python 2.7 and 3.x `bytes` instead of defining an alias
Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>
The main purpose for having NV_fragment_shader_interlock
extension is because that extension is also for GLES31 while
the ARB extension is for GL only.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
After commit "nir: Use derefs in nir_lower_samplers"
(75286c2d08) assumes one deref for both
the texture and the sampler. However there are cases (on OpenGL, using
ARB_gl_spirv) where SPIR-V is not providing a sampler, like for
texture query levels ops. Although we could make spirv_to_nir to
provide a sampler deref for those cases, it is not really needed, and
wrong from the Vulkan point of view.
This patch fixes the following (borrowed) tests run on SPIR-V mode:
arb_compute_shader/execution/basic-texelFetch.shader_test
arb_gpu_shader5/execution/sampler_array_indexing/fs-simple-texture-size.shader_test
arb_texture_query_levels/execution/fs-baselevel.shader_test
arb_texture_query_levels/execution/fs-maxlevel.shader_test
arb_texture_query_levels/execution/fs-miptree.shader_test
arb_texture_query_levels/execution/fs-nomips.shader_test
arb_texture_query_levels/execution/vs-baselevel.shader_test
arb_texture_query_levels/execution/vs-maxlevel.shader_test
arb_texture_query_levels/execution/vs-miptree.shader_test
arb_texture_query_levels/execution/vs-nomips.shader_test
glsl-1.30/execution/fs-textureSize-compare.shader_test
v2: merge lower_tex_src_to_offset and calc_sampler_offsets together,
update texture/sampler index and texture_array_size directly on
lower_tex_src_to_offset (Jason)
v3: clarify one comment (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
So they are not exposed through the introspection API.
It is worth to note that the number of hidden uniforms of GLSL linking
vs SPIR-V linking would be somewhat different due the differen order
of the nir lowerings/optimizations.
For example: gl_FbWposYTransform. This is introduced as part of
nir_lower_wpos_ytransform. On GLSL that is executed after the IR-based
linking. So that means that on GLSL the UniformStorage will not
include this uniform. With the SPIR-V linking, that uniform is already
present, but marked as hidden. So it will be included on the
UniformStorage, but as hidden.
One alternative would create a special how_declared for that case, but
seemed an overkill. Using hidden should be ok as far as it is used
properly.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Now that all the build scripts are compatible with both Python 2 and 3,
we can flip the switch and tell Meson to use the latter.
Since Meson already depends on Python 3 anyway, this means we don't need
two different Python stacks to build Mesa.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Fixes new piglit test:
tests/spec/glsl-1.20/execution/qualifiers/vs-out-conversion-int-to-float-vec4-index.shader_test
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: ignore names on purpose, for consistency with other places where
we are doing the same (Alejandro)
v3: changes proposed by Timothy Arceri, implemented by Alejandro Piñeiro:
* Remove redundant 'struct active_xfb_varying'
* Update several comments, including spec quotes if needed
* Rename struct 'active_xfb_varying_array' to 'active_xfb_varyings'
* Rename variable 'array' to 'active_varyings'
* Replace one if condition for an assert (<MAX_FEEDBACK_BUFFERS)
* Remove BufferMode initialization (was already done)
v4: simplify output pointer handling (Timothy)
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
These are copied from the from the corresponding values in
ir_variable. The intention is to eventually use them in a pure-NIR
linker.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Now that the elements version handles both cases, remove the
non-elements version.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Keep information in acp_entry whether the entry is full or not, and
use the ACP in more nodes when visiting the instructions:
- add_copy: write whole variables to the ACP state (regardless the
type).
- visit(ir_dereference_variable *): perform the propagation here if we have a
full candidate. Element-wise here doesn't apply because the mask
isn't available at this point.
- visit_leave(ir_assignment *): process beyond scalar and vector, as
the full variables might have other types.
Also import an improvement from opt_copy_propagation.cpp: if ir_call
is an intrinsic, we know the variables affected, so keep going.
v2: (all from Eric Anholt)
Describe how acp_entry attributes are used.
Don't do book-keeping to avoid adding repeated element to
the dsts in write_elements().
v3: Use _mesa_set_remove_key. (Thomas Helland)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.
Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().
As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In Python 2, iterators had a .next() method.
In Python 3, instead they have a .__next__() method, which is
automatically called by the next() builtin.
In addition, it is better to use the iter() builtin to create an
iterator, rather than calling its __iter__() method.
These were also introduced in Python 2.6, so using it makes the script
compatible with Python 2 and 3.
Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Delegating constructors is a C++11 feature, so this was breaking when
compiling with C++98. Change the copy_propagation_state() calls that
used the convenience constructor to use a static member function
instead.
Since copy_propagation_state is expected to be heap allocated, this
change is a good fit.
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305
When handling 'if' in copy propagation elements, if a certain variable
was killed when processing the first branch of the 'if', then the
second would get any propagation from previous nodes.
x = y;
if (...) {
z = x; // This would turn into z = y.
x = 22; // x gets killed.
} else {
w = x; // This would NOT turn into w = y.
}
With the change, we let copy propagation happen independently in the
two branches and only then apply the killed values for the subsequent
code.
One example in shader-db part of shaders/unity/8.shader_test:
(assign (xyz) (var_ref col_1) (var_ref tmpvar_8) )
(if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) (
(assign (xyz) (var_ref col_1) (expression vec3 + (var_ref tmpvar_8) ... ) ... )
)
(
(assign (xyz) (var_ref col_1) (expression vec3 lrp (var_ref col_1) ... ) ... )
))
The variable col_1 was replaced by tmpvar_8 in the then-part but not
in the else-part.
NIR deals well with copy propagation, so it already covered for the
missing ones that this patch fixes.
Reviewed-by: Eric Anholt <eric@anholt.net>
Instead of keeping multiple acp_entries in lists, have a single
acp_entry per variable. With this, the implementation of clone is more
convenient and now fully implemented. In the previous code, clone was
only partial.
Before this patch, each acp_entry struct represented a write to a
variable including LHS, RHS and a mask of what channels were written
to. There were two main hash tables, the first (lhs_ht) stored a list
of acp_entries per LHS variable, with the values available to copy for
that variable; the second (rhs_ht) was a "reverse index" for the first
hash table, so stored acp_entries per RHS variable.
After the patch, there's a single acp_entry struct per LHS variable,
it contains an array with references to the RHS variables per
channel. There now is a single hash table, from LHS variable to the
corresponding entry. The "reverse index" is stored in the ACP entry,
in the form of a set of variables that copy from the LHS. To make the
clone operation cheaper, the ACP entries are created on demand.
This should not change the result of copy propagation, a later patch
will take advantage of the clone operation.
v2: Add note clarifying how the hashtable is destroyed.
v3: (all from Eric Anholt)
Add remove_unused_var_from_dsts() function for reuse.
Remove from dsts as we go instead of clearing at the end.
Add clarifying comment to erase().
Reviewed-by: Eric Anholt <eric@anholt.net>
Separate higher level logic of visiting instructions and chosing when
to store and use new copy data from the datastructure holding the copy
propagation information. This will also make easier later patches that
change the structure.
v2: Remove empty destructor and clarify how hash tables are destroyed.
Reviewed-by: Eric Anholt <eric@anholt.net>
The "__inst" will contain the name used for the variable of type
"__type *". Parenthesis is not necessary as the name itself shouldn't
be an expression.
Fixes warning:
In file included from ../../src/mesa/main/mtypes.h:49,
from ../../src/intel/compiler/brw_compiler.h:30,
from ../../src/intel/compiler/brw_shader.h:29,
from ../../src/intel/compiler/brw_fs.h:31,
from ../../src/intel/compiler/brw_fs_cse.cpp:24:
../../src/intel/compiler/brw_fs_cse.cpp: In member function ‘bool fs_visitor::opt_cse_local(bblock_t*)’:
../../src/compiler/glsl/list.h:675:12: warning: unnecessary parentheses in declaration of ‘entry’ [-Wparentheses]
__type *(__inst); \
^
../../src/intel/compiler/brw_fs_cse.cpp:257:10: note: in expansion of macro ‘foreach_in_list_use_after’
foreach_in_list_use_after(aeb_entry, entry, &aeb) {
^~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
When handling loops in constant propagation, implement the "FINISHME"
comment like copy propagation: perform a first pass to find values
that can't be propagated, then perform a second pass with the ACP
containing still valid values.
Certain values are killed because the loop may run more than one
iteration, so we can't copy propagate them as they would be invalid in
the later iterations.
Reviewed-by: Eric Anholt <eric@anholt.net>
When handling 'if' in constant propagation, if a certain variable was
killed when processing the first branch of the 'if', then the second
would get any propagation from previous nodes. This is similar to the
change done for copy propagation code.
x = 1;
if (...) {
z = x; // This would turn into z = 1.
x = 22; // x gets killed.
} else {
w = x; // This would NOT turn into w = 1.
}
With the change, we let constant propagation happen independently in
the two branches and only then apply the killed values for the
subsequent code.
The new code use a single hash table for keeping the kills of both
branches (the branches only write to it), and it gets deleted after we
use -- instead of waiting for mem_ctx to collect it.
NIR deals well with constant propagation, so it already covered for
the missing ones that this patch fixes.
Reviewed-by: Eric Anholt <eric@anholt.net>
The only value in kill_entry is the writemask, which can be stored in
the data pointer of the hash table entry.
Suggested by Eric Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Since 4654439fdd "glsl: Use hash tables for
opt_constant_propagation() kill sets." uses a hash_table for storing
kill_entries, so the structs can be simplified.
Remove the exec_node from kill_entry since it is not used in an
exec_list anymore.
Remove the 'var' from kill_entry since it is now redundant with the
key of the hash table.
Suggested by Eric Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
From the SPIR-V 1.0 specification, section 3.32.18, "Atomic
Instructions":
"OpAtomicIDecrement:
<skip>
The instruction's result is the Original Value."
However, we were implementing it, for uniform atomic counters, as a
pre-decrement operation, as was the one available from GLSL.
Renamed the former nir intrinsic 'atomic_counter_dec*' to
'atomic_counter_pre_dec*' for clarification purposes, as it implements
a pre-decrement operation as specified for GLSL. From GLSL 4.50 spec,
section 8.10, "Atomic Counter Functions":
"uint atomicCounterDecrement (atomic_uint c)
Atomically
1. decrements the counter for c, and
2. returns the value resulting from the decrement operation.
These two steps are done atomically with respect to the atomic
counter functions in this table."
Added a new nir intrinsic 'atomic_counter_post_dec*' which implements
a post-decrement operation as required by SPIR-V.
v2: (Timothy Arceri)
* Add extra spec quotes on commit message
* Use "post" instead "pos" to avoid confusion with "position"
Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This is mostly just a straight-forward conversion of
link_assign_atomic_counter_resources to C directly using nir variables
instead of GLSL IR variables.
It is based on the version of link_assign_atomic_counter_resources in
6b8909f2d1. I’m noting this here to make it easier to track changes
and keep the NIR version up-to-date.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
ARB_gl_spirv points that uniforms in general need explicit
location. But there are still some cases of uniforms without location,
like for example uniform atomic counters. Those doesn't have a
location from the OpenGL point of view (they are identified with a
binding and offset), but Mesa internally assigns it a location.
Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
v2: squash with another patch, minor variable name tweak (Timothy
Arceri)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This includes:
* Move the defition of empty_uniform_block to linker_util.h
* Move find_empty_block (with a rename) to linker_util.h
* Refactor some code at linker.cpp to a new method at linker_util.h
(link_util_update_empty_uniform_locations)
So all that code could be used by the GLSL linker and the NIR linker
used for ARB_gl_spirv.
v2: include just "ir_uniform.h" (Timothy Arceri)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The spec allows adding scalars with a vector or matrix. In this case
the opt was losing swizzle and size information.
This fixes a bug with Doom (2016) shaders.
Fixes: 34ec1a24d6 ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The always_active_io flag was only set according to the first variable
that got packed in, so NIR io compaction would end up compacting XFB
varyings that shouldn't move at that point.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>