Invalidated buffers don't have to go through it.
Split r600_init_resource into r600_init_resource_fields and
r600_alloc_resource.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Currently, due to the inverse order, strcmp will produce negative result
when the needle is towards the start of the haystack. Thus on the next
iteration(s) we'll end up further towards the end and eventually fail to
locate the entry.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
We recently starting to always emit the NDV (== dall) bit for quadops.
However it was folded into the wrong code word.
Fixes: e0a067ed48 (nv50/ir: always emit the NDV bit for OP_QUADOP)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Unfortunately a3xx does not have a separate disable for depth clipping,
so when depth clamp is enabled, we disable the whole 3d clipper logic.
This in turn also gets rid of the xy clip that it would normally do.
When we detect this would happen, instead we integrate the viewport into
the window scissor. This may have slightly different behavior around
wide points, but it's unlikely that anything depends on this.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97231
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs,
or a clip vertex, or clip distances are in use, then we must use the
fallback discard-based clipping from the frag shader.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
This is the only remaining part of genX_l3.c and there's really no good
reason for it to be in its own file.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Now that we're using gen_l3_config.c, we no longer have one set of l3
config functions per gen and we can simplify a bit. Also, we know that
only compute uses SLM so we don't need to look for it in all of the stages.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
When Jordan first implement L3$ configuration for Vulkan, he copied+pasted
from the GL driver because we had no good place to share it. Now that we
have src/intel/common, we should be sharing these tables.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Generated by:
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
In 144cbf8 ("nir: Make nir_opt_remove_phis see through moves."), Ken
made nir_opt_remove_phis able to coalesce phi nodes whose sources are
all moves with the same swizzle. However, he didn't add the logic
necessary for handling the fact that the phi may now have multiple
different sources, even though the sources point to the same thing. For
example, if we had something like:
if (...)
a1 = b.yx;
else
a2 = b.yx;
a = phi(a1, a2)
... = a
then we would rewrite it to
if (...)
a1 = b.yx;
else
a2 = b.yx;
... = a1
by picking a random phi source, which in this case is invalid because
the source doesn't dominate the phi. Instead, we need to change it to:
if (...)
a1 = b.yx;
else
a2 = b.yx;
a3 = b.yx;
... = a3;
Fixes 12 CTS tests:
ES31-CTS.functional.tessellation.invariance.outer_edge_symmetry.quads*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Curiously OES/EXT_tessellation_shader leave these out, while ES 3.2 adds
them in.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I left this out of my previous commit that went around enabling all of
the other ES 3.2 entrypoints.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is a newly added flag. We always pass false into it from
nv50_clear_texture, but other callers may want to respect the render
condition. (And the functions were originally spec'd to respect it.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
When NIR was first introduced, Connor added this fake-edge hack to work
around issues related to unreachable blocks. Thanks to GLSL IR's jump
lowering code, the only unreachable code you can have is a block after an
infinite loop. With SPIR-V, we didn't have the jump lowering code so we
could also end up with the "if (...) { break; } else { continue; }" case
which generates an unreachable block after the if. Because of this, most
of NIR had to be fixed up for handling unreachable blocks. The only
remaining case of not handling unreachable blocks was specifically the
block-after-infinite-loop case in dead_cf which was fixed by the previous
commit. We can now delete the fake edge hack.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
When an application uses a ton of shaders, we need to evict them
when the code segment is full but this is not really a good solution
if monster shaders are used because code eviction will happen a lot.
To avoid this, it seems better to dynamically resize the code
segment area after each eviction. The maximum size is arbitrary
fixed to 8MB which should be enough.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
To avoid the bins list to grow up indefinitely when the code segment
size will be bumped, we need to separate that bin from the SCREEN
one because it contains other resources like the uniform bo.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This function will be helpful for resizing the code segment
area when we need to evict all shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This fixes a very old issue which happens when the code segment
size is full. A bunch of real applications like Tomb Raider,
F1 2015, Elemental, hit that issue because they use a ton of shaders.
In this case, all shaders are evicted (for freeing space) but all
currently bound shaders also need to be re-uploaded and SP_START_ID
have to be updated accordingly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
This refactoring will help for fixing the "out of code space"
eviction issue because we will need to reupload the code for
all currently bound shaders but it's slightly different than
uploading a new fresh code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
If scissor X or Y was set to a negative value then the previous
code might have indicated noop scissors when the scissor range
actually was masking a portion of the framebuffer.
Since fb->_Xmin, _Xmax, _Ymin and _Ymax take scissors into
account, we can use these to test for a noop scissor.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Varying packing would like to mark certain variables as flat.
This works as long as both sides of the interfaces are changed
accordingly. However, with SSO, we disable varying packing on
the outermost stages. We also disable varying packing for
certain tessellation stages.
With SSO, we operate on the producer and consumer separately.
Checks based on the consumer stage and variable are risky, and
can easily lead to altering one half of the interface between
stages, breaking SSO pipeline IO validation.
Just stop monkeying around with interpolation modes unless
required for varying packing. There's no point. This also
disables it in unsafe SSO cases.
Fixes CTS tests:
*.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_MaxPatchVertices_Position_PointSize
Also fixes Piglit's spec/oes_geometry_shader/sso_validation:
- user-defined-gs-input-not-in-block.shader_test
- user-defined-gs-input-in-block.shader_test
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We handled the unsized case, implicitly sizing arrays to the value
of gl_MaxPatchVertices. But if a size was present, we failed to
raise a compile error if it wasn't the value of gl_MaxPatchVertices.
Fixes CTS tests:
*.tessellation_shader.compilation_and_linking_errors.
{tc,te}_invalid_array_size_used_for_input_blocks
Piglit's tcs-input-read-nonconst-* tests have recently been fixed.
This patch will break older copies of those tests, but the latest
should continue working. Update to Piglit 75819c13af2ed5.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
When trying to get a device name for an fd using sysfs, it would always fail
as it was expecting key/value pairs to be delimited by '\0', which is not the
case.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The device name is only needed for WL_bind_wayland_display so make this clear
by only storing the device name when Wayland support is built.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>