We don't import textures with DCC now, but soon we will.
v2: if we can't disable DCC for image writes, at least decompress DCC
at bind time
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Could cause issues if you tried to read from an uninitialised pointer.
This just initalises the pointer to null to avoid that being a problem.
Discovered by Coverity.
CID: 1343616
Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
We've had a FINISHME here since Eric originally wrote the code in 2011.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.
The shader-db statistics are basically a wash:
No change in instruction counts.
total cycles in shared programs: 78685980 -> 78680730 (-0.01%)
cycles in affected programs: 2102646 -> 2097396 (-0.25%)
helped: 48
HURT: 83
I figured if we're going to do this for one copy propagation pass,
we may as well do it in both.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
We've had a FINISHME here since Eric originally wrote the code in 2010.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.
The shader-db statistics are not terribly impressive:
total instructions in shared programs: 9008589 -> 9008613 (0.00%)
instructions in affected programs: 4293 -> 4317 (0.56%)
helped: 0
HURT: 10
total cycles in shared programs: 78550978 -> 78575760 (0.03%)
cycles in affected programs: 655426 -> 680208 (3.78%)
helped: 75
HURT: 88
GAINED: 2
Most of the "regressions" appear to be us successfully copy propagating
uniforms, which i965 is loading as pull constants instead of push, so we
occasionally have two pulls instead of one. That doesn't seem like this
pass's job - it's propagating correctly, and we should be smarter about
pull loads in the backend.
This patch is also useful for a couple of reasons:
1. It can clean up copies created by varying packing (previously, we
couldn't if the uses were inside a loop).
This fixes a bug when interpolateAt*() is used on a packed varying
inside a loop: glsl_to_nir struggles to see through the extra copy
and mistakenly believed the variable was not an input.
2. It will help propagate uniform array access created by
lower_const_array_to_uniforms().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.
This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."
This fixes:
GL45-CTS.program_interface_query.output-built-in
[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cast the unsigned semantic index to integer datatype before comparing
to max_generic, otherwise, max_generic which is initialized to -1
will be converted to unsigned int before the comparison, causing a wrong
semantic index to be assigned to a shader output.
Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265)
Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord.
v2: use the original max_generic variable but add the (int) cast
to the semantic index, as suggested by Brian.
Reviewed-by: Brian Paul <brianp@vmware.com>
When TGSI debug flag is enabled, print the shader linkage info as well.
Tested with mesa demos with SVGA_DEBUG=tgsi
Reviewed-by: Brian Paul <brianp@vmware.com>
ARB_shader_image_load_store only requires a very fixed list of formats
to be supported, while textures may be in all kinds of formats, like
BGRA which are presently not supported on at least Kepler.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Switches to using truncf in micro_trunc.
Fixes the following piglit tests (for softpipe):
/spec/glsl-1.30/execution/built-in-functions/...
fs-trunc-float
fs-trunc-vec2
fs-trunc-vec3
fs-trunc-vec4
vs-trunc-float
vs-trunc-vec2
vs-trunc-vec3
vs-trunc-vec4
/spec/glsl-1.50/execution/built-in-functions/...
gs-trunc-float
gs-trunc-vec2
gs-trunc-vec3
gs-trunc-vec4
Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.
const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
64-bit aligned and the current implementation fails to read the data
properly. Instead, when there is is a double input varying, read it as
vector of floats with twice the number of components.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs".
"all geometry shader output vertex count declarations in a
program must declare the same count."
Fixes:
GL45-CTS.geometry_shader.output.conflicted_output_vertices_max
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Although the glsl_types.h stores this in a bitfield,
we should hide that from everyone else. Hide the cast
in an accessor method and use the enum everywhere.
This makes things a bit nicer in gdb, and improves type
safety.
v2: fix a few pieces of interface I missed that caused some
piglit regressions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
For 3D textures we shouldn't be using NumLayers, we need
to get it from the depth.
This fixes:
GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
With tessellation shaders we can have cases where we have
arrays of anon structs, so make sure we match using without_array().
Fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in
v2:
test lengths match as well (Ilia)
v3:
descend array lengths to check for matches as well (Ilia)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes a crash in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types
If we can't find the func_name in one of these paths,
we have emitted an earlier error so just return here.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
GL43-CTS.compute_shader.work-group-size does
uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 };
The initializer triggers the GLSL 4.30/GLES3 tests
for constant sequence subexpressions, so it doesn't
happen unless you are using those, so just return
false as this path is now reachable.
v2: update commit msg with diagnosis
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
PATH_MAX is apparently not a thing on Windows. Borrow the hack from
pipe_loader.c to try and make this work.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
This writes linked shader programs to .shader_test files to
$MESA_SHADER_CAPTURE_PATH in the format used by shader-db
(http://cgit.freedesktop.org/mesa/shader-db).
It supports both GLSL shaders and ARB programs. All stages that
are linked together are written in a single .shader_test file.
This eliminates the need for shader-db's split-to-files.py, as Mesa
produces the desired format directly. It's much more reliable than
parsing stdout/stderr, as those may contain extraneous messages, or
simply be closed by the application and unavailable.
We have many similar features already, but this is a bit different:
- MESA_GLSL=dump writes to stdout, not files.
- MESA_GLSL=log writes each stage to separate files (rather than
all linked shaders in one file), at draw time (not link time),
with uniform data and state flag info.
- Tapani's shader replacement mechanism (MESA_SHADER_DUMP_PATH and
MESA_SHADER_READ_PATH) also uses separate files per shader stage,
but allows reading in files to replace an app's shader code.
v2: Dump ARB programs too, not just GLSL.
v3: Don't dump bogus 0.shader_test file.
v4: Add "GL_ARB_separate_shader_objects" to the [require] block.
v5: Print "GLSL 4.00" instead of "GLSL 4.0" in the [require] block.
v6: Don't hardcode /tmp/mesa.
v7: Fix memoization of getenv().
v8: Also print "SSO ENABLED" (suggested by Timothy).
v9: Also handle ES shaders (suggested by Ilia).
v10: Guard against MESA_SHADER_CAPTURE_PATH being too long; add
_mesa_warning calls on error handling (suggested by Ben).
v11: Fix crash when variable is unset introduced in v10.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.
However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.
This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
This should fix spec@arb_shader_image_load_store@level.
Broken by:
Commit: 95c5bbae66
radeonsi: set some image descriptor fields at bind time
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
The fix in:
anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards
breaks things if wayland headers aren't installed.
Separate things out properly to avoid that problem.
[airlied: fixed up to put in pre-existing sections].
Reported-by: Arjan van de Ven
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The main impact is that {upload, draw, upload, draw, ..} doesn't flush
framebuffer caches before every upload.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>