The issue we're running into when running CTS is that glsl types are
deleted while builtins depending on them are not.
This happens because on one hand we have glsl types ref counted, but
builtins are not. Instead builtins are destroyed when unloading libGL
or explicitly calling glReleaseShaderCompiler().
This change removes almost entirely any dealing with glsl types
ref/unref by letting the builtins deal with it instead. In turn we
introduce a builtin ref count mechanism. Each GL context takes a
reference on the builtins when compiling a shader for the first time.
It releases the reference when the context is destroyed. It can also
explicitly release those when glReleaseShaderCompiler() is called.
Finally we also take a reference on the glsl types when loading libGL
to avoid recreating glsl types too often.
v2: Ensure we take a reference if we don't have one in link step (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Use value "2" to signal that lowering is needed and supported and enable
it accordingly.
v2: - Note in CAP description that this lowering currently requires TGSI
- use "true" instead of GL_TRUE (both Erik)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com>
v2: - Add GS and TES
- fix constants state update flags (Erik)
v3: don't update rasterizer when depth_clamp is lowered (Erik)
v4: Correct NewDepthClamp and also set flags for NewClipControl (Erik)
v5: Also set shader_has_one_variant property acording to possible
depth_clamp lowering (Marek)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>
For drivers supporting PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET the buffer_offset value
will be interpreted as an signed int.
An example of application code causing a negative offset:
float b[] = { ... }; // 3 float for pos, 3 for color
glBufferData(GL_ARRAY_BUFFER, ..., b, ...);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), 0);
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), &b[3]);
^
should be 3 * sizeof(float)
The offset is a ptr so when interpreted as a signed int it can be negative.
This commit adds a verification that (int) buffer_offset is not negative - this would
indicate an application bug. Since it's too late to emit a GL_INVALID_VALUE error,
we replace the negative offset by 0 and emit a debug message.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Shader Model 3.0 is a big promise to make to the state-tracker, and
for instance mobile hardware might support vertex-shader saturate but
not some of the other features of SM3. So let's give this its own cap
for simplicity.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This boolean is only consulted once during init, so there's nothing
much saved by storing this in the context. So let's just check directly
when we need it instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
When a context is destroyed the destroy_tex_sampler_cb makes sure that all the
sampler views created by that context are destroyed.
This is done by walking the ctx->Shared->TexObjects hash table.
In a multiple context environment the texture can be deleted by a different context,
so it will be removed from the TexObjects table and will prevent the above mechanism
to work.
This can result in an assertion in st_save_zombie_sampler_view because the
sampler_view owns a reference to a destroyed context.
This issue occurs in blender 2.80.
This commit fixes this by explicitly releasing sampler_view created by the destroyed
context for all texture attachments.
Fixes: 593e36f956 (st/mesa: implement "zombie" sampler views (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110944
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
This patch changes the code which sets EmitNoIndirectSampler to check
the core profile GLSL version, rather than the ARB_gpu_shader5 extension
enable. st/mesa exposes ARB_gpu_shader5 if GLSLVersion (in core
profiles) or GLSLVersionCompat (in compat profiles) >= 400.
The Intel drivers do not currently expose ARB_gpu_shader5 in compat
profiles. But the backend can absolutely handle indirect samplers.
Looking at the core profile version number should be a good indication
of what the driver supports.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This support requires the driver to be a NIR driver as we use the
NIR lowering pass to do the clamping.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit 55376cb31e.
It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Commit 624789e370 moved the destruction of types out of atexit() and
made use of a ref count instead. This is useful for avoiding a crash
where drivers such as radeonsi are still compiling in a thread when the app
exits and has not called MakeCurrent to change from the current context.
While the above scenario is technically an app bug we shouldn't crash.
However that change caused another race condition between the shader
compilation tread in radeonsi and context teardown functions.
This patch makes two changes to fix this new problem:
First we explicitly call _mesa_destroy_shader_compiler_types() when destroying
the st context rather than calling it indirectly via _mesa_free_context_data().
We do this as we must call it after st_destroy_context_priv() so that we don't
destory the glsl types before the compilation threads finish.
Next wait for the shader threads to finish in si_destroy_context() this
also means we need to call context destroy before destroying the queues
in si_destroy_screen().
Fixes: 624789e370 ("compiler/glsl: handle case where we have multiple users for types")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
if the driver (iris) indicates support for the inner_coverage pipe cap, this
will set the necessary states in the driver flags and rasterizer structs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When we destroy a context, we need to temporarily make that context
the current one for the thread.
That's because during context tear-down we make many calls to
_mesa_reference_texobj(&texObj, NULL). Note there's no context
parameter. If the texture's refcount goes to zero and we need to
delete it, we use the thread's current context. But if that context
isn't the context we're tearing down, we get into trouble when
deallocating sampler views. See patch 593e36f956 ("st/mesa:
implement "zombie" sampler views (v2)") for background information.
Also, we need to release any sampler views attached to the fallback
textures.
Fixes a crash on exit with a glretrace of the Nobel Clinician
application.
v2: at end of st_destroy_context(), check if save_ctx == ctx and
unbind the context if so.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
In all instances here we can replace pipe_sampler_view_release(pipe,
view) with pipe_sampler_view_reference(view, NULL) because the views
in question are private to the state tracker context. So there's no
danger of freeing a sampler view with the wrong context.
Testing done: google chrome, misc GL demos, games
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
As with the preceding patch for sampler views, this patch does
basically the same thing but for shaders. However, reference counting
isn't needed here (instead of calling cso_delete_XXX_shader() we call
st_save_zombie_shader().
The Redway3D Watch is one app/demo that needs this change. Otherwise,
the vmwgfx driver generates an error about trying to destroy a shader
ID that doesn't exist in the context.
Note that if PIPE_CAP_SHAREABLE_SHADERS = TRUE, then we can use/delete
any shader with any context and this mechanism is not used.
Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos
and a few Linux games.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
When st_texture_release_all_sampler_views() is called the texture may
have sampler views belonging to several contexts. If we unreference a
sampler view and its refcount hits zero, we need to be sure to destroy
the sampler view with the same context which created it.
This was not the case with the previous code which used
pipe_sampler_view_release(). That function could end up freeing a
sampler view with a context different than the one which created it.
In the case of the VMware svga driver, we detected this but leaked the
sampler view. This led to a crash with google-chrome when the kernel
module had too many sampler views. VMware bug 2274734.
Alternately, if we try to delete a sampler view with the correct
context, we may be "reaching into" a context which is active on
another thread. That's not safe.
To fix these issues this patch adds a per-context list of "zombie"
sampler views. These are views which are to be freed at some point
when the context is active. Other contexts may safely add sampler
views to the zombie list at any time (it's mutex protected). This
avoids the context/view ownership mix-ups we had before.
Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos
a few Linux games. If anyone can recomment some other multi-threaded,
multi-context GL apps to test, please let me know.
v2: avoid potential race issue by always adding sampler views to the
zombie list if the view's context doesn't match the current context,
ignoring the refcount.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-By: Jose Fonseca <jfonseca@vmware.com>
st_init_driver_functions() is only called in st_context.c so there's
no need for the prototype in st_context.h
To avoid a forward declaration of st_init_driver_functions() in
st_context.c, we need to move around several other functions.
No functional change.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
To de-clutter st_context.h.
Clean up remaining function prototypes in st_context.h.
The st_vp_uses_current_values() helper is only used in st_context.c
so move it there.
The st_get_active_states() function is only used in st_context.c so
remove its prototype in st_context.h
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Since using bitmasks we can easily check if we have any
current value that is potentially uploaded on array setup.
So check for any potential vertex program input that is not
already a vao enabled array. Only flag array update if there is
a potential overlap.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Intel's blending hardware does not properly return 1.0 for destination
alpha for RGBX formats; it requires the factors to be overridden to
either zero or one. Broadcom vc4 and v3d also could use this override.
While overriding these factors is safe in general, Nouveau and Radeon
would prefer not to. Their blending hardware already returns correct
values for RGB/RGBX formats, and would like to avoid the resulting
per-buffer blending and independent blend factors (rgb != a) since it
can cause additional overhead.
I considered simply handling this in the driver, but it's not as nice.
pipe_blend_state doesn't have any format information, so we'd need the
hardware blend state to depend on both pipe_blend_state and
pipe_framebuffer_state. Furthermore, Intel GPUs don't have a native
RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall
back to RGBA_SNORM. The pipe_surfaces we get in the driver have an RGBA
format, making it impossible to tell that there shouldn't be an alpha
channel. One could argue that st not handling it in that case is a bug.
To work around this, we'd have to expose RGBX pipe formats, mapped to
RGBA hardware formats, and add format swizzling special cases. All
doable, but it ends up being more code than I'd like.
st_atom_blend already has access to the right information and it's
trivial to accomplish there, so we just add a cap bit and do that.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Gallium historically has treated pipeline statistics queries as a single
query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11
values. This was originally patterned after the D3D1x API. Much later,
Brian introduced an OpenGL extension that exposed these counters - but
it exposes 11 separate queries, each of which returns a single value.
Today, st/mesa simply queries all 11 values, and returns a single value.
While pipeline statistics counters aren't typically performance
critical, this is still not a great fit. A D3D1x->GL translator might
request all 11 counters by creating 11 separate GL queries...which
Gallium would map to reads of all 11 values each time, resulting in a
total 121 counter reads. That's not ideal.
This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE,
and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE.
When calling create_query(), q->index should be set to one of the
PIPE_STAT_QUERY_* enums to select a counter. Unlike the block query,
this returns the value in pipe_query_result::u64 (as it's a single
value) instead of the pipe_query_data_pipeline_statistics group.
We update st/mesa to expose ARB_pipeline_statistics_query if either
capability is set, preferring the new SINGLE variant when available.
Thanks to Roland, Ilia, and Marek for helping me sort this out.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This might be required because some stages might generate different
programs depending on the other stages in the program. For example,
the i965 driver's tessellation control stage depends on the
tessellation evaluation shader.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This sets the LinkShader function for the driver, but for the st we
set it properly with the following call to st_init_program_functions().
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Let the gallium backend have its own gl_vertex_array array and basically
reimplement the way _vbo_draw works.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Move vbo draw functions into struct dd_function_table.
For now just wrap the underlying vbo functions.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
The newest version of WSI Fusion makes several glDrawPixels calls
per frame. By caching more than one image, we get better performance
when panning/zooming the map.
v2: move pixel unpack param checking out of cache search loop, per Roland
v3: also move unpack->BufferObj check out of loop, per Roland.
This resolves a game bug in Dead Island. The game doesn't properly
handle ARB_get_program_binary with 0 supported formats, and ends up
crashing.
This will enable ARB_get_program_binary binary support for any
driver that currently enables the on-disk shader cache.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
Currently we apply the extension overrides and construct the extensions
string upon MakeCurrent.
They are two distinct things, so let's slit the two while pushing the
overrides management _before_ _mesa_compute_version(). This ensures that
the version is updated to reflect the enabled/disabled extensions.
Cc: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
We already have piglit tests testing alpha, luminance, and intensity
formats. They were skipped by piglit until now.
Additionally, I'm enabling one ARB_texture_buffer_range piglit test to run
with the compat profile.
i965 behavior is unchanged except that it doesn't expose TBOs in the Compat
profile. Not sure how that affects the GL version override.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This adds a new atom that calls the new driver API to
bind buffers containing hw atomics.
v2: fixup bindings for sparse buffers. (mareko/nha)
don't bind buffer atomics when hw atomics are enabled.
use NewAtomicBuffer (mareko)
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>