Most modern hardware needs the edge flag added as a hidden vertex input
and needs code added to the vertex shader to copy the input to an
output. Intel hardware is a little different. Gfx4 and Gfx5 hardware
works in the previously described mannter. Gfx6+ hardware needs the
edge flag as a specific vertex shader input, and that input is magically
processed by fixed-function hardware without need for extra shader code.
This flag signals only that the vertex shader input is needed. It would
be nice if we could decouple adding the vertex shader input from
generating the copy-to-output code, but that has proven to be
challenging. Not having that code causes other passes to want to
eliminate that shader input.
v2: Convert conditional to assertion. This pass is only called for
vertex shaders. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
The lowering passes will soon be moved to another function, so there
won't be any choice.
As a side benefit, this allows eliminating the uses_atomic_load_store
**pointer** parameter from brw_nir_lower_storage_image. For some reason
crocus was passing false instead of NULL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
...instead of looking for "ARB" in the name of the shader. This matches
the behavior of i965. Using "ARB" was added in a1ebac3750 ("iris:
Implement ALT mode for ARB_{vertex,fragment}_shader"), but there's no
explanation of why that method was used.
v2: Just use shader_info::is_arb_asm everywhere instead of
iris_uncompiled_shader::use_alt_mode. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
This consolidates several duplicated pieces of code into devinfo.
max_scratch_ids is an array that provides the max number of threads
for the rendering and compute stages.
This fixes some exceptions missed by crocus for scratch ids on haswell
and cherryview.
It also fills out devinfo->max_scratch_ids properly for stages VS
through CS on Gfx12.5. But, functionally this should not make a
difference as Gfx12.5 already uses COMPUTE for all stages.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12799>
There are two problems with the current architecture.
In OpenGL, the id is supposed to be a unique identifier for a particular
log source. This is done so that applications can (theoretically)
filter particular log messages. The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in. This causes
the id to get set once to a unique value for each message.
By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.
When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread. This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(
I have not observed any crashes related to this particular issue.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
There are a couple minor things that can be improved:
1. Eliminate (or reduce) the dynamic allocation of the
threaded_compile_job.
2. For apps like shader-db, improve the case where nr_threads=0. Right
now this adds thread switching and mutex overhead.
3. Other performance improvements? iris_uncompiled_shader::variants has
some special properties that make it ripe for replacement with a
lockless list. Without gathering some data, it's hard to guess what
impact that could have.
v2: Fix whitespace and formatting issues. Noticed by Ken.
s/threaded_compile_job/iris_threaded_compile_job/g. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
I tried to find a way to break this into some smaller commits, but
everything is very intertwined. :(
When searching the variants list in the iris_uncompiled_shader, add the
new variant if it is not found. This will be necessary for threaded
shader compilation. This conceptually simple change had a bunch of
fallout.
Much of this was at least conceptually borrowed from radeonsi.
- Other threads might find a variant in the list before the variant has
been compiled. To accomdate this, add a fence. Each thread will wait
on the fence in the variant when searching the list.
- A variant in the list may fail compilation. To accomodate this, add a
flag. All paths will examine iris_compiled_shader::compilation_failed
before trying to use the variant.
- The race condition between multiple threads trying to create the same
variant at the same time is handled *before* both thread spend the
effort to compile the shader. The means that iris_upload_shader
cannot change shaders on the caller, so it does not need to return
anything.
v2: Change "found" parameter of find_or_add_variant to "added." This
inverts the values returned, and it probably makes uses of the returned
value more easily understood. Always set the value in the called
function. Suggested by Ken.
v3: Move shader->compilation_failed check to avoid shader != NULL test.
Rearrange some logic and add a comment in iris_update_compiled_tcs.
Suggested by Ken. Don't call find_or_add_variant in
iris_create_shader_state. See
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229#note_1000843
for more details. Noticed by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
I tried /just/ ref counting the uncompiled shaders, but that is not
sufficient. At the very least, it's a problem for blorp shaders that
only have variants (and no uncompiled shader).
This is in prepartion for using the live shader cache.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11229>
Rework:
* Jordan: Handle prog_data->total_scratch==0 in iris_upload_compute_walker
* Jordan: Resolve iris_get_scratch_space conflict with e2c5ef6cd6
* Jordan: Rebase on 4256f7ed58. broken
* Ken: Mostly fixed the rebase
* Jordan: Fix two small compilation issues
* Jordan: Rebase on Ken's ("iris: Make a pin_scratch_space() helper")
* Lionel: Fix a few bugs with scratch handles
* Jason: Tidy the patch up a bit
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
Based on a patch by Rafael Antognolli.
We already had a flags parameter, but omitted it from the simple alloc
interface because most callers were passing 0. However, we'll want to
use it for selecting between device local memory and system memory, and
possibly mmap cacheability modes, in the future. At that point, many
more callers will want to specify, so I think we should include flags
in iris_bo_alloc() as well.
A few places used the iris_bo_alloc_tiled() function simply to pass
flags, so this patch converts them to use iris_bo_alloc() instead now
it does everything they want.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11169>
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
When we enable u_threaded_context, the pipe->create_*_state hooks
(precompile variants) are going to be called from one thread, while
iris_update_compiled_shaders (on-the-fly variants) are going to be
called from a driver thread. BLORP shaders also happen from
clear, blit, and so on in the driver thread.
u_upload_mgr isn't thread-safe, so use an uploader for each purpose.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
We call update_last_vue_map after updating the shaders, which compares
the new and old VUE maps. Except...updating the shaders may have
dropped the last reference to the variant that ice->shaders.last_vue_map
belonged to, leading to a classic use-after-free.
Fix this by taking a reference to the variant for the last VUE stage,
so it stays around until we're done with it.
Fixes: 1afed51445 ("iris: Store a list of shader variants in the shader itself")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4311
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9143>
We've traditionally stored shader variants in a per-context hash table,
based on a key with many per-stage fields. On older hardware supported
by i965, there were potentially quite a few variants, as many features
had to be emulated in shaders, including things like texture swizzling.
However, on the modern hardware targeted by iris, our NOS dependencies
are much smaller. We almost always guess the correct state when doing
the initial precompile, and so we have maybe 1-3 variants. iris NOS
keys are also dramatically smaller (4 to 24 bytes) than i965's.
Unlike the classic world, Gallium also provides a single kind of object
for API shaders---pipe_shader_state aka iris_uncompiled_shader. We can
simply store a list of shader variants there. This makes it possible
to access shader variants across contexts, rather than compiling them
separately for each context, which better matches how the APIs work.
To look up variants, we simply walk the list and memcmp the keys.
Since the list is almost always singular (and rarely ever long),
and the keys are tiny, this should be quite low overhead.
We continue storing internally generated shaders for BLORP and
passthrough TCS in the per-context hash table, as they don't have
an associated pipe_shader_state / iris_uncompiled_shader object.
(There can also be many BLORP shaders, and the blit keys are large,
so having a hash table rather than a list makes sense there.)
Because iris_uncompiled_shaders are shared across multiple contexts,
we do require locking when accessing this list. Fortunately, this
is a per-shader lock, rather than a global one. Additionally, since
we only append variants to the list, and generate the first one at
precompile time (while only one context has the uncompiled shader),
we can assume that it is safe to access that first entry without
locking the list. This means that we only have to lock when we
have multiple variants, which is relatively uncommon.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7668>
There is a small gap of time where the currently bound uncompiled
shaders, and compiled shader variant, are out of sync. Specifically,
between pipe->bind_*_state() and the next draw.
Currently, shaders variants live entirely within a single context,
and when deleting an iris_uncompiled_shader, we check if any of its
variants are currently bound, and defer deleting those until the next
iris_update_compiled_shaders() hook runs and binds new shaders to
replace them. (This is due to the time gap between binding new
uncompiled shaders, and updating variants at draw time when we have
the required NOS in place.)
This works pretty well in a single context world. But as we move to
share compiled shader variants across multiple contexts, it breaks down.
When deleting a shader, we can't look at all contexts to see if its
variants are bound anywhere. We can't even quantify whether those
contexts will run a future draw any time soon, to update and unbind.
One fairly crazy solution would be to delete the variants anyway, and
leave the stale pointers to dead variants in place. This requires
removing any code that compares old and new variants. Today, we do
that sometimes for seeing if the old/new shaders toggled some feature.
Worse than that, though, we don't just have to avoid dereferences, we'd
have to avoid pointer comparisons. If we free a variant, and quickly
allocate a new variant, malloc may return the same pointer. If it's
for the same shader stage, we may get a new different program that has
the same pointer as a previously bound stale one, causing us to think
nothing had changed when we really needed to do updates. Again, this
is doable, but leaves the code fragile - we'd have to guard against
future patches adding such checks back in.
So, don't do that. Instead, do basic reference counting. When a
variant is bound in a context, up the reference. When it's unbound,
decrement it. When it hits zero, we know it's not bound anywhere and
is safe to delete, with no stale references. This ends up being
reasonably cheap anyway, since the atomic is usually uncontested.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7668>
Now that we're looking at shader info system values rather than
vs_prog_data, there's no reason we have to do this when updating
the shader variants. We can simply check it when binding a new
shader from the API point of view.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8759>