Add a new pass that inserts additional dependencies, rather than simply
relying on SSA srcs added in the nir->ir3 frontend. This makes it
easier to deal with barriers, but the additional false deps also lets us
deal properly with ensuring a write depends on all previous reads.
Since conversion to barrier instructions is lossy (ie. just knowing the
instruction doesn't tell us enough about what other instructions the
barrier applies to), use barrier_class/barrier_conflict fields in the
ir3_instruction to retain this information.
This could probably be relaxed somewhat by considering *which* array/
buffer/image variable is being referenced. Ie. a write to buffer A
can overtake a read from buffer B, if B is not coherent. (right?)
Signed-off-by: Rob Clark <robdclark@gmail.com>
I want to add a growable array to ir3_instruction, so we can append
false dependencies for purposes of scheduling barriers, atomics, and
dealing with write after read hazards.
Just code motion preparing for next patch.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Atomic instructions take a different # of src args depending on .g or .l
variant, split these out into different helpers with INSTR*F() helper
macro that lets you specify instruction flag.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Some instructions (like barriers) have no dst, which causes problems
with dereferencing a NULL dst. Flip the logic around to reject opc's
that can't be a type of move first, to filter out those instructions.
Signed-off-by: Rob Clark <robdclark@gmail.com>
User consts and driver consts such as UBO addresses and immediates are
handled the same for all shader stages, so split out a shared helper for
these, to make it easier to add more.
Signed-off-by: Rob Clark <robdclark@gmail.com>
It is unfortunate that image state isn't a real CSO, since (at least for
a4xx/a5xx) it is a combination of sampler and "SSBO" image state, and it
would be useful to pre-compute the state block "register" values rather
than doing it at emit time.
Signed-off-by: Rob Clark <robdclark@gmail.com>
In case the IR is NIR, the driver takes reference to the nir_shader.
Also, because there are no variants, we need to clone the shader,
instead of sharing the reference with gl_program, which would result
in a double free in _mesa_delete_program().
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Having both an ir3_compile (which was really context for compiling a
single shader variant) and ir3_compiler (which is the compiler object
that compiles all variants, ie. basically holds the RA regset) is a
bit confusing.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We renamed "Function Enable" to "Enable", which broke our detection
of whether shaders are enabled or not. So, we'd see a bunch of HS/DS
packets with program offsets of 0, and think that was a valid TCS/TES.
Fixes: c032cae9ff (genxml: Rename "Function Enable" to "Enable".)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
As far as I can tell these fields are only used to query arb
program info and are not related to ATI_fragment_shader.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Miklós Máté <mtmkls@gmail.com>
Android fences can't be deferred, because st/dri calls fence_finish
with ctx = NULL, so the driver can't flush u_threaded_context.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This prevents build failures when libdrm_freedreno is unavailable,
which started happening after the ir3_compiler build was enabled.
(Patch by Rob, commit message by Ken).
Fixes: fecd04a66a ("freedreno/ir3: fix standalone compiler meson build")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The L3 configuration code already considers the TCS and TES programs,
but failed to listen for TCS/TES program changes.
This was somehow missing.
Fixes: e9644cb1f9 ("i965: Consider tessellation in get_pipeline_state_l3_weights.")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
There is a bunch of churn in the main meson.build so that we can
correctly set the auto tristate of GLX. In particular, don't build
xlib-based glx when dri and gallium are disabled but vulkan is enabled,
in that case just turn glx off.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Because the same generation logic is required by xlib glx and
gallium-xlib glx, it makes sense to pull it out.
v2: - Ensure that libgl is defined before trying to generate a pkgconfig
file with it.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This ensure that it's properly guarded, but also makes the code clearer
by grouping related things together.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This creates a dependency on this header being generated before trying
to compile any of these targets, as well as passing the correct -I to
the compiler to ensure it's included correctly.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
These variables were removed from autotools in 2008 (sha:
80f68e1b6a), but they have lived on here. The Scons build
meanwhile doesn't set a patch/tiny version at all, just major and minor.
This patch removes the unused variables and simply sets the version,
leaving patch/tiny as 0 since that's what the autotools build as been
doing forever. This shouldn't change any behavior.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This seems to be dropped in 222a2fb9 "util: move os_time.[ch] to src/util"
../../../src/util/os_time.c: In function ‘os_time_sleep’:
../../../src/util/os_time.c:104:4: error: implicit declaration of function ‘usleep’ [-Werror=implicit-function-declaration]
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
These flags are set for C sources, but not C++. This causes symbol
visibility leaks from the C++ parts of the Intel compiler.
Fixes: 700bebb958 ("i965: Move the back-end compiler to src/intel/compiler")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes piglit - egl_khr_fence_sync/android_native tests.
Broken by 884a0b2a9e.
Introduce state-tracker flush flags, analogous to the pipe ones. Use
the former when with stapi->flush().
Fixes: 884a0b2a9e ("st/dri: use stapi flush instead of pipe flush
when creating fences")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Apparently, it doesn't have pthread barriers.
p_config.h (which was originally used to guard this code) uses the
__APPLE__ macro to detect Mac OS.
Fixes: f0d3a4de75 ("util: move pipe_barrier into src/util and rename to util_barrier")
Cc: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
State validation is performed during clear and draw calls. Validation
during clear was still accessing vertex buffer state. When the currently
set vertex buffers are client arrays, this could lead to accessing freed
memory. Such is the case with the VMD application.
Previously, vertex buffer validation depended on a dirty bit or the
draw info indicating an indexed draw. This required special handling for
clears. But, vertex buffer validation still occurred which was unnecessary
and wrong.
Now, only minimal validation is performed during clear, deferring the
remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo
for indexed draws, vertex buffer validation is only dependent upon a
single dirty bit.
This fixes a bug exposed by the VMD application when changing models.
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Don't rely on intr->num_components having a valid value. It doesn't
seem to anymore for non-vectorized intrinsics.
Signed-off-by: Rob Clark <robdclark@gmail.com>