I forgot why that was required, but it still is the correct thing to do.
Hit it at some point when working on implementing more CL features.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Just like we already do in the llvm backend. The current constant buffer code
seems fundamentally flawed and right now we are thinking on how we want to
reimplement all of that.
But until that happens, just treat is as global memory and go on.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The caller is responsible for setting up the ubo_addr_format value as
contrary to shared and global, it's not controlled by the spirv.
Right now clovers implementation of CL constant memory uses a 24/8 bit format
to encode the buffer index and offset, but that code is dead as all backends
treat constants as global memory to workaround annoying issues within OpenCL.
Maybe that will change, maybe not. But just in case somebody wants to look at
it, add a toggle for this inside vtn.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The current code looks like a typo, and fails if the usage_mask
is for a y/z enabled input.
Fixes piglit ext_transform_feedback-immediate-reuse-index-buffer
with llvmpipe/nir
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The previous fix worked when the second channel wasn't exposed, but
a couple of piglit tests have inputs with just the y/z chans, no x/w.
Partly Fixes piglit ext_transform_feedback-immediate-reuse-index-buffer
with llvmpipe/nir
Fixes: 5363cda52b ("gallivm: add swizzle support where one channel isn't defined.")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Including non-functional changes to get the value from the fd_reg_pair
in places.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
All developers now use gitlab, don't confuse newcomers by suggesting
they might use the mailing list. We want everyone to use gitlab so
that patches get run through basic CI before they are merged.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Vulkan 1.1 requires VK_KHR_external_fence which requires syncobj support
to be actually usable. However, it doesn't strictly require that we
support any external handle types. We should be able to advertise 1.1
even on old kernels that don't have syncobj support.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
When we have syncobj_wait, we can trust in WAIT_FOR_SUBMIT but when we
don't, we only have BO waits and those aren't quite as nice. This
commit adds a flag to _anv_queue_submit to wait for the queue to drain
before returning. This gives us the behavior we need to implement
DeviceWaitIdle.
Fixes: 246261f0ad "anv: prepare the driver for delayed submissions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Otherwise we may lower some fdot to fdph which is not implemented in pp.
Fixes#2126
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Nir serializes uses nir_ssa_alu_instr_src_components in a few places to
determine how many components a src has, but that's not what this function
returns. It simply returns how many channels are used, which is still fine
for most of the code.
This was breaking code like this:
vec16 32 ssa_1 = intrinsic load_global
vec1 32 ssa_2 = fmax ssa_1.a, ssa_2.b
v2: make the 16bit encoding work for identify swizzles again
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
This is correct per the Vulkan spec format equivalence table.
Fixes: f36b52740a "radv/android: Add android hardware buffer queries."
Reviewed-by: Eric Anholt <eric@anholt.net>
Sometimes it's useful to get information about GPU faults in the
console, so it's synchronized with other messages.
This commit will cause Mesa to wait for completion and check if there
are any faults raised by the GPU.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
A lot of the brw_*_prog_key fields are for emulating features on legacy
hardware that iris doesn't support. In particular, all of the texture
swizzle fields take up a lot of space. These dead fields make hashing
the shader keys more expensive than it ought to be.
We introduce iris-specific keys with only the information we need, and
translate them to brw keys when actually compiling new variants. This
way, key comparisons can use the small keys. The size reductions are:
VS: 328 bytes -> 8 bytes
TCS: 312 bytes -> 24 bytes
TES: 304 bytes -> 24 bytes
GS: 284 bytes -> 8 bytes
FS: 304 bytes -> 16 bytes
CS: 280 bytes -> 4 bytes
Scores for the Piglit drawoverhead microbenchmark case with a shader
program change improve by roughly 30%.
Reviewed-by: Eric Anholt <eric@anholt.net>
macOS does not have pthread_getcpuclockid.
src/util/u_thread.h:156:4: error: implicit declaration of function 'pthread_getcpuclockid' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
pthread_getcpuclockid(thread, &cid);
^
Fixes: 4913215d14 ("util/u_thread: don't restrict u_thread_get_time_nano() to __linux__")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2171
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
This gets us shared non-UBWC layout code between gallium and turnip.
Until I fix up the rest of gallium to handle UBWC mipmapping, we do the
single-level UBWC setup in gallium as a fixup after layout.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
We pass in all the parameters for setting up the layout, though freedreno
still sets a few of them up early (since it uses layout helpers in making
some decisions about the layout setup parameters that will be cleaned up
once krh's blitter work lands).
This lets us start using some of the fdl_* helpers and have more obviously
matching code between gallium and turnip. We can't yet use the fdl_* UBWC
helpers, since the gallium driver doesn't do UBWC mipmaps (which I'm
working on in another branch).
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
It's the same logic for each of these being emitted, and I was about to
change the rsc->layout.* for UBWC.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
We can just bake the UBWC-goes-first delta into the slices at setup time.
I did have to fix up the resource shadowing swap path to swap the slice
fields, as it was missing and regressed the format reinterpets otherwise.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
i965 wants to use an offset from a base because everything is in a
single buffer whose address may be relocated, and all base addresses
are set to the start of that buffer.
iris wants to use a full 64-bit address, because state lives in separate
buffers which may be in the shader, surface, and dynamic memory zones,
where addresses grow downward from the top of a 4GB zone, So it's very
possible for a 32-bit offset to exist relative to multiple bases,
leading to the wrong state size.
low-level implementation of INTEL-performance-query APIs in
Intel iris driver. Most of functions and procedures defined here
are adopted from i965 driver (brw_performance_query.c)
v2: - replace genX_init_performance_query with
iris_init_perfquery_functions which is gen's version agnositic
- general code clean-up
v3: include gen_perf_gens.h as some of defines were moved to this new
header file
v4: - checking for kernel 4.13+ won't be needed here as Iris won't be
loaded anyway without DRM_SYNCOBJ that is enabled after Kernel
4.13.
- checking whether gen < 8 or is_cherryview won't be required as
well because those cases are screened in iris_screen_create.
v5: remove genX(init_performance_query)
v6: - remove oa_metrics_kernel_support as iris works only with kernel
4.18 and newer.
- use perf functions defined in separate file, iris_perf.h/c
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>