For the CL spec it really matters how a program object was created. We
never really cared all that much, but it didn't support the corner case of
having an empty string as the OpenCL C source code.
Enums feel like the more Rust way to do this kind of stuff anyway.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22280>
We want to validate the actual passed in SPIR-V, but we can only report
errors back on build/compile time. So instead of storing the initial IL
in the devices `ProgramBuild` objects, just store it on the Program
instead. This also simplifies setting spec constants as this is only valid
on program directly created from IL and not e.g. linked ones.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22280>
In OpenCL, we can have no shader-defined shared memory but some dispatch-time
variable memory. This is not reflected in ss->info.wls_size, so check the right
variable instead so we allocate the appropriate memory.
Fixes page faults accessing shared memory with Rusticl, e.g. in the vstore_local
test.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22228>
OpenCL can generate large loads and stores that we can't support, so we need to
lower. We can load/store up to 128-bits in a single go. We currently only handle
up to 32-bit components in the load and no more than vec4, so we split up
accordingly.
It's not clear to me what the requirements are for alignment on Valhall, so we
conservatively generate aligned access, at worst there's a performance penalty
in those cases. I think unaligned access is suppoerted, but likely with a
performance penalty of its own? So in the absence of hard data otherwise, let's
just use natural alignment.
Oddly, this shaves off a tiny bit of ALU in a few compute shaders on Valhall,
all in gfxbench. Seems to just be noise from the RA lottery.
total instructions in shared programs: 2686768 -> 2686756 (<.01%)
instructions in affected programs: 584 -> 572 (-2.05%)
helped: 6
HURT: 0
Instructions are helped.
total cvt in shared programs: 14644.33 -> 14644.14 (<.01%)
cvt in affected programs: 5.77 -> 5.58 (-3.25%)
helped: 6
HURT: 0
total quadwords in shared programs: 1455320 -> 1455312 (<.01%)
quadwords in affected programs: 56 -> 48 (-14.29%)
helped: 1
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22228>
The query buffer contains a batch to implement the multi pass
replay/accumulation of results. So we can't clear it with a memset.
An optimization for later would be to move the batches to the very end
of the query buffer so we can clear the query data without touching
the batches.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4dc7256bf9 ("anv: reset query pools using blorp")
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22421>
Primitive export used the gs_accepted variable after culling,
so we overwrote this variable after vertex compaction to make
sure not to hang the GPU.
This had an unintended side effect when storing the primitive ID
to LDS on GS threads: the LDS store was done even on threads whose
triangle was culled; potentially causing issues.
As a fix, create a separate boolean variable that remembers
which invocations need to export a primitive; and don't store
the primitive ID to LDS when gs_accepted is false.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8805
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22424>
when a new resource is created for an extant swapchain, the existing
acquire (if any) should be transferred to the resource to ensure
expected behavior
this should be enough to fix piglit's glx-make-current
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22431>
this is important for handing off the swapchain between resources
on makecurrent since a context that is made not-current will have its
swapchain resources destroyed while the swapchain persists
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22431>
This information was wrong in some places, let's fix it now.
GFX6:
The GPU has 64KB LDS, but only 32KB is usable by a workgroup.
NGG:
There was some misinformation about NGG only being able to
address 32 KB LDS, it turns out this is actually not true
and it can address the full 64K.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21935>
in a scenario like:
* bind fb
* clear
* bind fb attachment as sampler
* begin rp
* draw
* end rp
* flush
* bind new fs
* begin rp
* draw
the first draw will have an implicit feedback loop, but the second one will not
need a feedback loop. since no samplers or attachments are changed between
draws, however, the feedback loop will remain active for successive renderpasses,
which is problematic since the shader part of the driver (zink_update_barriers)
attempts to eliminate these same feedback loops, leading to layout desync
instead, add handling to attachment prep here to eliminate feedback loops
in the event that an attachment can be switched from a write layout to a read layout
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22423>