Commit Graph

95 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
ad5f0592cc vc4: Use u_default_set_debug_callback
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18351>
2022-09-01 14:50:24 +00:00
Eric Engestrom
e178ecf8a9 vc4: introduce VC4_DBG() macro to make VC4_DEBUG checks consistent
The main issue was the inconsistent use of `unlikely()`, but the macro
also simplifies the code a little bit.

Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18086>
2022-08-24 23:03:57 +00:00
Yonggang Luo
2ca6ef22f7 util: Rename pipe_debug_callback to util_debug_callback
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15657>
2022-04-01 01:52:43 +00:00
Jose Maria Casanova Crespo
deb55340ca vc4: remove primconvert
We are losing the optimization of converting a single quad to
a triangle fan reusing the same four vertex in the buffer.

Maybe this optimization can be ported to primconvert, QUADS are
always converted to TRIANGLES with primconvert. And i965, crocus
and svga have similar optimizations.

V3D doesn't implement this optimization and according to when it was
introduced on 230e646a40 ("broadcom/vc4: Decompose single QUADs to
a TRIANGLE_FAN."):

 "No significant difference in the minetest replay, but it should reduce
  overhead by not requiring that we write quad indices to index buffers
  that we repeatedly re-upload (and making the draw packet smaller, as
  well)."

v2: Commit log includes more detail about the removed optimization.
    (Alejandro Piñeiro)

Reference: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5277
Suggested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12669>
2021-09-03 09:24:23 +02:00
Rhys Kidd
acc481ad79 vc4: Wire up core pipe_debug_callback
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-12-20 11:31:19 -08:00
Eric Engestrom
bb84fa146f util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Stefan Schake
b0acc3a562 broadcom/vc4: Native fence fd support
With the syncobj support in place, lets use it to implement the
EGL_ANDROID_native_fence_sync extension. This mostly follows previous
implementations in freedreno and etnaviv.

v2: Drop the flags (Eric)
    Handle in_fence_fd already in job_submit (Eric)
    Drop extra vc4_fence_context_init (Eric)
    Dup fds with CLOEXEC (Eric)
    Mention exact extension name (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:30 +01:00
Stefan Schake
44036c354d broadcom/vc4: Store job fence in syncobj
This gives us access to the fence created for the render job.

v2: Drop flag (Eric)

Signed-off-by: Stefan Schake <stschake@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-05-17 16:04:28 +01:00
Eric Anholt
c57d5ea3bb broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled.
Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to
36.
2018-03-09 09:59:54 -08:00
Eric Anholt
4aa700e0e0 broadcom/vc4: Implement GL_ARB_texture_barrier.
Improves x11perf -copywinwin100 from ~2000/sec to ~4700/sec.  More
importantly, this is a prerequisite for the new GL_MESA_tile_raster_order
extension.
2017-10-10 10:45:22 -07:00
Marek Olšák
55ad59d2b7 gallium: set pipe_context uploaders in drivers (v3)
Notes:
- make sure the default size is large enough to handle all state trackers
- pipe wrappers don't receive transfer calls from stream_uploader, because
  pipe_context::stream_uploader points directly to the underlying driver's
  stream_uploader (to keep it simple for now)

v2: add error handling to nv50, nvc0, noop
v3: set const_uploader

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Eric Anholt
9421a6065c vc4: Fix fallback to quad clears of depth in GLX.
The fix in the vc4-jobs series ended up triggering the fallback path on
GLX apps that use depth but not stencil.
2016-10-06 18:09:24 -07:00
Nicolai Hähnle
2a83036fe2 vc4: use the new parent/child pools for transfers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-10-05 15:42:20 +02:00
Eric Anholt
f597ac3966 vc4: Implement job shuffling
Track rendering to each FBO independently and flush rendering only when
necessary.  This lets us avoid the overhead of storing and loading the
frame when an application momentarily switches to rendering to some other
texture in order to continue rendering the main scene.

Improves glmark -b desktop:effect=shadow:windows=4 by 27%
Improves glmark -b
    desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4
    by 17%

While I haven't tested other apps, this should help X rendering a lot, and
I've heard GLBenchmark needed it too.
2016-09-14 06:25:41 +01:00
Eric Anholt
f473348468 vc4: Handle resolve skipping at job submit time.
This is done in vc4_flush currently, but I'm going to make the job always
track the surfaces it might be rendering to instead of putting in the
destinations at flush time.
2016-09-14 06:08:03 +01:00
Eric Anholt
9688166bd9 vc4: Move the render job state into a separate structure.
This is a preparation step for having multiple jobs being queued up at the
same time.
2016-09-14 06:08:03 +01:00
Eric Anholt
c31a7f529f vc4: Always unref the current job surfaces at job reset time.
Drops some tricky logic in vc4_flush() trying to update the pointers, and
fixes a broken lack of unref for MSAA surfaces at context destroy time.
2016-09-14 06:08:03 +01:00
Eric Anholt
774a556b6d vc4: Move job-submit skip cases to vc4_job_submit().
For calling job_submit() directly, I need the skipping here.
2016-09-14 06:08:03 +01:00
Eric Anholt
0ef1b32ebb vc4: Move bin CL trailer to job_submit() time.
To implement job shuffling, I want to be able to call submit() on specific
jobs, turning vc4_flush() into the context's flush-all-jobs hook.
2016-09-14 06:08:03 +01:00
Marek Olšák
e7a73b75a0 gallium: switch drivers to the slab allocator in src/util 2016-09-06 14:24:04 +02:00
Eric Anholt
21a9ed6207 vc4: Don't flush on read-only access of buffers read by the CL.
Fixes piglit mixed-immediate-and-vbo, and may significantly improve
performance of applications that store a 4-byte IB in the same VBO as
vertex data.
2016-04-18 10:10:44 -07:00
Marek Olšák
ecb2da1559 u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:45 +01:00
Marek Olšák
37d0aea772 u_upload_mgr: remove alignment parameter from u_upload_create
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-02 15:15:45 +01:00
Eric Anholt
c5b886b028 vc4: Only update vc4->msaa when the framebuffer changes.
Any update here should have been the same as in
vc4_set_framebuffer_state(), except for the point where vc4_blit.c
temporarily sets different state for its different buffers.
2015-12-15 12:02:53 -08:00
Eric Anholt
f2cf2a63f1 vc4: Don't consider nr_samples==1 surfaces to be MSAA.
This is apparently a weirdness of gallium -- nr_samples==1 is occasionally
used and means the same thing as nr_samples==0.  Fixes a bunch of
ARB_framebuffer_srgb blit cases in piglit.
2015-12-15 12:02:53 -08:00
Eric Anholt
edfd4d853a vc4: Add support for drawing in MSAA. 2015-12-08 09:49:53 -08:00
Edward O'Callaghan
13eb5f596b gallium/drivers: Sanitize NULL checks into canonical form
Use NULL tests of the form `if (ptr)' or `if (!ptr)'.
They do not depend on the definition of the symbol NULL.
Further, they provide the opportunity for the accidental
assignment, are clear and succinct.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:23 +01:00
Eric Anholt
a664233042 vc4: Add support for loading sample mask. 2015-12-04 09:10:53 -08:00
Eric Anholt
b6cd39fc47 vc4: Fix a leak of the last color read/write surface on context destroy. 2015-10-06 16:32:03 -07:00
Marek Olšák
0fc21ecfc0 gallium: add flags parameter to pipe_screen::context_create
This allows creating compute-only and debug contexts.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Eric Anholt
7432017f65 vc4: Rework cl handling to be friendlier to the compiler.
Drops 680 bytes of code, from avoiding a bunch of extra updates to the
next pointer in the struct.
2015-07-14 11:31:57 -07:00
Eric Anholt
a0d3915663 vc4: Make a helper function for getting the current offset in the CL.
I needed to rewrite this a bit for safety checking in the next commit.
Despite being a static inline of the same thing that was being done, we
lose 36 bytes of code for some reason.
2015-07-14 11:31:57 -07:00
Rob Clark
ab3ba21f97 vc4: unref old fence
Some, but not all, state trackers will explicitly unref (and set to
NULL) the previous *fence before calling pipe->flush().  So driver
should use fence_ref() which will unref the old fence if not NULL.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
2015-07-10 11:57:30 -04:00
Eric Anholt
1d45e44b2f vc4: Move tile state/alloc allocation into the kernel.
This avoids a security issue where userspace could have written the tile
state/tile alloc behind the GPU's back, and will apparently be necessary
for fixing stability bugs (tile state buffers are missing some top bits
for the tile alloc's address).
2015-06-17 23:53:49 -07:00
Eric Anholt
9adcd2d80a vc4: Move RCL generation into the kernel.
There weren't that many variations of RCL generation, and this lets us
skip all the in-kernel validation for what we generated.
2015-06-17 23:53:49 -07:00
Eric Anholt
731ac05cc4 vc4: Use VC4_SET/GET_FIELD for some RCL packets. 2015-06-16 15:15:14 -07:00
Eric Anholt
e22a192784 vc4: Make symbolic values for packet sizes. 2015-06-16 15:15:14 -07:00
Eric Anholt
10aacf5ae8 vc4: Just stream out fallback IB contents.
The idea I had when I wrote the original shadow code was that you'd see a
set_index_buffer to the IB, then a bunch of draws out of it.  What's
actually happening in openarena is that set_index_buffer occurs at every
draw, so we end up making a new shadow BO every time, and converting more
of the BO than is actually used in the draw.

While I could maybe come up with a better caching scheme, for now just
do the simple thing that doesn't result in a new shadow IB allocation
per draw.

Improves performance of isosurf in drawelements mode by 58.7967% +/-
3.86152% (n=8).
2015-05-27 17:29:11 -07:00
Eric Anholt
e214a59635 vc4: Separate out a bit of code for submitting jobs to the kernel.
I want to be able to have multiple jobs being set up at the same time (for
example, a render job to do a little fixup blit in the course of doing a
render to the main FBO).
2015-04-13 23:20:45 -07:00
Eric Anholt
5100221ff7 vc4: Skip sending down the clear colors if not clearing. 2015-04-13 10:39:24 -07:00
Eric Anholt
cb88d2cfcb vc4: Fix another space allocation mistake.
We're over-allocating our BCL in vc4_draw.c, so this never mattered.
However, new RCL-only blit support might end up here without having set up
any BCL contents.
2015-04-13 10:39:02 -07:00
Eric Anholt
8eb9304ee7 vc4: Add missed accounting for the size of the semaphore.
This wouldn't have mattered except in the worst case scenario RCL setup.
2015-04-13 10:33:30 -07:00
Eric Anholt
49d3c6a8e6 vc4: Update to current kernel sources.
New BO create and mmap ioctls are added.  The submit ABI gains a flags
argument, and the pointers are fixed at 64-bit.  Shaders are now fixed at
the start of their BOs.
2015-02-24 13:49:12 +00:00
Eric Anholt
d0d6d24723 vc4: Fix CL dumping trying to dump too far.
Execution will end at the cl->next, because that's what ct0ea/ct1ea get
programmed to.
2015-01-15 22:19:25 +13:00
Eric Anholt
d1f2fc834d vc4: Fix early Z behavior on hardware.
It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z.  The result was
that you'd get big chunks of your rendering missing.
2015-01-15 22:19:25 +13:00
Eric Anholt
b295403971 vc4: Skip storing the Z/S contents when it's invalidated.
Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409%
(n=67).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-06 15:40:41 -08:00
Eric Anholt
37478c638a vc4: Fix memory leak as of 0404e7fe0a.
Can't reset the CL before looking at how much we had pupt in it.
2014-12-31 11:34:28 -08:00
Eric Anholt
3ba57bae47 vc4: Only render tiles where the scissor ever intersected them.
This gives a 2.7x improvement in x11perf -rect100, since we only end up
load/storing the x11perf window, not the whole screen.
2014-12-30 14:33:52 -08:00
Eric Anholt
0404e7fe0a vc4: Move draw call reset handling to a helper function.
This will be more important in the next commit, when there's more state to
reset to nonzero values, and I want an early exit from the submit
function.
2014-12-30 14:30:59 -08:00
Eric Anholt
229bf4475f vc4: Optimize CL emits by doing size checks up front.
The optimizer obviously doesn't have the ability to rewrite these to skip
the size checks per call, so we have to do it manually.

Improves a norast benchmark on simulation by 0.779706% +/- 0.405838%
(n=6087).
2014-12-24 10:28:26 -10:00