third_party_mesa3d

Author	SHA1	Message	Date
Eric Anholt	a8fd58eae5	vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set. This has proven to be incredibly useful for debugging CMA allocation failures and driving memory management improvements. However, we don't want to burden entry and exit from the BO cache with the labeling ioctl's overhead on release builds.	2017-09-27 10:21:49 -07:00
Eric Anholt	68c91a87d7	broadcom/vc4: Keep pipe_sampler_view->texture matching the original texture. I was overwriting view->texture with the shadow resource when we need to do shadow copies (retiling or baselevel rebase), but that tripped up some critical new sanity checking in state_tracker (making sure that stObj->pt hasn't changed from view->texture through TexImage-related paths). To avoid that, move the shadow resource to the vc4_sampler_view struct. Fixes: `f0ecd36ef8` ("st/mesa: add an entirely separate codepath for setting up buffer views")	2017-09-26 14:49:43 -07:00
Lucas Stach	c481880899	renderonly/etnaviv: stop importing resource from renderonly The current way of importing the resource from renderonly after allocation is opaque and is taking away control from the driver, which it needs in order to implement more advanced scenarios than the simple linear scanout with matching stride alignments. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Daniel Stone <daniels@collabora.com>	2017-07-19 16:26:49 +02:00
Eric Anholt	84ed8b67c5	vc4: Set shareable BOs as T tiled if possible X11 and GL compositor performance on VC4 has been terrible because of our SHARED-usage buffers all being forced to linear. This swaps SHARED && !LINEAR buffers over to being tiled. This is an expected win for all GL compositors during rendering (a full copy of each shared texture per draw call), allows X11 to be used with decent performance without a GL compositor, and improves X11 windowed swapbuffers performance as well. It also halves the memory usage of shared buffers that get textured from. The only cost should be idle systems with a scanout-only buffer that isn't flagged as LINEAR, in which case the memory bandwidth cost of scanout goes up ~25%. This implements the EGL_EXT_image_dma_buf_import_modifiers extension, supporting the VC4 T_TILED modifier. v2: Added modifier support to resource creation/import, and advertisement (by daniels). v3: Fix old-kernel fallback path, fix compiler error and warnings, and comment touchups (by anholt). Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-07-12 10:58:33 -07:00
Eric Anholt	bb466a996f	vc4: Use vc4_setup_slices for resource import Rather than open-coding populating the first slice inside resource import, use vc4_setup_slices to do it for us. v2: Rebase on VC4_DEBUG=surf change Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-07-12 10:58:33 -07:00
Eric Anholt	111b6b77cb	vc4: Make the miptree debug code available under VC4_DEBUG=surf I kept flipping the bool on for debug, so let's just make it available. Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-07-12 10:58:33 -07:00
Eric Anholt	2aec62a45b	vc4: Remove a stale comment. The kernel hasn't been synchronous in a couple of years, plus there was synchronization code right there.	2017-07-12 10:58:33 -07:00
Eric Anholt	7029ec05e2	gallium: Add renderonly-based support for pl111+vc4. This follows the model of imx (display) and etnaviv (render): pl111 is a display-only device, so when asked to do GL for it, we see if we have a vc4 renderer, make the vc4 screen, and have vc4 call back to pl111 to do scanout allocations. The difference from etnaviv is that we share the same BO between vc4 and pl111, rather than having a vc4 bo and a pl11 bo and copies between the two. The only mismatch between their requirements is that vc4 requires 4-pixel (at 32bpp) stride alignment, while pl111 requires that stride match width. The kernel will reject any modesets to an incorrect stride, so the 3D driver doesn't need to worry about that. v2: Rebase on Android rework, drop unused include. v3: Fix another Android bug, from Rob Herring's build-testing. Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-06-15 11:41:22 -07:00
Rhys Kidd	499f45163a	vc4: Remove dead code in vc4_dump_surface_msaa() Coverity caught the use of dead code copy-paste for found_colors[] and num_found_colors. CID: 1341850 Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-05-22 09:50:22 -07:00
Eric Anholt	e8ea42d245	vc4: Don't allocate new BOs to avoid synchronization when they're shared. If X11 did a software fallback to the entire screen, we would throw out the BO the screen is scanning out from and allocate a new one. Cc: mesa-stable@lists.freedesktop.org	2017-05-17 14:18:29 -07:00
Eric Anholt	50e78cd04f	vc4: Drop pointless indirections around BO import/export. I've since found them to be more confusing by adding indirections than clarifying by screening off resources from the handle/fd import/export process.	2017-05-17 14:18:26 -07:00
Eric Anholt	76e4ab5715	vc4: Drop the u_resource_vtbl no-op layer. We only ever attached one vtbl, so it was a waste of space and indirections.	2017-05-17 14:18:26 -07:00
Marek Olšák	330d0607ed	gallium: remove pipe_index_buffer and set_index_buffer pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga	2017-05-10 19:00:16 +02:00
Eric Anholt	80157466cd	vc4: Add miptree/texture state support for ETC1 compressed textures. The format isn't flagged as enabled at runtime yet, because we need kernel validation support.	2016-11-03 18:42:58 -07:00
Eric Anholt	99d790538d	vc4: Avoid loading from the texture during non-utile-aligned glTexImage(). Previously, the plan was "if the width/height we have to load/store isn't the size the user is planning on writing, then we need to load the old contents out beforehand to prevent writing back undefined". However, when we're doing glTexImage() we often end up aligning the width/height into the padding of the texture, and we don't actually need to read out that padding. Improves x11perf -aatrapezoid100 performance from ~460/sec to ~700/sec.	2016-10-13 14:27:30 -07:00
Eric Anholt	9421a6065c	vc4: Fix fallback to quad clears of depth in GLX. The fix in the vc4-jobs series ended up triggering the fallback path on GLX apps that use depth but not stencil.	2016-10-06 18:09:24 -07:00
Eric Anholt	8810270d06	vc4: Add the format name in miptree_debug. I was curious if my Z/S buffer was actually ZS or ZX, and the vc4 format of "0" didn't tell me much.	2016-10-06 18:09:24 -07:00
Nicolai Hähnle	2a83036fe2	vc4: use the new parent/child pools for transfers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-10-05 15:42:20 +02:00
Eric Anholt	f597ac3966	vc4: Implement job shuffling Track rendering to each FBO independently and flush rendering only when necessary. This lets us avoid the overhead of storing and loading the frame when an application momentarily switches to rendering to some other texture in order to continue rendering the main scene. Improves glmark -b desktop:effect=shadow:windows=4 by 27% Improves glmark -b desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4 by 17% While I haven't tested other apps, this should help X rendering a lot, and I've heard GLBenchmark needed it too.	2016-09-14 06:25:41 +01:00
Eric Anholt	a2014c2eb9	vc4: Simplify the DISCARD_RANGE handling It's really just an upgrade to attempting WHOLE_RESOURCE. Pulling the logic out caught two bugs in it: We would try to do so on cubemaps (even though we're only mapping 1 of the 6 slices), and we would break persistent coherent mappings by trying to reallocate when we shouldn't.	2016-09-14 06:08:03 +01:00
Marek Olšák	e7a73b75a0	gallium: switch drivers to the slab allocator in src/util	2016-09-06 14:24:04 +02:00
Marek Olšák	1ffe77e7bb	gallium: split transfer_inline_write into buffer and texture callbacks to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2016-07-23 13:33:42 +02:00
Eric Anholt	3bcd0f1912	vc4: Speed up glGenerateMipmaps by avoiding shadow baselevel. To support general GL_TEXTURE_BASE_LEVEL we have to copy to a temporary miptree. However, if a single level is being selected, we can use the existing miptree and force all the sampling to be from that particular level. This avoids a ton of software fallbacks in glGenerateMipmaps(), which uses base levels in the blit implementation in gallium. Improves "glmark2 -b terrain" from 2 fps to 3 (perhaps some more precision would be useful?), and cuts its CPU usage during the benchmarking from ~30% to ~10% (total CPU time from 8.8s to 7.6s).	2016-07-15 13:54:00 -07:00
Eric Engestrom	e8959ba7af	vc4: fix memory leak The allocation has succeeded by that point, so it needs to be freed. CovID: 1358929 Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-07-12 15:47:12 -07:00
Rob Herring	067c5b10b6	vc4: fix vc4_resource_from_handle() stride calculation The expected stride calculation is completely wrong. It should ultimately be multiplying cpp and width rather than dividing. The width also needs to be aligned to the tiling width first before converting to stride bytes. The whole stride check here is possibly pointless. Any buffers which were allocated outside of vc4 may have strides with larger alignment requirements. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2016-06-15 14:54:38 -07:00
Eric Anholt	a507dcc160	vc4: Size transfer temporary mappings appropriately for full maps of 3D. We don't really support reading/writing of 3D textures since the hardware doesn't do 3D, but we do need to make sure that a pipe_transfer for them has enough space to store the image. This was previously not a problem because the state tracker only mapped a slice at a time until `fb9fe352ea`. Fixes glean glsl1 tests, which all have setup of a 3D texture at the start.	2016-05-18 17:30:07 -07:00
Eric Anholt	48fe53bbb9	vc4: Add support for rendering to cube map surfaces. We need to fix up the offset to point at the face of the cube. Fixes piglit fbo-cubemap, copyteximage CUBE, and glean's fbo test. Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-04-18 10:10:44 -07:00
Eric Anholt	21a9ed6207	vc4: Don't flush on read-only access of buffers read by the CL. Fixes piglit mixed-immediate-and-vbo, and may significantly improve performance of applications that store a 4-byte IB in the same VBO as vertex data.	2016-04-18 10:10:44 -07:00
Eric Anholt	56b14adf85	vc4: Sanity check strides for imported BOs. If we're going to sample from or render to them at some particular size, we'd better make sure that they actually are that size. Causes some tests under simulation to generate appropriate error messages instead of failures.	2016-04-18 10:10:44 -07:00
Marek Olšák	82db518f15	gallium: add external usage flags to resource_from(get)_handle (v2) This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-03-09 15:02:25 +01:00
Eric Anholt	64253fdb2e	vc4: Fix build from upload changes.	2016-01-02 17:33:19 -08:00
Marek Olšák	020009f7cc	u_upload_mgr: pass alignment to u_upload_alloc manually The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Eric Anholt	f2cf2a63f1	vc4: Don't consider nr_samples==1 surfaces to be MSAA. This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.	2015-12-15 12:02:53 -08:00
Eric Anholt	fc4a1bfb88	vc4: Add support for mapping of MSAA resources. The pipe_transfer_map API requires that we do an implicit downsample/upsample and return a mapping of that.	2015-12-08 09:49:56 -08:00
Eric Anholt	a69ac4e89c	vc4: Add debug dumping of MSAA surfaces.	2015-12-04 09:24:36 -08:00
Eric Anholt	3c3b1184eb	vc4: Add support for laying out MSAA resources. For MSAA, we store full resolution tile buffer contents, which have their own tiling format. Since they're full resolution buffers, we have to align their size to full tiles.	2015-12-04 09:24:36 -08:00
Eric Anholt	eb8fb0064d	vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails. I was afraid our callers weren't prepared for this, but it looks like at least for resource creation, mesa/st throws an error appropriately. Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-09 19:17:36 -08:00
Eric Anholt	855a3ca598	vc4: Fix a compiler warning.	2015-11-09 19:17:36 -08:00
Eric Anholt	04c42f3ab5	vc4: Allow user index buffers, to avoid slow readback for shadow IBs. Improves low-settings openarena performance by 31.9975% +/- 0.659931% (n=7).	2015-10-29 22:58:01 -07:00
Eric Anholt	2e04492a14	vc4: Skip re-emitting the shader_rec if it's unchanged. It's a bunch of work for us to emit it (and its uniforms), more work for the kernel to validate it, and additional work for the CLE to read it. Improves es2gears framerate by about 50%. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-07-28 20:02:16 -07:00
Eric Anholt	bb107110a4	vc4: Fix write-only texsubimage when we had to align. We need to make sure that when we store the aligned box, we've got initialized contents in the border. We could potentially just load the border area, but for now let's get text rendering working in X (and fix the GL_TEXTURE_2D errors in piglit's texsubimage test and gl-2.1-pbo/test_tex_image)	2015-06-20 00:16:32 -07:00
Eric Anholt	10aacf5ae8	vc4: Just stream out fallback IB contents. The idea I had when I wrote the original shadow code was that you'd see a set_index_buffer to the IB, then a bunch of draws out of it. What's actually happening in openarena is that set_index_buffer occurs at every draw, so we end up making a new shadow BO every time, and converting more of the BO than is actually used in the draw. While I could maybe come up with a better caching scheme, for now just do the simple thing that doesn't result in a new shadow IB allocation per draw. Improves performance of isosurf in drawelements mode by 58.7967% +/- 3.86152% (n=8).	2015-05-27 17:29:11 -07:00
Eric Anholt	3a728d4dfb	vc4: Update the shadow texture for public textures on every draw. We don't know who else has written to it, so we'd better update it every time. This makes the gears spin in X again.	2015-04-15 16:50:23 -07:00
Eric Anholt	bd957b1b79	vc4: Hook up VC4_DEBUG=perf to some useful printfs.	2015-04-15 16:50:22 -07:00
Eric Anholt	43b20795b7	vc4: Move the blit code to a separate file. There will be other blit code showing up, and it seems like the place you'd look.	2015-04-13 23:20:45 -07:00
Eric Anholt	adae027260	vc4: Use the blit interface for updating shadow textures. This lets us plug in a better blit implementation and have it impact the shadow update, too.	2015-04-13 10:39:24 -07:00
Eric Anholt	7bc39c8418	vc4: Add a dump-the-surface-contents routine. This has been useful once again while trying to debug stride issues between render targets and texturing.	2015-03-24 10:39:12 -07:00
Eric Anholt	7f797e3d17	vc4: Fix pitch alignment of linear textures. Fixes some non-power-of-two texture rendering when I force ARGB8888 to raster.	2015-03-24 10:39:12 -07:00
Eric Anholt	8975a09494	vc4: Fix use of a bool as an enum. The enum compared to was 0, so it worked out, but it sure looked wrong.	2015-03-24 10:39:12 -07:00
Eric Anholt	04605c21f6	vc4: Decide the HW's format before laying out the miptree. I'm experimenting with a workaround for raster texture misrendering on hardware, and this lets me look at the format chosen when computing strides.	2015-03-24 10:39:12 -07:00

1 2

73 Commits