third_party_mesa3d

Author	SHA1	Message	Date
Eric Anholt	0a1c6bcfb0	intel: Finish renaming fallback_debug() to perf_debug(). They're about to change to handle GL_ARB_debug_output, so just make one function. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:00 -08:00
Eric Anholt	807eedf70f	intel: Hook up the WARN_ONCE macro to GL_ARB_debug_output. This doesn't provide detailed error type information, but it's important to get these relatively severe but rare error messages out to the developer through whatever mechanism they are using. v2: Rebase on new WARN_ONCE additions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)	2013-03-05 14:25:00 -08:00
Paul Berry	a4b9678a54	Consolidate some redundant definitions of ARRAY_SIZE() macro. Previous to this patch, there were 13 identical definitions of this macro in Mesa source. That's ridiculous. This patch consolidates 6 of them to a single definition in src/mesa/main/macros.h. Unfortunately, I wasn't able to eliminate the remaining definitions, since they occur in places that don't include src/mesa/main/macros.h: - include/pci_ids/pci_id_driver_map.h - src/egl/drivers/dri2/egl_dri2.h - src/egl/main/egldefines.h - src/gbm/main/backend.c - src/gbm/main/gbm.c - src/glx/glxclient.h - src/mapi/mapi/stub.c I'm open to suggestions as to how to deal with the remaining redundancy. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-08 06:51:22 -08:00
Eric Anholt	99fe2b36cf	intel: Use a CPU map of the batch on LLC-sharing architectures. Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in, which was an improvement over mapping the batch through the GTT directly (since any readback or other failure to stream through write combining correctly would hurt). However, on LLC-sharing architectures we can do better by mapping the batch directly, which reduces the cache footprint of the application since we no longer have this extra copy of a batchbuffer around. Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4% (n=21). Improves Lightsmark performance by 1.1 +/- 0.1% (n=76). Improves cairo-gl performance by 1.9% +/- 1.4% (n=57). No statistically significant difference in GLB2.1 on SNB (n=37). Improves cairo-gl performance by 2.1% +/- 0.1% (n=278).	2013-01-29 11:25:14 +11:00
Chad Versace	a11fe62058	intel: Move validation of context version into intelInitContext Each driver (i830, i915, i965) used independent but similar code to validate the requested context version. With the rececnt arrival of GLES3, that logic has needed an update. Rather than apply identical updates to each drivers validation code, let's just move the validation into the shared routine intelInitContext. This refactor required some incidental changes to functions i830CreateContext and intelInitContext. For each function, this patch: - Adds context version parameters to the signature. - Adds a DRI_CTX_ERROR out param to the signature. - Sets the DRI_CTX_ERROR at each early return. Tested against gen6 with piglit egl-create-context-verify-gl-flavor. Verified that this patch does not change the set of exposed EGL context flavors. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-01-15 13:45:51 -08:00
Kenneth Graunke	4a6753926f	i965: Add an INTEL_DEBUG=no16 option. Often when debugging, I don't want to see SIMD16 shaders. It makes INTEL_DEBUG=vs/fs output much easier to read, especially when a program dumps many shaders. Plus, I also want to verify that SIMD8 works before even considering SIMD16. v2: Fix the likeliness check (caught by Chris and Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-01-12 15:35:38 -08:00
Paul Berry	8706395f25	mesa: Add ALIGN() macro to main/macros.h. Previously this macro existed in 3 separate places, some inside the intel driver and some outside of it. It makes more sense to have it in main/macros.h Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-01-08 09:08:57 -08:00
Eric Anholt	71f06344a0	i965: Add a debug flag for counting cycles spent in each compiled shader. This can be used for two purposes: Using hand-coded shaders to determine per-instruction timings, or figuring out which shader to optimize in a whole application. Note that this doesn't cover the instructions that set up the message to the URB/FB write -- we'd need to convert the MRF usage in these instructions to GRFs so that our offsets/times don't overwrite our shader outputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Check the timestamp reset flag in the VS, which is apparently getting set fairly regularly in the range we watch, resulting in negative numbers getting added to our 32-bit counter, and thus large values added to our uint64_t. v3: Rebase on reladdr changes, removing a new safety check that proved impossible to satisfy. Add a comment to the AOP defs from Ken's review, and put them in a slightly more sensible spot. v4: Check timestamp reset in the FS as well.	2012-12-05 14:29:44 -08:00
Eric Anholt	1fe71848b6	intel: Add a macro for printing a debug warning once. There are a number of places where some obscure piece of the code is not currently worth fixing, and we have some workaround behavior available. It's nicer for users to do some lame workaround than to just assert, but without asserts we never knew when the workaround was at fault. This should give us a nice compromise: Execute the workaround, but mention that the obscure workaround was hit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-10-16 13:13:44 -07:00
Oliver McFadden	1b921acd5f	intel: print debug either to stdout or `logcat' depending on platform. Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2012-10-12 11:14:54 +03:00
Eric Anholt	b0d23b66cf	intel: Move RenderMode fallback func to i915 driver. The Fallback field of the context struct doesn't work that way on i965, and it's the only caller of FALLBACK() in the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2012-08-28 11:43:04 -07:00
Kenneth Graunke	28fab4295e	i965: Un-hardcode WM binding table from update_texture_surface. Currently, we mirror the VS and WM binding tables' texture entries. That may not continue to be true, so in preparation, pass in the binding table and surface index as arguments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2012-08-25 12:01:10 -07:00
Chad Versace	38b748ce29	intel: Refactor intel_downsample_for_dri2_flush Move it from intel_screen.c to intel_context.c. Redeclare as non-static. A future commit will use it in multiple files. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2012-08-14 16:19:25 -07:00
Eric Anholt	d72ff03e69	i965: Add INTEL_DEBUG=perf for failure to compile 16-wide shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-08-12 19:08:25 -07:00
Eric Anholt	79198063b8	intel: Rename INTEL_DEBUG=fall to INTEL_DEBUG=perf. I want to introduce some more debug output for performance surprises that includes fallbacks, but aren't necessarily software rasterization. Leave INTEL_DEBUG=fall in place for those that have used that flag before. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-08-12 19:08:24 -07:00
Eric Anholt	5bffbd7ba2	i965: Add an offset argument to constant buffer setup. We'll use this for UBO surfaces. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-08-07 13:54:51 -07:00
Kenneth Graunke	860d5bdf98	i965: Add hardware context support. With fixes and updates from Ben Widawsky and comments from Paul Berry. v2: Use drm_intel_gem_context_destroy to destroy hardware context; remove useless initialization of hw_ctx, both suggested by Eric. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Paul Berry <stereotype441@gmail.com>	2012-07-10 15:09:58 -07:00
Eric Anholt	54308f78a2	i965: Drop a layer of indirection in doing HiZ resolves. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2012-05-23 10:18:27 -07:00
Paul Berry	ea8e854b2c	i965: Completely annotate the batch bo when aub dumping. Previously, when the environment variable INTEL_DEBUG=aub was set, mesa would simply instruct DRM to start dumping data to an .aub file, but we would not provide DRM with any information about the format of the data in various buffers. As a result, a lot of the data in the generate .aub file would be unannotated, making further data analysis difficult. This patch causes the entire contents of each batch buffer to be annotated using the data in brw->state_batch_list (which was previously used only to annotate the output of INTEL_DEBUG=bat). This includes data that was allocated by brw_state_batch, such as binding tables, surface and sampler states, depth/stencil state, and so on. The new annotation mechanism requires DRM version 2.4.34. Reviewed-by: Eric Anholt <eric@anholt.net>	2012-05-22 15:19:00 -07:00
Paul Berry	f28a7d0e77	intel: Work around differences between C and C++ scoping rules. In C++, if a struct is defined inside another struct, or its name is first seen inside a struct or function, the struct is nested inside the namespace of the struct or function it appears in. In C, all structs are visible from toplevel. This patch explicitly moves the decalartions of intel_batchbuffer to toplevel, so that it does not get nested inside a namespace when header files are included from C++. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2012-05-10 10:30:00 -07:00
Paul Berry	434fc8bde4	intel: Add extern "C" declarations to headers These declarations are necessary to allow C++ code to call C code without causing unresolved symbols (which would make the driver fail to load). Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2012-05-10 10:30:00 -07:00
Kenneth Graunke	180aecb6dc	i965: Add initial IS_HASWELL() macros. For now, these all return 0, as I don't yet want to enable Haswell support. Eventually they will be filled in with proper PCI IDs. Also add an is_haswell field similar to is_g4x to make it easy to distinguish Gen7 and Gen7.5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2012-03-30 14:38:48 -07:00
Eric Anholt	0247d89183	intel: Ask libdrm to dump an AUB file if INTEL_DEBUG=aub. It also asks for BMPs in the aub file at SwapBuffers time. Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-03-21 12:45:05 -07:00
Eric Anholt	67d3ff760a	intel: Drop the INTEL_STRICT_CONFORMANCE environment variable. If you want to test the graphics driver, you want to test it under the conditions that users will see, not some set of additional fallbacks. If you want to test swrast, run the swrast driver (or no_rast=true) instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-03-20 15:27:46 -07:00
Daniel Vetter	f172eae8b2	i965: fixup W-tile offset computation to take swizzling into account There's even a comment in the code containing the right swizzling computations! Previously this has not been noticed because we need to manually enabled swizzling on snb/ivb (kernel 3.4 will do that) and we don't use the separate stencil on ilk (where the bios enables swizzling). This fixes piglit ./bin/fbo-stencil readpixels GL_DEPTH32F_STENCIL8 -auto on recent drm-intel-next kernels. Also remove the comment about ivb, it's stale now. Swizzling detection is done by allocating a temporary x-tiled buffer object. Unfortunately kernels before v3.2 lie on snb/ivb because they claim that swizzling is enable, but it isn't. The kernel commit that fixes this for backport to pre-v3.2 is commit acc83eb5a1e0ae7dbbf89ca2a1a943ade224bb84 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Sep 12 20:49:16 2011 +0200 drm/i915: fix swizzling on gen6+ But if the kernel doesn't lie, this now works on swizzling and not swizzling machines. NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-03-05 12:02:47 -08:00
Eugeni Dodonov	7def293204	intel: verify if hardware has LLC support Rely on libdrm HAS_LLC parameter to verify if hardware supports it. In case the libdrm version does not supports this check, fallback to older way of detecting it which assumed that GPUs newer than GEN6 have it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>	2012-02-04 18:21:22 -02:00
Eric Anholt	796f44d779	intel: Pass the gl_renderbuffer to render_target_supported() vtable method. I'm going to want to go looking at it for an integer texture fix. NOTE: This is a candidate for the 8.0 branch.	2012-01-27 11:46:10 -08:00
Eric Anholt	ccf0d31a21	intel: Fix warnings of undefined ffs(). For some reason these started showing up with the automake conversion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2012-01-17 10:35:24 -08:00
Eric Anholt	c4089d444a	i965/gen7: Use the updated interface for SO write pointer resetting. The new kernel patch I submitted makes the interface opt-in, so all batchbuffers aren't preceded by the 4 MI_LOAD_REGISTER_IMMs. This requires the updated i915_drm.h present in libdrm 2.4.30. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2012-01-06 09:16:32 -08:00
Eric Anholt	b890f1090c	intel: Make the batchbuffer flush debug more useful. We were printing out the line triggering the flush, but a variety of different causes just printed the line number for intel_flush()'s call of intel_batchbuffer_flush(). Plumb the line numbers from the caller of intel_flush() on through.	2011-12-29 09:33:56 -08:00
Eric Anholt	d84a180417	i965: Base HW depth format setup based on MESA_FORMAT, not bpp. This will make handling new formats (like actually exposing Z32F) easier and more reliable. v2: Remove the check for hiz buffer -- the MESA_FORMAT should really be giving us the value we want even for hiz. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2011-11-29 16:44:51 -08:00
Eric Anholt	27505a105a	i915: Move the texture format setup for this driver out of shared code. The i965 driver is now enabling all of these formats on its own from the surface format table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2011-11-22 13:58:39 -08:00
Eric Anholt	6661b7596f	intel: Add the context to the render_target_supported() vtbl method. We're going to want to provide different answers per chipset generation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2011-11-22 13:58:38 -08:00
Chad Versace	f17b12278d	intel: Change signature of HiZ resolve functions Now that intel_renderbuffer::region has been replaced with a miptree, the HiZ functions region parameter must be replaced with a miptree parameter. Change the return type from bool to void. Rename the 'depth' parameter to 'layer', because it will correspond to irb->mt_layer. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2011-11-22 10:50:49 -08:00
Kenneth Graunke	5d448b42b7	i965: Add new vtable entries for surface state updating functions. Gen7+ SURFACE_STATE is different from Gen4-6, so we need separate per-generation functions for creating and updating it. However, the usage is the same, and callers just want to utilize the appropriate functions with minimal pain. So, put them in the vtable. Since these take a brw_context pointer and are only used on Gen4, just add a forward declaration. This is the simplest (if not cleanest) solution. It would be nicer to have a i965-specific vtable, but that's a refactor for another day. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2011-11-10 22:51:18 -08:00
Eric Anholt	ac6a376f52	intel: Don't force a batchbuffer flush in readpixels. Renderbuffer mapping handles flushing the batchbuffer if required, so all we need to do is make sure any pending rendering has reached the batchbuffer. Reviewed-by: Brian Paul <brianp@vmware.com>	2011-11-03 23:29:53 -07:00
Eric Anholt	3faf56ffbd	intel: Add an interface for saving/restoring the batchbuffer state. This will be used to avoid the prepare() step in the i965 driver's state setup. Instead, we can just speculatively emit the primitive into the batchbuffer, then check if the batch is too big, rollback and flush, and replay the primitive. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2011-10-29 12:15:56 -07:00
Kenneth Graunke	47f1d9deff	i965: Remove "single threaded" INTEL_DEBUG mode. According to the docs for 3DSTATE_PS (Gen7+) and 3DSTATE_WM (Gen6), there is a platform dependent value for the minimum number of pixel shader threads. It may also vary based on whether WIZ Hashing is on. For example, Ivybridge requires at least 4 threads if WIZ hashing is disabled, and 8 if it's enabled. Programming it to use less threads is illegal. Sandybridge appears to have similar restrictions. So on newer platforms, INTEL_DEBUG=sing will probably just hang the GPU. Rather than try to patch it up for newer platforms and extend it to support geometry shaders, just remove it as it isn't that useful anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2011-10-25 17:09:21 -07:00
Chad Versace	7b0f748efa	intel: Add HiZ operations to intel_context::vtbl for all drivers Add the following to the vtbl: hiz_resolve_depthbuffer hiz_resolve_hizbuffer For all drivers for which HiZ is not enabled, the methods are set to be no-ops. If HiZ is enabled, the methods are currently to set to empty stubs. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chad Versace <chad@chad-versace.us>	2011-10-18 11:42:54 -07:00
Kenneth Graunke	2e5a1a254e	intel: Convert from GLboolean to 'bool' from stdbool.h. I initially produced the patch using this bash command: for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i 's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i 's/GL_FALSE/false/g' $file; done Then I manually added #include <stdbool.h> to fix compilation errors, and converted a few functions back to GLboolean that were used in core Mesa's function pointer table to avoid "incompatible pointer" warnings. Finally, I cleaned up some whitespace issues introduced by the change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad@chad-versace.us> Acked-by: Paul Berry <stereotype441@gmail.com>	2011-10-18 11:38:39 -07:00
Chad Versace	e9adfa2ba1	intel: Assert that no batch is emitted if a region is mapped What I would prefer to assert is that, for each region that is currently mapped, no batch is emitted that uses that region's bo. However, it's much easier to implement this big hammer. Observe that this requires that the batch flush in intel_region_map() be moved to within the map_refcount guard. v2: Add comments (borrowed from anholt's reply) explaining why the assertion is a good idea. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad@chad-versace.us>	2011-10-11 17:16:31 -07:00
Chad Versace	9559ca600d	i965: Split brw_set_prim into brw/gen6 variants The "slight optimization to avoid the GS program" in brw_set_prim() is not used by Gen 6, since Gen 6 doesn't use a GS program. Also, Gen 6 doesn't use reduced primitives. Also, document that intel_context.reduced_primitive is only used for Gen < 6 Reviewed-by: Eric Anholt <eric@anho.net> Signed-off-by: Chad Versace <chad@chad-versace.us>	2011-10-10 13:23:41 -07:00
Kenneth Graunke	490e6470a0	intel: Introduce a new intel_context::gt field to go along with gen. It seems that GT1/GT2 sorts of variations are here to stay, and more special cases will likely be required in the future. Checking by PCI ID via the IS_xxx_GTx macros is cumbersome; introducing a new 'gt' field analogous to intel->gen will make this easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2011-09-26 11:50:31 -07:00
Kenneth Graunke	3f9f1b3659	intel: Remove intel_context::has_xrgb_textures/has_luminance_srgb. Seeing as they were only used once (in the same function they were defined), having them as context members seemed rather pointless. Remove them entirely (rather than using local variables) since the chipset generation checks are actually just as straightforward. While we're at it, clean up the remainder of the if-tree that set them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2011-09-26 11:50:31 -07:00
Eric Anholt	cb5e0ba2aa	i915: Simplify intel_wpos_* with a helper function.	2011-07-18 11:26:34 -07:00
Eric Anholt	f34ec6169d	intel: Move intel_draw_buffers() code into each driver. The illusion of shared code here wasn't fooling anybody. It was tempting to keep i830 and i915 still shared, but I think I actually want to make them diverge shortly. Reviewed-by: Chad Versace <chad@chad-versace.us>	2011-07-18 11:26:33 -07:00
Eric Anholt	6e6b388604	i915: Fix map/unmap mismatches from leaving INTEL_FALLBACK during TNL. The first rendering after context create didn't know of the color buffer yet, triggering a sw fallback. The intel_prepare_render() from intelSpanRenderStart then found the buffer and turned off fallbacks, but intelSpanRenderFinish was never called and things were left mapped. By checking buffers before making the call on whether to do the fallback pipeline or not, we avoid the fallback change inside of the rendering pipeline. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31561 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2011-07-12 14:40:39 -07:00
Eric Anholt	0ab7d6f437	i965/gen6: Limit the workaround flush to once per primitive. We're about to call this function in a bunch of state emits, so let's not spam the hardware with flushes too hard.	2011-06-20 08:37:43 -07:00
Eric Anholt	dfada714f8	i965/gen6: Use an BO instead of writing to address 0 for PIPE_CONTROL W/A. This was spectacularly unsafe. On my system, address 0 happens to be the hardware status page for the render ring, and the first quadword of that happens to contain nothing we ever look at, but I sure didn't look forward to having to debug some day when, for example, the kernel happened to bind the ringbuffer before binding the hwsp.	2011-06-20 08:37:43 -07:00
Eric Anholt	23b6f9606d	intel: Implement glFinish() correctly by waiting on all previous rendering. Before, we were waiting for (most of) the current framebuffer to be done, which is not quite the same thing.	2011-06-07 10:46:04 -07:00

1 2 3 4

170 Commits