This doesn't provide detailed error type information, but it's important
to get these relatively severe but rare error messages out to the
developer through whatever mechanism they are using.
v2: Rebase on new WARN_ONCE additions.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Previous to this patch, there were 13 identical definitions of this
macro in Mesa source. That's ridiculous. This patch consolidates 6
of them to a single definition in src/mesa/main/macros.h.
Unfortunately, I wasn't able to eliminate the remaining definitions,
since they occur in places that don't include src/mesa/main/macros.h:
- include/pci_ids/pci_id_driver_map.h
- src/egl/drivers/dri2/egl_dri2.h
- src/egl/main/egldefines.h
- src/gbm/main/backend.c
- src/gbm/main/gbm.c
- src/glx/glxclient.h
- src/mapi/mapi/stub.c
I'm open to suggestions as to how to deal with the remaining redundancy.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in,
which was an improvement over mapping the batch through the GTT directly
(since any readback or other failure to stream through write combining
correctly would hurt). However, on LLC-sharing architectures we can do better
by mapping the batch directly, which reduces the cache footprint of the
application since we no longer have this extra copy of a batchbuffer around.
Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4%
(n=21). Improves Lightsmark performance by 1.1 +/- 0.1% (n=76). Improves
cairo-gl performance by 1.9% +/- 1.4% (n=57).
No statistically significant difference in GLB2.1 on SNB (n=37). Improves
cairo-gl performance by 2.1% +/- 0.1% (n=278).
Each driver (i830, i915, i965) used independent but similar code to
validate the requested context version. With the rececnt arrival of GLES3,
that logic has needed an update. Rather than apply identical updates to
each drivers validation code, let's just move the validation into the
shared routine intelInitContext.
This refactor required some incidental changes to functions
i830CreateContext and intelInitContext. For each function, this patch:
- Adds context version parameters to the signature.
- Adds a DRI_CTX_ERROR out param to the signature.
- Sets the DRI_CTX_ERROR at each early return.
Tested against gen6 with piglit egl-create-context-verify-gl-flavor.
Verified that this patch does not change the set of exposed EGL context
flavors.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Often when debugging, I don't want to see SIMD16 shaders. It makes
INTEL_DEBUG=vs/fs output much easier to read, especially when a program
dumps many shaders. Plus, I also want to verify that SIMD8 works before
even considering SIMD16.
v2: Fix the likeliness check (caught by Chris and Eric).
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously this macro existed in 3 separate places, some inside the
intel driver and some outside of it. It makes more sense to have it
in main/macros.h
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This can be used for two purposes: Using hand-coded shaders to determine
per-instruction timings, or figuring out which shader to optimize in a
whole application.
Note that this doesn't cover the instructions that set up the message to
the URB/FB write -- we'd need to convert the MRF usage in these
instructions to GRFs so that our offsets/times don't overwrite our
shader outputs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
v2: Check the timestamp reset flag in the VS, which is apparently
getting set fairly regularly in the range we watch, resulting in
negative numbers getting added to our 32-bit counter, and thus large
values added to our uint64_t.
v3: Rebase on reladdr changes, removing a new safety check that proved
impossible to satisfy. Add a comment to the AOP defs from Ken's
review, and put them in a slightly more sensible spot.
v4: Check timestamp reset in the FS as well.
There are a number of places where some obscure piece of the code is not
currently worth fixing, and we have some workaround behavior available. It's
nicer for users to do some lame workaround than to just assert, but without
asserts we never knew when the workaround was at fault.
This should give us a nice compromise: Execute the workaround, but mention
that the obscure workaround was hit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The Fallback field of the context struct doesn't work that way on i965, and
it's the only caller of FALLBACK() in the driver.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Currently, we mirror the VS and WM binding tables' texture entries.
That may not continue to be true, so in preparation, pass in the binding
table and surface index as arguments.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Move it from intel_screen.c to intel_context.c. Redeclare as non-static.
A future commit will use it in multiple files.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
I want to introduce some more debug output for performance surprises that
includes fallbacks, but aren't necessarily software rasterization. Leave
INTEL_DEBUG=fall in place for those that have used that flag before.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
With fixes and updates from Ben Widawsky and comments from Paul Berry.
v2: Use drm_intel_gem_context_destroy to destroy hardware context;
remove useless initialization of hw_ctx, both suggested by Eric.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Paul Berry <stereotype441@gmail.com>
Previously, when the environment variable INTEL_DEBUG=aub was set,
mesa would simply instruct DRM to start dumping data to an .aub file,
but we would not provide DRM with any information about the format of
the data in various buffers. As a result, a lot of the data in the
generate .aub file would be unannotated, making further data analysis
difficult.
This patch causes the entire contents of each batch buffer to be
annotated using the data in brw->state_batch_list (which was
previously used only to annotate the output of INTEL_DEBUG=bat). This
includes data that was allocated by brw_state_batch, such as binding
tables, surface and sampler states, depth/stencil state, and so on.
The new annotation mechanism requires DRM version 2.4.34.
Reviewed-by: Eric Anholt <eric@anholt.net>
In C++, if a struct is defined inside another struct, or its name is
first seen inside a struct or function, the struct is nested inside
the namespace of the struct or function it appears in. In C, all
structs are visible from toplevel.
This patch explicitly moves the decalartions of intel_batchbuffer to
toplevel, so that it does not get nested inside a namespace when
header files are included from C++.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
These declarations are necessary to allow C++ code to call C code
without causing unresolved symbols (which would make the driver fail
to load).
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
For now, these all return 0, as I don't yet want to enable Haswell
support. Eventually they will be filled in with proper PCI IDs.
Also add an is_haswell field similar to is_g4x to make it easy to
distinguish Gen7 and Gen7.5.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
If you want to test the graphics driver, you want to test it under the
conditions that users will see, not some set of additional fallbacks.
If you want to test swrast, run the swrast driver (or no_rast=true)
instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There's even a comment in the code containing the right swizzling
computations!
Previously this has not been noticed because we need to manually
enabled swizzling on snb/ivb (kernel 3.4 will do that) and we
don't use the separate stencil on ilk (where the bios enables
swizzling). This fixes
piglit ./bin/fbo-stencil readpixels GL_DEPTH32F_STENCIL8 -auto
on recent drm-intel-next kernels.
Also remove the comment about ivb, it's stale now.
Swizzling detection is done by allocating a temporary x-tiled
buffer object. Unfortunately kernels before v3.2 lie on snb/ivb
because they claim that swizzling is enable, but it isn't. The
kernel commit that fixes this for backport to pre-v3.2 is
commit acc83eb5a1e0ae7dbbf89ca2a1a943ade224bb84
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Mon Sep 12 20:49:16 2011 +0200
drm/i915: fix swizzling on gen6+
But if the kernel doesn't lie, this now works on swizzling and
not swizzling machines.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rely on libdrm HAS_LLC parameter to verify if hardware supports it. In
case the libdrm version does not supports this check, fallback to older
way of detecting it which assumed that GPUs newer than GEN6 have it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
The new kernel patch I submitted makes the interface opt-in, so all
batchbuffers aren't preceded by the 4 MI_LOAD_REGISTER_IMMs. This
requires the updated i915_drm.h present in libdrm 2.4.30.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were printing out the line triggering the flush, but a variety of
different causes just printed the line number for intel_flush()'s call
of intel_batchbuffer_flush(). Plumb the line numbers from the caller
of intel_flush() on through.
This will make handling new formats (like actually exposing Z32F)
easier and more reliable.
v2: Remove the check for hiz buffer -- the MESA_FORMAT should really
be giving us the value we want even for hiz.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now that intel_renderbuffer::region has been replaced with a miptree, the
HiZ functions region parameter must be replaced with a miptree parameter.
Change the return type from bool to void.
Rename the 'depth' parameter to 'layer', because it will correspond to
irb->mt_layer.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Gen7+ SURFACE_STATE is different from Gen4-6, so we need separate
per-generation functions for creating and updating it. However, the
usage is the same, and callers just want to utilize the appropriate
functions with minimal pain. So, put them in the vtable.
Since these take a brw_context pointer and are only used on Gen4, just
add a forward declaration. This is the simplest (if not cleanest)
solution. It would be nicer to have a i965-specific vtable, but that's
a refactor for another day.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Renderbuffer mapping handles flushing the batchbuffer if required, so
all we need to do is make sure any pending rendering has reached the
batchbuffer.
Reviewed-by: Brian Paul <brianp@vmware.com>
This will be used to avoid the prepare() step in the i965 driver's
state setup. Instead, we can just speculatively emit the primitive
into the batchbuffer, then check if the batch is too big, rollback and
flush, and replay the primitive.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
According to the docs for 3DSTATE_PS (Gen7+) and 3DSTATE_WM (Gen6),
there is a platform dependent value for the minimum number of pixel
shader threads. It may also vary based on whether WIZ Hashing is on.
For example, Ivybridge requires at least 4 threads if WIZ hashing is
disabled, and 8 if it's enabled. Programming it to use less threads is
illegal. Sandybridge appears to have similar restrictions.
So on newer platforms, INTEL_DEBUG=sing will probably just hang the GPU.
Rather than try to patch it up for newer platforms and extend it to
support geometry shaders, just remove it as it isn't that useful anyway.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Add the following to the vtbl:
hiz_resolve_depthbuffer
hiz_resolve_hizbuffer
For all drivers for which HiZ is not enabled, the methods are set to be
no-ops. If HiZ is enabled, the methods are currently to set to empty
stubs.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Chad Versace <chad@chad-versace.us>
Acked-by: Paul Berry <stereotype441@gmail.com>
What I would prefer to assert is that, for each region that is currently
mapped, no batch is emitted that uses that region's bo. However, it's much
easier to implement this big hammer.
Observe that this requires that the batch flush in intel_region_map() be
moved to within the map_refcount guard.
v2: Add comments (borrowed from anholt's reply) explaining why the
assertion is a good idea.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad@chad-versace.us>
The "slight optimization to avoid the GS program" in brw_set_prim() is not
used by Gen 6, since Gen 6 doesn't use a GS program. Also, Gen 6 doesn't use
reduced primitives.
Also, document that intel_context.reduced_primitive is only used for Gen < 6
Reviewed-by: Eric Anholt <eric@anho.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
It seems that GT1/GT2 sorts of variations are here to stay, and more
special cases will likely be required in the future. Checking by PCI ID
via the IS_xxx_GTx macros is cumbersome; introducing a new 'gt' field
analogous to intel->gen will make this easier.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Seeing as they were only used once (in the same function they were
defined), having them as context members seemed rather pointless.
Remove them entirely (rather than using local variables) since the
chipset generation checks are actually just as straightforward.
While we're at it, clean up the remainder of the if-tree that set them.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The illusion of shared code here wasn't fooling anybody. It was
tempting to keep i830 and i915 still shared, but I think I actually
want to make them diverge shortly.
Reviewed-by: Chad Versace <chad@chad-versace.us>
The first rendering after context create didn't know of the color
buffer yet, triggering a sw fallback. The intel_prepare_render() from
intelSpanRenderStart then found the buffer and turned off fallbacks,
but intelSpanRenderFinish was never called and things were left
mapped. By checking buffers before making the call on whether to do
the fallback pipeline or not, we avoid the fallback change inside of
the rendering pipeline.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31561
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This was spectacularly unsafe. On my system, address 0 happens to be
the hardware status page for the render ring, and the first quadword
of that happens to contain nothing we ever look at, but I sure didn't
look forward to having to debug some day when, for example, the kernel
happened to bind the ringbuffer before binding the hwsp.