third_party_mesa3d

Author	SHA1	Message	Date
Jason Ekstrand	3200a81a55	anv/image: Add a vk_format field We've been trying to move away from anv_format for a while and this should help with the transition. There are cases (mostly in meta) where we need the original format for the image and not the isl_format. These will be moved over to the new vk_format and everythign else will use the isl_format from the particular anv_surface.	2016-01-04 16:08:05 -08:00
Nicolai Hähnle	2123bfcc9c	st/mesa: make KHR_debug output independent of context creation flags (v2) Instead, keep track of GL_DEBUG_OUTPUT and (un)install the pipe_debug_callback accordingly. Hardware drivers can still use the absence of the callback to skip more expensive operations in the normal case, and users can no longer be surprised by the need to set the debug flag at context creation time. v2: - re-add the proper initialization of debug contexts (Ilia Mirkin) - silence a potential warning (Ilia Mirkin) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-04 18:40:49 -05:00
Chad Versace	0d7614dce6	isl: Document mnemonic in Yf and Ys tiling The 'f' means "four K". The 's' means "sixty-four K".	2016-01-04 15:37:39 -08:00
Kristian Høgsberg Kristensen	0f34a4ec4e	isl: Use isl_align_npot for row_pitch Many formats are not power-of-two bytes per pixels and we need the non-power-of-two align macro here. This reverts the revert from `4f9a211b`, but keeps the change from `a827b553` that fixed the yuv if-else mix-up.	2016-01-04 10:53:47 -08:00
Kristian Høgsberg Kristensen	abc1c9878f	vk: Don't leak pipeline if initialization fails	2016-01-04 10:42:50 -08:00
Kristian Høgsberg Kristensen	fca1c08e34	vk: Allocate subpass attachment in one big block This avoids making a lot of small allocations and handles allocation failure correctly. Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:07:10 -08:00
Kristian Høgsberg Kristensen	5526c1782a	vk: Handle allocation failures in meta init paths Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:07:08 -08:00
Kristian Høgsberg Kristensen	b2ad2a20b6	vk: Handle allocation failure in anv_pipeline_init() Fixes dEQP-VK.api.object_management.alloc_callback_fail.* failures.	2016-01-04 10:06:50 -08:00
Kristian Høgsberg Kristensen	3954594eb4	vk: Call vk_error when we generate a VK_ERROR	2016-01-04 10:02:50 -08:00
Kristian Høgsberg Kristensen	75e01c8b2d	vk: Only finish wayland wsi if we created it Failure during instance creation will leave instance->wayland_wsi undefined. When we then try to clean that up we crash. Set instance->wayland_wsi to NULL on failure and only clean it up if it's non-NULL. Fixes part of dEQP-VK.api.object_management.alloc_callback_fail.*	2016-01-04 10:02:50 -08:00
Chad Versace	05c22f2d74	isl: Fix row pitch for linear buffers isl always aligned the row pitch to the surface's image alignment. This was sometimes wrong when the surface backed a VkBuffer. For a VkBuffer, the surface's row pitch is set by VkBufferImageCopy::bufferRowLength, whose required alignment is only that of the VkFormat. In particular, VkBuffer rows are packed in many dEQP and Crucible tests. And packed rows are rarely aligned to the surface's image alignment. Fixes: dEQP-VK.pipeline.image.view_type.2d.format.r8g8b8a8_unorm.size.13x13	2016-01-04 09:57:25 -08:00
Chad Versace	a827b553d9	isl: Fix swapped if-else in isl_calc_row_pitch The YUV case was applied to non-YUV formats. Oops.	2016-01-04 09:57:23 -08:00
Ilia Mirkin	b16c9be4a5	nvc0: scale up inter_bo size so that it's 16M for a 4K video Experimentally, 4M causes corruption and slowness, try to ramp it up with size instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-04 11:32:45 -05:00
Ilia Mirkin	b5f2f7073f	nv50,nvc0: fix crash when increasing bsp bo size for h264 H264 doesn't have a bitplane bo. We just need a device reference, so use the one from the client. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-04 11:32:45 -05:00
Samuel Iglesias Gonsálvez	8cf2e892fc	i965/wm: use proper API buffer size for the surfaces. Commit `5bb5eeea` fixes a bug indicating that the surfaces should have the API buffer size. Hovewer it picked the wrong value. This patch adds a new variable, which takes into account glBindBufferRange() values. This patch fixes the following CTS regressions: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-04 07:52:24 +01:00
Marek Olšák	86fa48426c	radeonsi: remove unused parameter from si_shader_binary_read_config Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	b6d95248f0	radeonsi: move si_shader_binary_upload out of si_shader_binary_read Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	7fa6bb47e3	gallium/radeon: dump LLVM module outside of radeon_llvm_compile Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	fb98acb5a1	gallium/radeon: always add +DumpCode to the LLVM target machine for LLVM <= 3.5 It's the same behavior that we use for later LLVM. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	cd7f252b11	gallium/radeon: r600_can_dump_shader should get TGSI processor type directly Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	fd7000bd78	radeonsi: pass TGSI processor type to si_shader_binary_read for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	3ce0a2fd7f	radeonsi: pass TGSI processor type to si_compile_llvm for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	dd79034ca6	radeonsi: rename shader parameter definitions and variables for more clarity Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Ilia Mirkin	34217018c4	nvc0/ir: add support for PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 16:20:52 -05:00
Ilia Mirkin	20dee333f3	st/mesa: use PK2H/UP2H when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:47 -05:00
Ilia Mirkin	e9f43d6333	gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:41 -05:00
Ilia Mirkin	459e4532af	tgsi: update PK2H/UP2H channel behavior info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:27 -05:00
Ilia Mirkin	6eb74b87b8	gallium: document PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:19:57 -05:00
Samuel Pitoiset	0ab2c21b93	st/mesa: fix parameter names for tesseval/tessctrl prototypes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 22:01:18 +01:00
Ilia Mirkin	bf34748b39	nouveau: fix double-const qualifier Reported by Tom^ on IRC. The original intent was to mark the pointer constant as well as the data being pointed to, so move the *. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 11:32:15 -05:00
Rob Clark	3684e899ea	freedreno/ir3: use NIR_PASS helper macros Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	317628dbb3	nir: extract out helper macros for running passes Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-03 09:11:27 -05:00
Rob Clark	23bd6affb2	freedreno/ir3: we require block_index metadata Found during NIR_TEST_CLONE=1 piglit run. We were using block->index but forgetting to require it. Causing things to not work with a cloned shader which didn't preserve block_index. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	74135f804a	freedreno/ir3: refactor NIR IR handling Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	ab4efb19dc	freedreno/ir3: drop unnecessary unreachable() case It will still hit a compile_assert() in emit_tex, which has the advantage of dumping out the offending shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Samuel Pitoiset	6a49fcfb1f	gallium/tests: fix build with clang compiler Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-03 12:18:00 +01:00
Samuel Pitoiset	53dddab78c	nv50,nvc0: optimize coherent buffer checking at draw time Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 12:17:05 +01:00
Kenneth Graunke	28dea26626	i965: Make TCS precompile use the TES primitive mode when available. If there's a linked TES program, we should just use the actual primitive mode. If not, just guess triangles (as we did before). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	4a1c8a3037	i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	b022150d70	i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	53a9b6223f	i965: Move 3-src subnr swizzle handling into the vec4 backend. While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Eric Anholt	64253fdb2e	vc4: Fix build from upload changes.	2016-01-02 17:33:19 -08:00
Nicolai Hähnle	8f384d07a8	gallium/radeon: send LLVM diagnostics as debug messages Diagnostics sent during code generation and the every error message reported by LLVMTargetMachineEmitToMemoryBuffer are disjoint reporting mechanisms. We take care of both and also send an explicit message indicating failure at the end, so that log parsers can more easily tell the boundary between shader compiles. Removed an fprintf that could never be triggered. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	255ccd1e99	gallium/radeon: pass pipe_debug_callback into radeon_llvm_compile (v2) This will allow us to send shader debug info via the context's debug callback. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	f8cd11403a	radeonsi: send shader info as debug messages in addition to stderr output The output via stderr is very helpful for ad-hoc debugging tasks, so that remains unchanged, but having the information available via debug messages as well will allow the use of parallel shader-db runs. Shader stats are always provided (if the context is a debug context, that is), but you still have to enable the appropriate R600_DEBUG flags to get disassembly (since it is rather spammy and is only generated by LLVM when we explicitly ask for it). Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	4bb1c8dfec	radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2) This will allow us to send shader debug info. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Nicolai Hähnle	b6847062dd	gallium/radeon: implement set_debug_callback Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Jason Ekstrand	f6c4658cde	nir/spirv: Fix group decorations They were completely bogus before. For one thing, OpDecorationGroup created a value of type undef rather than decoration_group. Also OpGroupMemberDecorate didn't properly apply the decoration to the different members of the different groups. It should be correct now but there's no good way to test it yet.	2016-01-02 11:53:36 -08:00
Jason Ekstrand	6b0b57225c	anv/device: Only allocate whole pages in AllocateMemory The kernel is going to give us whole pages anyway, so allocating part of a page doesn't help. And this ensures that we can always work with whole pages.	2016-01-02 07:52:24 -08:00
Marek Olšák	ecb2da1559	u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00

... 6 7 8 9 10 ...

77295 Commits