third_party_mesa3d

Author	SHA1	Message	Date
Jason Ekstrand	bce4a935c6	anv/query: Use a variable-length slot size Not all queries are the same. Even the two queries we support today require a different amount of data per slot. Once we introduce pipeline statistics queries, the size will vary wildly. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:49 -07:00
Jason Ekstrand	1c797af2c6	anv/query: Move the available bits to the front We're about to make slots variable-length and always having the available bits at the front makes certain operations substantially easier once we do that. Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-17 12:12:47 -07:00
Iago Toral Quiroga	be52f9693a	anv/blorp: make anv_cmd_buffer_alloc_blorp_binding_table() return a VkResult Instead of asserting inside the function, and then use use that information to return early from its callers upon failure. v2: - Make sure that clear_color_attachment() and clear_depth_stencil_attachment() get the VkResult as well so they avoid executing the batch if an error happened. (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	68d88f0237	anv: handle failures when growing reloc lists Growing the reloc list happens through calling anv_reloc_list_add() or anv_reloc_list_append(). Make sure that we call these through helpers that check the result and set the batch error status if needed. v2: - Handling the crashes is not good enough, we need to keep track of the error, for that, keep track of the errors in the batch instead (Jason). - Make reloc list growth go through helpers so we can have a central place where we can do error tracking (Jason). v3: - Callers that need the offset returned by anv_reloc_list_add() can compute it themselves since it is extracted from the inputs to the function, so change the function to return a VkResult, make anv_batch_emit_reloc() also return a VkResult and let their callers do the error management (Topi) v4: - Let anv_batch_emit_reloc() return an uint64_t as it originally did, there is no real benefit in having it return a VkResult. - Do not add an is_aux parameter to add_surface_state_reloc(), instead do error checking for aux in add_image_view_relocs() separately. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	d4bdd871dc	anv: avoid crashes when failing to allocate batches Most of the time we use macros that handle this situation transparently, but there are some cases were we need to handle this explicitly. This patch makes sure we don't crash, notice that error handling takes place in the function that actually failed the allocation, anv_batch_emit_dwords(), which will set the status field of the batch so it can be used at a later moment to report the error to the user. v2: - Not crashing is not good enough, we need to keep track of the error (Topi, Jason). Iago: now that we track errors in the batch, this is being handled. - Added guards in a few more places that needed it (Iago) v3: - Check result of anv_batch_emitn() for NULL before calling memset() in emit_vertex_input() (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	a8ce8e3542	anv: add anv_batch_set_error() and anv_batch_has_error() helpers The anv_batch_set_error() helper will track the first error that happened while recording a command buffer. The helper returns the currently tracked error to help the job of internal functions that may generate errors that need to be tracked and return a VkResult to the caller. We will use the anv_batch_has_error() helper to guard parts of the driver that are not safe to execute if an error has been generated while recording a particular command buffer. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	d0195bd067	anv/cmd_buffer: add a status field to anv_batch The vkCmd*() functions do not report errors, instead, any errors should be reported by the time we call vkEndCommandBuffer(). This means that we need to make the driver robust against incosistent and/or imcomplete command buffer states through the command recording process, particularly, avoid crashes due to access to memory that we failed to allocate previously. The strategy used to do this is to track the first error ocurred while recording a command buffer in the batch associated with it. We use the batch to track this information because the command buffer may not be visible to all parts of the driver that can produce errors we need to be aware of (such as allocation failures during batch emissions). Later patches will use this error information to guard parts of the driver that may not be safe to execute. v2: Move the field from the command buffer to the batch so we can track errors from batch emissions (Jason) v3: Registering errors in the command buffer's batch during anv_create_cmd_buffer() is unnecessary, since the command buffer is freed at the end of the function in that case (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	88b539c4a0	anv: do not try to ref/unref NULL shaders This situation can happen if we failed to allocate memory for the shader. v2: - We shouldn't see NULL shaders in anv_shader_bin_ref so we should not check for that (Jason). Make sure that callers don't attempt to call this function with a NULL shader and assert that this never happens (Iago). v3: - All callers to anv_shader_bin_unref seem to check for NULL before calling, so just assert that it is not NULL (Topi) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Jason Ekstrand	dd4db84640	anv: Use on-the-fly surface states for dynamic buffer descriptors We have a performance problem with dynamic buffer descriptors. Because we are currently implementing them by pushing an offset into the shader and adding that offset onto the already existing offset for the UBO/SSBO operation, all UBO/SSBO operations on dynamic descriptors are indirect. The back-end compiler implements indirect pull constant loads using what basically amounts to a texelFetch instruction. For pull constant loads with constant offsets, however, we use an oword block read message which goes through the constant cache and reads a whole cache line at a time. Because of these two things, direct pull constant loads are much faster than indirect pull constant loads. Because all loads from dynamically bound buffers are indirect, the user takes a substantial performance penalty when using this "performance" feature. There are two potential solutions I have seen for this problem. The alternate solution is to continue pushing offsets into the shader but wire things up in the back-end compiler so that we use the oword block read messages anyway. The only reason we can do this because we know a priori that the dynamic offsets are uniform and 16-byte aligned. Unfortunately, thanks to the 16-byte alignment requirement of the oword messages, we can't do some general "if the indirect offset is uniform, use an oword message" sort of thing. This solution, however, is recommended for a few of reasons: 1. Surface states are relatively cheap. We've been using on-the-fly surface state setup for some time in GL and it works well. Also, dynamic offsets with on-the-fly surface state should still be cheaper than allocating new descriptor sets every time you want to change a buffer offset which is really the only requirement of the dynamic offsets feature. 2. This requires substantially less compiler plumbing. Not only can we delete the entire apply_dynamic_offsets pass but we can also avoid having to add architecture for passing dynamic offsets to the back- end compiler in such a way that it can continue using oword messages. 3. We get robust buffer access range-checking for free. Because the offset and range are baked into the surface state, we no longer need to pass ranges around and do bounds-checking in the shader. 4. Once we finally get UBO pushing implemented, it will be much easier to handle pushing chunks of dynamic descriptors if the compiler remains blissfully unaware of dynamic descriptors. This commit improves performance of The Talos Principle on ULTRA settings by around 50% and brings it nicely into line with OpenGL performance. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-13 07:58:00 -07:00
Jason Ekstrand	d36b463817	anv: Add a helper for working with VK_WHOLE_SIZE for buffers Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2017-03-13 07:57:03 -07:00
Jason Ekstrand	700bebb958	i965: Move the back-end compiler to src/intel/compiler Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00
Jason Ekstrand	e042f5fcbc	anv: Stop including brw_context.h Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:33 +00:00
Tapani Pälli	db5f9c3177	anv: change BLOCK_POOL_MEMFD_SIZE to exactly 2GB This is what comment above definition says and change fixes issue with 32bit build where BLOCK_POOL_MEMFD_SIZE is used as ftruncate parameter and constant currently gets converted from 4294967296 to 0. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-08 07:57:55 +02:00
Jason Ekstrand	33301d949f	anv: Drop the anv_validate block helper Over the course of driver development, we've come up with a number of different schemes for adding giant blocks of asserts inside the driver. This one is only being used once in anv_pipeline.c and the way it's being used actually generates compiler warnings in release builds. This commit drops the anv_validate macro and just puts the contents of the one validation function in side of a "#ifdef DEBUG" guard. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	a316d8f406	anv: Get rid of the stub() macros Except for a few unimplemented things on gen7, we don't really have stubs anymore so we should drop this. This commit replaces the few gen7 stub() calls with explicitly labeled finishme's and makes the sparse binding stuff silently no-op or return a FEATURE_NOT_PRESENT error. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Jason Ekstrand	201fc83df7	anv: Add a performance warning helper This acts identically to anv_finishme except that it only dumps out these nice log messages if you run with INTEL_DEBUG=perf. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Nanley Chery	9950774f8b	anv/blorp: Encapsulate subpass id querying Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	c0223d052b	anv/pass: Store subpass attachment reference list We'll loop through this array when performing automatic layout transitions. v2: Adjust formatting of an assignment (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	608d17b80e	anv: Store the user's VkAttachmentReference We will be using the image layout. Store the full struct directly from the user. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	5408d3fd05	anv/descriptor_set: Store aux usage of sampled image descriptors v2: Rebase onto latest changes v3: Account for NULL image_view in aux_usage assignment Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	efc2222323	anv/image: Create an additional surface state for sampling This will be used to sample a depth input attachment without having to pass through the HiZ buffer. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	54d29ee65f	anv: Update the HiZ sampling helper Validate the inputs, verify that this image has a depth buffer, use gen_device_info instead of v2: - Add parenthesis (Jason Ekstrand) - Make parameters const - Use gen_device_info instead of gen - Pass aspect to missed function in transition_depth_buffer Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	425e33bcdb	anv/image: Add anv_layout_to_aux_usage() This function supersedes layout_to_hiz_usage(). v2: - Don't find the optimal buffer for layout transitions (Jason Ekstrand). - Pass the devinfo instead of the gen (Jason Ekstrand) - Update the function documentation. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Lionel Landwerlin	af5f13e58c	anv: add VK_KHR_descriptor_update_template support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	9f60ed98e5	anv: add VK_KHR_push_descriptor support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	12dee851a3	anv: descriptor: make descriptor writing take a stream allocator This allows us to allocate surface states from the command buffer when pushing descriptor sets rather than allocating them through a descriptor set pool. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	194fa58285	anv: descriptors: extract writing of descriptors elements This will be reused later on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	c2d199adec	anv: make layout size computation helper available across compilation units Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Lionel Landwerlin	c83e33e6ee	anv: move buffer_view declaration We will need this declaration closer for readability later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 10:34:06 +00:00
Jason Ekstrand	f31ed6d0cd	anv: Take a device parameter in anv_state_flush This allows the helper to check for llc instead of having to do it manually at all the call sites. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	f408971deb	anv: Pull all clflushing into a clflush_range helper All this cache line address calculation stuff is tricky. Let's not duplicate it more places than we have to. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	16b187c8bb	anv: Remove the unused state_pool_emit macro Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	f9d7d27d6d	anv: Rename clflush_range and state_clflush It's a bit shorter and easier to work with. Also, we're about to add a helper called clflush which does the clflush but without any memory fencing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	8582ab2d6e	anv: Add an invalidate_range helper This is similar to clflush_range except that it puts the mfence on the other side to ensure caches are flushed prior to reading. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Emil Velikov	9807e9dea6	anv: remove unused anv_dispatch_table dtable Fixes: `4c9dec80ed` ("anv: Get rid of the ANV_CALL macro") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-02-21 18:31:04 +00:00
Emil Velikov	e776e0385c	anv: remove unneeded extern C notation Analogous to previous commit - never used in any C++ code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-02-21 18:28:18 +00:00
Dave Airlie	0a44a680ff	vulkan/wsi/x11: add support to detect if we can support rendering (v3) This adds support to radv_GetPhysicalDeviceXlibPresentationSupportKHR and radv_GetPhysicalDeviceXcbPresentationSupportKHR to check if the local device file descriptor is compatible with the descriptor retrieved from the X server via DRI3. This will stop radv binding to an X server until we have prime support in place. Hopefully apps use this API before trying to render things. v2: drop unneeded function, don't leak memory. (jekstrand) v3: also check in surface_get_support callback. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-20 12:53:52 +10:00
Jason Ekstrand	bfbb362601	anv: Use vk_foreach_struct for handling extension structs Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-14 16:15:39 -08:00
Jason Ekstrand	f434a60a53	anv: Implement the Skylake stencil PMA optimization Unfortunately, this doesn't substantially improve the performance of any known apps. With Dota 2 on my Sky Lake gt4, it seems help by somewhere between 0% and 1% but there's enough noise that it's hard to get a clear picture. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	e8d52dab48	anv: Add support for the PMA fix on Broadwell This helps Dota 2 on Broadwell by 8-9%. I also hacked up the driver and used the Sascha "shadowmapping" demo to get some results. Setting uses_kill to true dropped the framerate on the demo by 25-30%. Enabling the PMA fix brought it back up to around 90% of the original framerate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Jason Ekstrand	6ce8592836	anv: Disable stencil writes when both write masks are zero Vulkan doesn't have a stencilWriteEnable bit like it does for depth. Instead, you have a stencil mask. Since the stencil mask is handled as dynamic state, we have to handle it later during command buffer construction. This, combined with a later commit, seems to help Dota2 on my Broadwell GT3e desktop by a couple percent because it allows the hardware to move the depth and stencil writes to early in more cases. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Alex Smith	924a8cbb40	anv: Add support for shaderStorageImageWriteWithoutFormat This allows shaders to write to storage images declared with unknown format if they are decorated with NonReadable ("writeonly" in GLSL). Previously an image view would always use a lowered format for its surface state, however when a shader declares a write-only image, we should use the real format. Since we don't know at view creation time whether it will be used with only write-only images in shaders, create two surface states using both the original format and the lowered format. When emitting the binding table, choose between the states based on whether the image is declared write-only in the shader. Tested on both Sascha Willems' computeshader sample (with the original shaders and ones modified to declare images writeonly and omit their format qualifiers) and on our own shaders for which we need support for this. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-14 08:16:52 -08:00
Lionel Landwerlin	9413e11869	anv: emit DrawID if needed v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:33:06 +00:00
Lionel Landwerlin	289aef771d	anv: move BaseVertexID/BaseInstanceID vertex buffer index to 31 v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:32:48 +00:00
Lionel Landwerlin	98cf60a3ce	anv: limit vertex buffers to 31 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:32:39 +00:00
Jason Ekstrand	ccdd5b3738	anv: Don't use bogus alpha swizzles For RGB formats in Vulkan, we use the corresponding RGBA format with a swizzle of RGB1. While this swizzle is exactly what we want for texturing, it's not allowed for rendering according to the docs. While we haven't been getting hangs or anything, we should probably obey the docs. This commit just sanitizes all render swizzles so that the alpha channel maps to ALPHA. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 14:41:06 -08:00
Chad Versace	022e5c7e5a	anv: Implement VK_KHR_get_physical_device_properties2 Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:18:47 -08:00
Chad Versace	b2de77a07d	anv: Revive struct anv_common The struct was deleted by: commit `efe9d1cde3` Author: Edward O'Callaghan <funfunctor@folklore1984.net> Subject: anv: Clean up some unused variables Unlike the original anv_common, the new one has a non-const pNext pointer because we will use it for the output structs of VK_KHR_get_physical_device_properties2. v2: - Retype pNext from void* to struct anv_common*. Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:18:33 -08:00
Chad Versace	c5d99c9983	anv: Define macro anv_debug() This is a printf-like macro that prints a debug message to stderr when built with DEBUG. If no DEBUG, then do nothing. Reviewed-by: Jason Ekstranad <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-01-25 19:17:45 -08:00
Samuel Iglesias Gonsálvez	ff0dd67d2f	anv: increase ANV_MAX_STATE_SIZE_LOG2 limit to 1 MB Fixes crash in dEQP-VK.ubo.random.all_shared_buffer.48 due to a fragment shader code bigger than 128 kB. This patch increases the allocation size limit to 1 MB. v2: - Increase it to 1 MB (Jason) - Increase device->instruction_block_pool allocation size in anv_device.c (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-17 06:42:42 +01:00

1 2 3 4 5

208 Commits