third_party_mesa3d

Author	SHA1	Message	Date
Sviatoslav Peleshko	94989b45a5	anv,driconf: Add fake non device local memory WA for Total War: Warhammer 3 Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8721 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29127>	2024-06-07 04:14:10 +00:00
Erik Faye-Lund	df17f2b89a	meson: bump test-timeout This tests usually takes around 15 seconds on CI, according to logs. It recently timed out under load, causing a job to fail spuriously. Let's bump the timeout here to 60. That's in line with the glsl compiler warnings test, which usually takes around the same time. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29547>	2024-06-07 03:38:59 +00:00
Lucas Fryzek	db38a4913e	llvmpipe: query winsys support for dmabuf mapping Fixes #11257 by ensuring winsys mapping functions is only called if its supported by the winsys, which should prevent llvmpipe from crashing with kmswast. If the winsys is kms_swrast then this method will be null, but on drisw it will be available. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29546>	2024-06-07 02:42:20 +00:00
Erik Faye-Lund	d0d5fedbab	docs: wrap long words instead of overflowing This fixes rendering on mobile for the 24.1.1 release notes, where we'd otherwise end up with horizontal scrolling. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29573>	2024-06-07 02:32:48 +00:00
Yonggang Luo	85ff3f525c	util: Rename DETECT_OS_UNIX to DETECT_OS_POSIX Looking at each usage of DETECT_OS_UNIX, it's more about the POSIX API usage, not the Unix-like OS, so let's rename it And for POSIX it's a standard to claim which API present, but for UNIX there is no such thing Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29555>	2024-06-07 01:56:28 +00:00
Eric Engestrom	73cc6c6738	venus/ci: add flake that's been blocking MRs Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29590>	2024-06-07 01:24:26 +00:00
Nanley Chery	de22e20294	anv: Rely more on ISL_SURF_USAGE_DISABLE_AUX_BIT In order to support CCS, ISL may upgrade a main surface from Tile4 to Tile64 with miptails disabled. To avoid using this space consuming layout when not needed, inform ISL as soon as possible that compression won't be used. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Nanley Chery	fc57991b66	anv: Support multiple aspects in anv_formats_ccs_e_compatible Prevents the next patch from causing the following assert failure: Test case 'dEQP-VK.ycbcr.copy.g8_b8_r8_3plane_420_unorm.g8_b8_r8_3plane_444_unorm.linear_linear_disjoint'.. deqp-vk: ../../src/intel/vulkan/anv_private.h:4962: anv_aspect_to_plane: Assertion `!(aspect & ~all_aspects)' failed. We still disable CCS for multiplane formats elsewhere. I've attempted enabling CCS for those cases but end up with failures in CI that I cannot reproduce locally. Hopefully this change gets the next person a step closer towards enabling this feature. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Nanley Chery	14a0f7391d	anv,hasvk: Drop anv_get_isl_format_with_usage Since `3beaaa9ae8` ("anv: drop lowered storage images code"), this function has not used the VkImageUsageFlags parameter. So, we can drop it and simplify its callers. This function isn't used in hasvk. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Nanley Chery	3e9dc450a6	anv: Rely on the primary surf usage to disable aux Instead of passing isl_extra_usage_flags to add_aux_surface_if_supported, use the isl_surf::usage field of the primary surface to check for ISL_SURF_USAGE_DISABLE_AUX_BIT. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Nanley Chery	8e96b516ca	intel/isl: Assert alignments of surface addresses In the import paths in iris, there are several cases where surface VMAs are created without relying on the calculated surface alignment. Asserting the alignments of surface addresses, should help catch any cases where we end up with the wrong alignment. This found a couple issues during development. One which required a change to existing code is that when creating uncompressed surfaces from compressed ones, ISL will sometimes increase the image alignment as a result of the new format supporting CCS. This patch adds the usage flag to disable that behavior. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Nanley Chery	31560d82ad	iris: Simplify bo import in memobj_create_from_handle Looking at the caller, we only import FDs without modifiers. By asserting this behavior and dropping the unused cases, we gain some clarity on the alignment of the imported BO's VMA. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Nanley Chery	6b969a4b43	intel/isl: Add and use multi-engine surf usage bits Add and use two new surf usage bits: * ISL_SURF_USAGE_MULTI_ENGINE_SEQ_BIT: the surface may be accessed by multiple engines, but not in parallel. * ISL_SURF_USAGE_MULTI_ENGINE_PAR_BIT: the surface may be accessed by multiple engines in parallel. Both usages are not concerned with read-after-read access patterns. Using these bits allows ISL to conditionally use Tile64 or a 64KB alignment to account for the gfx12.5 CCS WA from HSD 22015614752. Apart from the potential space savings, there are three benefits of this approach: 1) CCS can now be used with miptails (though nothing makes use of this today). 2) CCS can now be used with 3D depth/stencil surfaces in GL. 3) CCS can now be used with 3D depth/stencil surfaces in Vulkan when apps only use a single queue. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11111 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11117 Tested-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29094>	2024-06-07 00:58:41 +00:00
Erik Faye-Lund	3053268fd0	mesa/main: updates for EXT_texture_format_BGRA8888 The spec is about to change, so let's prepare for the new and brighter future. Reviewed-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27726>	2024-06-07 00:28:25 +00:00
Eric Engestrom	f81e38e5a9	docs: add sha256sum for 24.0.9 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29588>	2024-06-07 00:21:41 +00:00
Eric Engestrom	15627f9203	docs: update calendar for 24.0.9 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29588>	2024-06-07 00:21:41 +00:00
Eric Engestrom	92a44d3907	docs: add release notes for 24.0.9 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29588>	2024-06-07 00:21:41 +00:00
Nanley Chery	53440554c4	intel/isl: Add and use ISL_MAIN_TO_CCS_SIZE_RATIO_XE In iris, use the CCS scale down factor to calculate the impact of CCS on TBIMR tile sizes. Even though we fall back to a seemingly less accurate method to calculate the impact of CCS, it ends up giving the same answer, 1bpp. Anv already uses this factor, so this patch replaces the constant with this macro. There are two benefits to doing this: 1) Consistency between anv and iris. 2) Preparation for a future where we no longer use ISL surfaces to describe CCS on Xe+. In fact, in iris, we already don't create such surfaces on ACM. I considered using INTEL_AUX_MAP_MAIN_SIZE_SCALEDOWN for the calculation in both drivers, but the naming is aux-map specific and the scaledown actually exists on flat-ccs platforms as well. So, we introduce a new macro for all Xe platforms, currently only used for the specific use case of TBIMR calculations. We can add more such macros for future platforms, as needed. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>	2024-06-06 23:57:52 +00:00
Nanley Chery	26655a137f	intel/aux_map: Add and use INTEL_AUX_MAP_MAIN_SIZE_SCALEDOWN Introduce a macro so that drivers don't need to rely on the isl_surf struct to determine the size of the CCS buffer on gfx12. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>	2024-06-06 23:57:52 +00:00
Nanley Chery	4ae50eaf70	intel/aux_map: Add and use INTEL_AUX_MAP_META_ALIGNMENT_B Introduce a macro defining the alignment which aux data start addresses should have. This alignment is for the worst case of the CCS buffer being included in a dmabuf. Although a smaller alignment is possible for non-dmabuf cases on TGL, no drivers would make use of that today as they place CCS surfaces directly after tiled surfaces. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>	2024-06-06 23:57:52 +00:00
Nanley Chery	e27d951527	intel/aux_map: Add and use INTEL_AUX_MAP_MAIN_PITCH_SCALEDOWN Introduce a macro so that drivers don't need to rely on the isl_surf struct to determine the pitch of the CCS buffer on gfx12. This is useful during layout queries of dmabufs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>	2024-06-06 23:57:52 +00:00
Nanley Chery	e9653b5833	anv: Refactor modifier plane layout queries Before this patch, we special-cased the clear color plane for layout queries. This was because that plane lacks an ISL surface whereas all others have one. We plan to drop the ISL surface for CCS buffers on gfx12 in a future commit. So, in preparation, generalize the clear color plane code to work for every plane queried on a surface that uses modifiers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>	2024-06-06 23:57:52 +00:00
Nanley Chery	0194290bb5	intel/isl: Add and use ISL_DRM_CC_PLANE_PITCH_B At the interfaces which query the pitch of the clear color plane in GL and Vulkan, we've been returning 64B for various reasons. Unify the rationale under a macro. The documentation for the macro is picked from anv, which reflects the most recently synchronized copy of drm_fourcc.h. See the notable changes at `8cd8f3d697`. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28942>	2024-06-06 23:57:52 +00:00
Friedrich Vock	f1742d36f3	radv/rt: Fix memory leak when compiling libraries Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29579>	2024-06-06 21:56:11 +00:00
Daniel Schürmann	c452a4d1cc	aco/ra: use round robin register allocation Totals from 74681 (94.06% of 79395) affected shaders: (GFX11) MaxWaves: 2265668 -> 2263546 (-0.09%); split: +0.01%, -0.10% Instrs: 44941647 -> 44412809 (-1.18%); split: -1.23%, +0.05% CodeSize: 234173852 -> 232009132 (-0.92%); split: -0.97%, +0.05% VGPRs: 3033208 -> 3403000 (+12.19%); split: -0.02%, +12.22% Latency: 305575738 -> 301100302 (-1.46%); split: -1.70%, +0.23% InvThroughput: 49366070 -> 49020000 (-0.70%); split: -0.91%, +0.21% VClause: 875748 -> 854930 (-2.38%); split: -2.65%, +0.27% SClause: 1369614 -> 1327212 (-3.10%); split: -3.43%, +0.33% Copies: 2887932 -> 2883061 (-0.17%); split: -1.93%, +1.76% Branches: 885041 -> 885101 (+0.01%); split: -0.01%, +0.02% VALU: 25218078 -> 25215170 (-0.01%); split: -0.20%, +0.19% SALU: 4328640 -> 4326052 (-0.06%); split: -0.20%, +0.14% VOPD: 9129 -> 9611 (+5.28%); split: +7.48%, -2.20% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:15 +00:00
Daniel Schürmann	197943ae27	aco/ra: change heuristic to first fit Totals from 73175 (92.17% of 79395) affected shaders: (GFX11) MaxWaves: 2217690 -> 2217930 (+0.01%); split: +0.02%, -0.01% Instrs: 44780731 -> 44784895 (+0.01%); split: -0.14%, +0.15% CodeSize: 233238960 -> 233255604 (+0.01%); split: -0.11%, +0.12% VGPRs: 3009116 -> 3007684 (-0.05%); split: -0.29%, +0.24% Latency: 304320163 -> 304286592 (-0.01%); split: -0.31%, +0.30% InvThroughput: 49121992 -> 49145025 (+0.05%); split: -0.20%, +0.25% VClause: 872566 -> 873242 (+0.08%); split: -0.25%, +0.33% SClause: 1359666 -> 1361640 (+0.15%); split: -0.11%, +0.26% Copies: 2879649 -> 2881646 (+0.07%); split: -1.13%, +1.20% Branches: 887102 -> 887093 (-0.00%); split: -0.01%, +0.01% VALU: 25128240 -> 25128572 (+0.00%); split: -0.12%, +0.12% SALU: 4328852 -> 4330559 (+0.04%); split: -0.07%, +0.11% VOPD: 8861 -> 8992 (+1.48%); split: +2.63%, -1.15% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:15 +00:00
Daniel Schürmann	d76fc005b6	aco/ra: re-use registers from killed operands Totals from 77283 (97.34% of 79395) affected shaders: (GFX11) MaxWaves: 2348498 -> 2348250 (-0.01%); split: +0.01%, -0.02% Instrs: 45304558 -> 45097367 (-0.46%); split: -0.57%, +0.11% CodeSize: 235719656 -> 234957768 (-0.32%); split: -0.43%, +0.11% VGPRs: 3065984 -> 3073244 (+0.24%); split: -0.41%, +0.65% Latency: 308010576 -> 307008565 (-0.33%); split: -0.85%, +0.52% InvThroughput: 49560307 -> 49464214 (-0.19%); split: -0.54%, +0.34% VClause: 881895 -> 879739 (-0.24%); split: -0.78%, +0.53% SClause: 1388139 -> 1374634 (-0.97%); split: -1.12%, +0.14% Copies: 2918583 -> 2910434 (-0.28%); split: -1.92%, +1.64% Branches: 893947 -> 893712 (-0.03%); split: -0.06%, +0.03% VALU: 25260728 -> 25256766 (-0.02%); split: -0.20%, +0.19% SALU: 4377750 -> 4373595 (-0.09%); split: -0.17%, +0.07% VOPD: 8603 -> 9163 (+6.51%); split: +8.54%, -2.03% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:15 +00:00
Daniel Schürmann	b054cfe704	aco/ra: move can_write_m0() check into get_reg_specified() This way, affinities are also covered. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:15 +00:00
Daniel Schürmann	8e817cf52b	aco/ra: refactor get_reg_simple() with increased stride. This should avoid some redundant calls. Totals from 153 (0.19% of 79395) affected shaders: (GFX11) Instrs: 301717 -> 301687 (-0.01%); split: -0.06%, +0.05% CodeSize: 1583080 -> 1582988 (-0.01%); split: -0.06%, +0.05% VGPRs: 10068 -> 10348 (+2.78%) Latency: 6685446 -> 6685475 (+0.00%); split: -0.11%, +0.11% InvThroughput: 999241 -> 999316 (+0.01%); split: -0.01%, +0.02% VClause: 3868 -> 3870 (+0.05%) Copies: 23752 -> 23769 (+0.07%); split: -0.27%, +0.34% Branches: 6479 -> 6480 (+0.02%) VALU: 179290 -> 179307 (+0.01%); split: -0.04%, +0.04% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:15 +00:00
Daniel Schürmann	1b0edf3f33	aco/ra: Fix array access when finding register for subdword variables Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:15 +00:00
Daniel Schürmann	5326e033ff	aco/ra: fix handling of killed operands in compact_relocate_vars() Found by inspection. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>	2024-06-06 21:02:14 +00:00
Samuel Pitoiset	afa2070c99	radv: initialize compute preambles with the common helper The PM4 mechanism can emit paired packets on GFX11+ when possible. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>	2024-06-06 20:26:47 +00:00
Samuel Pitoiset	3c8b48e310	ac,radeonsi: add a function to initialize compute preambles Preambles are very similar between RADV and RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>	2024-06-06 20:26:47 +00:00
Samuel Pitoiset	428601095c	ac,radeonsi import PM4 state from RadeonSI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>	2024-06-06 20:26:47 +00:00
Lionel Landwerlin	62c52fb59d	anv: expose VK_MESA_image_alignment_control Our implementation is a no-op for the following reasons : - ISL always tries to go for the smallest tiling mode (see isl_surf_choose_tiling()) - In the few cases where we need to use Tile64 for compression workarounds, VK_MESA_image_alignment_control doesn't require use to disable compression - vkd3d-proton has the ability to disable compression using VK_EXT_image_compression_control, disabling Tile64 requirements and ensuring ISL can select a 4k tiling mode So vkd3d-proton should always be able to get a 4k tiling mode if it wants to. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29175>	2024-06-06 19:00:47 +00:00
Eric Engestrom	3e7a82968d	nvk+zink/ci: add another flake seen in nightly Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29574>	2024-06-06 18:49:11 +00:00
Samuel Pitoiset	15fe733703	radv: add a helper to get image VA Similar to buffer, and less error prone. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29428>	2024-06-06 18:21:33 +00:00
Rhys Perry	4cfb7a0c17	aco: remove support for sub-dword push constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480>	2024-06-06 17:52:05 +00:00
Rhys Perry	e21312018e	ac/llvm: remove support for sub-dword push constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480>	2024-06-06 17:52:05 +00:00
Rhys Perry	41c5f71343	radv: lower sub-dword push constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480>	2024-06-06 17:52:05 +00:00
Rhys Perry	69b7fcd775	ac/nir: support lowering of sub-dword push constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480>	2024-06-06 17:52:04 +00:00
Yusuf Khan	e7a2127f0e	aux/draw: Use the draw info we get passed in instead of our own Signed-off-by: Yusuf Khan <yusisamerican@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28641>	2024-06-06 17:00:18 +00:00
Yusuf Khan	377600b9df	nv50/vbo: wrap draw_vbo to avoid ovehead from multidraw Same as the nvc0 patch pretty much, similar improvement. Signed-off-by: Yusuf Khan <yusisamerican@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> --- v2: remove tmp_info as per Karol Herbst suggestion v3: nv50_draw_vbo -> nv50_draw_single_vbo per Karol's suggestion v4: mutex assertion and remove num_draws Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28641>	2024-06-06 17:00:18 +00:00
Yusuf Khan	225f2aac96	nvc0/vbo: wrap draw_vbo for multidraw performance This patch is to avoid the high overhead that exists when trying to kick ever single draw during multidraw. glMultiDrawArrays performance profiling: 342.5 thousand draws/second -> 40 million draws/second Special thanks to Arthur Huillet for helping getting this profiled in irc. Signed-off-by: Yusuf Khan <yusisamerican@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> --- v2: fix typos pointed out by Arthur v3: nvc0_draw_vbo -> nvc0_draw_single_vbo, intialize count v4: remove num_draws from wrapped function and add mutex assert Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28641>	2024-06-06 17:00:18 +00:00
Georg Lehmann	3fb1a64918	aco: move s_add_u32 -> s_addk_i32 optimization fully to ra Having this in one place is better. When I wrote the old I wasn't aware that checking the kill flag on definitions is the same as checking zero uses. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>	2024-06-06 16:28:23 +00:00
Georg Lehmann	60f3f0fdbb	aco/ra: use a switch to check vop2acc instruction support Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>	2024-06-06 16:28:23 +00:00
Georg Lehmann	fdc2fb6835	aco: move literal unswizzle opt to RA Much simpler. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>	2024-06-06 16:28:23 +00:00
Georg Lehmann	c63c750380	aco/gfx11+: fix inline constants for v_pk_fmac_f16 On newer hardware, the hi operation reads the lo half of the inline constant. On older hardware, it reads the hi half (zero). I tested this on Navi31 for gfx11 and Raphael for gfx10. Foz-DB Navi31: Totals from 4 (0.01% of 79395) affected shaders: CodeSize: 36832 -> 36448 (-1.04%) Latency: 20362 -> 20334 (-0.14%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>	2024-06-06 16:28:23 +00:00
Georg Lehmann	39380d475a	aco: add affinities for possible sopk optimizations Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>	2024-06-06 16:28:23 +00:00
Georg Lehmann	fac475bc25	aco: rework how affinities for acc operands are determined Improve accuracy by adding a helper that's also used by the optimization function. Foz-DB Navi31: Totals from 50 (0.06% of 79206) affected shaders: CodeSize: 126148 -> 126128 (-0.02%); split: -0.05%, +0.04% Latency: 334049 -> 334060 (+0.00%); split: -0.00%, +0.00% InvThroughput: 59203 -> 59205 (+0.00%) Copies: 2011 -> 1998 (-0.65%); split: -0.75%, +0.10% VALU: 14221 -> 14208 (-0.09%); split: -0.11%, +0.01% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29512>	2024-06-06 16:28:23 +00:00

1 2 3 4 5 ...

190251 Commits