third_party_mesa3d

Author	SHA1	Message	Date
Marek Olšák	1299f5c50a	gallium/radeon: import libdrm_radeon source code, drop the dependency Only radeon_surface.h/c is used from libdrm and radeon_drm.h is imported too. This code doesn't change anymore. We don't need the dependency. Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com> Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31827>	2024-11-10 00:52:18 +00:00
Russell Greene	ae9d365686	perfetto: fix macos compile On macos, <sys/types.h> does not declare clockid_t, but it's instead in <time.h>, which also includes <sys/types.h> on Linux, so just include <time.h> on all UNIX platforms. Fixes: `a871eabc` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12064 Tested-by: Vinson Lee <vlee@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31881>	2024-11-09 09:23:22 +00:00
Deborah Brouwer	276447ef81	ci/b2c: update RESULTS_DIR for .b2c-test jobs Since $RESULTS_DIR is now centrally defined in setup-test-env.sh it's no longer necessary to manually add a hard-coded results directory for the b2b-test job results. This keeps the results directory consistent between b2c-test jobs and lava. Fixes: `9b6d14aed1` ("ci: Always create results dir from init") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32051>	2024-11-09 08:40:48 +00:00
Deborah Brouwer	b5b2515f86	ci: Remove duplicate slash before $RESULTS_DIR The RESULTS_DIR variable is defined by reference to the present working directory, but if the pwd is the root directory then the $RESULTS_DIR begins with two slashes instead of one like this: //results. This is harmless but not necessary, so remove it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32051>	2024-11-09 08:40:48 +00:00
Deborah Brouwer	e368623fff	freedreno/ci: add prefix for a630-vk-asan tests Currently a630-vk-asan has separate files for its expected failures and skips, but by using the deqp-runner prefix option, the job can use the common a630 expectation files. This simplifies `a630-vk-asan` without any substantive changes to the ci job. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31970>	2024-11-09 08:15:36 +00:00
Alyssa Rosenzweig	0a81434adf	agx: rewrite address mode lowering AGX load/stores supports a single family of addressing modes: 64-bit base + sign/zero-extend(32-bit) << (format shift + optional shift) This is a base-index addressing mode, where the index is minimally in elements (never bytes, unless we're doing 8-bit load/stores). Both base and the resulting address must be aligned to the format size; the mandatory shift means that alignment of base is equivalent to alignment of the final address, which is taken care of by lower_mem_access_bit_size anyhow. The other key thing to note is that this is a 64-bit shift, after the sign- or zero-extension of the 32-bit index. That means that AGX does NOT implement 64-bit base + sign/zero-extend(32-bit << shift) This has sweeping implications. For addressing math from C-based languages (including OpenCL C), the AGX mode is more helpful, since we tend to get 64-bit shifts instead of 32-bit shifts. However, for addressing math coming from GLSL, the AGX mode is rather annoying since we know UBOs/SSBOs are at most 4GB so nir_lower_io & friends are all 32-bit byte indexing. It's tricky to teach them to do otherwise, and would not be optimal either since 64-bit adds&shifts are usually much more expensive than 32-bit on AGX except for when fused into the load/store. So we don't want 32-bit NIR, since then we can't use the hardware addressing mode at all. We also don't want 64-bit NIR, since then we have excessive 64-bit math resulting from deep deref chains from complex struct/array cases. Instead, we want a middle ground: 32-bit operations that are guaranteed not to overflow 32-bit and can therefore be losslessly promoted to 64-bit. We can make that no-overflow guarantee as a consequence of the maximum UBO/SSBO size, and indeed Mesa relies on this already all over the place. So, in this series, we use relaxed amul opcodes for addressing math. Then, we rewrite our address mode pattern matching to fuse AGX address modes. The actual pattern matching is rewritten. The old code was brittle handwritten nir_scalar chasing, based on a faulty model of the hardware (with the 32-bit shift). We delete it all, it's broken. In the new approach, we add some NIR pseudo-opcodes for address math (ulea_agx/ilea_agx) which we pattern match with NIR algebraic rules. Then the chasing required to fuse LEA's into load/stores is trivial because we never go deeper than 1 level. After fusing, we then lower the leftover lea/amul opcodes and let regular nir_opt_algebraic take it from here. We do need to be very careful around pass order to make sure things like load/store vectorization still happen. Some passes are shuffled in this commit to make this work. We also need to cleanup amul before fusing since we specifically do not have nir_opt_algebraic do so - the entire point of the pseudo-opcodes is to make nir_opt_algebraic ignore the opcodes until we've had a chance to fuse. If we simply used the .nuw bit on iadd/imul, nir_opt_algebraic would "optimize" things and lose the bit and then we would fail to fuse addressing modes, which is a much more expensive failure case than anything nir_opt_algebraic can do for us. I don't know what the "optimal" pass order for AGX would look like at this point, but what we have here is good enough for now and is a net positive for shader-db. That all ends up being much less code and much simpler code, while fixing the soundness holes in the old code, and also optimizing a significantly richer set of addressing calculations. Now we don't juts optimize GL/VK modes, but also CL. This is crucial even for GL/VK performance, since we rely on CL via libagx even in graphics shaders. Terraintessellation is up 10% to ~310fps, which is quite nice. The following stats are for the end of the series together, including this change + libagx change + the NIR changes building up to this... but not including the SSBO vectorizer stats or the IC modelling fix. In other words, these are the stats for "rewriting address mode handling". This is on OpenGL, and since the old code was targeted at GL, anything that's not a loss is good enough - we need this for the soundness fix regardless. total instructions in shared programs: 2751356 -> 2750518 (-0.03%) instructions in affected programs: 372143 -> 371305 (-0.23%) helped: 715 HURT: 75 Instructions are helped. total alu in shared programs: 2279559 -> 2278721 (-0.04%) alu in affected programs: 304170 -> 303332 (-0.28%) helped: 715 HURT: 75 Alu are helped. total fscib in shared programs: 2277843 -> 2277008 (-0.04%) fscib in affected programs: 304167 -> 303332 (-0.27%) helped: 715 HURT: 75 Fscib are helped. total ic in shared programs: 632686 -> 621886 (-1.71%) ic in affected programs: 113078 -> 102278 (-9.55%) helped: 1159 HURT: 82 Ic are helped. total bytes in shared programs: 21489034 -> 21477530 (-0.05%) bytes in affected programs: 3018456 -> 3006952 (-0.38%) helped: 751 HURT: 107 Bytes are helped. total regs in shared programs: 865148 -> 865114 (<.01%) regs in affected programs: 1603 -> 1569 (-2.12%) helped: 10 HURT: 9 Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120735 -> 2120792 (<.01%) uniforms in affected programs: 22752 -> 22809 (0.25%) helped: 76 HURT: 49 Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 27613312 -> 27613504 (<.01%) threads in affected programs: 1536 -> 1728 (12.50%) helped: 3 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	d466ccc6bd	libagx: promote math to use AGX address mode we want to fit into the 64 + ext() << #n pattern to let us fuse address arithmetic into our loads, so rework some libagx addressing to better match that Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	77ce91e99b	hk: reduce max SSBO size Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	01d2aa1d53	agx: fix bfeil timing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	db8d467ec6	agx: model IC dispatch Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	3c222da6c0	agx: vectorize SSBOs this was missed due to the lowering, and mitigates a lot of stats weirdness with the address mode rework. total instructions in shared programs: 2755170 -> 2751399 (-0.14%) instructions in affected programs: 16323 -> 12552 (-23.10%) helped: 71 HURT: 0 helped stats (abs) min: 10 max: 178 x̄: 53.11 x̃: 42 helped stats (rel) min: 2.04% max: 50.00% x̄: 34.73% x̃: 40.79% 95% mean confidence interval for instructions value: -60.94 -45.28 95% mean confidence interval for instructions %-change: -37.81% -31.65% Instructions are helped. total alu in shared programs: 2169888 -> 2168281 (-0.07%) alu in affected programs: 9547 -> 7940 (-16.83%) helped: 71 HURT: 0 helped stats (abs) min: 5 max: 90 x̄: 22.63 x̃: 16 helped stats (rel) min: 1.02% max: 43.33% x̄: 25.39% x̃: 29.41% 95% mean confidence interval for alu value: -26.33 -18.93 95% mean confidence interval for alu %-change: -27.91% -22.87% Alu are helped. total fscib in shared programs: 2165597 -> 2163990 (-0.07%) fscib in affected programs: 9547 -> 7940 (-16.83%) helped: 71 HURT: 0 helped stats (abs) min: 5 max: 90 x̄: 22.63 x̃: 16 helped stats (rel) min: 1.02% max: 43.33% x̄: 25.39% x̃: 29.41% 95% mean confidence interval for fscib value: -26.33 -18.93 95% mean confidence interval for fscib %-change: -27.91% -22.87% Fscib are helped. total bytes in shared programs: 21517750 -> 21489352 (-0.13%) bytes in affected programs: 126270 -> 97872 (-22.49%) helped: 71 HURT: 0 helped stats (abs) min: 80 max: 1084 x̄: 399.97 x̃: 324 helped stats (rel) min: 1.77% max: 50.57% x̄: 35.07% x̃: 42.31% 95% mean confidence interval for bytes value: -455.66 -344.28 95% mean confidence interval for bytes %-change: -38.34% -31.79% Bytes are helped. total regs in shared programs: 864490 -> 865162 (0.08%) regs in affected programs: 4567 -> 5239 (14.71%) helped: 4 HURT: 61 helped stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 helped stats (rel) min: 4.51% max: 5.13% x̄: 4.82% x̃: 4.82% HURT stats (abs) min: 2 max: 24 x̄: 11.41 x̃: 12 HURT stats (rel) min: 1.98% max: 82.35% x̄: 21.05% x̃: 16.00% 95% mean confidence interval for regs value: 8.52 12.16 95% mean confidence interval for regs %-change: 14.91% 24.00% Regs are HURT. total threads in shared programs: 27613056 -> 27613312 (<.01%) threads in affected programs: 3200 -> 3456 (8.00%) helped: 4 HURT: 0 helped stats (abs) min: 64 max: 64 x̄: 64.00 x̃: 64 helped stats (rel) min: 7.69% max: 8.33% x̄: 8.01% x̃: 8.01% 95% mean confidence interval for threads value: 64.00 64.00 95% mean confidence interval for threads %-change: 7.42% 8.60% Threads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	b593a6aa98	rusticl: respect late_lower_int64 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	5c73a8af44	nir/lower_uniforms_to_ubo: use amul Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	fc460e7f20	nir/opt_algebraic: don't lower amul if requested Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	1f3c97547a	nir/builder: use amul over ishl on agx ishl can wrap, amul cannot. so we need amul in the backend, or otherwise we would need to introduce an ashl opcode instead. that doesn't seem better. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	9ab8d70fa6	nir: add ilea_agx/ulea_agx opcodes to facilitate address mode lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	23afe968ad	nir: add late_lower_int64 option Some drivers generally need int64 lowered, but prefer to do this lowering themselves late, to have a chance to optimize targeted int64 patterns before lowering the rest. This isn't currently possible since nir_lower_int64 takes no options except what's const* in the shader, and frontends call nir_lower_int64 before passing the shader off to the driver. Add an option to defer int64 lowering. This is a bit ugly but the alternative is replumbing nir_lower_int64's option handling cross-tree and no-thank-you-not-right-now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	eaf75169ee	nir: add amul flag Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	227026b7ad	nir/opt_algebraic: add another 64-bit pattern clpeak Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	2a3f133fd0	nir/opt_algebraic: add more 64-bit patterns Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:41 -04:00
Alyssa Rosenzweig	a4a3487aae	nir/opt_algebraic: optimize patterns from Skia shaders/skia/1567.shader_test relies on algebraic + constant folding, subtle changes in the input compiling flow can cause it to baloon. these patterns fix that. annoying! shader-db results aren't amazing, but they avert a major stats regression for that one Skia shader. total instructions in shared programs: 2751399 -> 2751295 (<.01%) instructions in affected programs: 6509 -> 6405 (-1.60%) helped: 21 HURT: 1 helped stats (abs) min: 1 max: 14 x̄: 5.62 x̃: 6 helped stats (rel) min: 0.53% max: 13.73% x̄: 3.57% x̃: 1.62% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 2.45% max: 2.45% x̄: 2.45% x̃: 2.45% 95% mean confidence interval for instructions value: -7.09 -2.36 95% mean confidence interval for instructions %-change: -5.14% -1.45% Instructions are helped. total alu in shared programs: 2274577 -> 2274468 (<.01%) alu in affected programs: 6178 -> 6069 (-1.76%) helped: 21 HURT: 1 helped stats (abs) min: 1 max: 14 x̄: 5.86 x̃: 7 helped stats (rel) min: 0.55% max: 16.47% x̄: 3.93% x̃: 1.72% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 2.83% max: 2.83% x̄: 2.83% x̃: 2.83% 95% mean confidence interval for alu value: -7.35 -2.56 95% mean confidence interval for alu %-change: -5.67% -1.57% Alu are helped. total fscib in shared programs: 2272894 -> 2272785 (<.01%) fscib in affected programs: 6178 -> 6069 (-1.76%) helped: 21 HURT: 1 helped stats (abs) min: 1 max: 14 x̄: 5.86 x̃: 7 helped stats (rel) min: 0.55% max: 16.47% x̄: 3.93% x̃: 1.72% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 2.83% max: 2.83% x̄: 2.83% x̃: 2.83% 95% mean confidence interval for fscib value: -7.35 -2.56 95% mean confidence interval for fscib %-change: -5.67% -1.57% Fscib are helped. total bytes in shared programs: 21489352 -> 21488668 (<.01%) bytes in affected programs: 53362 -> 52678 (-1.28%) helped: 21 HURT: 2 helped stats (abs) min: 6 max: 98 x̄: 35.52 x̃: 40 helped stats (rel) min: 0.39% max: 10.63% x̄: 2.27% x̃: 1.27% HURT stats (abs) min: 2 max: 60 x̄: 31.00 x̃: 31 HURT stats (rel) min: 0.08% max: 1.40% x̄: 0.74% x̃: 0.74% 95% mean confidence interval for bytes value: -42.73 -16.74 95% mean confidence interval for bytes %-change: -3.13% -0.89% Bytes are helped. total regs in shared programs: 865162 -> 865148 (<.01%) regs in affected programs: 509 -> 495 (-2.75%) helped: 4 HURT: 5 helped stats (abs) min: 2 max: 14 x̄: 6.00 x̃: 4 helped stats (rel) min: 3.17% max: 35.90% x̄: 14.01% x̃: 8.48% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 3.17% max: 3.17% x̄: 3.17% x̃: 3.17% 95% mean confidence interval for regs value: -5.75 2.64 95% mean confidence interval for regs %-change: -14.31% 5.39% Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120731 -> 2120735 (<.01%) uniforms in affected programs: 358 -> 362 (1.12%) helped: 1 HURT: 2 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 2.94% x̄: 2.94% x̃: 2.94% HURT stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 HURT stats (rel) min: 1.05% max: 4.00% x̄: 2.53% x̃: 2.53% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:41 -04:00
Chia-I Wu	015f6a7aff	panvk: ensure res table is restored after meta Set res_table to 0 to ensure that the res table is re-emitted. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Fixes: `5067921349` ("panvk: Switch to vk_meta") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32044>	2024-11-09 00:48:47 +00:00
Jianxun Zhang	8906816f49	anv,hasvk,genxml: Rename genxml files using verx10 It could be confusing that a newer platform named with a smaller number than a half-generation of an older platform like 'gfx20' and 'gfx75' in xml files. Down the road, it can be a little worse once we pass something like 'gfx40' when there is already a gfx45.xml for the oldest platform. Unify naming xml files with verx10 numbers to resolve the issue. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31943>	2024-11-09 00:04:47 +00:00
Eric Engestrom	7e0e433482	radv+zink/ci: add flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>	2024-11-08 22:49:21 +00:00
Eric Engestrom	66df09ffda	nvk+zink/ci: add flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>	2024-11-08 22:49:21 +00:00
Eric Engestrom	f9593d9eb5	freedreno/ci: add flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>	2024-11-08 22:49:21 +00:00
Eric Engestrom	4ab210f588	broadcom/ci: add flakes seen recently Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32066>	2024-11-08 22:49:21 +00:00
Eric Engestrom	8d2620569c	ci: make error handling quieter Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32054>	2024-11-08 20:56:46 +00:00
Eric Engestrom	e5708ab2b4	ci: use quiet alias for commands And set x_off again when nesting these functions but we're not done and we have more after. Fixes: `d69bd58365` ("ci: consistently restore `-x` after temporarily disabling it") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32054>	2024-11-08 20:56:46 +00:00
Eric Engestrom	5cd054ebe5	ci: move error handling functions at the end So that everything is defined by the time we use it in here. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32054>	2024-11-08 20:56:46 +00:00
Iván Briano	aee04bf4fb	intel/rt: fix ray_query stack address calculation While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack address calculation, what the HW actually uses is NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform, while the latter is fixed to 2048 for all current platforms. Fixes: `6c84cbd8c9` ("intel/dev/xe: Set max_eus_per_subslice using topology query") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32049>	2024-11-08 18:31:52 +00:00
Ian Romanick	7aad19ccd2	brw/lower: Lower invalid source conversion to better code There are two fragment shaders from RDR2 that is hurt for spills and fills on Lunar Lake. Totals from 2 (0.00% of 551413) affected shaders: Spill count: 1252 -> 1317 (+5.19%) Fill count: 2518 -> 2642 (+4.92%) Those shaders... have a lot of room for improvement. There are some patterns in those shaders that we handle very, very poorly. Improving those patterns would likely improve the spills and fills in these shaders quite dramatically. Given how much other platforms are helped, I don't this should block this commit. No shader-db or fossil-db changes on any pre-Gfx12.5 Intel platforms. v2: Add some comments and an additional assertion. Suggested by Ken. shader-db: Lunar Lake total instructions in shared programs: 18094517 -> 18094511 (<.01%) instructions in affected programs: 809 -> 803 (-0.74%) helped: 6 / HURT: 0 total cycles in shared programs: 921532158 -> 921532168 (<.01%) cycles in affected programs: 2266 -> 2276 (0.44%) helped: 0 / HURT: 3 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19820845 -> 19820839 (<.01%) instructions in affected programs: 803 -> 797 (-0.75%) helped: 6 / HURT: 0 total cycles in shared programs: 906372999 -> 906372949 (<.01%) cycles in affected programs: 3216 -> 3166 (-1.55%) helped: 6 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 141887377 -> 141884465 (-0.00%); split: -0.00%, +0.00% Cycle count: 21990301498 -> 21990267232 (-0.00%); split: -0.00%, +0.00% Spill count: 69732 -> 69797 (+0.09%) Fill count: 128521 -> 128645 (+0.10%) Totals from 349 (0.06% of 551413) affected shaders: Instrs: 506117 -> 503205 (-0.58%); split: -0.79%, +0.21% Cycle count: 32362996 -> 32328730 (-0.11%); split: -0.52%, +0.41% Spill count: 1951 -> 2016 (+3.33%) Fill count: 4899 -> 5023 (+2.53%) Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152773732 -> 152761383 (-0.01%); split: -0.01%, +0.00% Cycle count: 17187529968 -> 17187450663 (-0.00%); split: -0.00%, +0.00% Spill count: 79279 -> 79003 (-0.35%) Fill count: 148803 -> 147942 (-0.58%) Scratch Memory Size: 3949568 -> 3946496 (-0.08%) Max live registers: 31879325 -> 31879230 (-0.00%) Totals from 366 (0.06% of 633185) affected shaders: Instrs: 557377 -> 545028 (-2.22%); split: -2.22%, +0.01% Cycle count: 26171205 -> 26091900 (-0.30%); split: -0.54%, +0.24% Spill count: 3238 -> 2962 (-8.52%) Fill count: 10018 -> 9157 (-8.59%) Scratch Memory Size: 257024 -> 253952 (-1.20%) Max live registers: 28187 -> 28092 (-0.34%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	2a57568ebd	brw/build: Add scalar_group() helper Some uses of the old pattern still exist. The use in brw_fs_nir.cpp is deleted by commits !29884. The use in brw_lower_logical_sends.cpp seems different, so I decided to keep it. The next commit wants to use this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	5dfea87623	brw/opt: Always do both kinds of copy propagation before lower_load_payload shader-db: All Intel platforms except Skylake had similar results. (Lunar Lake shown) total instructions in shared programs: 18092932 -> 18092713 (<.01%) instructions in affected programs: 139290 -> 139071 (-0.16%) helped: 103 HURT: 18 helped stats (abs) min: 1 max: 8 x̄: 2.43 x̃: 2 helped stats (rel) min: 0.02% max: 9.09% x̄: 0.73% x̃: 0.29% HURT stats (abs) min: 1 max: 5 x̄: 1.72 x̃: 1 HURT stats (rel) min: 0.02% max: 0.55% x̄: 0.10% x̃: 0.08% 95% mean confidence interval for instructions value: -2.17 -1.45 95% mean confidence interval for instructions %-change: -0.83% -0.38% Instructions are helped. total cycles in shared programs: 922792268 -> 921495900 (-0.14%) cycles in affected programs: 400296984 -> 399000616 (-0.32%) helped: 765 HURT: 635 helped stats (abs) min: 2 max: 77018 x̄: 6739.33 x̃: 60 helped stats (rel) min: <.01% max: 35.59% x̄: 1.98% x̃: 0.32% HURT stats (abs) min: 2 max: 88658 x̄: 6077.51 x̃: 152 HURT stats (rel) min: <.01% max: 51.33% x̄: 2.75% x̃: 0.63% 95% mean confidence interval for cycles value: -1620.41 -231.54 95% mean confidence interval for cycles %-change: -0.10% 0.44% Inconclusive result (%-change mean confidence interval includes 0). LOST: 4 GAINED: 3 Skylake total instructions in shared programs: 18658324 -> 18579715 (-0.42%) instructions in affected programs: 2089957 -> 2011348 (-3.76%) helped: 9842 HURT: 23 helped stats (abs) min: 1 max: 24 x̄: 7.99 x̃: 8 helped stats (rel) min: 0.05% max: 40.00% x̄: 5.37% x̃: 4.52% HURT stats (abs) min: 1 max: 5 x̄: 1.57 x̃: 1 HURT stats (rel) min: 0.02% max: 1.28% x̄: 0.36% x̃: 0.24% 95% mean confidence interval for instructions value: -7.98 -7.95 95% mean confidence interval for instructions %-change: -5.43% -5.29% Instructions are helped. total cycles in shared programs: 860031654 -> 860237548 (0.02%) cycles in affected programs: 449175235 -> 449381129 (0.05%) helped: 7895 HURT: 4416 helped stats (abs) min: 1 max: 14129 x̄: 113.70 x̃: 22 helped stats (rel) min: <.01% max: 40.95% x̄: 1.31% x̃: 0.56% HURT stats (abs) min: 1 max: 33397 x̄: 249.89 x̃: 34 HURT stats (rel) min: <.01% max: 67.47% x̄: 2.65% x̃: 0.65% 95% mean confidence interval for cycles value: 1.46 31.98 95% mean confidence interval for cycles %-change: 0.02% 0.19% Cycles are HURT. LOST: 557 GAINED: 900 fossil-db: Lunar Lake Totals: Instrs: 141933621 -> 141884681 (-0.03%); split: -0.03%, +0.00% Cycle count: 21990657282 -> 21990200212 (-0.00%); split: -0.14%, +0.14% Spill count: 69754 -> 69732 (-0.03%); split: -0.05%, +0.02% Fill count: 128559 -> 128521 (-0.03%); split: -0.05%, +0.02% Scratch Memory Size: 5934080 -> 5925888 (-0.14%) Max live registers: 48021653 -> 48051253 (+0.06%); split: -0.00%, +0.06% Totals from 13510 (2.45% of 551410) affected shaders: Instrs: 19497180 -> 19448240 (-0.25%); split: -0.25%, +0.00% Cycle count: 2455370202 -> 2454913132 (-0.02%); split: -1.25%, +1.23% Spill count: 10975 -> 10953 (-0.20%); split: -0.32%, +0.12% Fill count: 21709 -> 21671 (-0.18%); split: -0.28%, +0.10% Scratch Memory Size: 674816 -> 666624 (-1.21%) Max live registers: 2502653 -> 2532253 (+1.18%); split: -0.01%, +1.19% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152763523 -> 152772716 (+0.01%); split: -0.00%, +0.01% Cycle count: 17188701887 -> 17187510768 (-0.01%); split: -0.10%, +0.09% Spill count: 79280 -> 79279 (-0.00%); split: -0.00%, +0.00% Fill count: 148809 -> 148803 (-0.00%) Max live registers: 31879240 -> 31879093 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5559984 -> 5559712 (-0.00%); split: +0.00%, -0.01% Totals from 20524 (3.24% of 633183) affected shaders: Instrs: 20366964 -> 20376157 (+0.05%); split: -0.01%, +0.05% Cycle count: 2406162382 -> 2404971263 (-0.05%); split: -0.68%, +0.63% Spill count: 19935 -> 19934 (-0.01%); split: -0.02%, +0.01% Fill count: 34487 -> 34481 (-0.02%) Max live registers: 1745598 -> 1745451 (-0.01%); split: -0.01%, +0.01% Max dispatch width: 117992 -> 117720 (-0.23%); split: +0.03%, -0.26% Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 150694108 -> 150683859 (-0.01%); split: -0.01%, +0.00% Cycle count: 15526754059 -> 15529031079 (+0.01%); split: -0.10%, +0.12% Max live registers: 31791599 -> 31791441 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5569488 -> 5569296 (-0.00%); split: +0.00%, -0.01% Totals from 15000 (2.37% of 632406) affected shaders: Instrs: 10965577 -> 10955328 (-0.09%); split: -0.11%, +0.02% Cycle count: 2025347115 -> 2027624135 (+0.11%); split: -0.80%, +0.91% Max live registers: 983373 -> 983215 (-0.02%); split: -0.02%, +0.00% Max dispatch width: 83064 -> 82872 (-0.23%); split: +0.12%, -0.35% Skylake Totals: Instrs: 140588784 -> 140413758 (-0.12%); split: -0.13%, +0.00% Cycle count: 14724286265 -> 14723402393 (-0.01%); split: -0.04%, +0.04% Fill count: 100130 -> 100129 (-0.00%) Max live registers: 31418029 -> 31417146 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5513400 -> 5535192 (+0.40%); split: +0.89%, -0.49% Totals from 39733 (6.35% of 625986) affected shaders: Instrs: 17240737 -> 17065711 (-1.02%); split: -1.02%, +0.01% Cycle count: 1994668203 -> 1993784331 (-0.04%); split: -0.31%, +0.27% Fill count: 44481 -> 44480 (-0.00%) Max live registers: 2766781 -> 2765898 (-0.03%); split: -0.03%, +0.00% Max dispatch width: 210600 -> 232392 (+10.35%); split: +23.23%, -12.89% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	be26012f1d	brw/opt: Always do copy prop, DCE, and register coalesce after lower_regioning shader-db: Lunar Lake total instructions in shared programs: 18100289 -> 18083853 (-0.09%) instructions in affected programs: 790048 -> 773612 (-2.08%) helped: 3058 / HURT: 1 total cycles in shared programs: 921691992 -> 921293816 (-0.04%) cycles in affected programs: 37210762 -> 36812586 (-1.07%) helped: 2329 / HURT: 624 LOST: 27 GAINED: 26 Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 19825635 -> 19821391 (-0.02%) instructions in affected programs: 138675 -> 134431 (-3.06%) helped: 877 / HURT: 0 total cycles in shared programs: 907900598 -> 907885713 (<.01%) cycles in affected programs: 7127161 -> 7112276 (-0.21%) helped: 318 / HURT: 242 total spills in shared programs: 5790 -> 5758 (-0.55%) spills in affected programs: 660 -> 628 (-4.85%) helped: 8 / HURT: 0 total fills in shared programs: 6744 -> 6712 (-0.47%) fills in affected programs: 708 -> 676 (-4.52%) helped: 8 / HURT: 0 LOST: 10 GAINED: 0 Skylake total instructions in shared programs: 18722197 -> 18637637 (-0.45%) instructions in affected programs: 2757553 -> 2672993 (-3.07%) helped: 12290 / HURT: 1 total cycles in shared programs: 859716039 -> 859432560 (-0.03%) cycles in affected programs: 113731837 -> 113448358 (-0.25%) helped: 9555 / HURT: 2422 LOST: 265 GAINED: 714 fossil-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) Totals: Instrs: 142000618 -> 141928331 (-0.05%); split: -0.05%, +0.00% Subgroup size: 10995136 -> 10995072 (-0.00%) Cycle count: 21994723230 -> 21990481140 (-0.02%); split: -0.08%, +0.06% Spill count: 69911 -> 69754 (-0.22%); split: -0.23%, +0.00% Fill count: 128723 -> 128559 (-0.13%); split: -0.15%, +0.02% Scratch Memory Size: 5936128 -> 5934080 (-0.03%) Max live registers: 48006880 -> 48020936 (+0.03%); split: -0.01%, +0.04% Totals from 17450 (3.16% of 551410) affected shaders: Instrs: 14984149 -> 14911862 (-0.48%); split: -0.48%, +0.00% Subgroup size: 365744 -> 365680 (-0.02%) Cycle count: 2585095128 -> 2580853038 (-0.16%); split: -0.71%, +0.54% Spill count: 20893 -> 20736 (-0.75%); split: -0.76%, +0.00% Fill count: 44181 -> 44017 (-0.37%); split: -0.44%, +0.07% Scratch Memory Size: 995328 -> 993280 (-0.21%) Max live registers: 2378069 -> 2392125 (+0.59%); split: -0.20%, +0.79% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 150719758 -> 150676269 (-0.03%); split: -0.04%, +0.01% Subgroup size: 7764560 -> 7764632 (+0.00%) Cycle count: 15526689814 -> 15525687740 (-0.01%); split: -0.03%, +0.02% Spill count: 60120 -> 59472 (-1.08%); split: -1.17%, +0.10% Fill count: 105973 -> 104675 (-1.22%); split: -1.40%, +0.17% Scratch Memory Size: 2396160 -> 2381824 (-0.60%); split: -0.73%, +0.13% Max live registers: 31782879 -> 31788857 (+0.02%); split: -0.01%, +0.03% Max dispatch width: 5569200 -> 5569344 (+0.00%); split: +0.00%, -0.00% Totals from 10089 (1.60% of 632405) affected shaders: Instrs: 6389866 -> 6346377 (-0.68%); split: -0.87%, +0.19% Subgroup size: 102912 -> 102984 (+0.07%) Cycle count: 681310278 -> 680308204 (-0.15%); split: -0.65%, +0.51% Spill count: 19571 -> 18923 (-3.31%); split: -3.61%, +0.30% Fill count: 38229 -> 36931 (-3.40%); split: -3.88%, +0.48% Scratch Memory Size: 808960 -> 794624 (-1.77%); split: -2.15%, +0.38% Max live registers: 677473 -> 683451 (+0.88%); split: -0.45%, +1.33% Max dispatch width: 88672 -> 88816 (+0.16%); split: +0.27%, -0.11% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	b2d7a823be	brw/lower: Don't emit spurious moves to or from NULL register Previously an instruction like cmp.l.f0.0(16) null:F, v359:F, 0f would get lowered to undef(16) v13703:UD cmp.l.f0.0(16) v13703:F, v359:F, 0f mov(16) null:UD, v13703:UD After copy propagation and dead-code elimination are run again, the original CMP gets turned back into its original form! Some cases can also emit MOVs from the original NULL register. It should be possible to not do any lowering here, but there are some interactions with source lowering passes for things like cmp.l.f0.0(16) null:HF, g89.1<16,16,1>:HF, 0hf What inspired this was... diff'ing step-by-step dumps from INTEL_DEBUG=optimizer had a lot of useless changes due to these MOVs and undefs. It was very annoying. This low-effort change gets the majority of the possible benefit. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	9aba731d03	brw/cse: Don't eliminate instructions that write flags With other changes in my tree, I observed this code from dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second cmp.z removed. undef(8) %69:UD cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F mov(1) v58+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0 cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d ... undef(8) %72:UD cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F mov(1) v63+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0 This was also fixed by running dead-code elimination before CSE. That seems more like avoiding the problem than fixing it, though. I believe this affects shader-db results because leaving the second CMP in the shader can give more opportunities for cmod propagation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `234c45c929` ("intel/brw: Write a new global CSE pass that works on defs") shader-db: All Intel platforms had similar results. (Lunar Lake shown) total cycles in shared programs: 922097690 -> 922260862 (0.02%) cycles in affected programs: 3178926 -> 3342098 (5.13%) helped: 130 HURT: 88 helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16 helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18% HURT stats (abs) min: 4 max: 11992 x̄: 2292.55 x̃: 47 HURT stats (rel) min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61% 95% mean confidence interval for cycles value: 320.36 1176.63 95% mean confidence interval for cycles %-change: 1.59% 5.73% Cycles are HURT. LOST: 2 GAINED: 1 fossil-db: Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown) Totals: Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00% Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00% Max live registers: 48013385 -> 48013343 (-0.00%) Totals from 507 (0.09% of 551441) affected shaders: Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01% Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86% Max live registers: 94413 -> 94371 (-0.04%) DG2 Totals: Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00% Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00% Fill count: 150673 -> 150631 (-0.03%) Max live registers: 31871520 -> 31871476 (-0.00%) Totals from 506 (0.08% of 633197) affected shaders: Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01% Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74% Fill count: 2779 -> 2737 (-1.51%) Max live registers: 51383 -> 51339 (-0.09%) Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00% Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00% Fill count: 106610 -> 106614 (+0.00%) Max live registers: 32195006 -> 32194966 (-0.00%) Totals from 411 (0.06% of 637268) affected shaders: Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01% Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02% Fill count: 2865 -> 2869 (+0.14%) Max live registers: 42883 -> 42843 (-0.09%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	80a5d158ae	brw/copy: Don't copy propagate through smaller entry dest size Copy propagation would incorrectly occur in this code mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 to create mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, u0<0>:UD NoMask group0 This has different behavior. I think I just made a mistake when I changed this condition in `e3f502e007`. It seems like this condition could be relaxed to cover cases like (note the change of destination stride) mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 I'm not sure it's worth it. No shader-db or fossil-db changes on any Intel platform. Even the code for the test case mentioned in the original commit did not change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `e3f502e007` ("intel/fs: Allow copy propagation between MOVs of mixed sizes") Closes: #12116 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Samuel Pitoiset	ced2404cb4	vulkan/runtime: return same cmdbuf level from the command pool freelist This fixes a performance issue on RADV because secondaries are allocated in GTT instead of VRAM for primaries. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12119 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32010>	2024-11-08 17:20:43 +00:00
Ian Romanick	c1c09e3c4a	brw/emit: Add correct 3-source instruction assertions for each platform Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled on this while trying some stuff with !31852. v2: Don't be lazy. Add proper assertions for all the things on all the platforms. Based on a suggestion by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7bed11fbde` ("intel/brw: Allow immediates in the BFE instruction on Gfx12+") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31858>	2024-11-08 16:48:57 +00:00
Gurchetan Singh	aebc6c974f	gfxstream: use vulkan_lite_runtime This results in faster compiles. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:10:09 -08:00
Gurchetan Singh	dd5244e6ac	gfxstream: nuke android::base::SubAllocator Use u_mm, one less dependency on libaemu v0.1.2 Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:10:05 -08:00
Gurchetan Singh	6a9eb986c2	gfxstream: move isHostVisible function It's separable from the rest of CoherentMemory class. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:10:00 -08:00
Gurchetan Singh	5d299a0bd4	util: add c++ guards to u_mm.h Needed for gfxstream. Reviewed-by: Marcin Radomski <dextero@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32015>	2024-11-08 08:09:49 -08:00
Hans-Kristian Arntzen	5f70858ece	vulkan/wsi/wayland: Use X11-style image count strategy when using FIFO. This is required, otherwise we regress latency in cases where applications are using FIFO without explicit KHR_present_wait. This is an unacceptable regression. The fix is to normalize the behavior to X11 WSI. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Fixes: `d052b0201e` ("vulkan/wsi/wayland: Use fifo protocol for FIFO") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32029>	2024-11-08 14:28:08 +00:00
Samuel Pitoiset	437bd63265	radv,aco: dump m0 and exec from the trap handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:15 +00:00
Samuel Pitoiset	d1d41be43f	aco: declare phys regs for tba_hi/tma_hi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:15 +00:00
Samuel Pitoiset	13bab450a2	aco: fix storing SQ_WAVE_STATUS in the trap handler shader SQ_WAVE_STATUS can change inside the trap because of SCC. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Samuel Pitoiset	494050d2ea	aco: add a helper to dump SGPR to memory for the trap handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00
Samuel Pitoiset	8c6f2fef1b	aco: use scalar buffer stores for dumping SGPRS from the trap on GFX8 This avoids using any VGPRs on GFX8. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>	2024-11-08 14:00:14 +00:00

1 2 3 4 5 ...

197467 Commits