third_party_mesa3d

Author	SHA1	Message	Date
Faith Ekstrand	80a1836d8b	nir: Get rid of nir_dest_bit_size() We could add a nir_def_bit_size() helper but we use ->bit_size about 3x as often as nir_dest_bit_size() today so that's a major Coccinelle refactor anyway and this doesn't make it much worse. Most of this commit was generated byt the following semantic patch: @@ expression D; @@ <... -nir_dest_bit_size(D) +D.ssa.bit_size ... Some manual fixup was needed, especially in cpp files where Coccinelle tends to give up the moment it sees any interesting C++. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Alyssa Rosenzweig	d9786a48aa	agx: Remove agx_nir_ssa_index Deduplicated from agx_def_index. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:52 +00:00
Alyssa Rosenzweig	6f66f3583e	agx: Stop passing nir_dest around Towards deleting nir_dest. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:52 +00:00
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Alyssa Rosenzweig	7ac6176ea5	agx: Do not allow creating vec8 mem_access_bit_size needs to split up 64x4 into 2 loads. Fixes: dEQP-VK.spirv_assembly.instruction.compute.64bit_compare.int64.comp_opiequal_vector Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:28 +00:00
Alyssa Rosenzweig	fd481d00d3	agx: Handle <32-bit local memory access I don't know if this is possible to hit with GL, but it is with Vulkan. Fixes: dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:28 +00:00
Alyssa Rosenzweig	aeffd22c30	agx: Handle f2f16_rtne like f2f16 TBD whether we can control round modes later on. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:28 +00:00
Alyssa Rosenzweig	73657cd011	agx: Handle conversions to 8-bit These can't be lowered by nir_lower_bit_sizes but it doesn't actually matter. Fixes SPIR-V conversions tests. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:28 +00:00
Mary	0f4e3a03fd	agx: Move nir_lower_fragcolor out of agx_preprocess_nir Do not apply "nir_lower_fragcolor" in the common code. This fix a crash on agxv side when a frag shader have SSBO writes. This is caused by "nir_lower_frag_color" assuming that every "store_deref" will have a variable backing the output. Signed-off-by: Mary <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:28 +00:00
Alyssa Rosenzweig	a30c668e44	agx: Require an immediate for `nest` There's no good reason to allow non-immediate nesting values, and this lets us use the (smaller) mov_imm instruction without special casing. This matches what Metal produces, so it seems like a good preference. total bytes in shared programs: 11720338 -> 11717310 (-0.03%) bytes in affected programs: 2341580 -> 2338552 (-0.13%) helped: 1385 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	e83b708676	agx: Optimize out pointless else instructions Now that they're in the right blocks, this is easy. Includes an informal proof and the implementation itself is built around a finite state machine, which together meant this code worked on its first try :~) And hey, it's a pointless little instruction saving optimization I've wanted to do for a while~ Major note is that this HAS to be done after register allocation, since it doesn't update the control flow graph and would introduce critical edges if it tried to actually deleted the else block. The intuitive reason for this is simple: sometimes RA needs to insert instructions into the else block, even if it was empty in the original NIR, so we always need an else block even if we can delete it with this pass after RA. total instructions in shared programs: 1778390 -> 1776725 (-0.09%) instructions in affected programs: 268459 -> 266794 (-0.62%) helped: 1013 HURT: 0 Instructions are helped. total bytes in shared programs: 12185102 -> 12175112 (-0.08%) bytes in affected programs: 1927524 -> 1917534 (-0.52%) helped: 1013 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	782055106f	agx: Use unconditional else instruction Rather than duplicating the condition. This matches the blob, so is presumably the most energy-efficient way of expressing the logic. No shader-db changes. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	41b7891673	agx: Put else instructions in the right block According to Dougall's pseudocode, else_icmp operates as: if r0l == 0: r0l = n elif r0l == 1: if cc.compare(A[thread], B[thread]): r0l = 0 else: r0l = 1 exec_mask[thread] = (r0l == 0) Notice that the comparison only happens when r0l == 1, that is, for threads that are about to enter the else block. Threads that just executed the if body are still active (r0l = 0) and skip the comparison. As such, the sources of else_icmp are only read in the else block, and hence the whole instruction should be placed in the else block for correctness with respect to live range splitting. shader-db is a wash, but shows some improvements due to correctly modelling the liveness of the condition variable. total instructions in shared programs: 1778376 -> 1778390 (<.01%) instructions in affected programs: 14753 -> 14767 (0.09%) helped: 35 HURT: 39 Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 12185018 -> 12185102 (<.01%) bytes in affected programs: 101522 -> 101606 (0.08%) helped: 35 HURT: 39 Inconclusive result (value mean confidence interval includes 0). total halfregs in shared programs: 531174 -> 531032 (-0.03%) halfregs in affected programs: 2320 -> 2178 (-6.12%) helped: 40 HURT: 1 Halfregs are helped. total threads in shared programs: 18909184 -> 18909440 (<.01%) threads in affected programs: 1792 -> 2048 (14.29%) helped: 2 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	0d7b8bfce5	agx: Don't lower load_local_invocation_index We have an SR for it, which can save a bit of math. This came up while working on the spiller. total instructions in shared programs: 1778396 -> 1778376 (<.01%) instructions in affected programs: 3036 -> 3016 (-0.66%) helped: 10 HURT: 3 Instructions are helped. total bytes in shared programs: 12185182 -> 12185018 (<.01%) bytes in affected programs: 38640 -> 38476 (-0.42%) helped: 18 HURT: 2 Bytes are helped. total halfregs in shared programs: 531218 -> 531174 (<.01%) halfregs in affected programs: 471 -> 427 (-9.34%) helped: 6 HURT: 0 Halfregs are helped. total threads in shared programs: 18909056 -> 18909184 (<.01%) threads in affected programs: 1280 -> 1408 (10.00%) helped: 2 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	3e5d2f0c1b	asahi,agx: Respect no16 even for I/O Don't call lower_mediump_io for no16. This is helpful for debugging and soon driconf-shaming apps with broken precision qualifiers. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	5f3d784c6c	agx: Handle 8-bit vecs These should "just" work, promoting the 8-bit channels to 16-bit registers internally, allowing us to use our 8-bit stores with 8-bit data vectors packed in 16-bit registers. All other non-conversion ALU gets lowered by the previous patch, this is just needed for simple things like nir_op_vec of lowered math passed to a vectorized store. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	c3b86bcbbc	agx: Lower 8-bit ALU No hardware support for it. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24635>	2023-08-11 20:31:27 +00:00
Alyssa Rosenzweig	144546f434	agx: Lower flat shading in NIR We get this as part of the lowering we added for interpolateAtOffset. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24498>	2023-08-11 09:50:12 +00:00
Alyssa Rosenzweig	48029548f3	agx: Forcibly vectorize pointcoord coeffs This avoids regressions from scalarizing pointcoord loads. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24498>	2023-08-11 09:50:11 +00:00
Alyssa Rosenzweig	22f694c008	agx: Implement nir_intrinsic_load_coefficients_agx Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24498>	2023-08-11 09:50:11 +00:00
Mike Blumenkrantz	e9a5da2f4b	nir: add a filter cb to lower_io_to_scalar this is useful for drivers that want to do selective scalarization of io Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24565>	2023-08-11 09:02:53 +00:00
Alyssa Rosenzweig	a8013644a1	nir: Drop nir_alu_src::{negate,abs} Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	ab0d878932	treewide: Remove more is_ssa asserts Stuff Coccinelle missed. sed -i -e '/assert(.\.is_ssa)/d' $(git grep -l is_ssa) sed -i -e '/ASSERT.\.is_ssa)/d' $(git grep -l is_ssa) + a manual fixup to restore the assert for parallel copy lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	51db19f7a2	nir: Rename scoped_barrier -> barrier sed + ninja clang-format + fix up spacing for common code. If you are unhappy that I did not manually change the whitespace of your driver, you need to enable clang-format for it so the formatting would happen automatically. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>	2023-08-01 23:18:29 +00:00
Alyssa Rosenzweig	5f167c9f72	asahi: Lower multisample image stores These will be used for spilling multisampled render targets. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	10fc9e3d59	agx: Plumb in coverage mask This is internally used by the hardware when writing to the tilebuffer. We need to use it externally to spill multisample render targets. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	56bb3dcc21	agx: Require tag writes with side effects Otherwise the fragment shader might be skipped entirely. (Possibly this is the wrong approach to this though...) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	7ed2596fe7	agx: Implement fence_*_to_tex_agx intrinsics We need these fencing intrinsics because our image caches aren't coherent with memory. Furthermore, we need some sync intrinsics for imageblocks (which are spicy images). These are a stub of what the final fragment shader interlock implementation will look like, or what a real Metal-grade imageblock implementation needs, but this is good enough for handling the sync requirements with spilled render targets. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	c1afe26be6	agx: Don't emit silly barriers Trust in the scoped_barrier. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	b618ba9330	agx: Emit global memory barriers for images This is part of image atomics, since those go through the regular memory path. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	93f26abe49	agx: Implement image_load Texture loads can be reordered freely but image loads can't be (since there could be writes). Implement image_load natively to avoid subtle problems with CSE and scheduling. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	e5f37ac5cb	agx: Extract texture write mask handling image_load will share the logic. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	02b1ddeca6	asahi,agx: Fix txf sampler Bizarrely, the clamps/wrap modes are respected so we need to set them appropriately for correct out-of-bounds behaviour (returning all zero). That in turn means we can't use whatever sampler is already there, instead we need to allocate a dedicated sampler just for txf. Good news is we have an extra sampler state register available for the purpose. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	e2cfd2a228	agx: Add interleave opcode We'll use it for texture atomics. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	76641762ce	agx: Implement image barriers Or cache flushes or whatever these actually are. Probably could be optimized once we understand what the 4 individual instructions are actually doing. Fixes dEQP-GLES31.functional.image_load_store.2d.qualifiers.*. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	4ef89e71ba	agx: Translate image_store from NIR Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	13bb1209e2	agx: Translate texture bindless handles Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	f4aa6fd22e	agx: Model texture bindless base Extra source we need to implement bindless. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	80e103d718	agx: Reduce un/packs with mem access lowering Often not needed and makes the NIR harder to read. shader-db is noise. total instructions in shared programs: 1752712 -> 1752688 (<.01%) instructions in affected programs: 8338 -> 8314 (-0.29%) helped: 21 HURT: 8 Inconclusive result (%-change mean confidence interval includes 0). total bytes in shared programs: 11943572 -> 11943434 (<.01%) bytes in affected programs: 56716 -> 56578 (-0.24%) helped: 21 HURT: 8 Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Alyssa Rosenzweig	8db9eeaeec	asahi: Upload image descriptors Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24258>	2023-07-20 15:33:28 +00:00
Asahi Lina	1140bdb783	asahi: Arrange VS varyings in the correct order The GPU ABI requires varyings to be grouped as follows: - Position - Smooth shaded fp32 - Flat shaded fp32 - Linear shaded fp32 - Smooth shaded fp16 - Flat shaded fp16 - Linear shaded fp16 - Point size Use the flat shaded mask info we now have in the vertex shader key to sort things properly, and pass the counts to the hardware. FP16 is still TODO. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Asahi Lina	4a65b4bb14	asahi: Fix type confusion for fragment shader keys We can't attempt to access the fs union member if this is not a FS. That worked so far since there wasn't a VS shader key at all, but we're about to introduce one. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Asahi Lina	90834353a1	asahi: Gather flat/linear shaded input info from uncompiled FS We need to propagate shading model metadata from the FS to the VS in order to correctly lay out the uniforms in the right order. This means we need VS variants depending on this data. We could use the existing shader info structure, but that applies to compiled shaders which would introduce a dependency from the VS compile to the FS compile. This information does not change with FS variants, so we can introduce an agx_uncompiled_shader_info structure and gather it early at precompilation time. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Alyssa Rosenzweig	d9bf52e00f	agx: Assert that barriers are not used in the preamble It is nonsensical and confuses the hardware. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Alyssa Rosenzweig	9bf7d14b2c	agx: Use nir_opt_shrink_vectors Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Alyssa Rosenzweig	c81a14c754	agx: Use nir_opt_shrink_stores This especially helps with image stores, where we otherwise insert a bunch of pointless moves to collect a vector even when we know the format only has a single channel. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Alyssa Rosenzweig	5a4c9136cd	agx: Add algebraic opt to help with discard lowering When lowering discards, it will be convenient to generate the pattern: (cond ? 255 : 0) ^ 255 Add rules to optimize that to (cond ? 0 : 255) This is not part of the main algebraic optimizer since this lowering happens late. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23998>	2023-07-05 05:11:49 +00:00
Yonggang Luo	62ce223245	treewide: Switch to use nir_foreach_function_with_impl when possible Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23903>	2023-06-29 08:36:03 +00:00
Alyssa Rosenzweig	05adeb850b	agx: Use nir_lower_frag_coord_to_pixel_coord Instead of open-coding the logic. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23836>	2023-06-27 14:38:21 +00:00
Alyssa Rosenzweig	766535c867	agx: Implement vector live range splitting The SSA killer feature is that, under an "optimal" allocator, the number of registers used (register demand) is equal to the number of registers required (register pressure, the maximum number of variables simultaneously live at any point in the program). I put "optimal" in scare quotes, because we don't need to use the exact minimum number of registers as long as we don't sacrifice thread count or introduce spilling, and using a few extra registers when possible can help coalesce moves. Details-shmetails. The problem is that, prior to this commit, our register allocator was not well-behaved in certain circumstances, and would require an arbitrarily large number of registers. In particular, since different variables have different sizes and require contiguous allocation, in large programs the register file may become fragmented, causing the RA to use arbitrarily many registers despite having lots of registers free. The solution is vector live range splitting. First, we calculate the register pressure (the minimum number of registers that it is theoretically possible to allocate successfully), and round up to the maximum number of registers we will actually use (to give some wiggle room to coalesce moves). Then, we will treat this maximum as a bound, requiring that we don't use more registers than chosen. In the event that register file fragmentation prevents us from finding a contiguous sequence of registers to allocate a variable, rather than giving up or using registers we don't have, we shuffle the register file around (defragmenting it) to make room for the new variable. That lets us use a few moves to avoid sacrificing thread count or introducing spilling, which is usually a great choice. Android GLES3.1 shader-db results are as expected: some noise / small regressions for instruction count, but a bunch of shaders with improved thread count. The massive increase in register demand may seem weird, but this is the RA doing exactly what it's supposed to: using more registers if and only if they would not hurt thread count. Notice that no programs whatsoever are hurt for thread count, which is the salient part. total instructions in shared programs: 1781473 -> 1781574 (<.01%) instructions in affected programs: 276268 -> 276369 (0.04%) helped: 1074 HURT: 463 Inconclusive result (value mean confidence interval includes 0). total bytes in shared programs: 12196640 -> 12201670 (0.04%) bytes in affected programs: 1987322 -> 1992352 (0.25%) helped: 1060 HURT: 513 Bytes are HURT. total halfregs in shared programs: 488755 -> 529651 (8.37%) halfregs in affected programs: 295651 -> 336547 (13.83%) helped: 358 HURT: 9737 Halfregs are HURT. total threads in shared programs: 18875008 -> 18885440 (0.06%) threads in affected programs: 64576 -> 75008 (16.15%) helped: 82 HURT: 0 Threads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>	2023-06-23 17:37:41 +00:00

1 2 3 4 5 ...

406 Commits