third_party_mesa3d

Author	SHA1	Message	Date
Faith Ekstrand	ed9affa02f	nir: Drop most instances of nir_ssa_dest_init() Generated using the following two semantic patches: @@ expression I, J, NC, BS; @@ -nir_ssa_dest_init(I, &J->dest, NC, BS); +nir_def_init(I, &J->dest.ssa, NC, BS); @@ expression I, J, NC, BS; @@ -nir_ssa_dest_init(I, &J->dest.dest, NC, BS); +nir_def_init(I, &J->dest.dest.ssa, NC, BS); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24658>	2023-08-13 17:12:52 +00:00
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Lionel Landwerlin	6f694432e4	intel/fs: add variable for output of debug backend optimizer It can be useful to compare 2 runs with different compiler changes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24552>	2023-08-10 06:39:57 +00:00
Lionel Landwerlin	0e244d56e3	intel/fs: track more steps with INTEL_DEBUG=optimizer One particular nice thing to have is the first generated backend IR before validation. Especially if you made a mistake in the NIR translation, you can at least look at it before validation tells you off. Then the last 2 steps of the optimize() function can be interesting to look at. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24552>	2023-08-10 06:39:57 +00:00
Lionel Landwerlin	9934613c74	anv/hasvk: track robustness per pipeline stage And split them into UBO and SSBO v2 (Lionel): - Get rid of robustness fields in anv_shader_bin v3 (Lionel): - Do not pass unused parameters around Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>	2023-08-09 09:00:12 +03:00
Kenneth Graunke	95db3e87fe	intel/compiler: Fix sparse cube map array coordinate lowering Brown paper bag fix for my untested review feedback comments. Cube array images use a coordinate of the form <X, Y, 6Slice+Face>, while cube array textures use a <X, Y, Slice, Face> style coordinate. This code tried to convert one to the other, but instead of writing Z / 6 and Z % 6, we tried to reuse our original division result. What we wanted was Z - (Z/6) 6, but instead we botched it and wrote Z-Z*6 which produced...totally invalid cube faces. Fixes: `fe81d40bff` ("intel/nir: add lower for sparse images & textures") Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24481>	2023-08-04 00:09:05 -07:00
Alyssa Rosenzweig	42ee8a55dd	nir: Remove nir_alu_dest::write_mask Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:30 +00:00
Alyssa Rosenzweig	4828881e28	intel/vec4: Don't use legacy write mask Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:30 +00:00
Alyssa Rosenzweig	11fc4f969c	intel: Collapse is_ssa checks Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:29 +00:00
Alyssa Rosenzweig	579bc1e72e	treewide: Drop some is_ssa if's Via Coccinelle patch: @@ expression x; @@ -if (!x.is_ssa) { -... -} and likewise with x->is_ssa, with invalid hunks manually filtered out. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:29 +00:00
Alyssa Rosenzweig	95e3df39c0	treewide: sed out more is_ssa Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	a8013644a1	nir: Drop nir_alu_src::{negate,abs} Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	ab0d878932	treewide: Remove more is_ssa asserts Stuff Coccinelle missed. sed -i -e '/assert(.\.is_ssa)/d' $(git grep -l is_ssa) sed -i -e '/ASSERT.\.is_ssa)/d' $(git grep -l is_ssa) + a manual fixup to restore the assert for parallel copy lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	5fead24365	treewide: Drop is_ssa asserts We only see SSA now. Via Coccinelle patch: @@ expression x; @@ -assert(x.is_ssa); Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	d559764e7c	nir: Remove nir_alu_dest::saturate Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Yonggang Luo	86bcc90c0e	intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24438>	2023-08-03 22:00:15 +00:00
Yonggang Luo	9eb8a0b16a	intel/brw: Define and use BRW_SWIZZLE_* instead of SWIZZLE_* This is for avoid #include "program/prog_instrunction.h" in intel/brw code Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24438>	2023-08-03 22:00:15 +00:00
Alyssa Rosenzweig	17d66055ae	nir: Remove reg_intrinsics parameter to convert_from_ssa All users must set it. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24450>	2023-08-02 10:26:45 -04:00
Alyssa Rosenzweig	51db19f7a2	nir: Rename scoped_barrier -> barrier sed + ninja clang-format + fix up spacing for common code. If you are unhappy that I did not manually change the whitespace of your driver, you need to enable clang-format for it so the formatting would happen automatically. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>	2023-08-01 23:18:29 +00:00
Jordan Justen	5df97c27dc	intel/compiler: Use nir SUBGROUP_INVOCATION for RT TOPOLOGY_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21774>	2023-07-28 22:54:59 +00:00
Jason Ekstrand	739e21fa9a	intel/fs: Add a parameter to speed up register spilling Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>	2023-07-28 14:51:42 +00:00
Mike Blumenkrantz	e68e612826	nir: add a helper for calculating variable slots this will maybe avoid future bugs, but probably not Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>	2023-07-28 13:14:35 +00:00
Mike Blumenkrantz	59396eefe6	nir: fix slot calculations for compact variables with location_frac a variable with a component offset may span multiple slots, and this cannot be inferred from its type alone (e.g., compacted clip+cull distances) cc: mesa-stable Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>	2023-07-28 13:14:35 +00:00
Lionel Landwerlin	fe81d40bff	intel/nir: add lower for sparse images & textures We have to lower images into image load + sampler residency. There is also a restriction on sampler access with a compare, lower those as 2 sampler instructions to meet the restriction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:59 +03:00
Lionel Landwerlin	300cc829de	intel/nir: handle image_sparse_load in storage format lowering The last component of sparse load is the residency data. We don't want to touch/convert that value with the format lowering. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:34 +03:00
Lionel Landwerlin	d33aff783d	intel/fs: add support for sparse accesses Purely from the backend point of view it's just an additional parameter to sampler messages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:30 +03:00
Lionel Landwerlin	d099e47de0	intel/fs: add more UNDEFs around SEND messages lower_find_live_channel() in particular is used a lot in control flow to find the live channel for the surface/sampler handle. Adding UNDEFs on the temporary registers used for finding the live channels helps reduce the liveness of those temporary registers, especially in loops. Some titles affected : Rise Of The Tomb Raider: Totals from 2780 (22.58% of 12311) affected shaders: Instrs: 1294455 -> 1294592 (+0.01%); split: -0.15%, +0.16% Cycles: 1473136441 -> 1471302617 (-0.12%); split: -1.52%, +1.40% Max live registers: 144282 -> 143595 (-0.48%) Max dispatch width: 22200 -> 22232 (+0.14%) Red Dead Redemption 2: Totals from 435 (7.28% of 5972) affected shaders: Instrs: 488472 -> 487594 (-0.18%); split: -0.31%, +0.14% Cycles: 11354732 -> 11384928 (+0.27%); split: -0.44%, +0.71% Spill count: 1217 -> 1172 (-3.70%) Fill count: 3521 -> 3447 (-2.10%) Scratch Memory Size: 64512 -> 62464 (-3.17%) Max live registers: 35997 -> 35798 (-0.55%) Fallout 4: Totals from 8 (0.49% of 1638) affected shaders: Instrs: 41908 -> 40509 (-3.34%) Cycles: 3638464 -> 3555680 (-2.28%); split: -2.67%, +0.39% Spill count: 717 -> 665 (-7.25%) Fill count: 2542 -> 2438 (-4.09%) Scratch Memory Size: 32768 -> 16384 (-50.00%) Max live registers: 567 -> 534 (-5.82%) Cyberpunk 2077: Totals from 2984 (28.97% of 10301) affected shaders: Instrs: 3888874 -> 3891600 (+0.07%); split: -0.20%, +0.27% Cycles: 67906489 -> 67767721 (-0.20%); split: -0.68%, +0.47% Spill count: 200 -> 98 (-51.00%) Fill count: 237 -> 90 (-62.03%) Scratch Memory Size: 10240 -> 8192 (-20.00%) Max live registers: 215715 -> 212727 (-1.39%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>	2023-07-26 08:48:33 +00:00
Lionel Landwerlin	5c72724819	intel/fs: consider UNDEF as non-partial write A few titles show max live register reductions, but nothing significant in instruction count or other stats. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>	2023-07-26 08:48:32 +00:00
Lionel Landwerlin	d62e494b37	intel/vec4: fix log_data pointer Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3384f029be` ("intel/compiler: rework input parameters") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9421 Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24307>	2023-07-26 06:36:18 +00:00
Iván Briano	377c2a045f	intel/compiler: call brw_nir_adjust_payload from brw_postprocess_nir Calling anything after nir_trivialize_registers() risks undoing some of its work. In this case, brw_nir_adjust_payload() will do a constant folding pass if any payload adjusting happened, and that can turn a bunch of @store_regs into basically noops. Fixes dEQP-VK.subgroups.*task Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24325>	2023-07-25 22:48:09 +00:00
Ian Romanick	cb0de0a1d3	intel/fs: Constant fold OR and AND The path taken in fs_visitor::swizzle_nir_scratch_addr for DG2 generates some AND and OR instructions before the SHL. This commit folds those so the whold calculation becomes a constant (like on older platforms). v2: Fix return type of src_as_uint. Noticed by Marcin. shader-db results: DG2 total instructions in shared programs: 23190475 -> 23179540 (-0.05%) instructions in affected programs: 36026 -> 25091 (-30.35%) helped: 7 / HURT: 0 total cycles in shared programs: 841196807 -> 841142563 (<.01%) cycles in affected programs: 1660670 -> 1606426 (-3.27%) helped: 7 / HURT: 0 No shader-db changes on any older Intel platforms. fossil-db results: DG2 Totals: Instrs: 197780372 -> 197773966 (-0.00%) Cycles: 14066410782 -> 14066399378 (-0.00%); split: -0.00%, +0.00% Subgroup size: 8438104 -> 8438112 (+0.00%) Send messages: 8049445 -> 8049446 (+0.00%) Scratch Memory Size: 14263296 -> 14264320 (+0.01%) Totals from 9 (0.00% of 668055) affected shaders: Instrs: 24547 -> 18141 (-26.10%) Cycles: 1984791 -> 1973387 (-0.57%); split: -0.98%, +0.40% Subgroup size: 88 -> 96 (+9.09%) Send messages: 867 -> 868 (+0.12%) Scratch Memory Size: 69632 -> 70656 (+1.47%) No fossil-db changes on any older Intel platforms. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>	2023-07-25 22:11:21 +00:00
Ian Romanick	61c786bad5	intel/fs: Constant fold SHL This is a modified version of a commit originally in !7698. This version add the changes to brw_fs_copy_propagation. If the address passed to fs_visitor::swizzle_nir_scratch_addr is a constant, that function will generate SHL with two constant sources. DG2 uses a different path to generate those addresses, so the constant folding can't occur there yet. That will be addressed in the next commit. What follows is the commit change history from that older MR. v2: Previously this commit was after `intel/fs: Combine constants for integer instructions too`. However, this commit can create invalid instructions that are only cleaned up by `intel/fs: Combine constants for integer instructions too`. That would potentially affect the shader-db results of each commit, but I did not collect new data for the reordering. v3: Fix masking for W/UW and for Q/UQ types. Add an assertion for !saturate. Both suggested by Ken. Also add an assertion that B/UB types don't matically come back. v4: Fix sources count. See also `ed3c2f73db` ("intel/fs: fixup sources number from opt_algebraic"). v5: Fix typo in comment added in v3. Noticed by Marcin. Fix a typo in a comment added when pulling this commit out of !7698. Noticed by Ken. shader-db results: DG2 No changes. Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown) total instructions in shared programs: 20655696 -> 20651648 (-0.02%) instructions in affected programs: 23125 -> 19077 (-17.50%) helped: 7 / HURT: 0 total cycles in shared programs: 858436639 -> 858407749 (<.01%) cycles in affected programs: 8990532 -> 8961642 (-0.32%) helped: 7 / HURT: 0 Broadwell and Haswell had similar results. (Broadwell shown) total instructions in shared programs: 18500780 -> 18496630 (-0.02%) instructions in affected programs: 24715 -> 20565 (-16.79%) helped: 7 / HURT: 0 total cycles in shared programs: 946100660 -> 946087688 (<.01%) cycles in affected programs: 5838252 -> 5825280 (-0.22%) helped: 7 / HURT: 0 total spills in shared programs: 17588 -> 17572 (-0.09%) spills in affected programs: 1206 -> 1190 (-1.33%) helped: 2 / HURT: 0 total fills in shared programs: 25192 -> 25156 (-0.14%) fills in affected programs: 156 -> 120 (-23.08%) helped: 2 / HURT: 0 No shader-db changes on any older Intel platforms. fossil-db results: DG2 Totals: Instrs: 197780415 -> 197780372 (-0.00%); split: -0.00%, +0.00% Cycles: 14066412266 -> 14066410782 (-0.00%); split: -0.00%, +0.00% Totals from 16 (0.00% of 668055) affected shaders: Instrs: 16420 -> 16377 (-0.26%); split: -0.43%, +0.17% Cycles: 220133 -> 218649 (-0.67%); split: -0.69%, +0.01% Tiger Lake, Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 153425977 -> 153423678 (-0.00%) Cycles: 14747928947 -> 14747929547 (+0.00%); split: -0.00%, +0.00% Subgroup size: 8535968 -> 8535976 (+0.00%) Send messages: 7697606 -> 7697607 (+0.00%) Scratch Memory Size: 4380672 -> 4381696 (+0.02%) Totals from 6 (0.00% of 662749) affected shaders: Instrs: 13893 -> 11594 (-16.55%) Cycles: 5386074 -> 5386674 (+0.01%); split: -0.42%, +0.43% Subgroup size: 80 -> 88 (+10.00%) Send messages: 675 -> 676 (+0.15%) Scratch Memory Size: 91136 -> 92160 (+1.12%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>	2023-07-25 22:11:21 +00:00
Ian Romanick	56e6186dcf	intel/fs: Always do opt_algebraic after opt_copy_propagation makes progress opt_copy_propagation can create invalid instructions like shl(8) vgrf96:UD, 2d, 8u These instructions will be cleaned up by opt_algebraic. The irony is opt_algebraic converts these to simple mov instructions that opt_copy_propagation should clean up. I don't think we want a loop like do { progress = false; if (OPT(opt_copy_propagation)) { OPT(opt_algebraic); OPT(dead_code_eliminate); } } while (progress); But maybe we do? Maybe this would be sufficient: while (OPT(opt_copy_propagation)) OPT(opt_algebraic); OPT(dead_code_eliminate); No shader-db or fossil-db changes (yet) on any Intel platform. This is expected. v2: Do opt_algebraic immediately after every call to opt_copy_propagation instead of being clever. Suggested by Lionel. Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>	2023-07-25 22:11:21 +00:00
Faith Ekstrand	94f36cfaa3	intel/fs: Assume NIR is in SSA form Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>	2023-07-25 16:25:11 +00:00
Faith Ekstrand	965bbe5286	intel/fs: Rework the overlapping mov/vec case Now that we're using load/store_reg intrinsics, the previous checks for registers aren't what we want. Instead, we need to be looking for a mov or vec where both the destination and a source are load/store_reg with matching decl_reg. Fixes: `b8209d69ff` ("intel/fs: Add support for new-style registers") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>	2023-07-25 16:25:11 +00:00
Faith Ekstrand	45ee952efb	intel/fs: Use write masks from store_reg intrinsics Fixes: `b8209d69ff` ("intel/fs: Add support for new-style registers") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>	2023-07-25 16:25:10 +00:00
Marcin Ślusarz	4f1125e4ae	intel/compiler/test: fix crashes when TEST_DEBUG is set Dumping instructions requires that ISA info is not empty. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24274>	2023-07-25 15:13:29 +00:00
Lionel Landwerlin	8cbf730145	intel/fs: don't try to rebuild sequences of non ssa values Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `04777171e0` ("intel/fs: try to rematerialize surface computation code") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9378 Reviewed-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228>	2023-07-24 20:04:24 +00:00
Faith Ekstrand	079e8a9674	anv,hasvk,iris: sampler_prog_key::swizzles is only used on crocus The field is no longer consumed by brw_complie_* and is instead handled directly by the crocus driver. Therefore, it's safe to leave it zero and not even bother setting it. This removes our reliance on the SWIZZLE_* macros in prog_instructions.h. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24288>	2023-07-24 15:40:40 +00:00
Marcin Ślusarz	48885c7fe3	intel/compiler: load debug mesh compaction options once Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Marcin Ślusarz	c1685f08dd	intel/compiler,anv: put some vertex and primitive data in headers Both per-primitive and per-vertex space is allocated in MUE in 8 dword chunks and those 8-dword chunks (granularity of 3DSTATE_SBE_MESH.Per[Primitive\|Vertex]URBEntryOutputReadLength) are passed to fragment shaders as inputs (either non-interpolated for per-primitive and flat vertex attributes or interpolated for non-flat vertex attributes). Some attributes have a special meaning and must be placed in separate 8/16-dword slot called Primitive Header or Vertex Header. Primitive Header contains 4 such attributes (Cull Primitive, ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword slot) potentially unused. Vertex Header is similar - it starts with 3 unused dwords, 1 dword for Point Size (but if we declare that shader doesn't produce Point Size then we can reuse it), followed by 4 dwords for Position and optionally 8 dwords for clip distances. This means we have an interesting optimization problem - we can put some user attributes into holes in Primitive and Vertex Headers, which may lead to smaller MUE size and potentially more mesh threads running in parallel, but we have to be careful to use those holes only when we need it, otherwise we could force HW to pass too much data to fragment shader. Example 1: Let's assume that Primitive Header is enabled and user defined 12 dwords of per-primitive attributes. Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader. With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to fragment shader. 16/16 is better than 24/16, so packing makes sense. Example 2: Now let's assume that Primitive Header is enabled and user defined 16 dwords of per-primitive attributes. Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader. With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to fragment shader. 24/24 is worse than 24/16, so packing doesn't make sense. This change doesn't affect vk_meshlet_cadscene in default configuration, but it speeds it up by up to 25% with "-extraattributes N", where N is some small value divisible by 2 (by default N == 1) and we are bound by URB size. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Marcin Ślusarz	a252123363	intel/compiler/mesh: compactify MUE layout Instead of using 4 dwords for each output slot, use only the amount of memory actually needed by each variable. There are some complications from this "obvious" idea: - flat and non-flat variables can't be merged into the same vec4 slot, because flat inputs mask has vec4 stride - multi-slot variables can have different layout: float[N] requires N 1-dword slots, but i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot - some output variables occur both in single-channel/component split and combined variants - crossing vec4 boundary requires generating more writes, so avoiding them if possible is beneficial This patch fixes some issues with arrays in per-vertex and per-primitive data (func.mesh.ext.outputs.*.indirect_array.q0 in crucible) and by reduction in single MUE size it allows spawning more threads at the same time. Note: this patch doesn't improve vk_meshlet_cadscene performance because default layout is already optimal enough. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Alyssa Rosenzweig	1466014184	nir: Rename lower_locals_to_reg_intrinsics back The short name is freed up. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>	2023-07-21 11:25:49 +00:00
Alyssa Rosenzweig	a08286f993	intel/fs: Don't read reg.base_offset It's not set in the new intrinsics path. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>	2023-07-21 11:25:48 +00:00
Marcin Ślusarz	3c83ac8002	intel/compiler: remove redundant code has_lsc is checked few lines above, so this code doesn't matter. Added in: `a358b97c58` ("intel/fs: optimize uniform SSBO & shared loads") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24260>	2023-07-21 07:22:22 +00:00
Felix DeGrood	d04be9770b	intel/compiler: use shader source hash in shader dump code Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>	2023-07-20 09:08:08 +00:00
Lionel Landwerlin	3384f029be	intel/compiler: rework input parameters Use a struct for various common parameters rather than per stage structure or arguments to stage specific entrypoints. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>	2023-07-20 09:08:08 +00:00
Lionel Landwerlin	46958bcb74	intel/fs: fix missing predicate on SEL instruction Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d8dfd153c5` ("intel/fs: Make per-sample and coarse dispatch tri-state") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9381 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24236>	2023-07-19 21:57:25 +00:00
Faith Ekstrand	61a90c2968	intel/vec4: Drop support for nir_register Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>	2023-07-19 02:11:57 +00:00
Faith Ekstrand	39b5bb0809	intel/fs: Drop support for nir_register Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>	2023-07-19 02:11:57 +00:00

1 2 3 4 5 ...

2745 Commits