third_party_mesa3d

Author	SHA1	Message	Date
Francisco Jerez	37e280f28a	intel/fs: Lower unsupported regioning with non-trivial 2D regions on FIXED_GRFs. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Caio Oliveira	dd632bf527	intel/fs/xe2+: Update TASK/MESH payload setup for Xe2 reg size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Caio Oliveira	8944ac7d6c	intel/fs/xe2+: Update BS payload setup for Xe2 reg size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	14e1b9ee69	intel/fs/xe2+: Update TES payload setup for Xe2 reg size. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	4b3243104c	intel/fs/xe2+: Update TCS payload setup for Xe2 reg size. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	6195eac210	intel/fs/xe2+: Update GS payload setup for Xe2 reg size. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Caio Oliveira	28744c8954	intel/compiler/xe2: Account for reg_unit() in TES intrinsics Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Caio Oliveira	9859f5b4d2	intel/compiler/xe2: Account for reg_unit() in TCS intrinsics Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	610daa3166	intel/fs/xe2+: Fix payload layout of sampler messages for Xe2 reg size Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Ian Romanick	c9f2857546	intel/compiler/xe2: TXD is lowered to SIMD16 in SIMD32 mode Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Ian Romanick	ef817650c9	intel/compiler/xe2: Use SIMD16 for nir_intrinsic_image_size Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Ian Romanick	0b23df3951	intel/compiler/xe2: Update fs_visitor::setup_vs_payload to account for Xe2 reg size [ Francisco Jerez: Simplify. ] Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Rohan Garg	42b90f05f6	intel/compiler: Adjust barrier emission for Xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	8b1dc77521	intel/fs/xe2+: Scale BRW_MAX_MSG_LENGTH by native register size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Rohan Garg	4de065f6a2	intel/compiler: Adjust fence message lengths for new register width on Xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Rohan Garg	e1289d6135	intel/compiler: Adjust CS payload registers for new register width on Xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	150b3e87c8	intel/fs/xe2+: Round up fs_builder::vgrf() size calculation to HW register unit. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	24dcc3269b	intel/fs/xe2+: Update encoding of FB write message payload. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	a573531785	intel/compiler/xe2+: Represent dispatch_grf_start_reg in native GRF units. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	17ef5e7ead	intel/fs/xe2+: Allow increased SIMD width for various get_fpu_lowered_simd_width() restrictions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	6423cb9bfa	intel/eu/xe2+: Update validation of GRF region size to account for Xe2 reg size Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	00b614a5a7	intel/fs/xe2+: Scale MAX_SAMPLER_MESSAGE_SIZE by native register size. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	421d43fe62	intel/fs/xe2+: Fixes for increased accumulator register width. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	80e9031b44	intel/fs/xe2+: Fix grf_count in post-RA scheduling for updated register file size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	571ddf8516	intel/fs/xe2+: Fix payload node live range calculations for change in register size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	2b7419d090	intel/fs: Fix signedness of payload_node_count argument of calculate_payload_ranges(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	abf8111560	intel/eu/xe2+: Fix encoding of various message descriptors for change in register size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	6d39b3d6ae	intel/fs/ra/xe2: Scale up register allocation granularity by 2x on Xe2+ platforms. v2: Fix spill register allocation. Switch to brw_reg::nr representation in fake 256b units. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	bd98df5d8e	intel/compiler: Make MAX_VGRF_SIZE macro depend on devinfo and update it for Xe2. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Francisco Jerez	a7d521e556	intel/vec4/ra: Define REG_CLASS_COUNT constant specifying the number of register classes. Rework: * Jordan: 16=>20 following `d33aff783d` ("intel/fs: add support for sparse accesses") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:35 -07:00
Francisco Jerez	5d87f41a54	intel/fs/ra: Define REG_CLASS_COUNT constant specifying the number of register classes. Rework: * Jordan: 16=>20 following `d33aff783d` ("intel/fs: add support for sparse accesses") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:35 -07:00
Connor Abbott	4282386311	nir/spirv: Add inverse_ballot intrinsic This is actually a no-op on AMD, so we really don't want to lower it to something more complicated. There may be a more efficient way to do this on Intel too. In addition, in the future we'll want to use this for lowering boolean reduce operations, where the inverse ballot will operate on the backend's "natural" ballot type as indicated by options->ballot_bit_size, instead of uvec4 as produced by SPIR-V. In total, there are now three possible lowerings we may have to perform: - inverse_ballot with source type of uvec4 from SPIR-V to inverse_ballot with natural source type, when the backend supports inverse_ballot natively. - inverse_ballot with source type of uvec4 from SPIR-V to arithmetic, when the backend doesn't support inverse_ballot. - inverse_ballot with natural source type from reduce operation, when the backend doesn't support inverse_ballot. Previously we just did the second lowering unconditionally in vtn, but it's just a combination of the first and third. We add support here for the first and third lowerings in nir_lower_subgroups, instead of simply moving the second lowering, to avoid unnecessary churn. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>	2023-09-20 14:41:18 +00:00
Pavel Ondračka	1c72c71bdf	nir/move_vec_src_uses_to_dest: allow to skip reuse of constant sources And enable this for r300 and intel-vec4 crocus HSW (mostly helps few doplhin ubershaders): total instructions in shared programs: 1576736 -> 1576589 (<.01%) instructions in affected programs: 38235 -> 38088 (-0.38%) helped: 12 HURT: 0 total cycles in shared programs: 111025838 -> 110944796 (-0.07%) cycles in affected programs: 5646582 -> 5565540 (-1.44%) helped: 15 HURT: 6 total spills in shared programs: 447 -> 432 (-3.36%) spills in affected programs: 186 -> 171 (-8.06%) helped: 12 HURT: 0 total fills in shared programs: 792 -> 774 (-2.27%) fills in affected programs: 291 -> 273 (-6.19%) helped: 12 HURT: 0 r300 RV530: total instructions in shared programs: 96655 -> 96304 (-0.36%) instructions in affected programs: 15020 -> 14669 (-2.34%) helped: 79 HURT: 18 total temps in shared programs: 13027 -> 12952 (-0.58%) temps in affected programs: 677 -> 602 (-11.08%) helped: 41 HURT: 9 total cycles in shared programs: 147745 -> 147314 (-0.29%) cycles in affected programs: 21831 -> 21400 (-1.97%) helped: 84 HURT: 19 r300 RV370: total instructions in shared programs: 63678 -> 63669 (-0.01%) instructions in affected programs: 931 -> 922 (-0.97%) helped: 12 HURT: 6 total temps in shared programs: 10028 -> 10013 (-0.15%) temps in affected programs: 339 -> 324 (-4.42%) helped: 33 HURT: 10 total cycles in shared programs: 101118 -> 101087 (-0.03%) cycles in affected programs: 2659 -> 2628 (-1.17%) helped: 22 HURT: 6 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>	2023-09-19 18:05:37 +02:00
Alyssa Rosenzweig	d1eb17e92e	treewide: Drop nir_ssa_for_src users Via Coccinelle patch: @@ expression b, s, n; @@ -nir_ssa_for_src(b, *s, n) +s->ssa @@ expression b, s, n; @@ -nir_ssa_for_src(b, s, n) +s.ssa Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>	2023-09-18 10:25:17 -04:00
Sviatoslav Peleshko	b1a63d5418	intel/fs: Check if the whole ubo load range is in the push const range Before this, we were checking only the beginning of the ubo range, so partially overlapping loads were trying to load undefined data. Fixes: `b2da1238` ("i965: Use pushed UBO data in the scalar backend.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9748 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25111>	2023-09-15 10:55:24 +00:00
Ian Romanick	92f5442489	intel/fs: Merge copy prop dataflow loops This is kept as a separate commit because the change looks like a lot more than it it. The order of the two loops is swapped, then the two loops are merged. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	fa2757aa97	intel/fs: Use rb_tree for copy prop dataflow Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	35644bb483	intel/fs: Use rb_tree to store ACP entries by destination Using a single data structure seems better. There's no appreciable performance change. On batman_arkham_city_goty.foz, the difference reported was 0.48%±0.36% (n=20). Several commits in the MR, including some that should have no effect at all, reported similar changes. I attribute this primarily changing of loop alignments and similar. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	c28bf1a249	intel/fs: Use rb_tree to store ACP entries by source On batman_arkham_city_goty.foz, this improves fossil-db time by -3.83%±0.24% (n=20). This fossil takes the longest time of any in my database. v2: Add some comments for cmp_entry_src_entry_src and cmp_entry_src_nr. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	06bdd3eac0	intel/fs: Encapsulate per-block ACP in a structure This simplifies some later changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	c262752d74	intel/fs: Make opt_copy_propagation_local file private This annoyed me durning development of this MR. Every time I changed the parameters to this internal function, I had to modify a public header file... and trigger a much large rebuild. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	0946108298	intel/fs: Simplify check in can_propagate_from The larger predicate here already requires that inst->opcode must be BRW_OPCODE_MOV, so it can't BRW_OPCODE_SEL. With that removed, the other simplifications are pretty straight forward. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	1f15a0f8b2	intel/fs: Don't loop in try_constant_propagate The caller already loops over the sources. This means that the caller must loop over the sources in reverse because constant propagation prefers to propagate into the last sources first. The shader-db and fossil-db changes (below) are all due to SEL instructions. Changing the order sources are visited changes whether a SEL with two immediate sources is (+f0.0) sel g12 IMM_A IMM_B or (-f0.0) sel g12 IMM_B IMM_A The ordering of the sources affects the order the constant combining encounters the values, and the determines which value is "combined" and which value remains an immediate. This affects the results by luck. If there are two instructions: (+f0.0) sel g12 IMM_A IMM_B (+f0.0) sel g13 IMM_A IMM_C Picking IMM_A is advantageous over picking IMM_B and IMM_C. Since the selection algorithm in constant combining is greedy, this case requires the algorithm see the values in just the right order for the right thing to happen. v2: Rebase on many, many changes. Move instruction source fixup reordering out or try_constant_propagate. v3: Rebase on !7698. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	ab23d89ade	intel/fs: Move src.file checks out of try_constant_propagate and try_copy_propagate Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:23 +00:00
Ian Romanick	b5b2338c5c	intel/fs: Make try_constant_propagate and try_copy_propagate file private This annoyed me durning development of this MR. Every time I changed the parameters to this internal function, I had to modify a public header file... and trigger a much large rebuild. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:22 +00:00
Ian Romanick	8665e37960	intel/fs: Don't try to copy propagate into a source again after progress is made If the linked list structure used depended on the list head to know when to terminate, this would be a pretty serious bug. If try_constant_propage or try_copy_propagate make progress, inst->src[i].nr will change. This results in the foreach_in_list using a different list header on later iterations of the loop. This causes two shaders in shader-db and 9 shaders in fossil-db to change. Looking at the code changes, these are cases where there was a copy of a copy that gets propagated. The part that confuses me is the VGRF numbers involved should not hash to the same bucket, so it should be impossible to find the original source from the intermediate VGRF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:22 +00:00
Ian Romanick	e488b46419	intel/fs: Don't continue fixed point iteration just because liveout changes Unless the change in liveout also causes livein to change, updates to liveout cannot have any global effect. Changes to livein already flag additional interation. I had additional changes in this area that didn't pan out. While working on those change, I was a little confused about this bit of code. It's unnecessary, so it's better to delete it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>	2023-09-14 22:31:22 +00:00
Caio Oliveira	3890c60584	compiler/types: Remove unused GLSL_TYPE_FUNCTION and related functions GLSL doesn't use that type. SPIR-V used for a while but later started relying on its own data structures and stopped using it. See `ca62e849d3` ("nir/spirv: Stop using glsl_type for function types") If we were ever to add this one again, would be better to have a way to grab a key for lookup that did not require allocations, right now that's needed to inject return type as the first element in params array. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25160>	2023-09-12 23:18:12 +00:00
Iván Briano	f1bc58cb7b	intel/fs: use ffsll so we don't explode on 32 bits Fixes: `b200e5765c` ("anv: use a simpler MUE layout for fast linked libraries") Tested-by: Mark Janes <markjanes@swizzler.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25192>	2023-09-12 22:42:38 +00:00
Iván Briano	4eddeea7bf	intel/fs: handle URB setup for fast linked mesh pipelines Up until now, the mesh pipeline assumed it would be always linked to the fragment shader, and so the calculated MUE map would always be available. That is not the case for fast linked pipeline libraries, so the URB setup needs to account for this. We do this by replicating what's done for non-mesh pipelines, defining the URB based on the FS inputs, and always assuming they will be laid out in order of varying number, except that we also account for per-primitive attributes. Fixes all GPL using tests under dEQP-VK.mesh_shader.ext.smoke.* Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>	2023-09-12 02:51:31 +00:00

1 2 3 4 5 ...

2745 Commits