third_party_mesa3d

Author	SHA1	Message	Date
Iago Toral Quiroga	c8e4ee8ecb	broadcom/compiler: don't assign registers to unused nodes/temps In programs with a lot of unused temps, if we don't do this, we may end up recycling previously used rfs more often, which can be detrimental to instruction pairing. total instructions in shared programs: 11464335 -> 11444136 (-0.18%) instructions in affected programs: 8976743 -> 8956544 (-0.23%) helped: 33196 HURT: 33778 Inconclusive result total max-temps in shared programs: 2230150 -> 2229445 (-0.03%) max-temps in affected programs: 86413 -> 85708 (-0.82%) helped: 2217 HURT: 1523 Max-temps are helped. total sfu-stalls in shared programs: 18077 -> 17104 (-5.38%) sfu-stalls in affected programs: 8669 -> 7696 (-11.22%) helped: 2657 HURT: 2182 Sfu-stalls are helped. total inst-and-stalls in shared programs: 11482412 -> 11461240 (-0.18%) inst-and-stalls in affected programs: 8995697 -> 8974525 (-0.24%) helped: 33319 HURT: 33708 Inconclusive result total nops in shared programs: 298140 -> 296185 (-0.66%) nops in affected programs: 52805 -> 50850 (-3.70%) helped: 3797 HURT: 2662 Inconclusive result Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>	2023-10-13 22:37:42 +00:00
Iago Toral Quiroga	ce13aa4ee7	broadcom/compiler: improve allocation for final program instructions The last 3 instructions can't use specific registers so flag all the nodes for temps used in the last program instructions and try to avoid assigning any of these. This may help us avoid injecting nops for the last thread switch instruction. Because regisster allocation needs to happen before QPU scheduling and instruction merging we can't tell exactly what the last 3 instructions will be, so we do this for a few more instructions than just 3. We only do this for fragment shaders because other shader stages always end with VPM store instructions that take an small immediate and therefore will never allow us to merge the final thread switch earlier, so limiting allocation for these shaders will never improve anything and might instead be detrimental. total instructions in shared programs: 11471389 -> 11464335 (-0.06%) instructions in affected programs: 582908 -> 575854 (-1.21%) helped: 4669 HURT: 578 Instructions are helped. total max-temps in shared programs: 2230497 -> 2230150 (-0.02%) max-temps in affected programs: 5662 -> 5315 (-6.13%) helped: 344 HURT: 44 Max-temps are helped. total sfu-stalls in shared programs: 18068 -> 18077 (0.05%) sfu-stalls in affected programs: 264 -> 273 (3.41%) helped: 37 HURT: 48 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 11489457 -> 11482412 (-0.06%) inst-and-stalls in affected programs: 585180 -> 578135 (-1.20%) helped: 4659 HURT: 588 Inst-and-stalls are helped. total nops in shared programs: 301738 -> 298140 (-1.19%) nops in affected programs: 14680 -> 11082 (-24.51%) helped: 3252 HURT: 108 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>	2023-10-13 22:37:42 +00:00
Iago Toral Quiroga	3a36a618d7	broadcom/compiler: try to use ldunif(a) instead of ldunif(a)rf in v71 The rf variants need to encode the destination in the cond bits, which prevents these to be merged with any other instruction that need them. In 4.x, ldunif(a) write to r5 which is a special register that only ldunif(a) and ldvary can write so we have a special register class for it and only allow it for them. Then when we need to choose a register for a node, if this register is available we always use it. In 7.x these instructions write to rf0, which can be used by any instruction, so instead of restricting rf0, we track the temps that are used as ldunif(a) destinations and use that information to favor rf0 for them. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>	2023-10-13 22:37:42 +00:00
Iago Toral Quiroga	b1548b18d3	broadcom/compiler: rename vir_writes_rX to vir_writes_rX_implicitly Since that represents more accurately what they check.. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>	2023-10-13 22:37:41 +00:00
Iago Toral Quiroga	4afbf4ad31	v3d: get rid of shader_state pointer in v3d_key Having this pointer in the key is undesirable since it makes copying keys difficult and error prone (as seen in previous patches), also, it is only there for convenience and we don't strictly need it (in fact the vulkan driver doesn't use it at all), so let's just get rid of it so our v3d_key is fully static. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:07 +00:00
Iago Toral Quiroga	adc63d2503	broadcom/compiler: add a couple of shader key helpers Our shader key includes a void pointer that we can't just memcmp, so add helpers that allow us toget the 'static' portion and size of a key. We will use this to fix up the shader cache in v3d in a later patch. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>	2023-10-02 06:35:06 +00:00
Faith Ekstrand	f922ea7c07	broadcom: Stop using nir_dest directly We want to get rid of nir_dest so back-ends need to stop storing it in structs and passing it through helpers. Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Iago Toral Quiroga	f0e603583e	broadcom/compiler: drop execution environment from the shader key We are no longer using this for anything. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24396>	2023-08-03 06:32:41 +00:00
Iago Toral Quiroga	1f8ecd3ae0	broadcom: use nir info to keep track of implicit sample shading It seems NIR is tracking this for us now so we can stop doing this in the backend. Also, new CTS tests seem to add the requirement where in the presence of some builtin's like gl_SampleID in a shader, even if unused, sample shading is expected to be enabled, which is something we can't track in the backend since the variable may have been dropped by then. Fixes 2 failures in: dEQP-VK.draw.renderpass.implicit_sample_shading.sample* Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23984>	2023-07-04 08:54:43 +00:00
Alyssa Rosenzweig	4601517f54	broadcom/compiler: Remove v3d_nir_lower_robust_access Now unused. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>	2023-06-29 22:36:50 +00:00
Alejandro Piñeiro	dfdbf5bf94	broadcom/compiler: clarify use of QFILE_VPM This was only used for version < 40 (See commit `22a02f3e3`). Adding some extra explanations and asserts of places where it is used. As we are here also move the definition of a register with QFILE_VPM, to avoid defining it if not needed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22984>	2023-06-14 09:03:35 +00:00
Iago Toral Quiroga	c950098abb	broadcom/compiler: move buffer loads to lower register pressure If we are trying to lower register pressure this can make a big difference in some cases. To avoid adding even more strategies, merge this with disabling ubo load sorting, since they are basically trying to do the same. total instructions in shared programs: 12848024 -> 12844510 (-0.03%) instructions in affected programs: 236537 -> 233023 (-1.49%) helped: 195 HURT: 87 Instructions are helped. total uniforms in shared programs: 3815601 -> 3814932 (-0.02%) uniforms in affected programs: 31773 -> 31104 (-2.11%) helped: 67 HURT: 115 Inconclusive result (value mean confidence interval includes 0). total max-temps in shared programs: 2210803 -> 2210622 (<.01%) max-temps in affected programs: 9362 -> 9181 (-1.93%) helped: 114 HURT: 34 Max-temps are helped. total spills in shared programs: 2556 -> 2330 (-8.84%) spills in affected programs: 1391 -> 1165 (-16.25%) helped: 39 HURT: 9 total fills in shared programs: 3840 -> 3317 (-13.62%) fills in affected programs: 2379 -> 1856 (-21.98%) helped: 39 HURT: 23 total sfu-stalls in shared programs: 21965 -> 21978 (0.06%) sfu-stalls in affected programs: 2618 -> 2631 (0.50%) helped: 45 HURT: 81 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 12869989 -> 12866488 (-0.03%) inst-and-stalls in affected programs: 238771 -> 235270 (-1.47%) helped: 193 HURT: 87 Inst-and-stalls are helped. total nops in shared programs: 303501 -> 303274 (-0.07%) nops in affected programs: 4159 -> 3932 (-5.46%) helped: 87 HURT: 105 Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22824>	2023-05-03 13:01:58 +00:00
Harri Nieminen	c3c63cb1d8	broadcom: fix typos Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22591>	2023-04-21 17:19:46 +00:00
Iago Toral Quiroga	1e28f2a6f2	broadcom/compiler: track pending ldtmu count with each TMU lookup And use this information when scheduling QPU to avoid merging a new TMU request into a previous ldtmu instruction when doing so may cause TMU output fifo overflow due to a stalling ldtmu. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22044>	2023-03-21 11:29:05 +00:00
Alejandro Piñeiro	1a1fa2393e	v3d/v3dv: use shader_info->var_copies_lowered Instead of passing allow_copies as a parameter for v3d_optimize_nir (so manually doing that tracking). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19338>	2023-02-06 22:11:34 +00:00
Alejandro Piñeiro	41a081380a	broadcom/compiler: v3d_nir_lower_txf_ms doesn't need v3d_compile Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20744>	2023-01-18 13:09:57 +00:00
Iago Toral Quiroga	22ef66bcc9	v3d/compiler: remove unused sample_coverage field from fs key. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20634>	2023-01-11 10:54:05 +00:00
Iago Toral Quiroga	c7150ad8e6	broadcom/compiler: drop unused v3d_compile parameter for nir pass Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19519>	2022-11-04 09:58:10 +00:00
Iago Toral Quiroga	24d9a80247	v3dv: implement VK_EXT_pipeline_robustness Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18883>	2022-10-27 08:17:11 +00:00
Alejandro Piñeiro	019529aa11	broadcom/compiler: call nir_opt_gcm with a custom strategy nir_opt_gcm get us worse shader-db stats, but that is expected. But we want to prevent to get worse values on spill/fills. Analyzing the outcome with shader-db, this mostly happen with shaders that are already complex, and are already spilling/filling. So the best option here is adding a new strategy, that fall backs if we get spill/fill using nir_opt_gcm. It is not clear in which order we should disable gcm. For now we disable it before loop unrolling. We get a slight performance gain (in average) using nir_opt_gcm. We don't show the shaderdb stats, as they are worse, but as mentioned, this is expected. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17185>	2022-10-26 12:29:30 +00:00
Alejandro Piñeiro	0bf31b0710	broadcom/compiler: add more lowerings/optimizations on v3d_optimize_nir Optimizations that we are already calling on the Vulkan driver. As preparation to the Vulkan frontend to use v3d_optimize_nir too. We need to add a new parameter to v3d_optimize_nir in order to know if we can call nir_opt_find_array_copies. As we don't track if we are calling nir_var_lower_copies, we explicitly call it when we create the uncompiled shader create. So instead of tracking, we assume that each driver (v3d/v3dv) would call it when the shader is created. So when v3d_optimize_nir is called as part of the process to compile it at the compiler, we call it with allow_copies as false. We exclude on purpose nir_opt_gcm as it is a case of a optimization that could help performance even if it hurts shader db stats. shaderdb stats: total instructions in shared programs: 11705923 -> 11705034 (<.01%) instructions in affected programs: 88350 -> 87461 (-1.01%) helped: 201 HURT: 80 Instructions are helped. total threads in shared programs: 375552 -> 375558 (<.01%) threads in affected programs: 6 -> 12 (100.00%) helped: 3 HURT: 0 total uniforms in shared programs: 3486108 -> 3485789 (<.01%) uniforms in affected programs: 7473 -> 7154 (-4.27%) helped: 90 HURT: 1 Uniforms are helped. total max-temps in shared programs: 2021860 -> 2021802 (<.01%) max-temps in affected programs: 800 -> 742 (-7.25%) helped: 21 HURT: 3 Max-temps are helped. total sfu-stalls in shared programs: 19299 -> 19296 (-0.02%) sfu-stalls in affected programs: 18 -> 15 (-16.67%) helped: 10 HURT: 7 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 11725222 -> 11724330 (<.01%) inst-and-stalls in affected programs: 88402 -> 87510 (-1.01%) helped: 201 HURT: 80 Inst-and-stalls are helped. total nops in shared programs: 269674 -> 269386 (-0.11%) nops in affected programs: 3641 -> 3353 (-7.91%) helped: 103 HURT: 29 Nops are helped. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17185>	2022-10-26 12:29:30 +00:00
Iago Toral Quiroga	b6093ffbe7	v3dv: expose VK_EXT_image_robustness Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18820>	2022-09-27 09:08:29 +00:00
Iago Toral Quiroga	c7e022abfd	broadcom/compiler: add a lowering for robust image access Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18820>	2022-09-27 09:08:29 +00:00
Iago Toral Quiroga	87a9951073	broadcom/compiler: track number of TMU operations in prog data Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17854>	2022-08-15 23:35:16 +00:00
Alejandro Piñeiro	0a50330c3d	broadcom/compiler: make several passes to return a progress Two advantages: * When using NIR_DEBUG=nir_print_xx, will print outcome only if there is a change * We can use NIR_PASS(_, ...) instead of NIR_PASS_V, that has slightly more validation checks. This includes: * v3d_nir_lower_image_load_store * v3d_nir_lower_io * v3d_nir_lower_line_smooth * v3d_nir_lower_load_store_bitsize * v3d_nir_lower_robust_buffer_access * v3d_nir_lower_scratch * v3d_nir_lower_txf_ms As we are here we also simplify some of them by using the nir_shader_instructions_pass helper. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>	2022-07-20 11:35:25 +00:00
Alejandro Piñeiro	81ca0b4191	broadcom/compiler: removed unused function It is not even implemented. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17609>	2022-07-20 11:35:25 +00:00
Iago Toral Quiroga	90054e9c5d	broadcom/compiler: track if a shader uses global intrinsics Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17275>	2022-07-19 09:47:34 +02:00
Iago Toral Quiroga	487c213142	v3d/compiler: add more stats to prog_data So we can expose them via VK_KHR_pipeline_executable_properties. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16370>	2022-05-09 12:12:35 +00:00
Iago Toral Quiroga	ea3223e7a4	v3dv: implement VK_EXT_inline_uniform_block Inline uniform blocks store their contents in pool memory rather than a separate buffer, and are intended to provide a way in which some platforms may provide more efficient access to the uniform data, similar to push constants but with more flexible size constraints. We implement these in a similar way as push constants: for constant access we copy the data in the uniform stream (using the new QUNIFORM_UNIFORM_UBO_*) enums to identify the inline buffer from which we need to copy and for indirect access we fallback to regular UBO access. Because at NIR level there is no distinction between inline and regular UBOs and the compiler isn't aware of Vulkan descriptor sets, we use the UBO index on UBO load intrinsics to identify inline UBOs, just like we do for push constants. Particularly, we reserve indices 1..MAX_INLINE_UNIFORM_BUFFERS for this, however, unlike push constants, inline buffers are accessed through descriptor sets, and therefore we need to make sure they are located in the first slots of the UBO descriptor map. This means we store them in the first MAX_INLINE_UNIFORM_BUFFERS slots of the map, with regular UBOs always coming after these slots. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15575>	2022-03-28 10:44:13 +00:00
Iago Toral Quiroga	a35b47a0b1	broadcom/compiler: add a strategy to disable scheduling of general TMU reads This can add quite a bit of register pressure so it makes sense to disable it to prevent us from dropping to 2 threads or increase spills: total instructions in shared programs: 12672813 -> 12642413 (-0.24%) instructions in affected programs: 256721 -> 226321 (-11.84%) helped: 719 HURT: 77 total threads in shared programs: 415534 -> 416322 (0.19%) threads in affected programs: 788 -> 1576 (100.00%) helped: 394 HURT: 0 total uniforms in shared programs: 3711370 -> 3703861 (-0.20%) uniforms in affected programs: 28859 -> 21350 (-26.02%) helped: 204 HURT: 455 total max-temps in shared programs: 2159439 -> 2150686 (-0.41%) max-temps in affected programs: 32945 -> 24192 (-26.57%) helped: 585 HURT: 47 total spills in shared programs: 5966 -> 3255 (-45.44%) spills in affected programs: 2933 -> 222 (-92.43%) helped: 192 HURT: 4 total fills in shared programs: 9328 -> 4630 (-50.36%) fills in affected programs: 5184 -> 486 (-90.62%) helped: 196 HURT: 0 Compared to the stats before adding scheduling of non-filtered memory reads we see we that we have now gotten back all that was lost and then some: total instructions in shared programs: 12663186 -> 12642413 (-0.16%) instructions in affected programs: 2051803 -> 2031030 (-1.01%) helped: 4885 HURT: 3338 total threads in shared programs: 415870 -> 416322 (0.11%) threads in affected programs: 896 -> 1348 (50.45%) helped: 300 HURT: 74 total uniforms in shared programs: 3711629 -> 3703861 (-0.21%) uniforms in affected programs: 158766 -> 150998 (-4.89%) helped: 1973 HURT: 499 total max-temps in shared programs: 2138857 -> 2150686 (0.55%) max-temps in affected programs: 177920 -> 189749 (6.65%) helped: 2666 HURT: 2035 total spills in shared programs: 3860 -> 3255 (-15.67%) spills in affected programs: 2653 -> 2048 (-22.80%) helped: 77 HURT: 21 total fills in shared programs: 5573 -> 4630 (-16.92%) fills in affected programs: 3839 -> 2896 (-24.56%) helped: 81 HURT: 15 total sfu-stalls in shared programs: 39583 -> 38154 (-3.61%) sfu-stalls in affected programs: 8993 -> 7564 (-15.89%) helped: 1808 HURT: 1038 total nops in shared programs: 324894 -> 323685 (-0.37%) nops in affected programs: 30362 -> 29153 (-3.98%) helped: 2513 HURT: 2077 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	f761f8fd9e	broadcom/compiler: simplify node/temp translation during register allocation Now that we don't sort our nodes we can arrange them so we can easily translate between nodes and temps without a mapping table, just applying an offset. To do this we have a single array of nodes where twe put first the nodes for accumulators and then the nodes for temps. With this setup we can ensure that for any given temp T, its node is always T + ACC_COUNT. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Iago Toral Quiroga	a4b164b57b	broadcom/compiler: only patch temps that existed before the current spill When we spill we add new temps. We should be careful not to access liveness for these until we have re-computed it after all spills and fill for that the spilled temp have been processed so as to avoid out-of-bounds accesses to the c->temp_start and c->temp_end arrays. This fixes a crash in a Three.js demo when we try to patch register classes after a TMU spill that was caused because we would incorrectly try to patch the same temps we had just added for the spill itself, which is not only unnecessary but also incorrect since we these temps would not have liveness information available yet and thus would cause out of bounds accesses. Fixes: `f3c3228522` ('broadcom/compiler: do not rebuild the interference graph after each spill') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15107>	2022-02-22 06:41:51 +00:00
Iago Toral Quiroga	750eeecf4e	broadcom/compiler: document that spill_base is used for spills and scratch Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	8883975209	broadcom/compiler: drop spill_count and add spilling boolean We added spill_count to handle uniform batch spills, which we no longer do. What we want now is a way to know if we are spilling registers. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	f3c3228522	broadcom/compiler: do not rebuild the interference graph after each spill Instead, we only recompute liveness and we add new nodes and interferences to the graph manually (we also need to patch register classes in some cases). To assist in this process, we also add an ip counter to our instructions that we also recompute after each spill, which we use to identify registers that cross thrsw boundries introduced with TMU spills and fills and adjust their register classes accordingly (removing their capacity to use accumulators). This significantly reduces the CPU cost of spills. Using shaders/closed/gputest/piano/7.shader_test as reference: Compile time up to the first successful compile strategy in main is ~24s and with this change it is ~11s. With this speed up, we can now try all 2-thread compile strategies (including the fallback scheduler) in only ~15s. A full shader-db run results in: Total CPU time (seconds): 9904.67 -> 9087.98 (-8.25%) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	40e091267d	broadcom/compiler: define max number of tmu spills for compile strategies Instead of whether they are allowed to spill or not. This is more flexible. Also, while we are not currently enabling spilling on any 4-thread strategies, should we do that in the future, always prefer a 4-thread compile. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	7561ea8fa1	broadcom/compiler: allow ldunifa with read-only SSBOs Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830>	2022-02-03 07:35:07 +00:00
Iago Toral Quiroga	765d9feb46	broadcom/compiler: add lowering pass to scalarize non 32-bit general load/store V3D hardware doesn't support vector access for general TMU load/store operations like the ones we use for UBO and SSBO, so we need to split these to scalar operations. It should be noted that we also have a vectorization pass (which runs later, during optimization), that may reconstruct some of these into 32-bit operations when possible (i.e. when the resulting operation is 32-bit aligned). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Dave Airlie	ccbf700d6c	nir: remove gl.h include from nir headers. This saves a lot of pointless gl.h includes across the board, it moves the one place that needs GLenum into a separate file only used in those passes that require it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Iago Toral Quiroga	a65c605365	broadcom/compiler: track passthrough Z writes In some cases we need to make the shaders write the Z value produced from rasterization (FEP). Track these instances because they are relevant to early EZ setup. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	6d4a645c90	broadcom/compiler: emit passthrough Z write if shader reads Z Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	6923dd687c	broadcom/compiler: allow color TLB writes in last instruction Only Z writes are disallowed. total instructions in shared programs: 11578449 -> 11577369 (<.01%) instructions in affected programs: 38132 -> 37052 (-2.83%) helped: 1080 HURT: 0 Instructions are helped. total max-temps in shared programs: 2334416 -> 2334395 (<.01%) max-temps in affected programs: 218 -> 197 (-9.63%) helped: 21 HURT: 0 Max-temps are helped. total inst-and-stalls in shared programs: 11607890 -> 11606810 (<.01%) inst-and-stalls in affected programs: 38265 -> 37185 (-2.82%) helped: 1080 HURT: 0 Inst-and-stalls are helped. total nops in shared programs: 338316 -> 337236 (-0.32%) nops in affected programs: 2625 -> 1545 (-41.14%) helped: 1080 HURT: 0 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13964>	2021-11-29 06:44:07 +00:00
Iago Toral Quiroga	0cb58f80d2	v3d: use V3D_MAX_DRAW_BUFFERS instead of hardcoded constant Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>	2021-11-12 11:04:07 +00:00
Iago Toral Quiroga	3a95e25e84	v3dv,v3d: don't store swizzle pointer in shader/pipeline keys We had been storing pointers to a driver owned swizzle table rather than storing the actual swizzle value in various shader and pipeline keys on both GL and Vulkan drivers. This doesn't look very robust, particularly since we also compute sha1 hashes from these values and we may store these hashes to disk (for the disk cache). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13738>	2021-11-10 11:24:26 +00:00
Alejandro Piñeiro	d50be41f8f	broadcom/compiler: remove unused macro and function definition Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13444>	2021-10-20 10:08:27 +00:00
Alejandro Piñeiro	193898c8b0	broadcom/compiler: remove commented out vir_LOAD_IMM methods It has been commented several years now. Let's remove it to reduce the noise. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13008>	2021-09-24 08:46:06 +00:00
Juan A. Suarez Romero	53c8b4c093	broadcom: make vir_emit_last_thrsw() private This function is only used in v3d_nir_to_vir(), so make it private. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12322>	2021-09-10 09:18:05 +00:00
Iago Toral Quiroga	b727eaac3c	broadcom/compiler: add a vir_get_cond helper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12278>	2021-08-10 08:47:40 +00:00
Iago Toral Quiroga	d5acae3206	broadcom/compiler: implement nir_intrinsic_load_view_index This is used for multiview's gl_ViewIndex built-in. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12034>	2021-07-27 07:31:31 +00:00

1 2 3 4 5

207 Commits