third_party_mesa3d

Author	SHA1	Message	Date
Lionel Landwerlin	8955d179d3	anv: fix MI_PREDICATE_RESULT write This register is only 32bits. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `1952fd8d2c` ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9428>	2021-03-05 16:19:20 +00:00
Alyssa Rosenzweig	718bfdb3da	pan/bi: Implement fsin/fcos Instead of lowering it in NIR, use the lookup tables as inputs to a second-order Taylor expansion. shader-db results aren't amazing but keep in mind this is without backend CSE yet. total instructions in shared programs: 115913 -> 115707 (-0.18%) instructions in affected programs: 3151 -> 2945 (-6.54%) helped: 12 HURT: 0 Instructions are helped. total nops in shared programs: 84045 -> 84041 (<.01%) nops in affected programs: 1571 -> 1567 (-0.25%) helped: 1 HURT: 7 Inconclusive result (value mean confidence interval includes 0). total clauses in shared programs: 20498 -> 20489 (-0.04%) clauses in affected programs: 188 -> 179 (-4.79%) helped: 6 HURT: 0 Clauses are helped. total quadwords in shared programs: 90395 -> 90291 (-0.12%) quadwords in affected programs: 2287 -> 2183 (-4.55%) helped: 12 HURT: 0 Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Alyssa Rosenzweig	253b795451	pan/bi: Allow negating constants Useful for representing -0 in transcendental sequences matching the blob. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Alyssa Rosenzweig	362756ad09	pan/bi: Use replace_index in more places Needed to respect abs/neg. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Pierre-Eric Pelloux-Prayer	c276bde34a	radeonsi/sqtt: export shader code to RGP With these changes the shader code is visible in RGP. Vk pipeline feature is emulated using si_update_shaders: when shaders are updated we compute a sha1 of their code and use it as a pipeline hash. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	729d3eb0e0	radeonsi/sqtt: don't always use WGP 0 Because it may be disabled. Instead use the cu mask to pick the first active WGP. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	47eafb3f51	radeonsi/sqtt: remove duplicate token V_008D18_REG_INCLUDE_CONTEXT was set twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	a27ea38d2a	radeonsi/sqtt: keep a copy of the uploaded shader code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	7f5a8db96d	ac/rgp: move radv/sqtt functions to ac pso_correlation and code_object_loader don't depend on drivers specific logic so move them to the shared code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	b2ef94943f	ac/rtld: make ac_rtld_upload returns the code size This will be useful to keep a copy of the uploaded code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	e5b1e645e7	ac/rgp: make the max gap between shader code a warning For radeonsi the shaders don't live in the same BOs, so they're unlikely to be less that 0x1000 bytes apart. So this commit bumps the threshold to 0x10000 and warns once when hitting it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	0e97d817f5	radeonsi: properly set SPI_SHADER_PGM_HI_ES When not using S_00B324_MEM_BASE the value isn't properly truncated. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Iago Toral Quiroga	6e6e71ddf9	broadcom/compiler: fix flags check for ldvary merge We were checking that the previous instruction doesn't write flags, but we also need to check it doesn't read them. Fixes: `1784dd22a3` ('broadcom/compiler: pipeline smooth ldvary sequences') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9431>	2021-03-05 12:55:47 +00:00
Iago Toral Quiroga	21c1853c55	broadcom/compiler: ldvary doesn't implicitly write to r3 since V3D 4.1 total instructions in shared programs: 13805979 -> 13786037 (-0.14%) instructions in affected programs: 2263244 -> 2243302 (-0.88%) helped: 10646 HURT: 1508 Instructions are helped. total threads in shared programs: 412220 -> 412242 (<.01%) threads in affected programs: 58 -> 80 (37.93%) helped: 17 HURT: 6 Threads are helped. total uniforms in shared programs: 3793200 -> 3790401 (-0.07%) uniforms in affected programs: 131281 -> 128482 (-2.13%) helped: 1547 HURT: 281 Uniforms are helped. total max-temps in shared programs: 2326309 -> 2324834 (-0.06%) max-temps in affected programs: 31836 -> 30361 (-4.63%) helped: 1139 HURT: 153 Max-temps are helped. total spills in shared programs: 5932 -> 5940 (0.13%) spills in affected programs: 80 -> 88 (10.00%) helped: 2 HURT: 3 total fills in shared programs: 13370 -> 13372 (0.01%) fills in affected programs: 480 -> 482 (0.42%) helped: 2 HURT: 3 total sfu-stalls in shared programs: 30829 -> 30685 (-0.47%) sfu-stalls in affected programs: 2190 -> 2046 (-6.58%) helped: 570 HURT: 533 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13836808 -> 13816722 (-0.15%) inst-and-stalls in affected programs: 2276152 -> 2256066 (-0.88%) helped: 10643 HURT: 1525 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9430>	2021-03-05 13:37:39 +01:00
Rhys Perry	524848707b	radv: don't set sx_blend_opt_epsilon for V_028C70_COLOR_10_11_11 Matches radeonsi and PAL. From PAL: // 1 is recommended, but doesn't provide sufficient precision Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4394 Fixes: `ed94638156` ("radv: Enable RB+ where possible.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9427>	2021-03-05 11:16:40 +00:00
Iago Toral Quiroga	839007e490	broadcom/compiler: always restart ldvary pipelining when scheduling ldvary When we were only able to pipeline smooth varyings, if we had to disable ldvary pipelining in the middle of a sequence it would stay disabled for the rest of the program, to prevent us from prioritizing scheduling of ldvary instructions that we would not be able to pipeline effectively. Now that we can pipeline all ldvary sequences we can change this. This change re-enables ldvary pipelining upon finding the next ldvary in the program in the hopes that we can continue pipelining succesfully. To do this, we track the number of ldvary instructions we emitted so far and compare that to the number of inputs in the fragment shader we are scheduling. This also allows us to simplify our ldvary tracking at nir to vir time, since that is all now handled in the QPU scheduler. total instructions in shared programs: 13817048 -> 13810783 (-0.05%) instructions in affected programs: 810114 -> 803849 (-0.77%) helped: 4843 HURT: 591 Instructions are helped. total max-temps in shared programs: 2326612 -> 2326300 (-0.01%) max-temps in affected programs: 4689 -> 4377 (-6.65%) helped: 285 HURT: 7 Max-temps are helped. total sfu-stalls in shared programs: 30942 -> 30865 (-0.25%) sfu-stalls in affected programs: 207 -> 130 (-37.20%) helped: 120 HURT: 42 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13847990 -> 13841648 (-0.05%) inst-and-stalls in affected programs: 825378 -> 819036 (-0.77%) helped: 4899 HURT: 590 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9404>	2021-03-05 10:32:19 +01:00
Samuel Pitoiset	2169c4f763	radv: re-enable TC-compat HTILE for MSAA D32S8 images on GFX9+ Should help MSAA games. Note that it's broken on GFX8 because the tiling doesn't match. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3868 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9284>	2021-03-05 08:44:40 +00:00
Xin He	97b196b921	virgl: use atomic operations when increase sub_ctx_id Use atomic operations to avoid competition. In addition, since sub_ctx_id 0 has been used by default, sub_ctx_id should start from 1. Signed-off-by: Xin He <hexin.op@bytedance.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9406>	2021-03-05 08:35:29 +00:00
Samuel Pitoiset	367a93830b	radv: skip useless FCE when fast-clearing MSAA images with DCC enabled The clear code is 0xCC which means CMASK isn't fast-cleared. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9392>	2021-03-05 08:11:28 +00:00
Samuel Pitoiset	6102507a74	radv: remove useless check about mips+layers for TC-compat HTILE images radv_use_htile_for_image() prevents it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:10:19 +01:00
Samuel Pitoiset	438f65fb1e	radv: cleanup enabling TC-compat HTILE for depth surfaces It makes more sense to try to enable TC-compat if the image has HTILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:09:42 +01:00
Mike Blumenkrantz	55b57db84d	zink: add vk/spirv caps/extension for shader LAYER variable this is required if gl_Layer is used outside of GEOMETRY stage Fixes: `c77df59c9e` ("zink: export PIPE_CAP_TGSI_VS_LAYER_VIEWPORT") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9410>	2021-03-05 03:45:51 +00:00
Dave Airlie	1186fbcdf1	lavapipe: fix dynamic viewport/scissor pipeline emission Just fixup the tests for when the pipeline vp/scissors are emitted. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9422>	2021-03-05 03:34:47 +00:00
Dave Airlie	6bcd304278	lavapipe: fix pipeline vp/scissor mixup. Not copying all the scissors caused dEQP-VK.pipeline.extended_dynamic_state.two_draws_dynamic.2_viewports to fail but thah test pointlessly relies on KHR_multiview (cts issue filed). Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9422>	2021-03-05 03:34:47 +00:00
Iván Briano	194e477615	anv: don't advertise mipmaps for linear 3D surfaces on BDW Prior to SKL, the mipmaps for 3D surfaces are laid out in a way that make it impossible to represent in the way that VkSubresourceLayout expects. Since we can't tell users how to make sense of them, don't report them as available. "Fixes" dEQP-VK.image.subresource_layout.3d.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9419>	2021-03-04 16:23:23 -08:00
Ian Romanick	2c4fd24c01	nir/algebraic: Apply addition property of equality to the other ordering too Inequality comparison operations are not commutative, so `foo < bar` and `bar < foo` both have to be explicitly listed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 20027051 -> 20026899 (<.01%) instructions in affected programs: 37181 -> 37029 (-0.41%) helped: 85 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.05% max: 6.78% x̄: 0.92% x̃: 0.68% 95% mean confidence interval for instructions value: -2.42 -1.15 95% mean confidence interval for instructions %-change: -1.23% -0.61% Instructions are helped. total cycles in shared programs: 979762793 -> 979753527 (<.01%) cycles in affected programs: 2653905 -> 2644639 (-0.35%) helped: 104 HURT: 50 helped stats (abs) min: 1 max: 1048 x̄: 119.99 x̃: 11 helped stats (rel) min: <.01% max: 9.88% x̄: 0.77% x̃: 0.20% HURT stats (abs) min: 1 max: 734 x̄: 64.26 x̃: 8 HURT stats (rel) min: <.01% max: 3.06% x̄: 0.36% x̃: 0.10% 95% mean confidence interval for cycles value: -98.65 -21.68 95% mean confidence interval for cycles %-change: -0.66% -0.15% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Ian Romanick	33031bdab6	nir/algebraic: Apply addition property of equality more conservatively This allows a lot more CSE. Depending on where the addition and the comparison are scheduled, it may also reduce register pressure by reducing the live range of the addends. Across all the platforms, the shaders affected for spills or fills were all fragment shaders from Dirt Rally. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 21043103 -> 21038804 (-0.02%) instructions in affected programs: 892878 -> 888579 (-0.48%) helped: 1549 HURT: 724 helped stats (abs) min: 1 max: 225 x̄: 4.14 x̃: 2 helped stats (rel) min: 0.05% max: 11.18% x̄: 1.04% x̃: 0.78% HURT stats (abs) min: 1 max: 71 x̄: 2.93 x̃: 1 HURT stats (rel) min: 0.07% max: 6.90% x̄: 0.80% x̃: 0.56% 95% mean confidence interval for instructions value: -2.33 -1.45 95% mean confidence interval for instructions %-change: -0.50% -0.40% Instructions are helped. total cycles in shared programs: 855054155 -> 855757566 (0.08%) cycles in affected programs: 58275918 -> 58979329 (1.21%) helped: 1213 HURT: 1680 helped stats (abs) min: 1 max: 107405 x̄: 1684.00 x̃: 10 helped stats (rel) min: <.01% max: 38.09% x̄: 1.51% x̃: 0.25% HURT stats (abs) min: 1 max: 126632 x̄: 1634.59 x̃: 12 HURT stats (rel) min: <.01% max: 85.91% x̄: 2.75% x̃: 0.49% 95% mean confidence interval for cycles value: -98.06 584.35 95% mean confidence interval for cycles %-change: 0.71% 1.22% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9843 -> 9771 (-0.73%) spills in affected programs: 72 -> 0 helped: 5 HURT: 0 total fills in shared programs: 9600 -> 9451 (-1.55%) fills in affected programs: 149 -> 0 helped: 5 HURT: 0 LOST: 14 GAINED: 9 Skylake total instructions in shared programs: 18185074 -> 18183866 (<.01%) instructions in affected programs: 575180 -> 573972 (-0.21%) helped: 1286 HURT: 468 helped stats (abs) min: 1 max: 15 x̄: 1.55 x̃: 1 helped stats (rel) min: 0.03% max: 4.08% x̄: 0.67% x̃: 0.65% HURT stats (abs) min: 1 max: 8 x̄: 1.69 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.87% x̃: 0.45% 95% mean confidence interval for instructions value: -0.77 -0.60 95% mean confidence interval for instructions %-change: -0.30% -0.22% Instructions are helped. total cycles in shared programs: 960518105 -> 960608234 (<.01%) cycles in affected programs: 42536073 -> 42626202 (0.21%) helped: 1210 HURT: 1714 helped stats (abs) min: 1 max: 7015 x̄: 123.41 x̃: 10 helped stats (rel) min: <.01% max: 33.76% x̄: 1.32% x̃: 0.26% HURT stats (abs) min: 1 max: 14474 x̄: 139.71 x̃: 14 HURT stats (rel) min: <.01% max: 58.94% x̄: 2.00% x̃: 0.44% 95% mean confidence interval for cycles value: 4.02 57.63 95% mean confidence interval for cycles %-change: 0.43% 0.82% Cycles are HURT. LOST: 16 GAINED: 42 Broadwell total instructions in shared programs: 17856880 -> 17852158 (-0.03%) instructions in affected programs: 564836 -> 560114 (-0.84%) helped: 1243 HURT: 418 helped stats (abs) min: 1 max: 115 x̄: 4.36 x̃: 1 helped stats (rel) min: 0.03% max: 9.67% x̄: 0.90% x̃: 0.67% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.14% max: 7.69% x̄: 0.89% x̃: 0.46% 95% mean confidence interval for instructions value: -3.45 -2.23 95% mean confidence interval for instructions %-change: -0.51% -0.38% Instructions are helped. total cycles in shared programs: 1031140321 -> 1029856892 (-0.12%) cycles in affected programs: 66986946 -> 65703517 (-1.92%) helped: 1084 HURT: 1653 helped stats (abs) min: 1 max: 415168 x̄: 1835.32 x̃: 10 helped stats (rel) min: <.01% max: 57.16% x̄: 1.19% x̃: 0.28% HURT stats (abs) min: 1 max: 43930 x̄: 427.14 x̃: 12 HURT stats (rel) min: <.01% max: 57.53% x̄: 1.32% x̃: 0.39% 95% mean confidence interval for cycles value: -915.76 -22.07 95% mean confidence interval for cycles %-change: 0.17% 0.47% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 20891 -> 20335 (-2.66%) spills in affected programs: 1567 -> 1011 (-35.48%) helped: 70 HURT: 0 total fills in shared programs: 27307 -> 25905 (-5.13%) fills in affected programs: 5381 -> 3979 (-26.05%) helped: 71 HURT: 0 LOST: 17 GAINED: 20 Haswell total instructions in shared programs: 16411850 -> 16409414 (-0.01%) instructions in affected programs: 602666 -> 600230 (-0.40%) helped: 1152 HURT: 781 helped stats (abs) min: 1 max: 103 x̄: 3.59 x̃: 1 helped stats (rel) min: 0.03% max: 8.61% x̄: 0.85% x̃: 0.65% HURT stats (abs) min: 1 max: 41 x̄: 2.18 x̃: 1 HURT stats (rel) min: 0.12% max: 7.69% x̄: 0.88% x̃: 0.69% 95% mean confidence interval for instructions value: -1.74 -0.78 95% mean confidence interval for instructions %-change: -0.21% -0.10% Instructions are helped. total cycles in shared programs: 1035338781 -> 1036977801 (0.16%) cycles in affected programs: 68961096 -> 70600116 (2.38%) helped: 1246 HURT: 2206 helped stats (abs) min: 1 max: 392022 x̄: 1040.28 x̃: 14 helped stats (rel) min: <.01% max: 56.44% x̄: 2.32% x̃: 0.38% HURT stats (abs) min: 1 max: 68630 x̄: 1330.56 x̃: 18 HURT stats (rel) min: <.01% max: 69.97% x̄: 3.31% x̃: 0.61% 95% mean confidence interval for cycles value: 90.43 859.17 95% mean confidence interval for cycles %-change: 1.02% 1.54% Cycles are HURT. total spills in shared programs: 17805 -> 17457 (-1.95%) spills in affected programs: 1202 -> 854 (-28.95%) helped: 34 HURT: 31 total fills in shared programs: 20939 -> 20387 (-2.64%) fills in affected programs: 2702 -> 2150 (-20.43%) helped: 34 HURT: 31 LOST: 24 GAINED: 45 Ivy Bridge and earlier Intel GPUs had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515912 -> 15516757 (<.01%) instructions in affected programs: 396569 -> 397414 (0.21%) helped: 578 HURT: 858 helped stats (abs) min: 1 max: 9 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.04% max: 3.70% x̄: 0.65% x̃: 0.65% HURT stats (abs) min: 1 max: 11 x̄: 1.87 x̃: 1 HURT stats (rel) min: 0.08% max: 12.90% x̄: 0.95% x̃: 0.53% 95% mean confidence interval for instructions value: 0.47 0.70 95% mean confidence interval for instructions %-change: 0.24% 0.37% Instructions are HURT. total cycles in shared programs: 584395455 -> 584466352 (0.01%) cycles in affected programs: 20346570 -> 20417467 (0.35%) helped: 1192 HURT: 1896 helped stats (abs) min: 1 max: 4108 x̄: 123.27 x̃: 14 helped stats (rel) min: <.01% max: 37.20% x̄: 2.27% x̃: 0.46% HURT stats (abs) min: 1 max: 3698 x̄: 114.89 x̃: 19 HURT stats (rel) min: <.01% max: 70.28% x̄: 3.02% x̃: 0.71% 95% mean confidence interval for cycles value: 10.75 35.16 95% mean confidence interval for cycles %-change: 0.73% 1.23% Cycles are HURT. LOST: 20 GAINED: 12 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Kenneth Graunke	206495cac4	iris: Enable u_threaded_context This implements most of the remaining u_threaded_context support. Most of the heavy lifting was done in the previous patches which fixed things up for the new thread safety requirements. Only a few things remain. u_threaded_context support can be disabled via an environment variable: GALLIUM_THREAD=0 On Felix's Tigerlake with the GPU at fixed frequency, enabling u_threaded_context improves performance of several games: - Civilization VI: +17% - Shadow of Mordor: +6% - Bioshock Infinite +6% - Xonotic: +6% Various microbenchmarks improve substantially as well: - GfxBench5 gl_driver2: +58% - SynMark2 OglBatch6: +54% - Piglit drawoverhead: +25% Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	c133d0930f	iris: Use thread safe slab allocators in transfer_map handling pipe->transfer_map can be called from u_threaded_context's thread rather than the driver thread. We need to use two different slab allocators, one for each thread. transfer_unmap, on the other hand, is only ever called from the driver thread. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	1b1c857248	iris: Make various classes inherit from u_threaded_context base classes u_threaded_context requires various objects to inherit from a new threaded_foo base class rather than directly from pipe_foo. This patch does most of the mechanical changes required for that. It also initializes the new threaded_resource fields. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	3358c7125a	iris: Use different shader uploaders for precompile vs. draw time When we enable u_threaded_context, the pipe->create_*_state hooks (precompile variants) are going to be called from one thread, while iris_update_compiled_shaders (on-the-fly variants) are going to be called from a driver thread. BLORP shaders also happen from clear, blit, and so on in the driver thread. u_upload_mgr isn't thread-safe, so use an uploader for each purpose. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	ec0d61c14c	iris: Support rebinding of stream output targets This enables us to replace the backing storage of resources that have been used as stream output targets, in case we're invalidating their entire contents. This can avoid stalls. We simply hadn't supported it because it was going to be tricky to re-emit 3DSTATE_SO_BUFFER without screwing up "reset offset to zero" vs. "keep appending". But that should be working fine with the previous patch's refactor. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	08e04ddd2c	iris: Rework zeroing of stream output buffer offsets The previous mechanism was a bit fragile. We stored the zero offset in the pre-baked packet, and used an flag to override 0xFFFFFFFF (append) offsets until our first emit - then prohibited anyone from trying to re-emit the packet by flagging IRIS_DIRTY_SO_BUFFERS, because that would re-emit the version with the zeroing of the offset. Now, we always store 0xFFFFFFFF in the pre-baked packet, and use a flag to override it to zero on the first emit. That way, we can re-emit that packet at any time, and it'll just keep appending. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	e40fafa991	iris: Defer stream output target space allocation until set time In the future, Marek is planning to make u_threaded_context call create_stream_output_target() from a different thread than the main driver thread, which means that we can't safely use uploaders there. To prepare for this eventual future, just defer the allocation of the offset BO 'til later. It's a very small amount of overhead. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:20 -08:00
Kenneth Graunke	5659460af4	iris: Defer uploading of surface states With u_threaded_context, create_surface and create_sampler_view will be called from a different thread than the driver thread. They aren't allowed to access the context, which means that they can't use the uploaders there to upload our SURFACE_STATE entries. Thanks to backing-storage replacement and iris_rebind_buffer, we already reworked things to maintain CPU-side copies of the SURFACE_STATE entries and added the ability to upload or re-upload them later. So we can skip the upload at object creation time, and add a simple resource-is-NULL check at binding table upload time to ensure that they get uploaded by the time we need them. (They might get uploaded earlier due to rebinds or clear color updates, but this is the last moment to do so.) Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:20 -08:00
Eric Anholt	3bdd39f03c	lima: avoid stomping over bound shader state when creating new shaders It shouldn't affect bound program state, and the current context state shouldn't be relevant for shader creation precompiles anyway (level load isn't going to have the eventual set of sampler views bound when you go to draw with that shader). Closes: #4306 Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:35 +00:00
Eric Anholt	4ac3f85054	lima: upload the shader to a BO at shader creation No need to conditionally upload later. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:35 +00:00
Eric Anholt	5a550c8dc7	lima: don't look at dirty bits for setup of FS key You always have to populate the key with the right texture swizzles, even if textures haven't changed since binding a new shader. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:35 +00:00
Eric Anholt	d4f706389c	lima: stop encoding the texture format in the shader key We can compose the swizzles at sampler view creation time, saving recompiles on texture format changes. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:34 +00:00
Lionel Landwerlin	8023d6de20	anv: implement INTEL_DEBUG=submit Name all the BOs! v2: Fix 32bit build issue (Thanks Marge!) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5736>	2021-03-04 19:46:24 +02:00
Rohan Garg	c6eb84ff30	virgl: Add support for querying detailed memory info This allows for virgl guests to expose GL_NVX_gpu_memory_info and GL_ATI_meminfo when the extensions are supported on the host. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9337>	2021-03-04 17:14:14 +01:00
Jason Ekstrand	1e53e0d2c7	intel/mi_builder: Drop the gen_ prefix mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining us anything except extra characters. It's possible that MI_ may conflict a tiny bit with GenXML but it doesn't seem to be a problem today and we can deal with that in the future if it's ever an issue. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>	2021-03-04 15:14:27 +00:00
Jason Ekstrand	6d522538b6	intel: Rename gen_mi_builder.h to mi_builder.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>	2021-03-04 15:14:27 +00:00
Danylo Piliaiev	7e25e5b56f	ir3: disallow moving memory writes over discard Writes to global memory should not be moved over discard, otherwise we could have unintended side-effects or lack of side-effects where they should be observed. Fixes tests: dEQP-VK.rasterization.frag_side_effects.color_at_beginning.kill dEQP-VK.rasterization.frag_side_effects.color_at_end.kill Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9365>	2021-03-04 11:40:58 +00:00
Juan A. Suarez Romero	7b3b8524ef	ci: Bump deqp to vk-gl-cts 1.2.5.2 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9369>	2021-03-04 11:09:35 +00:00
Danylo Piliaiev	72a9f315db	ir3: make mark_kill_path exit early if instr is already seen Would bring down its complexity in pathological cases. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9386>	2021-03-04 10:52:06 +00:00
Danylo Piliaiev	9dbb678f5a	ir3: prevent duplication of instruction's dependencies Otherwise mark_kill_path() is happy to take exponential time to finish. It was possible to have such chains: ... stib.base0 imm[0.000000,0,0x0], ssa_233, ssa_234, false-deps:ssa_231, ssa_231 stib.base0 imm[0.000000,0,0x0], ssa_237, ssa_238, false-deps:ssa_235, ssa_235 stib.base0 imm[0.000000,0,0x0], ssa_241, ssa_242, false-deps:ssa_239, ssa_239 stib.base0 imm[0.000000,0,0x0], ssa_245, ssa_246, false-deps:ssa_243, ssa_243 stib.base0 imm[0.000000,0,0x0], ssa_249, ssa_250, false-deps:ssa_247, ssa_247 stib.base0 imm[0.000000,0,0x0], ssa_105, ssa_253, false-deps:ssa_251, ssa_251 stib.base0 imm[0.000000,0,0x0], ssa_109, ssa_256, false-deps:ssa_254, ssa_254 stib.base0 imm[0.000000,0,0x0], ssa_113, ssa_259, false-deps:ssa_257, ssa_257 stib.base0 imm[0.000000,0,0x0], ssa_117, ssa_262, false-deps:ssa_260, ssa_260 stib.base0 imm[0.000000,0,0x0], ssa_265, ssa_266, false-deps:ssa_263, ssa_263 stib.base0 imm[0.000000,0,0x0], ssa_269, ssa_270, false-deps:ssa_267, ssa_267 stib.base0 imm[0.000000,0,0x0], ssa_273, ssa_274, false-deps:ssa_271, ssa_271 ... Fixes tests: dEQP-VK.geometry.layered.cube_array.36_36_12.secondary_cmd_buffer_inherit_framebuffer dEQP-VK.geometry.layered.3d.64_64_8.secondary_cmd_buffer_inherit_framebuffer dEQP-VK.geometry.layered.cube_array.64_64_12.secondary_cmd_buffer_inherit_framebuffer Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9386>	2021-03-04 10:52:06 +00:00
Samuel Pitoiset	517600b4d5	Revert "radv: stop using VM_ALWAYS_VALID on APUs" Disabling VM_ALWAYS_VALID actually hurts more than it helps after doing more testing. Managing the global BO list in userspace is really costly and make a bunch of games CPU bound. I think re-enabling VM_ALWAYS_VALID is a step in the right direction. This reverts commit `6ac6e2fbfb`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9341>	2021-03-04 09:37:59 +00:00
Gert Wollny	e148d5ec99	r600/sfn: lower intrinsic_load_tess_coord to driver version Fixes KHR-GL45.tessellation_shader.tessellation_shader_tessellation.TCS_TES KHR-GL45.tessellation_shader.tessellation_shader_tessellation.TES Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9373>	2021-03-04 09:14:03 +00:00
Gert Wollny	81b41e0c76	nir: Add r600 specific intrinsic for loading the tesselation coords Only the XY pair is provided directly, the Z value has to be deducted from the primitive type. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9373>	2021-03-04 09:14:03 +00:00

1 2 3 4 5 ...

136055 Commits