third_party_mesa3d

Author	SHA1	Message	Date
Samuel Pitoiset	7562a2cbe3	radv: fix vkUpdateDescriptorSets with inline uniform blocks descriptorCount is the number of bytes into the descriptor, so it shouldn't be used as an index. srcArrayElement/dstArrayElement specify the starting byte offset within the binding to copy from/to. This fixes new CTS tests: dEQP-VK.binding_model.descriptor_copy..inline_uniform_block_ dEQP-VK.binding_model.descriptor_copy..mix_3 dEQP-VK.binding_model.descriptor_copy..mix_array1 Fixes: `8d2654a419` ("radv: Support VK_EXT_inline_uniform_block.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 09:59:22 +02:00
Samuel Pitoiset	9c92a21fe5	radv/gfx10: fix 3D images GFX10 does act like GFX9 actually. This fixes dEQP-VK.glsl.texture_functions.query.texturesize.sampler3d_. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 09:45:49 +02:00
Samuel Pitoiset	41ace1d939	radv/gfx10: re-enable fast depth/stencil clears with separate aspects It used to cause weird issues on GFX10 in the past with vkmark and Wreckfest, and they can't be reproduced now. Shadow Of Mordor (Vulkan beta) hits that path and it works fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 09:18:06 +02:00
Samuel Pitoiset	956d825ed8	radv: do not emit rbplus if attachments are undefined Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer on Raven and other chips that allow rbplus. This just prevents a crash and rbplus probaby needs more work. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 08:57:31 +02:00
Samuel Pitoiset	411ad8e7c5	radv: add an assertion in radv_gfx10_compute_bin_size() To prevent out of bounds access. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 08:33:12 +02:00
Samuel Pitoiset	f4ab58c1a0	radv: do not create meta pipelines with 16 samples The driver only supports up to 8 samples, so it's useless to create more pipelines than needed. This fixes a conditional jump reported by Valgrind on GFX10: ==194282== Conditional jump or move depends on uninitialised value(s) ==194282== at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242) ==194282== by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334) ==194282== by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440) ==194282== by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764) ==194282== by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788) ==194282== by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114) ==194282== by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297) ==194282== by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277) ==194282== by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363) ==194282== by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080 This is caused by an out of bound access of 'fmask_array' (ie. index is 4 as for 16 samples). Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 08:33:08 +02:00
Lionel Landwerlin	2b5f30b1d9	anv: implement VK_INTEL_performance_query v2: Introduce the appropriate pipe controls Properly deal with changes in metric sets (using execbuf parameter) Record marker at query end v3: Fill out PerfCntr1&2 v4: Introduce vkUninitializePerformanceApiINTEL v5: Use new execbuf extension mechanism v6: Fix comments in genX_query.c (Rafael) Use PIPE_CONTROL workarounds (Rafael) Refactor on the last kernel series update (Lionel) v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	5ba6d9941b	intel/perf: add mdapi writes for register perf counters Those are not part of the OA reports. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	a2a1873a82	intel/genxml: add RPSTAT register for core frequency Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	e0ab658acd	intel/genxml: add generic perf counters registers We have 2 of those we can configure to source programmable events. Those are not part of the OA reports. Configuration happens in i915 through the metric set selected by the application. On the Mesa side we'll just sample those and do a diff. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	11c4bf9417	intel/perf: add support for querying kernel loaded configurations We use this as a communication mechanism between MDAPI & Anv. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	13f802291d	drm-uapi: Update headers from drm-next Pull new updates from drm-next as of the following commit: commit f1b4a9217efd61d0b84c6dc404596c8519ff6f59 Merge: 400e91347e1d f3a36d469621 Author: Dave Airlie <airlied@redhat.com> Date: Tue Oct 22 15:04:00 2019 +1000 Merge tag 'du-next-20191016' of git://linuxtv.org/pinchartl/media into drm-next Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	db7a6847dd	intel/perf: move registers to their own header Will conflict with the genxml RPSTAT register. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	e1d5d75257	intel/perf: extract register configuration We want to query the content of register configurations from the kernel. Let's pull this out of the query. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	a338b7d739	intel/perf: expose some utility functions The Vulkan performance query extension is a bit lower level than the GL one. Expose some of the functions to do the result accumulation directly in the Anv driver. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	a0e0e75db1	intel/perf: add mdapi maker helper A simple utility to put the marker at the right location. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Kenneth Graunke	c352cdf970	st/mesa: Silence chatty debug printf Other debug_printf's in this file are in if (0) blocks. Trivial.	2019-10-22 18:01:41 -07:00
Chris Wilson	0899bf55d4	st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM This is useful for PBO texture upload with GL_RGB and GL_UNSIGNED_BYTE. v2: Vasily Khoruzhick provided an update for the Lima CI expectations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-22 22:13:14 +00:00
Lionel Landwerlin	0dfa643feb	anv: fix unwind of vkCreateDevice fail We're skipping the context destruction in some cases which is the grand scheme of thing is not that important because closing device->fd will destroy the associated context as well. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Fixes: `b30e01aef5` ("anv: fix memory leak on device destroy")	2019-10-22 20:44:26 +00:00
Rhys Perry	118a32e5ba	Revert "aco: only emit waitcnt on loop continues if we there was some load or export" We don't properly pass on ctx.lgkm_cnt/ctx.barrier_imm/etc, so this waitcnt was necessary for barriers and correctly waiting for SMEM before s_dcache_wb on GFX10. Totals from affected shaders: SGPRS: 33200 -> 33200 (0.00 %) VGPRS: 31376 -> 31376 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2431804 -> 2433956 (0.09 %) bytes LDS: 316 -> 316 (0.00 %) blocks Max Waves: 1609 -> 1609 (0.00 %) This reverts commit `2c050b49b3`. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	964ce47abc	aco: add missing bld.scc() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	c96289a70e	aco: keep can_reorder/barrier when combining addition into SMEM Affects 30 shaders in the pipeline-db (all youngblood). Totals from affected shaders: SGPRS: 2656 -> 2456 (-7.53 %) VGPRS: 2260 -> 2260 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 240680 -> 240944 (0.11 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 90 -> 90 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	57c2cfb608	aco: add a few missing checks in value numbering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	a8d0101d69	aco: use ds_read2_b64/ds_write2_b64 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	bdf47a1273	aco: properly combine additions into ds_write2_b64/ds_read2_b64 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	58d4aee5df	aco: fix sparse store_lds() p_extract_vector's second operand is in units of the definition size, not dwords. v2: move extract_subvector() to right before ds_write_helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	a856629e8f	aco: create load_lds/store_lds helpers We'll want these for GS, since VS->GS IO on Vega is done using LDS. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	a400928f4a	aco: fix 64-bit p_extract_vector on 32-bit p_create_vector Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	f6f15859de	aco: small stage corrections Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Marek Olšák	f764725b3e	st/mesa: replace pipe_shader_state with tgsi_token* in st_vp_variant we don't need more than that Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-22 14:41:25 -04:00
Marek Olšák	a0b711d8e9	nir: allow nir_lower_uniforms_to_ubo to be run repeatedly for st/mesa Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-22 14:41:23 -04:00
Rob Clark	aa8515463e	freedreno/ir3: fixup register footprint fixup Small typo resulted in not converting footprint to vec4, meaning that we could potentially ask for quite a few more registers than required Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-22 17:46:19 +00:00
Rob Clark	4c060235a2	freedreno/ir3: handle scalarized varying inputs If the load_interpolated_input is scalarized, we would be too conservative about deciding the tex instruction wasn't a candidate to pre-fetch: vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 /) vec2 32 ssa_1 = intrinsic load_barycentric_pixel () (0) / interp_mode=0 / vec1 32 ssa_2 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 0) / base=0 / / component=0 / / packed:v_uv,v_uv1 / vec1 32 ssa_3 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 1) / base=0 / / component=1 / / packed:v_uv,v_uv1 */ vec2 32 ssa_8 = vec2 ssa_2, ssa_3 vec4 32 ssa_9 = tex ssa_8 (coord), 0 (texture), 0 (sampler) Really we don't care that the texcoord components come from different load_interpolated_input instructions, just that they have consecutive varying offsets. Reported-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-22 17:46:19 +00:00
Daniel Schürmann	3a20ef4a32	aco: refactor value numbering Previously, we used one hashset per BB, so that we could always initialize the current hashset from the immediate dominator. This patch changes the behavior to a single hashmap using the block index per instruction to resolve dominance. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-22 17:18:59 +02:00
Erik Faye-Lund	3a71e1d27b	mesa/st: assert that lowering is supported Some of these lowerings aren't supported for drivers that supports tesselation and geometry shaders. Let's add a couple of asserts to make it obvious if these have been enabled when it's not possible. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-22 12:07:23 +00:00
Michel Dänzer	793f6b30d9	gitlab-ci: Enable llvmpipe in ARM build jobs v2: * Use LLVM 8 from buster-backports v3: * Use LLVM 7 again for armhf, llvmpipe is still broken there with LLVM 8 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Michel Dänzer	59e7f1413c	gitlab-ci: Update the meson cross file for LLVM_VERSION as well Cross builds don't use the llvm-config path from the native file.	2019-10-22 10:26:29 +00:00
Michel Dänzer	163ec5d808	gitlab-ci: Use native aarch64 runner for ARM build jobs This allows running the regression tests. One downside is that we can't easily build the Vulkan overlay layer, because only x86 binaries of the glslang validator are available. If that's important, we could either use those binaries via qemu, or build it from source. v2: * Add :amd64 suffix to existing debian-9/10 job names (Eric Engestrom) Acked-by: Eric Engestrom <eric.engestrom@intel.com> # v1	2019-10-22 10:26:29 +00:00
Michel Dänzer	c5aa2711a4	gitlab-ci: Explicitly list debian-10 in needs: for .deqp-test template Apparently needs: in a definition overwrites inherited ones. So .deqp-test effectively didn't declare needs: for debian-10, which means any jobs based on .deqp-test could spuriously run after the debian-10 job failed or was cancelled.	2019-10-22 10:26:29 +00:00
Michel Dänzer	38d42cf1d5	gitlab-ci: Bring ARM docker image install script in line with x86_64 Use https:// URLs in the APT configuration. Drop --no-install-recommends, the image generation template disables installation of recommended packages in /etc/apt/apt.conf. Run apt-get autoremove at the end, cleaning up packages which were installed to satisfy dependencies but are no longer needed. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Michel Dänzer	e3c7e04dfa	gitlab-ci: Sort ARM docker image packages in alphabetical order No functional change. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Samuel Pitoiset	a13320370e	radv: fix updating bound fast ds clear values with different aspects On GFX9, the driver is able to do an optimized fast depth/stencil clear with only one aspect (ie. clear the stencil part of a depth/stencil image). When this happens, the driver should only update the clear values of the given aspect. Note that it's currently only supported on GFX9 but I have some local patches that extend this optimized path for other gens. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967 Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-22 11:16:13 +02:00
Sagar Ghuge	97e6d34e66	intel/compiler: Refactor disassembly of sources in 3src instruction Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	18b28b5654	intel/compiler: Don't move immediate in register On Gen12, we support mixed mode HF/F operands, and also 3 source instruction supports immediate value support, so keep immediate as it is, if it fits properly in 16 bit field. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	bf943bdf24	intel/compiler: Set bits according to source file On Gen >= 12, if src0 or src2 holds immediate value, we need set src[0/2]_is_imm bits instead of register file. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	c018c5a339	intel/compiler: Add Immediate support for 3 source instruction On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but not both. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Eric Anholt	fb9362c6fb	ci: Disable lima until its farm can get fixed. It's been throwing the following error today: "<Fault -32603: 'Internal Server Error (contact server administrator for details): could not extend file "base/17952/18226": No space left on device\nHINT: Check free disk space.\n'>" Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-10-21 20:31:34 -07:00
Sagar Ghuge	7fb75ddfa7	intel: Add missing entry for brw_nir_lower_alpha_to_coverage in Makefile Fixes: `7ecfbd4f6d` ("nir: Add alpha_to_coverage lowering pass") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-21 16:19:24 -07:00
Dave Airlie	bde08ce4d7	llvmpipe: handle compute shader launch with 0 threads If you set LP_NUM_THREADS=0 compute shaders would hang, just execute the workloads in sequence if we have no threads in the pool. Fixes: `1b24e3ba75` ("llvmpipe: add compute threadpool + mutex") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-21 22:51:23 +00:00
Marijn Suijten	0141a4cdc0	freedreno/ir3: Add missing ir3_nir_lower_tex_prefetch.c to Android.mk This file is created in `2a0d45ae6c` but addition to android makefiles was omitted. It breaks the build with missing references which are defined in this file. List the file in ir3_SOURCES to make the build succeed. Signed-off-by: Marijn Suijten <marijns95@gmail.com>	2019-10-21 22:43:00 +00:00

1 2 3 4 5 ...

116640 Commits