third_party_mesa3d

Author	SHA1	Message	Date
Alyssa Rosenzweig	d49fdca229	pan/midgard: Identify 64-bit atomic opcodes They are symmetric to their 32-bit counterparts, just shifted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	6601570ead	pan/midgard: Debug mir_insert_instruction_after_scheduled Add some comments explaining what's going on in a more natural flow in order to solve the actual bug. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `2d914ebe81` ("pan/midgard: Fix memory corruption in register spilling")	2019-10-20 12:02:31 +00:00
Christian Gmeiner	a6de05a968	etnaviv: keep track of buffer valid ranges for PIPE_BUFFER This allows a write to proceed to an uninitialized part of a buffer even when the GPU is using the previously-initialized portions. Such a situation can be triggered with the following API usage example: glBufferSubData(..., offset, size, data1); glDrawArrays(...); // append new vertex data glBufferSubData(..., offset+size, size, data2); glDrawArrays(...); Same is done for freedreno, nouveau and radeon. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-20 09:03:06 +00:00
Christian Gmeiner	eab6d75066	etnaviv: store updated usage in pipe_transfer object Store the changed usage in the newly created transfer object. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-20 09:03:06 +00:00
Christian Gmeiner	cd4528563f	etnaviv: fix code style Fixes: `1194afdfe3` ("etnaviv: rework the stream flush to always go through the context flush") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-20 10:20:22 +02:00
Lionel Landwerlin	b30e01aef5	anv: fix memory leak on device destroy v2: handle vma destruction if vkCreateDevice fails (Jordan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1959 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-20 08:02:22 +00:00
Christian Gmeiner	f834656a41	etnaviv: fix compile warnings Fixes: `e5cc66dfad` ("etnaviv: Rework locking") Fixes: `1456aa61cc` ("etnaviv: Rework resource status tracking") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-20 08:28:18 +02:00
Eric Anholt	d8741ad251	mesa: Redefine the RG formats as array formats. This is the layout used in the GL API, and maps directly to PIPE formats with no endianness trickery. As with the LA change, this fixes big-endian fetching from texbos. Also cleans up some endian shenanigans in shader images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	4f384ddf5f	gallium: Drop the unused PIPE_FORMAT_AL formats. Now that Mesa is also using an array format for LA, nothing was using these. (And, clearly, no HW driver had exposed them). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	6a819cabe8	mesa: Replace MESA_FORMAT_L8A8/A8L8 UNORM/SNORM/SRGB with an array format. The array format is what the GL API wants (fixing texbos on big-endian), and matches directly to gallium's corresponding array format. The only driver exposing A8L8 was radeon/r200 in big-endian, where the HW's underlying format was trying to read as array and we needed to flip things around to make our packed format come out right (note that while the radeon format tables had both AL and LA, ChooseTextureFormat would only pick one of them based on endianness). v2: Don't make r200/radeon use endian swaps. v3: Rebase on dropping the r200 _be/_le format table removal patch v4: reword commit message to explain why we can drop both formats from radeon. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-10-20 04:39:48 +00:00
Eric Anholt	236b478b2e	mesa: Replace the LA16_UNORM packed formats with one array format. The array format is what the GL API wants (and we made a mistake in the format returned for texbos on big-endian!), and it's exactly what the gallium-side PIPE_FORMAT_L16A16 is. The only downside is that dri_util tries to fall back to sampling RG16 using LA16, which doesn't have a match for big-endian any more. No HW drivers supported A16L16 anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	1165e3f360	radeon: Drop the unused first arg of OUT_BATCH_RELOC. This was a trap when trying to figure out how to fit data bits into the reloc. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	2a548cf92f	radeon: Fill in the TXOFFSET field containing the tile bits in our relocs. The first arg to OUT_BATCH_RELOC is ignored, we actually wanted these in the third arg. They're always 0 so far, so it didn't matter. v2: Reword commit message that I don't end up using the tile bits, but keep the commit as a cleanup anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-10-20 04:39:48 +00:00
Eric Anholt	ecddabfa76	r100/r200: factor out txformat/txfilter setup from the TFP path. No matter what, we deref the texFormat from the table, except for a mistake in cpp=4 where we pulled a 0 out of the table either way. v2: Rebase on dropping r200 table deduplication patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-10-20 04:39:48 +00:00
Vasily Khoruzhick	7ceafa4b40	lima: fix PP stack size PP stack size should be set to maximum PP stack size, not to stack size of last shader. Fixes: `27e7603c34` ("lima: fix ppir spill stack allocation") Tested-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-19 18:15:18 -07:00
Marijn Suijten	224b267282	freedreno/a5xx: enable a510 Kernel support for this GPU is added by the following series: https://patchwork.kernel.org/project/linux-arm-msm/list/?series=187609 In particular https://patchwork.kernel.org/patch/11189953/ Tested on Sony Xperia X and X Compact. Signed-off-by: Marijn Suijten <marijns95@gmail.com> Tested-by: AngeloGioacchino Del Regno <kholk11@gmail.com>	2019-10-19 16:48:24 +02:00
Prodea Alexandru-Liviu	48d617118a	Appveyor/Meson: Add build test of osmesa gallium Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-19 14:44:44 +00:00
Lionel Landwerlin	3f8f52b241	anv: fix vkUpdateDescriptorSets with inline uniform blocks With inline uniform blocks descriptor, the meaning of descriptorCount is a number of bytes to copy into the descriptor. Don't try to use that size as an index into the descriptor table. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `43f40dc7cb` ("anv: Implement VK_EXT_inline_uniform_block") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1195 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-19 13:16:40 +03:00
Rob Clark	1cea76274e	freedreno/ir3: handle imad24_ir3 case in UBO lowering Similiar to iadd, we can fold an added constant value from an imad24_ir3 into the load_uniform's constant offset. This avoids some cases where the addition of imad24_ir3 could otherwise be a regression in instr count. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	d9424e5821	freedreno/ir3: add imul24 opcode This maps to mul.s24 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	c7b8f16bee	freedreno/ir3: optimize immed 2nd src to mad We can't encode immed sources for cat3 (mad) instructions, but we can use const in first or third src. We handled this case already, but we weren't considering that we could lower immed to const. For manhattan: total instructions in shared programs: 35202 -> 34718 (-1.37%) instructions in affected programs: 14931 -> 14447 (-3.24%) helped: 90 HURT: 0 total full in shared programs: 2451 -> 2359 (-3.75%) full in affected programs: 653 -> 561 (-14.09%) helped: 69 HURT: 2 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 15:08:54 -07:00
Rob Clark	666b6236f7	freedreno/ir3: add rule to generate imad24 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	5e08f070f0	nir: add nir_lower_amul pass Lower amul to either imul or imul24, depending on whether 24b is enough bits to calculate an offset within the thing being dereferenced. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-18 15:08:54 -07:00
Rob Clark	1bdde31392	nir: add address calc related opt rules Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	6320e37d4b	nir: add amul instruction Used for address/offset calculation (ie. array derefs), where we can potentially use less than 32b for the multiply of array idx by element size. For backends that support `imul24`, this gives a lowering pass an easy way to find multiplies that potentially can be converted to `imul24`. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	0568761f8e	nir: Add a new ALU nir_op_imul24 Some hardware can do 24b multiply in a single instruction, but not 32b. However in most cases 24b is sufficient for address/offset calculation. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	bc2ccdc45a	freedreno/ir3: Handle newly added opcode nir_op_imad24_ir3 Simply emit an ir3_MAD_S24 instruction in the backend. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	32e5fbf47c	nir: Add a new ALU nir_op_imad24_ir3 ir3 compiler has a signed integer multiply-add instruction (MAD_S24) that is used for different offset calculations in the backend. Since we intend to move some of these calculations to NIR, we need a new ALU op that can directly represent it. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	6ad442acae	freedreno/ir3: rename mul.s/mul.u to mul.s24/mul.u24, to better reflect that these are 24b multiply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	ad8167c1e0	nir/search: fix the PoT helpers Otherwise, if the base type is (for example) uint32, we would incorrectly think that PoT optimizations could not apply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstsrand <jason@jleksrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	f30c256ec0	freedreno/ir3: enable pre-fs texture fetch for a6xx Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	72048dd799	turnip: add support for pre-fs texture fetch Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	a5afcc76d5	freedreno/a6xx: add support for pre-fs texture fetch Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Hyunjun Ko	e9450ad27d	freedreno/ir3: Add support for texture sampling pre-dispatch Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev	2a0d45ae6c	freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch The pass should run once at the end of shader compilation, for a4xx onwards. It iterates texture sampling instructions and mark those eligibile for pre-dispatch by changing the tex op from 'tex' to 'tex_prefetch'. An instruction is eligibile if: * The coordinate is a vector where all its components come from a shader input. * The order of the components match exactly that of the input (no swizzles). * The instruction is in the 'main' function, and in the outer most-block. The first two restrictions were arrived to empirically, so more testing could tighten or loosen it. The 3rd restriction is there to allow moving the instructions eligible for pre-dispatch to the beginning of the shader, so that we don't block the registers holding the result for too long. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	7d4213fe88	freedreno/ir3: force i/j pixel to r0.x It seems that pre-fs texture fetch only works if ij_pix ends up in r0.x. I've tried unknown zero bits, to no avail, and blob also seems to force r0.x when this feature is used. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	07e9bf564f	freedreno/ir3: add pre-dispatch tex fetch to disasm Useful to see in disassembly listing texture fetches that were moved to pre-dispatch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	2b93eb9c76	freedreno/ir3: add dummy bary.f(ei) for pre-fs-fetch If the only use of varyings is a pre-shader texture-fetch, we still need to issue a bary.f with the end-input flag, otherwise we'll block further VS invocations, as the hw will think varying storage is still busy. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	392a309a55	freedreno/ir3: fixup register footprint to account for prefetch It is possible that the result of a pre-fs texture fetch is an output (or partially an output) of the FS. Sine the meta:tex_prefetch instructions are dropped before the assembler, we need to account for this when we fixup the register footprint. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	482e1b9955	freedreno/ir3: add meta instruction for pre-fs texture fetch Add a placeholder instruction to track texture fetches made prior to FS shader dispatch. These, like meta:input instructions are scheduled before any real instructions, so that RA realizes their result values are live before the first real instruction. And to give legalize a way to track usage of fetched sample requiring (sy) sync flags. There is some related special handling for varying texcoord inputs used for pre-fs-fetch, so that they are not DCE'd and remain in linkage between FS and previous stage. Note that we could almost avoid this special handling by giving meta:tex_prefetch real src arguments, except that in the FS stage, inputs are actual bary.f/ldlv instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	11e467c378	freedreno/ir3: don't DCE ij_pix if used for pre-fs-texture-fetch When we enable pre-dispatch texture fetch, we could have a scenario where the barycentric i/j coord sysval is not used in the shader, but only used for the varying fetch for the pre-dispatch texture fetch. In this case we need to take care not to DCE this sysval. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	af817a44c1	freedreno/ir3: track sysval slot for inputs Will be needed for special handling of SYSTEM_VALUE_BARYCENTRIC_PIXEL (ij_pix) when pre-fs texture fetch is enabled. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	35692fab86	freedreno/ir3: remove unused ir3_instruction::inout Not sure I remember how long this has been unused for. But it's unused now. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Hyunjun Ko	fd14788e1f	freedreno/ir3: Add data structures to support texture pre-fetch Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	766a68cdb9	freedreno: update registers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev	f1d4fadf1b	nir: Add new texop nir_texop_tex_prefetch This is like nir_texop_tex, but signals that the sampling coordinates are immutable during the shader stage, in a way that allows the HW that supports pre-dispatching sampling operations to pre-fetch the result prior to scheduling the shader stage. This is introduced to support the feature in Freedreno. Adreno HW from a4xx supports it. A NIR pass introduced later in this series will detect sampling operations that are eligible for pre-dispatch, and replace nir_texop_tex by this new op, to tell the backend to enable pre-fetch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eric Engestrom	27df3e015b	osmesa: add missing #include <stdint.h> Fixes: `281466332b` ("gallium/osmesa: Introduce a test.") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1947 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-18 22:07:21 +01:00
Dylan Baker	1ce23b5653	docs: Add new feature for compiling for windows with meson Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	0b6b7ff3ca	appveyor: Move appveyor script into .appveyor directory This clears out the scripts directory completely Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	fbb969b98a	appveyor: Add support for building llvmpipe with meson Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00

... 5 6 7 8 9 ...

116855 Commits