third_party_mesa3d

Author	SHA1	Message	Date
Marek Olšák	5e5573b1bf	radeonsi: disable RB+ blend optimizations for dual source blending This fixes dual source blending on Stoney. The fix was copied from Vulkan. The problem was discovered during internal testing. Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	ff50c44a5f	radeonsi: set CB_BLEND1_CONTROL.ENABLE for dual source blending copied from Vulkan Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	87b208a54e	radeonsi: always set all blend registers better safe than sorry Cc: 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	fc9f7fc9d0	radeonsi: set the smallest possible CB_TARGET_MASK better safe than sorry; set_framebuffer_state always makes this dirty Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	ea43d0b5e8	radeonsi: don't print bodies of header-only packets Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	7abd94c9b0	radeonsi: print unknown registers with correct formatting Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Marek Olšák	9e1dc10432	ddebug: fix hang detection with deferred flushes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-29 23:52:31 +01:00
Dave Airlie	048143b9d9	radv: set spi_baryc_cntl.pos_float_location to 0 This fixes: dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_sample_position.* This should probably be 2 when sample shading is enabled, but I'm not sure. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:48:23 +00:00
Dave Airlie	f3a3fea973	radv: force persample shading when required. We need to force persample shading when a) shader uses sample_id b) shader uses sample_position c) shader uses sample qualifier. Also since ps_iter_samples can now change independently of the rasterizer samples we need to move setting the regs more often. This fixes: dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.* dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.* dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.* dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:48:03 +00:00
Dave Airlie	6a62026dd4	nir: print var binding in dumps. This only useful for spir-v shaders, but I keep finding myself having to add it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 22:07:13 +00:00
Eric Engestrom	fae5e1dc74	docs: fix small typo Fixes: `ba28f2136f` ("docs: add note about r-b/other tags when resending") Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-11-29 22:02:57 +00:00
Matt Turner	218fec66cc	i965/sched: Schedule trivial blocks. In commit `45cd76e342` schedule_instructions(bblock_t *) began setting bblock_t::cycle_count, but that function was not called on trivial blocks. Remove the code to skip trivial blocks so that cycle_count is set. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Matt Turner	cab0952d4b	i965/sched: Make 'time' a local variable. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Matt Turner	b0156702fa	i965/cfg: Initialize bblock_t::cycle_count. schedule_instructions(bblock_t *) isn't called on blocks with a single instruction, and since it is the only thing that set cycle_count, cycle_count would be uninitialized. A non-empty block with bblock_t::cycle_count == 0 is arguably a bug. That'll be fixed in the next commit. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Matt Turner	ca9e30e002	i965/cfg: Initialize cfg_t::cycle_count. This reverts commit `b4001af174`. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-11-29 11:53:36 -08:00
Bas Nieuwenhuizen	b8c9ce4459	ac/nir: Fix accessing an unitialized value. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 20:13:28 +01:00
Bas Nieuwenhuizen	029e8ff81c	radv: Initialize the shader_stats_dump flag. Meta was using it before it was set. I suspect we typically don't want to dump meta shaders, so just set it to false in the beginning. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 20:13:28 +01:00
Eric Anholt	d40a3212ae	vc4: Add a note for the future about texture latency calculation. Debugging a shader-db reported cycle count regression from the tex coalescing, I eventually figured out that the texture latencies were totally bogus. Really fixing it will probably involve mirroring vc4_qir_schedule.c's texture fifo management here.	2016-11-29 09:01:23 -08:00
Eric Anholt	4690a93b12	vc4: Add support for coalescing ALU ops into tex_[srtb] MOVs. This isn't as complete as I would like (can't merge interpolation because of the implicit r5 dependency, doesn't work with control flow), but this was cheap and easy. Improves 3DMMES Taiji performance by 1.15353% +/- 0.299896% (n=29, 16) total instructions in shared programs: 99810 -> 99059 (-0.75%) instructions in affected programs: 10705 -> 9954 (-7.02%)	2016-11-29 08:52:50 -08:00
Eric Anholt	f4baf80993	vc4: Restructure VPM write optimization into two passes. For texturing, there won't be a fixed limit on how many writes there are, so we need to compute uses up front.	2016-11-29 08:38:59 -08:00
Eric Anholt	a025983dd9	vc4: Make qir_for_each_inst_inorder() safe against removal. The dead code elimination wants it to be safe, and I actually got segfaults due to it being unsafe with the new coalescing pass.	2016-11-29 08:38:59 -08:00
Eric Anholt	27544ea8d3	vc4: Split optimizing VPM writes from VPM reads. The VPM write logic will be basically the same as the texture coordinate write logic we need, and it's not really related to the VPM read logic other than the reuse of the use_count array.	2016-11-29 08:38:59 -08:00
Eric Anholt	d4c20e82ae	vc4: Restructure texture insts as ALU ops with tex_[strb] as the dst. For now we're still just generating MOVs, but this will let us fold into other ops in the future. No difference on shader-db.	2016-11-29 08:38:59 -08:00
Eric Anholt	314f0c57e4	vc4: Refactor qir_get_op_nsrc(enum qop) to qir_get_nsrc(struct qinst *). Every caller was dereffing the qinst, and this will let us make the number of sources vary depending on the destination of the qinst so that we can have general ALU ops that store to tex_[strb] and get an implicit uniform.	2016-11-29 08:38:59 -08:00
Eric Anholt	51087327f2	vc4: Replace the qinst src[] with a fixed-size array. This may have made a tiny bit of sense when we had one 4-arg inst per shader, but if we only ever put 2 things in, having a pointer to 2 things almost every instruction is pointless indirection.	2016-11-29 08:38:59 -08:00
Eric Anholt	a220f1b5a9	vc4: Remove qir_inst4(). This was used originally for unorm4x8 packs, but we now represent those as a series of packed movs.	2016-11-29 08:38:59 -08:00
Ilia Mirkin	7a8def8c18	anv: bump the texture gather offset limits This matches what NVIDIA and AMD hardware expose, as well as what Intel hardware supports. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 07:44:01 -08:00
Ilia Mirkin	62b8dbf35e	i965/gen7: expose larger gather offsets This matches the capabilities of the hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 07:44:01 -08:00
Ilia Mirkin	4f2d1d6ea7	i965: support constant gather offsets larger than 4 bits Offsets that don't fit into 4 bits need to force gather_po to be selected. Adjust the logic so that this happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-29 07:44:01 -08:00
Jason Ekstrand	faf20df143	i965/fs: Refactor handling of constant tg4 offsets Previously, we had an OFFSET_VALUE source for logical texture instructions that was intended to mean exactly what it says, "offset". In reality, we only fully used it for tg4 offsets. We used offset_value.file == IMM to mean, "you have a constant offset, go look in instr->offset" and didn't actually use the contents of the register at all in that case except for in nir_emit_texture where we used it as a temporary before we copy it into instr->offset. This commit renames OFFSET_VALUE to TG4_OFFSET and restricts its usage to indirect tg4 offsets only. The nir_emit_texture code is refactored so that we explicitly build a header_bits value which is placed in instr->offset and the constant offset values (both for tg4 and regular texture operations) are used to construct header_bits and don't go through the offset source at all. Finally, we stop passing offset_value in to lower_sampler_logical_send_gen5 because we can't do indirect offsets until gen7 anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-29 07:44:01 -08:00
Bas Nieuwenhuizen	05533ce418	radv: Use different intrinsic for ubo loads. Not sure about the deprecation path, but this intrinsic can be lowered to SMEM loads. This results in a significant Talos performance improvement. v2: Fix for LLVM attribute changes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-11-29 08:36:16 +01:00
Timothy Arceri	0303201dfb	mesa: fix active subroutine uniforms properly `07fe2d565b` introduced a big hack in order to return NumSubroutineUniforms when querying ACTIVE_RESOURCES for <shader>_SUBROUTINE_UNIFORM interfaces. However this is the wrong fix we are meant to be returning the number of active resources i.e. the count of subroutine uniforms in the resource list which is what the code was previously doing, anything else will cause trouble when trying to retrieve the resource properties based on the ACTIVE_RESOURCES count. The real problem is that NumSubroutineUniforms was counting array elements as separate uniforms but the innermost array is always considered a single uniform so we fix that count instead which was counted incorrectly in `7fa0250f9`. Idealy we could probably completely remove NumSubroutineUniforms and just compute its value when needed from the resource list but this works for now. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: 13.0 <mesa-stable@lists.freedesktop.org>	2016-11-29 15:29:51 +11:00
Jason Ekstrand	f469235a6e	anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculation The 1-D special case doesn't actually apply to depth or HiZ. I discovered this while converting BLORP over to genxml and ISL. The reason is that the 1-D special case only applies to the new Sky Lake 1-D layout which is only used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers, the old gen4 2-D layout is used and the QPitch should be in rows. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 20:17:29 -08:00
Jason Ekstrand	d4ef87c1bb	anv/cmd_buffer: Set the correct surface type for depth/stencil Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-28 20:17:16 -08:00
Ilia Mirkin	e6847f24f0	anv: enable drawIndirectFirstInstance This was already piped through in the CmdDraw(Indexed)Indirect handling. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	d2280a007a	anv: expose depthBiasClamp, it is already set The gen7/8_cmd_buffer logic already sets the clamp, and it's piped through via the dynamic state. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	e2c669a56b	anv: bump maxFramebufferLayers to 2048 This matches maxImageArrayLayers, as well as the same setting in the GL frontend. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	76b97d544e	anv: enable storage image extended formats These are all regularly available in desktop GL, so the backend fully supports them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:32:14 -08:00
Ilia Mirkin	a34f89c5e6	anv: expose imageCubeArray functionality This appears to be fully supported already. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-11-28 19:32:13 -08:00
Dave Airlie	eaf0768b8f	radv: set maxFragmentDualSrcAttachments to 1 Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 13:27:26 +10:00
Dave Airlie	f9ab60202d	anv: set maxFragmentDualSrcAttachments to 1 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-11-29 13:26:53 +10:00
Ilia Mirkin	e0fc18a435	swr: [rasterizer memory] only clear up to the LOD size Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 20:14:48 -05:00
Ilia Mirkin	2fca08e550	swr: [rasterizer memory] hook up stencil clears for ClearTile Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 20:14:48 -05:00
Ilia Mirkin	5582610ea1	swr: [rasterizer memory] add support for clearing Z32F_X32 and Z16 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2016-11-28 20:14:48 -05:00
Jason Ekstrand	6bc8bef1a1	intel/aubinator: Pull useful information from the AUB header This commit does two things. One is to pull useful and/or interesting information from the AUB file header and display it as a header above your decoded batches. Second, it is now capable of pulling the PCI ID from the AUB file comment left by intel_aubdump. This removes the need to use the --gen flag all the time. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	da5ebeffdf	intel/aubinator: Wait to setup decoders until we parse the aub header This requires that a few more state bits become global. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	e6c01fb17d	intel/aubinator: Rework handling of the --gen flag This makes it just store the pci_id instead of a struct pointer Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	12f2eae7e7	intel/aubinator: Trust the packet size in the header for SUBOPCODE_HEADER We were reading from the "comment size" dword and incrementing by that amount. This never caused a problem because that field was always zero. However, experimenting with actual aub file comments indicates, the simulator seems to include the comment size in the packet size provided in the header. We should do the same. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	89bb515e91	intel/aubinator: Add a get_offset helper The helper automatically handles masking for us so we don't have to worry about whether or not something is in the bottom bits. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-28 16:45:09 -08:00
Jason Ekstrand	318cf3ffa4	intel/aubinator: Fix the kernel start pointer for 3DSTATE_HS Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-28 16:45:09 -08:00

... 5 6 7 8 9 ...

87313 Commits