third_party_mesa3d

Author	SHA1	Message	Date
Jason Ekstrand	39925d60ec	anv: Add pipeline cache support for xfb_info Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	2aa78e46e9	anv/pipeline: Add a pdevice helper variable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-21 11:57:00 -06:00
Karol Herbst	6fefd69724	nir: rename nir_var_ssbo to nir_var_mem_ssbo Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	3afc1e068f	nir: rename nir_var_ubo to nir_var_mem_ubo Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	9b24028426	nir: rename nir_var_function to nir_var_function_temp Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Jason Ekstrand	eb4b1477dc	anv/pipeline: Cache the pre-lowered NIR This adds a second level of caching for the pre-lowered NIR that's only based off of the shader module, entrypoint and specialization constants. This is enough for spirv_to_nir as well as our first round of lowering and optimization. Caching at this level should allow for faster shader recompiles due to state changes. The NIR caching does not get serialized to disk via either the VkPipelineCache serialization mechanism or the transparent on-disk cache. We could but it's usually not that expensive to fall back to SPIR-V for the odd cache miss especially if it only happens once for several misses and it simplifies the cache. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	8dfda5ebbe	anv/pipeline: Hash shader modules and spec constants separately The stuff hashed by anv_pipeline_hash_shader is exactly the inputs to anv_shader_compile_to_nir so it can be used for NIR caching. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	73ddfbeb85	anv/pipeline: Move wpos and input attachment lowering to lower_nir This lets us make anv_pipeline_compile_to_nir take a device instead of a pipeline. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	8ea8727a87	anv/pipeline: Constant fold after apply_pipeline_layout Thanks to the new NIR load_descriptor intrinsic added by the UBO/SSBO lowering series, we weren't getting UBO pushing because the UBO range detection pass couldn't see the constants it needed. This fixes that problem with a quick round of constant folding. Because we're folding we no longer need to go out of our way to generate constants when we lower the vulkan_resource_index intrinsic and we can make it a bit simpler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 20:34:00 +00:00
Karol Herbst	d0c6ef2793	nir: rename global/local to private/function memory the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:51:46 +01:00
Jason Ekstrand	05d72d6d48	spirv: Sort supported capabilities Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-07 18:41:15 -06:00
Jason Ekstrand	34af63fa22	anv: Enable the new deref-based UBO/SSBO path Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	63b9aa2e25	spirv: Add support for using derefs for UBO/SSBO access For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	adc155a815	spirv: Add explicit pointer types Instead of baking in uvec2 for UBO and SSBO pointers and uint for push constant and shared memory pointers, make it configurable. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	fc9c4f89b8	nir: Move propagation of cast derefs to a new nir_opt_deref pass We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	a10a450db2	anv: Advertise support for MinLod on Skylake+ These are usually used for dealing with sparse resources but there's no reason why we can't hook them up before we have sparse. We have the hardware; let's light it up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	a24654b49d	anv/nir: Rework arguments to apply_pipeline_layout Instead of taking a whole pipeline (which could be anything!), just take a physical device and robust_buffer_access boolean. This makes it easier to verify that only the things in the hash actually affect pipeline compilation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-11-22 09:17:28 -06:00
Jason Ekstrand	617e402b3d	anv: Put robust buffer access in the pipeline hash It affects apply_pipeline_layout. Shaders compiled with the wrong value will work but they may not be robust as requested by the app. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-11-22 09:17:10 -06:00
Jason Ekstrand	d3a0d8b750	intel/compiler: Stop assuming the entrypoint is called "main" This isn't true for Vulkan so we have to whack it to "main" in anv which is silly. Instead of walking the list of functions and asserting that everything is named "main" and hoping there's only one function named "main", just use the nir_shader_get_entrypoint() helper which has better assertions anyway. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-30 20:14:52 -05:00
Jason Ekstrand	28bb6abd1d	nir/validate: Print when the validation failed Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Dylan Baker	8396043f30	Replace uses of _mesa_bitcount with util_bitcount and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem in nir for platforms that don't have popcount or popcountll, such as 32bit msvc. v2: - Fix additional uses of _mesa_bitcount added after this was originally written Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-07 10:21:26 -07:00
Jason Ekstrand	09f1de97a7	anv,i965: Lower away image derefs in the driver Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:03 -05:00
Jason Ekstrand	37f7983bcc	intel/compiler: Do image load/store lowering to NIR This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	d9ea015ced	anv/pipeline: Lower pipeline layouts etc. after linking This allows us to use the link-optimized shader for determining binding table layouts and, more importantly, URB layouts. For apps running on DXVK, this is extremely important as DXVK likes to declare max-size inputs and outputs and this lets is massively shrink our URB space requirements. VkPipeline-db results (Batman pipelines only) on KBL: total instructions in shared programs: 820403 -> 790008 (-3.70%) instructions in affected programs: 273759 -> 243364 (-11.10%) helped: 622 HURT: 42 total spills in shared programs: 8449 -> 5212 (-38.31%) spills in affected programs: 3427 -> 190 (-94.46%) helped: 607 HURT: 2 total fills in shared programs: 11638 -> 6067 (-47.87%) fills in affected programs: 5879 -> 308 (-94.76%) helped: 606 HURT: 3 Looking at shaders by hand, it makes the URB between TCS and TES go from containing 32 per-vertex varyings per tessellation shader pair to a more reasonable 8-12. For a 3-vertex patch, that's at least half the URB space no matter how big the patch section is. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-17 10:50:28 -05:00
Jason Ekstrand	f210a5f4bb	anv/pipeline: Set tess IO read/written key fields in compile_* We want these to be set as close to the final compile as possible so that they are guaranteed to happen after nir_shader_gather_info is called. The next commit is going to move nir_shader_gather_info to after the linking step which makes this necessary. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-17 10:50:28 -05:00
Jason Ekstrand	2e4094cd8f	anv/pipeline: Use more fields from stage in compile_cs Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-17 10:50:28 -05:00
Eric Engestrom	fcf259ef97	anv: set error in all failure paths Cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `5b196f39bd` "anv/pipeline: Compile to NIR in compile_graphics" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-09 11:20:27 +01:00
Jason Ekstrand	1d900e55fd	anv/pipeline: Disable FS dispatch for pointless fragment shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-03 05:52:23 -07:00
Jason Ekstrand	7ef6cd0ee8	anv/pipeline: Do cross-stage linking optimizations This appears to help the Aztec Ruins benchmark by about 2% on my Kaby Lake gt2 laptop. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	a5bffa061d	anv/pipeline: Pull most of the anv_pipeline_compile_* into common code This leaves us with a series of little anv_pipeline_compile_* functions which each take a compiler object, a mem_ctx, the stage to compile, and the previous stage for VUE linking purposes. Some of them do interesting things but most are little more than wrappers around brw_compile_*. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	5351339554	anv/pipeline: Add a separate "link" stage This breaks compilation up a bit into "link" and "compile". In the "link" stage, new anv_pipeline_link_* helpers are called which are responsible for setting up the binding table and doing anything needed to properly link with the next stage in the pipeline if one exists. They are called in reverse order starting with the fragment shader so you can assume linking in later stages is already done. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	5b196f39bd	anv/pipeline: Compile to NIR in compile_graphics This pulls the SPIR-V to NIR step out into common code. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	946fcd02a9	anv/pipeline: Recompile all shaders if any are missing from the cache Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	f76d6d8a63	anv/pipeline: Drop anv_pipeline_add_compiled_stage We can set active_stages much more directly and then it's just candy around setting pipeline->stages[stage]. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	703a24932a	anv/pipeline: Pull shader compilation out into a helper. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	f3c59ca947	anv/pipeline: Call anv_pipeline_compile_* in a loop Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	bdc3565c8c	anv/pipeline: Hash the entire pipeline in one go Instead of hashing each stage separately (and TES and TCS together), we hash the entire pipeline. This means we'll get fewer cache hits if they, for instance, re-use the same VS over and over again but it also means we can now safely do cross-stage optimizations. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	4a8236ae17	anv/pipeline: Populate keys up-front Instead of having each anv_pipeline_compile_* function populate the shader key, make it part of the anv_pipeline_stage struct and fill it out up-front. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	76503b319a	anv/pipline: Add a helper struct for per-stage info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	de9e5cf35a	anv/pipeline: Add populate_tcs/tes_key helpers They don't really do anything interesting, but it's more consistent this way. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	e621f57556	anv/pipeline: Rework the parameters to populate_wm_prog_key Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b2e0b0dad6	anv/pipeline: More aggressively optimize away color attachments Instead of just looking at the number of color attachments, look at which ones are actually used by the subpass. This lets us potentially throw away chunks of the fragment shader. In DXVK, for example, all subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this is very helpful in that case. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	80bc0b728c	anv: Restrict the number of color regions to those actually written The back-end compiler emits the number of color writes specified by wm_prog_key::nr_color_regions regardless of what nir_store_outputs we have. Once we've gone through and figured out which render targets actually exist and are written by the shader, we should restrict the key to avoid extra RT write messages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4d57e543b8	anv/pipeline: Fix up deref modes if we delete a FS output With the new deref instructions, we have to keep the modes consistent between the derefs and the variables they reference. Since we remove outputs by changing them to local variables, we need to run the fixup pass to fix the modes. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Kenneth Graunke	488972222c	i965: Combine both gl_PatchVerticesIn lowering passes. Until now, we had separate passes for lowering gl_PatchVerticesIn to a statically known constant (for TES inputs when linked against a TCS), and a uniform in the other cases. Annoyingly, one had to be run before nir_lower_system_values, and the other afterward. This simplified the passes, but made life painful for the callers. This patch combines both into a single pass. If you give it a non-zero static count, it uses that. If you give it Mesa state slots, it turns it back into a built-in uniform. Otherwise, it does nothing. This also moves the i965 uniform lowering out to shared code. v2: Make token arrays const. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-26 21:51:36 -07:00
Jason Ekstrand	820d5e51b7	intel/compiler: Account for built-in uniforms in analyze_ubo_ranges The original pass only looked for load_uniform intrinsics but there are a number of other places that could end up loading a push constant. One obvious omission was images which always implicitly use a push constant. Legacy VS clip planes also get pushed into the shader. This fixes some new Vulkan CTS tests that test random combinations of bindings and, in particular, test lots of UBOs and images together. Cc: mesa-stable@lists.freedesktop.org Cc: Kenneth Graunke <kenneth@whitecape.org>	2018-07-23 15:28:17 -07:00
Ilia Mirkin	257128079c	anv/gen9: expose VK_EXT_post_depth_coverage Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver. Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted? Not sure, so I left it as-is. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-22 14:56:44 -07:00
Jason Ekstrand	227dabc266	anv: Implement VK_EXT_vertex_attribute_divisor Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jason Ekstrand	2caf6c0392	anv/pipeline: Add a per-VB instance divisor Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jason Ekstrand	32f4feb5a0	anv/pipeline: Use a per-VB struct instead of separate arrays Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00

1 2 3 4 5

212 Commits