third_party_mesa3d

Author	SHA1	Message	Date
Timothy Arceri	c412ff426b	nir: fix nir_variable_data packing Before: /* size: 60, cachelines: 1, members: 29 / After: / size: 56, cachelines: 1, members: 29 */ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-10-24 13:22:59 +11:00
Rhys Perry	8b98d0954e	nir/lower_idiv: add new llvm-based path v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Rob Clark	5e08f070f0	nir: add nir_lower_amul pass Lower amul to either imul or imul24, depending on whether 24b is enough bits to calculate an offset within the thing being dereferenced. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-18 15:08:54 -07:00
Rob Clark	ad8167c1e0	nir/search: fix the PoT helpers Otherwise, if the base type is (for example) uint32, we would incorrectly think that PoT optimizations could not apply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstsrand <jason@jleksrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	f1d4fadf1b	nir: Add new texop nir_texop_tex_prefetch This is like nir_texop_tex, but signals that the sampling coordinates are immutable during the shader stage, in a way that allows the HW that supports pre-dispatching sampling operations to pre-fetch the result prior to scheduling the shader stage. This is introduced to support the feature in Freedreno. Adreno HW from a4xx supports it. A NIR pass introduced later in this series will detect sampling operations that are eligible for pre-dispatch, and replace nir_texop_tex by this new op, to tell the backend to enable pre-fetch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Kristian H. Kristensen	8e16fb1528	freedreno/ir3: Implement lowering passes for VS and GS This introduces two new lowering passes. One to lower VS to explicit outputs using STLW and one to lower GS to load input using LDLW and implement the GS specific functionality. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Erik Faye-Lund	71c0dcf266	nir: support feeding state to nir_lower_clip_[vg]s Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	eb3047c094	nir: support lowering clipdist to arrays This allows us to make sure clipdist is emitted as a scalar array rather than two vec4s. This matches SPIR-V semantics, and will be useful for Zink. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	878c94288a	nir: add lowering-pass for point-size mov Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	6d7e02e37d	nir: allow passing alpha-ref state to lowering-code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Dave Airlie	dc91a02a72	nir: add a pass to lower flat shading. This takes any color or backcolor that has unspecified shading and converts it to flat shading. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Marek Olšák	cebc38ff60	nir: add nir_shader_compiler_options::lower_to_scalar This will replace PIPE_SHADER_CAP_SCALAR_ISA. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	3340c066a1	nir: move gl_nir_opt_access from glsl directory Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Samuel Iglesias Gonsálvez	45668a8be1	nir: add auxiliary functions to detect if a mode is enabled v2: - Added more functions. v3: - Simplify most of the functions (Caio). v4: - Updated to renamed enum values (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]	2019-09-17 23:39:18 +03:00
Jason Ekstrand	f81a2623d8	nir: Add a block_is_unreachable helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Timur Kristóf	610cc3089c	nir: Carve out nir_lower_samplers from GLSL code. Lowering samplers is needed to produce NIR that can actually be consumed by some gallium drivers, so it doesn't make sense to to keep it only in the GLSL code. This commit introduces nir_lower_samplers to compiler/nir, while maintains the GL-specific function too. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 12:20:20 +03:00
Vasily Khoruzhick	9367d2ca37	nir: allow specifying filter callback in lower_alu_to_scalar Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-06 01:51:28 +00:00
Alyssa Rosenzweig	a8f86fcb51	nir: Remove nir_const_load_to_arr There are no remaining users in-tree. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-22 12:24:13 -07:00
Daniel Schürmann	df86c5ffb3	nir: add divergence analysis pass. This pass expects the shader to be in LCSSA form. The algorithm is based on 'The Simple Divergence Analysis' from Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira. Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-20 17:40:13 +02:00
Rhys Perry	911a1dfad2	nir/lcssa: allow to create LCSSA phis for loop-invariant booleans ACO depends on LCSSA phis for divergent booleans to work correctly. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:40:05 +02:00
Daniel Schürmann	9c40ad49d5	nir/lcssa: Skip loop invariant variables when converting to LCSSA. Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:40:01 +02:00
Rhys Perry	8a6cfaa15a	nir: make nir_to_lcssa() a general NIR pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:39:54 +02:00
Jason Ekstrand	5167e94f23	nir: Add more source types to nir_tex_instr_src_type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 17:03:34 +00:00
Iago Toral Quiroga	48f5c34301	nir: add a pass to clamp gl_PointSize to a range The OpenGL and OpenGL ES specs require that implementations clamp the value of gl_PointSize to an implementation-depedent range. This pass is useful for any GPU hardware that doesn't do this automatically for either one or both sides of the range, such as V3D. v2: - Turn into a generic NIR pass (Eric). - Make the pass work before lower I/O so we can use the deref variable to inspect if we are writing to gl_PointSize (Eric). - Make the pass take the range to clamp as parameter and allow it to clamp to both sides of the range or just one side. - Make the pass report progress. v3: - Fix copyright header (Eric) - use fmin/fmax instead of bcsel to clamp (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 09:44:12 +02:00
Rhys Perry	7740149852	nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo v2: add to series v3: update Makefile.sources v4: don't remove a comment and break statement v4: use nir_can_move_instr Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 22:01:30 +00:00
Rhys Perry	da8ed68aca	nir: replace nir_move_load_const() with nir_opt_sink() This is mostly the same as nir_move_load_const() but can also move undef instructions, comparisons and some intrinsics (being careful with loops). v2: actually delete nir_move_load_const.c v3: fix nir_opt_sink() usage in freedreno v3: update Makefile.sources v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def v4: handle if uses v4: fix handling of nested loops v5: re-write adjust_block_for_loops v5: re-write setting of use_block for if uses Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 22:01:30 +00:00
Rhys Perry	fd73ed1bd7	nir: add nir_lower_to_explicit() v2: use glsl_type_size_align_func v2: move get_explicit_type() to glsl_types.cpp/nir_types.cpp v2: use align() instead of util_align_npot() v2: pack arrays a bit tighter v2: rename mem_* to field_* v2: don't attempt to handle when struct offsets are already set v2: use column_type() instead of recreating it v2: use a branch instead of \|= in nir_lower_to_explicit_impl() v2: assign locations to variables and update shared_size and num_shared v2: allow the pass to be used with nir_var_{shader_temp,function_temp} v4: rebase v5: add TODO v5: small formatting changes v5: remove incorrect assert in get_explicit_type() v5: rename to nir_lower_vars_to_explicit_types v5: correctly update progress when only variables are updated v5: rename get_explicit_type() to get_explicit_shared_type() v5: add comment explaining how get_explicit_shared_type() is different v5: update cast strides v6: update progress when lowering nir_var_function_temp variables v6: formatting changes v6: add more detailed documentation comment for get_explicit_shared_type v6: rename get_explicit_shared_type to get_explicit_type_for_size_align v7: fix comment in nir_lower_vars_to_explicit_types_impl() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Jason Ekstrand	078dcb7ccd	nir/lower_io: Add an option to lower 64-bit varyings Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:14:09 -05:00
Eric Engestrom	5d7bcac4e7	nir: remove explicit nir_intrinsic_index_flag values These were left after a rebase and happen to make NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it was noticed. Fixes: `6f20643b47` ("nir: Allow qualifiers on copy_deref and image instructions") Cc: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-31 23:28:20 +01:00
Erico Nunes	b3676a6548	nir/algebraic: rename lower_bitshift to lower_bitops Optimizations that insert bitshift or bitwise operations should not be applied on GPUs that don't support integer operations. The .lower_bitshift could be used to control the bitshift related ones, but there was also one bitwise optimization uncovered. Since only lima and freedreno use this option and the use case is that no bit operations are wanted, let's rename it to .lower_bitops and use it to control all bitops related optimizations. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-07-31 23:06:04 +02:00
Erico Nunes	4a407df682	nir/algebraic: add new fsum ops and fdot lowering The Mali400 pp doesn't implement fdot but has fsum3 and fsum4, which can be used to optimize fdot lowering. fsum2 is not implemented and can be further lowered to an add with the vector components. Currently lima ppir handles this lowering internally, however this happens in a very late stage and requires a big chunk of code compared to a nir_opt_algebraic lowering. By having fsum in nir, we can reduce ppir complexity and enable the lowered ops to be part of other nir optimizations in the optimization loop. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-31 21:35:58 +02:00
Connor Abbott	156306e5e6	nir/find_array_copies: Handle wildcards and overlapping copies This commit rewrites opt_find_array_copies to be able to handle an array copy sequence with other intervening operations in between. In particular, this handles the case where we OpLoad an array of structs and then OpStore it, which generates code like: foo[0].a = bar[0].a foo[0].b = bar[0].b foo[1].a = bar[1].a foo[1].b = bar[1].b ... that wasn't recognized by the previous pass. In order to correctly handle copying arrays of arrays, and in particular to correctly handle copies involving wildcards, we need to use a tree structure similar to lower_vars_to_ssa so that we can walk all the partial array copies invalidated by a particular write, including ones where one of the common indices is a wildcard. I actually think that when factoring in the needed hashing/comparing code, a hash table based approach wouldn't be a lot smaller anyways. All of the changes come from tessellation control shaders in Strange Brigade, where we're able to remove the DXVK-inserted copy at the beginning of the shader. These are the result for radv: Totals from affected shaders: SGPRS: 4576 -> 4576 (0.00 %) VGPRS: 13784 -> 5560 (-59.66 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread Code Size: 329940 -> 263268 (-20.21 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 330 -> 898 (172.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 11:36:25 +02:00
Jonathan Marek	9be902097c	nir/algebraic: add option to lower fall_equalN/fany_nequalN Add generic lowerings for fall_equalN/fany_nequalN. These should be optimal for vec4 backends that doesn't have any special instructions for it, as long as they support saturate. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	1e089d0575	nir/algebraic: add option to lower fdph For backends that don't have a 'fdph' instructions Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	bc3b6168ba	nir: replace lower_sincos with algebraic opt This version has less ops for the same precision. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jason Ekstrand	0e6cb481fa	nir: Add a nir_tex_instr_has_implicit_derivatives helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	7a98c7804c	nir: Move nir_alu_instr_is_comparison to the ALU section Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Timothy Arceri	30038dd5ec	nir/lower_clip: add support for geometry shaders This will be used to enabled compat profile support for geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-19 09:25:47 +10:00
Eric Anholt	251c64a53d	nir: Allow internal changes to the instr in nir_shader_lower_instructions(). v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in NIR, but doesn't generate a new txf_ms instructions as replacement. It's pretty easy to allow that in nir_shader_lower_instructions, and it may be common in lowering passes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-18 11:28:56 -07:00
Jason Ekstrand	548da20b22	nir/lower_doubles: Handle fdiv and fsub directly Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	758fdce9fe	nir: Add some generic helpers for writing lowering passes Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	c74b98486a	nir: Add a helper for fetching the SSA def from an instruction Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	0ba508d7a3	nir,intel: Add support for lowering 64-bit nir_opt_extract_* We need this when doing full software 64-bit emulation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309 Fixes: `cbad201c2b` "nir/algebraic: Add missing 64-bit extract_[iu]8..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-15 16:08:37 -05:00
Jason Ekstrand	7a19e05e8c	nir/opt_if: Clean up single-src phis in opt_if_loop_terminator Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071 Fixes: `2a74296f24` "nir: add opt_if_loop_terminator()" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-15 19:58:51 +00:00
Ian Romanick	1259f6d802	nir: intel/vec4: Add flag to disable some algebraic optimizations A couple patches later in this series use the flag to avoid a few thousand shader-db regresions on all vec4 platforms. I'm not particularly enamored with the name of this flag. However, I suspect the Intel vec4 backend is the only backend that will benefit from it. Specifically, the cases where this helps are all cases where we want to prevent nir_opt_algebraic from rearranging instructions to create 3-source instructions, such as ffma and flrp, with additional immediate value or uniform sources. The earlier commit "intel/vec4: Try to emit a single load for multiple 3-src instruction operands" solves most of the problems caused by additional immediate values, but the restrictions on register strides that cause problems for uniforms and shader inputs persist. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Jason Ekstrand	8f7405ed9d	nir: Add some helpers for chasing SSA values properly There are various cases in which we want to chase SSA values through ALU ops ranging from hand-written optimizations to back-end translation code. In all these cases, it can be very tricky to do properly because of swizzles. This set of helpers lets you easily work with a single component of an SSA def and chase through ALU ops safely. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	6e984bcb92	nir/instr_set: Expose nir_instrs_equal() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	3acddc733f	nir: Refactor nir_src_as_* constant functions Now that we have the nir_const_value_as_* helpers, every one of these functions is effectively the same except for the suffix they use so we can easily define them with a repeated macro. This also means that they're inline and the fact that the nir_src is being passed by-value should no longer really hurt anything. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	ce5581e23e	nir: Add more helpers for working with const values Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Ian Romanick	5450fd7a36	nir: Allow nir_ssa_alu_instr_src_components to operate on non-SSA destinations Existing users only operate on instructions with SSA destinations. Some later patches add new direct calls and indirect calls (via existing NIR functions) on instructions after going out of SSA. At the very least, these calls are added by: intel/vec4: Try to emit a VF source in try_immediate_source intel/vec4: Try to emit a single load for multiple 3-src instruction operands The first commit adds direct calls, and the second adds calls via nir_alu_srcs_equal and nir_alu_srcs_negative_equal. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:11 -07:00

1 2 3 4 5 ...

514 Commits