third_party_mesa3d

Author	SHA1	Message	Date
Ian Romanick	336eab0630	nir: Add a shallow clone function for nir_alu_instr Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Matt Turner <mattst88@gmail.com>	2019-05-31 08:47:03 -07:00
Jason Ekstrand	e84194686d	nir/deref: Add a has_complex_use helper This lets passes easily detect derefs which have uses that fall outside the standard load/store/copy pattern so they can bail appropriately. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Kenneth Graunke	c31b4420e7	st/nir: Re-vectorize shader IO We scalarize IO to enable further optimizations, such as propagating constant components across shaders, eliminating dead components, and so on. This patch attempts to re-vectorize those operations after the varying optimizations are done. Intel GPUs are a scalar architecture, but IO operations work on whole vec4's at a time, so we'd prefer to have a single IO load per vector rather than 4 scalar IO loads. This re-vectorization can help a lot. Broadcom GPUs, however, really do want scalar IO. radeonsi may want this, or may want to leave it to LLVM. So, we make a new flag in the NIR compiler options struct, and key it off of that, allowing drivers to pick. (It's a bit awkward because we have per-stage settings, but this is about IO between two stages...but I expect drivers to globally prefer one way or the other. We can adjust later if needed.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-28 01:06:48 -07:00
Jason Ekstrand	f2dc0f2872	nir: Drop imov/fmov in favor of one mov instruction The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Rob Clark <robdclark@chromium.org>	2019-05-24 08:38:11 -05:00
Caio Marcelo de Oliveira Filho	f051fa6ad7	nir: Add nir_address_format_null_value() Returns the nir_const_value * with the representation of the NULL pointer for each address format. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	6bc9cdb1b7	nir: Add nir_address_format_32bit_offset This is a simple 32-bit address which is not a global address. Gives us a format that don't use 0 as its null pointer value. We will need this in anv to represent nir_var_mem_shared addresses. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	bdaf41107a	nir: Add nir_address_format_logical An address format representing a purely logical addressing model. In this model, all deref chains must be complete from the dereference operation to the variable. Cast derefs are not allowed. These addresses will be 32-bit scalars but the format is immaterial because you can always chase the chain. E.g. push constants in anv. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Ian Romanick	ede45bf9cf	nir: Rename commutative to 2src_commutative The meaning of the new name is that the first two sources are commutative. Since this is only currently applied to two-source operations, there is no change. A future change will mark ffma as 2src_commutative. It is also possible that future work will add 3src_commutative for opcodes like fmin3. v2: s/commutative_2src/2src_commutative/g. I had originally considered this, but I discarded it because I did't want to deal with identifiers that (should) start with 2. Jason suggested it in review, so we decided that _2src_commutative would be used in nir_opcodes.py. Also add some comments documenting what 2src_commutative means. Also suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Jonathan Marek	d0bff89159	nir: allow specifying a set of opcodes in lower_alu_to_scalar This can be used by both etnaviv and freedreno/a2xx as they are both vec4 architectures with some instructions being scalar-only. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-10 15:10:41 +00:00
Vasily Khoruzhick	e67e4e90b2	nir: implement lowering for fsin and fcos Lower sin and cos using Nick's fast sin/cos approximation from https://web.archive.org/web/20180105155939/http://forum.devmaster.net/t/fast-and-accurate-sine-cosine/9648 It's suitable for GLES2, but it throws warnings in dEQP GLES3 precision tests. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 15:25:21 +00:00
Ian Romanick	158370ed2a	nir/flrp: Add new lowering pass for flrp instructions This pass will soon grow to include some optimizations that are difficult or impossible to implement correctly within nir_opt_algebraic. It also include the ability to generate strictly correct code which the current nir_opt_algebraic lowering lacks (though that could be changed). v2: Document the parameters to nir_lower_flrp. Rebase on top of `3766334923` ("compiler/nir: add lowering for 16-bit flrp") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:28 -07:00
Christian Gmeiner	4e110eca42	nir: nir_shader_compiler_options: drop native_integers Driver which do not support native integers should use a lowering pass to go from integers to floats. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 07:35:52 +02:00
Vasily Khoruzhick	443c5a3cd6	nir: add int_to_float lowering pass This new pass lowers ints and bools to floats. It allows hardware that doesn't have native integers (e.g. Mali4x0) use the same code paths as modern hardware. It uses newly introduced pass to gather SSA types and should be used as late as possible. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 01:07:27 +00:00
Karol Herbst	d11b807da5	nir: Add nir_op_vec helper with that we can simplify code where nir vectors are created v2: merge both lines in nir_vec Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-04 12:27:51 +02:00
Jason Ekstrand	91899495a1	nir: Add a SSA type gathering pass This new pass (which isn't even compile-tested) attempts to determine the ALU type of all the SSA values in a function impl. It takes a greedy approach and assigns intness or floatness to everything it thinks can possibly contain an int or a float. Some values will be labled as both int and float and some will be labled as neither and it is up to the caller to decide what to do with this information. However, for a "nice" shader where the original source contained no bit-casts and no implicit bit-casts were introduced by optimizations, there shouldn't be any overlap in the two sets save for the odd CSEd zero constant. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-04 03:52:05 +00:00
Rob Clark	a99c360a46	nir: add pass to lower fb reads Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-02 11:19:22 -07:00
Kenneth Graunke	2b44b27dbe	nir: Add a new nir_cf_list_is_empty_block() helper. Helper and name suggested by Eric Anholt. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-04-28 22:36:08 -07:00
Andreas Baierl	b82de2b4d7	nir: add rcp(w) lowering for gl_FragCoord On some hardware (e.g. Mali400) the shader needs to apply some transformations for correct gl_FragCoord handling. The lowering actions look like the following in pseudocode: gl_FragCoord.xyz = gl_FragCoord_orig.xyz gl_FragCoord.w = 1.0 / gl_FragCoord_orig.w Add this lowering as a nir pass in preparation for using it in the driver. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-29 02:46:44 +00:00
Caio Marcelo de Oliveira Filho	d5ac5d6e83	nir: Add option to lower tex to txl when shader don't support implicit LOD We already add the LOD src, so go ahead and update the texop as well when this option is set. v2: Make it an option. (Rob Clark) v3: Use a more concise name suggested by Jason. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-25 12:13:06 -07:00
Jason Ekstrand	ccb25aaeaf	nir: Use the NIR_SRC_AS_ macro to define nir_src_as_deref We have a macro for this now; no reason to hand-roll it for derefs. While we're here, move the NIR_DEFINE_CAST for derefs down to where all the other ones are. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-22 15:23:24 +00:00
Jason Ekstrand	470422870a	nir: Add helpers for getting the type of an address format Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	2edf29b933	intel,nir: Lower TXD with a bindless sampler When we have a bindless sampler, we need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Jason Ekstrand	995dc4e5c3	nir/lower_io: Expose some explicit I/O lowering helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-19 19:56:42 +00:00
Erico Nunes	4577eb7b7c	nir/algebraic: add lowering for fsign The mali utgard pp doesn't support a sign instruction. In the ARM offline shader compiler, the sign function is implemented using sub(gt(0.0, a), lt(0.0, a)). This is a generic optimization, so implement it in the nir level when lower_fsign is set, alongside the lowering for isign. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-19 15:42:23 +00:00
Jason Ekstrand	c6463f8ac2	nir: Add a nir_src_as_intrinsic() helper Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Jason Ekstrand	85c35885b3	nir: Rework nir_src_as_alu_instr to not take a pointer Other nir_src_as_* functions just take a nir_src. It's not that much more memory copying and the constness preserving really isn't worth the cognitive dissonance. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Jason Ekstrand	eee994e769	nir: Drop "struct" from some nir_* declarations Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-04-18 17:12:44 +00:00
Marek Olšák	d3ce8a7f6b	nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possible Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-04-16 10:24:19 -04:00
Karol Herbst	14531d676b	nir: make nir_const_value scalar v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-04-14 22:25:56 +02:00
Alyssa Rosenzweig	2ce4adefa5	nir: Add nir_lower_viewport_transform On Mali hardware (supported by Panfrost and Lima), the fixed-function transformation from world-space to screen-space coordinates is done in the vertex shader prior to writing out the gl_Position varying, rather than in dedicated hardware. This commit adds a shared NIR pass for implementing coordinate transformation and lowering gl_Position writes into screen-space gl_Position writes. v2: Run directly on derefs before io/vars are lowered to cleanup the code substantially. Thank you to Qiang for this suggestion! v3: Bikeshed continues. v4: Add to Makefile.sources (per Jason's comment). Bikeshed comment. Ian and Qiang's reviews are from v3, but no real functional changes from v4. Rob's review is from v4. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-04-14 19:15:13 +00:00
Christian Gmeiner	b6bed115a5	nir: add lower_ftrunc Port TGSI TRUNC lowering to nir Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-04-13 17:54:48 +00:00
Jason Ekstrand	18ed82b084	nir: Add a pass for selectively lowering variables to scratch space This commit adds new nir_load/store_scratch opcodes which read and write a virtual scratch space. It's up to the back-end to figure out what to do with it and where to put the actual scratch data. v2: Drop const_index comments (by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-04-12 15:59:31 -07:00
Karol Herbst	4a3c04a11f	glsl/nir: add support for lowering bindless images_derefs v2: handle atomics as well make use of nir_rewrite_image_intrinsic v3: remove call to nir_remove_dead_derefs v4: (Timothy Arceri) dont actually call lowering yet Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Timothy Arceri	035759b61b	nir/i965/freedreno/vc4: add a bindless bool to type size functions This required to calculate sizes correctly when we have bindless samplers/images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Karol Herbst	3b2a9ffd60	nir: move brw_nir_rewrite_image_intrinsic into common code Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-12 09:02:59 +02:00
Jason Ekstrand	6279074de1	nir: Get rid of global registers We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 00:29:36 -05:00
Jason Ekstrand	b28bad89b9	nir: Get rid of nir_register::is_packed All we ever do is initialize it to zero, clone it, print it, and validate it. No one ever sets or uses it. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-04-09 00:29:36 -05:00
Timothy Arceri	e30804c602	nir/radv: remove restrictions on opt_if_loop_last_continue() When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-09 11:29:41 +10:00
Rob Clark	1ae0c030cb	nir: add lower_all_io_to_elements I need this part of lower_all_io_to_temps but without the actual lowering to temps part. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-03-30 12:56:01 -04:00
Ian Romanick	2cf59861a8	nir: Add partial redundancy elimination for compares This pass attempts to dectect code sequences like if (x < y) { z = y - x; ... } and replace them with sequences like t = x - y; if (t < 0) { z = -t; ... } On architectures where the subtract can generate the flags used by the if-statement, this saves an instruction. It's also possible that moving an instruction out of the if-statement will allow nir_opt_peephole_select to convert the whole thing to a bcsel. Currently only floating point compares and adds are supported. Adding support for integer will be a challenge due to integer overflow. There are a couple possible solutions, but they may not apply to all architectures. v2: Fix a typo in the commit message and a couple typos in comments. Fix possible NULL pointer deref from result of push_block(). Add missing (-A + B) case. Suggested by Caio. v3: Fix is_not_const_zero to work correctly with types other than nir_type_float32. Suggested by Ken. v4: Add some comments explaining how this works. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:53 -07:00
Ian Romanick	c6ee46a753	nir: Add nir_alu_srcs_negative_equal v2: Move bug fix in get_neg_instr from the next patch to this patch (where it was intended to be in the first place). Noticed by Caio. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:52 -07:00
Ian Romanick	be1cc3552b	nir: Add nir_const_value_negative_equal v2: Rebase on 1-bit Boolean changes. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-28 15:35:52 -07:00
Timothy Arceri	e76ae39ae2	nir: add support for user defined select control This will allow us to make use of the selection control support in spirv and the GL support provided by EXT_control_flow_attributes. Note this only supports if-statements as we dont support switches in NIR. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Timothy Arceri	b56451f82c	nir: add support for user defined loop control This will allow us to make use of the loop control support in spirv and the GL support provided by EXT_control_flow_attributes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108841	2019-03-27 02:39:12 +00:00
Jason Ekstrand	40074ebf74	nir: Add texture sources and intrinsics for bindless On Intel, we have both bindless and bindful and we'd like to use them at the same time if we can so we need to be able to distinguish at the NIR level between the two. This also fixes nir_lower_tex to properly handle bindless in its tex_texture_size and get_texture_lod helpers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 16:12:09 -05:00
Jason Ekstrand	3bd5457641	nir: Add a lowering pass for non-uniform resource access Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-03-25 15:00:36 -05:00
Jason Ekstrand	39da1deb49	nir/lower_io: Add a bounds-checked 64-bit global address format Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-25 14:40:54 -05:00
Iago Toral Quiroga	3766334923	compiler/nir: add lowering for 16-bit flrp And enable it on Intel. v2: - Squash the change to enable it on Intel (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 16:08:25 +01:00
Iago Toral Quiroga	ca31df6f1f	compiler/nir: add lowering option for 16-bit fmod And enable it on Intel. v2: - Squash the change to enable this lowering on Intel (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-25 16:08:25 +01:00
Samuel Pitoiset	23d30f4099	spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-22 19:41:46 +01:00

1 2 3 4 5 ...

482 Commits