third_party_mesa3d

Author	SHA1	Message	Date
Lionel Landwerlin	d33aff783d	intel/fs: add support for sparse accesses Purely from the backend point of view it's just an additional parameter to sampler messages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>	2023-07-27 02:02:30 +03:00
Marcin Ślusarz	a252123363	intel/compiler/mesh: compactify MUE layout Instead of using 4 dwords for each output slot, use only the amount of memory actually needed by each variable. There are some complications from this "obvious" idea: - flat and non-flat variables can't be merged into the same vec4 slot, because flat inputs mask has vec4 stride - multi-slot variables can have different layout: float[N] requires N 1-dword slots, but i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot - some output variables occur both in single-channel/component split and combined variants - crossing vec4 boundary requires generating more writes, so avoiding them if possible is beneficial This patch fixes some issues with arrays in per-vertex and per-primitive data (func.mesh.ext.outputs.*.indirect_array.q0 in crucible) and by reduction in single MUE size it allows spawning more threads at the same time. Note: this patch doesn't improve vk_meshlet_cadscene performance because default layout is already optimal enough. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Lionel Landwerlin	3384f029be	intel/compiler: rework input parameters Use a struct for various common parameters rather than per stage structure or arguments to stage specific entrypoints. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>	2023-07-20 09:08:08 +00:00
Lionel Landwerlin	46958bcb74	intel/fs: fix missing predicate on SEL instruction Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d8dfd153c5` ("intel/fs: Make per-sample and coarse dispatch tri-state") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9381 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24236>	2023-07-19 21:57:25 +00:00
Faith Ekstrand	39b5bb0809	intel/fs: Drop support for nir_register Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24104>	2023-07-19 02:11:57 +00:00
Rohan Garg	ef2b763d9c	anv: fix incorrect asserts when combining CPS and per sample interpolation CPS is dynamically turned off when per sample interpolation is active. Update the asserts to reflect this. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5644011f06` ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23103>	2023-05-31 19:26:59 +00:00
Lionel Landwerlin	04777171e0	intel/fs: try to rematerialize surface computation code This helps a lot with accessing surface handles in control flow. Our resource_intel intrinsic has a non_uniform flag, in which case we cannot apply this optimization. But in uniform cases, this is just a massive win. We drop all kind of pipeline stalls due to find_live_channel. We also reduce register pressure by doing the surface handle computation in a single GRF (instead of 2 or 4). There are some regressions in max dispatch width but those I think are only on SIMD32 and due to the current heuristic disabling it after throughput comparison with SIMD16. We know this heuristic is not perfect, it should probably be updated in another change. Here are some stats (all titles seem to have similar gains) : PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width red_dead_redemption2 5860 -36.80% -5.67% +0.77% +0.06% -81.26% -79.16% -70.62% -8.63% -6.93% --------------------------------------------------------------------------------------------------------------------------------------------------------------- All affected 4716 -37.29% -5.67% +0.95% +0.07% -81.26% -79.16% -70.62% -9.15% -8.47% --------------------------------------------------------------------------------------------------------------------------------------------------------------- Total 5860 -36.80% -5.67% +0.77% +0.06% -81.26% -79.16% -70.62% -8.63% -6.93% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width rise_of_the_tomb_raider_g2 12010 -37.19% -22.12% +0.01% +0.00% -99.01% -99.14% -98.65% -7.62% -4.96% --------------------------------------------------------------------------------------------------------------------------------------------------------------------- All affected 11732 -37.27% -22.14% +0.01% +0.00% -99.01% -99.14% -98.65% -7.67% -5.11% --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Total 12010 -37.19% -22.12% +0.01% +0.00% -99.01% -99.14% -98.65% -7.62% -4.96% PERCENTAGE DELTAS Shaders Instrs Cycles Spill count Fill count Scratch Memory Size Max live registers Max dispatch width total_war_warhammer2 462 -27.45% -12.42% -82.35% -88.46% -66.67% -5.52% -5.62% ----------------------------------------------------------------------------------------------------------------------------------- All affected 335 -28.31% -12.77% -82.35% -88.46% -66.67% -6.25% -7.24% ----------------------------------------------------------------------------------------------------------------------------------- Total 462 -27.45% -12.42% -82.35% -88.46% -66.67% -5.52% -5.62% PERCENTAGE DELTAS Shaders Instrs Cycles Subgroup size Send messages Spill count Fill count Scratch Memory Size Max live registers Max dispatch width witcher_3_dxvk_g2 1049 -36.94% -57.82% +0.06% +0.01% -98.52% -97.29% -98.10% -7.81% -1.00% ------------------------------------------------------------------------------------------------------------------------------------------------------------ All affected 693 -41.93% -58.45% +0.09% +0.01% -98.52% -97.29% -98.10% -10.25% -1.33% ------------------------------------------------------------------------------------------------------------------------------------------------------------ Total 1049 -36.94% -57.82% +0.06% +0.01% -98.52% -97.29% -98.10% -7.81% -1.00% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	3d0cc3f63b	intel/fs: keep track of new resource_intel information Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	56474fae93	intel/fs: fix subgroup invocation read bounds checking nir->info.subgroup_size can be set to an enum : SUBGROUP_SIZE_VARYING = 0 SUBGROUP_SIZE_UNIFORM = 1 SUBGROUP_SIZE_API_CONSTANT = 2 SUBGROUP_SIZE_FULL_SUBGROUPS = 3 So compute the API subgroup size value and compare it to the dispatch size to determine whether we need some bound checking. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9ac192d79d` ("intel/fs: bound subgroup invocation read to dispatch size") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>	2023-03-14 12:15:48 +00:00
Lionel Landwerlin	09cdb77a92	intel/fs: report max register pressure in shader stats Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21756>	2023-03-08 13:37:07 +00:00
Tapani Pälli	207eb94445	intel/compiler: add comment about workaround on simd width Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21619>	2023-03-02 14:06:36 +00:00
Jason Ekstrand	f3969e2413	intel/fs: Rework dynamic coarse handling Use 2 flags for PI & RT messages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	964b878986	intel/fs: Break out yet another FB write helper This new helper, do_emit_fb_writes() does the actual walk over all the render targets to emit each of the different FB writes. We want this in a helper because we're about to go a bit crazy with coarse. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	5644011f06	intel/compiler: Convert wm_prog_key::persample_interp to a tri-state This allows for the possibility that we may not know at compile time if sample shading is enabled through the API. While we're here, also document exactly what this bit means so we don't confuse ourselves. v2: Fixup coarse pixel values (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	d8dfd153c5	intel/fs: Make per-sample and coarse dispatch tri-state Whenever one of them is BRW_SOMETIMES, we depend on dynamic flag pushed in as a push constant. In this case, we have to often have to do the calculation both ways and SEL the result. It's a bit more code but decouples MSAA from the shader key. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Kenneth Graunke	bafbe7c23a	intel/compiler: Set NoMask on cr0 access for float controls mode This is trying to clear a bit in the control register. However, it's executing with whatever channel mask happens to be active. Typically this is the one at the start of the program, so at least some channels will be active. Typically the first channel will be active due to packed dispatch, but that's not always guaranteed. Without NoMask, the float controls writes may randomly not happen. Recent GPUs also seem to have a hang issue when the first instruction in the shader doesn't have any active channels. Having an instruction with NoMask at the start of the program works around the issue. See HSD bug 14017989577. In our case, the float controls preamble was breaking that restriction every time, causing us to run into this problem frequently. Thanks to Tapani Pälli for finding this hang issue, and Francisco Jerez and Lionel Landwerlin for helping pinpoint this issue during review of a workaround patch in !20194. Fixes GPU hangs in Elder Scrolls Online, Witcher 3, and likely more. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7639 Fixes: `9da56ffc52` ("i965/fs: add emit_shader_float_controls_execution_mode() and aux functions") Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20214>	2022-12-08 09:54:09 +00:00
Caio Oliveira	3272868218	intel/compiler: Make thread_payload struct abstract Each shader stage has its own struct and will instantiate it, so the base class doesn't need to be instantiated anymore. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	5b6987daee	intel/compiler: Create and use struct for GS thread payload Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	0ca65b3c4c	intel/compiler: Create and use struct for VS thread payload Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	19c6e1b447	intel/compiler: Create and use struct for TES thread payload Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	73920b7e2f	intel/compiler: Use FS thread payload only for FS Move the setup into the FS thread payload constructor. Consolidate payload setup for that in brw_fs_thread_payload.cpp file. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Tapani Pälli	40c2e0a317	intel/compiler: fix assert from ver to verx10 Fixes: `027b8b4249` ("intel/compiler: Add helper for barrier message payload setup for gfx >= 125") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18546>	2022-09-12 19:03:17 +00:00
Caio Oliveira	027b8b4249	intel/compiler: Add helper for barrier message payload setup for gfx >= 125 CS-like and TCS control barriers converged in gfx >= 125, so use a common helper for the message payload setup. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18362>	2022-09-09 09:35:08 -07:00
Caio Oliveira	55db3aaa3a	intel/compiler: Create fs_visitor::emit_tcs_barrier() Allow us to implement this in brw_fs_visitor.cpp, which then will let us deduplicate code between the CS-like barrier and the TCS barrier in a later patch. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18362>	2022-09-09 09:35:08 -07:00
Marcin Ślusarz	7ebae85955	intel/compiler: insert URB fence before task/mesh termination Bspec 53421 says: "A URB fence memory is typically performed prior the thread exit message, so that the next thread dispatch that reads that URB memory will see it." Cc: 22.1 <mesa-stable> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16665>	2022-08-02 09:31:24 +00:00
Ian Romanick	f7f232385f	intel/fs: Use canonical form for "work around" tags Trivial. Also clean up some weird whitespace. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>	2022-07-26 17:25:19 +00:00
Ian Romanick	377246318a	intel/fs: Eliminate "masked" and "per slot offset" URB messages All of this information can be inferred from the sources. v2: Fix "error: unused variable 'opcode'" detected by marge-bot. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>	2022-07-26 17:25:19 +00:00
Ian Romanick	349a040f68	intel/fs: Make logical URB write instructions more like other logical instructions The changes to fs_visitor::validate() helped track down a place where I initially forgot to convert a message to the new sources layout. This had caused a different validation failure in dEQP-GLES31.functional.tessellation.tesscoord.triangles_equal_spacing, but this were not detected until after SENDs were lowered. Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 19951145 -> 19951133 (<.01%) instructions in affected programs: 2429 -> 2417 (-0.49%) helped: 8 / HURT: 0 total cycles in shared programs: 858904152 -> 858862331 (<.01%) cycles in affected programs: 5702652 -> 5660831 (-0.73%) helped: 2138 / HURT: 1255 Broadwell total cycles in shared programs: 904869459 -> 904835501 (<.01%) cycles in affected programs: 7686744 -> 7652786 (-0.44%) helped: 2861 / HURT: 2050 Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 141442369 -> 141442032 (-0.0%) Instructions helped: 337 Cycles in all programs: 9099270231 -> 9099036492 (-0.0%) Cycles helped: 40661 Cycles hurt: 28606 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17605>	2022-07-26 17:25:18 +00:00
Ian Romanick	a477587b4a	intel/fs: Add _LOGICAL versions of URB messages The lowering is currently fake. It just changes the opcode from the _LOGICAL version to the non-_LOGICAL version. v2: Remove some rebase cruft. 's/gfx8_//;s/simd8_/' in brw_instruction_name. Both suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>	2022-07-08 19:45:34 +00:00
Kenneth Graunke	9886615958	intel/compiler: Move spill/fill tracking to the register allocator Originally, we had virtual opcodes for scratch access, and let the generator count spills/fills separately from other sends. Later, we started using the generic SHADER_OPCODE_SEND for spills/fills on some generations of hardware, and simply detected stateless messages there. But then we started using stateless messages for other things: - anv uses stateless messages for the buffer device address feature. - nir_opt_large_constants generates stateless messages. - XeHP curbe setup can generate stateless messages. So counting stateless messages is not accurate. Instead, we move the spill/fill accounting to the register allocator, as it generates such things, as well as the load/store_scratch intrinsic handling, as those are basically spill/fills, just at a higher level. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16691>	2022-05-25 06:56:01 +00:00
Lionel Landwerlin	0cd93c59ef	intel/compiler: add primitive rate output support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13739>	2022-02-02 17:09:46 +00:00
Jason Ekstrand	a1de102479	intel/fs: Use compare_func for wm_prog_key::alpha_test_func Because 0 is no longer a recognizable value (it's NEVER, which isn't a good default), we add an emit_alpha_test bool to tell the back-end when to bother alpha testing. This lets us only touch crocus with the change. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14157>	2022-01-14 15:08:09 +00:00
Dave Airlie	e12b0d0d60	intel/compiler: remove gfx6 gather wa from backend. Crocus lowers this in the frontend, they key member is still used but reset prior to backend. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14202>	2021-12-22 21:37:55 +00:00
Jason Ekstrand	4fa58d27a5	intel/fs,vec4: Drop support for shader time Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Jason Ekstrand	8f3c100d61	intel/fs,vec4: Drop uniform compaction and pull constant support The only driver using these was i965 and it's gone now. This is all dead code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Caio Oliveira	be89ea3231	intel/compiler: Handle per-primitive inputs in FS In Fragment Shader, regular inputs are laid out in the thread payload in a one dword per each half-GRF, that gives room for having the two delta dwords needed for interpolation. Per-primitive inputs are laid out before the regular inputs, and since there's no need to have delta information, they are packed. So half-GRF will be fully filled with 4 dwords of input. When num_per_primitive_inputs is zero (the default case), behavior should be the same as before. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>	2021-12-04 00:41:46 +00:00
Caio Oliveira	858424bd2e	intel/compiler: Use gl_shader_stage_uses_workgroup() helpers Instead of checking for MESA_SHADER_COMPUTE (and KERNEL). Where appropriate, also use gl_shader_stage_is_compute(). This allows most of the workgroup-related lowering to be applied to Task and Mesh shaders. These will be added later and "inherit" from cs_prog_data structure. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13629>	2021-11-03 11:09:48 -07:00
Jason Ekstrand	31fdd26d01	intel/compiler: Add unified barrier support for CS Program CS barrier message fields for producers/consumers. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Jason Ekstrand	f5e58838c2	intel/fs: Handle non-perspective-correct interpolation on gen4-5 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11125>	2021-06-03 02:36:17 +00:00
Jason Ekstrand	3f36e027d3	intel/fs: Don't use pixel_z for Gen4-5 source_depth_to_render_target The source_depth_to_render_target flag can get set on old gen4-5 HW in a few cases which are independent of the app writing gl_FragDepth. It should be safe to just use fetch_payload_reg in that case instead of depending in interpolation setup. This fixes a bug with certain very simple shaders where we might end up not including the depth when we should have. While we're here, rework the logic around setting src_depth and add a comment so it's more clear what's going on. Fixes: `6d4070f3dd` "intel/compiler: add support for fragment coordinate..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10596>	2021-05-03 23:51:51 +00:00
Lionel Landwerlin	81f369c93b	intel/compiler: add coarse pixel offset on Gfx12.5+ Gfx12.5 has a slightly different code path. v2: Document the oddness Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>	2021-05-02 20:20:06 +00:00
Lionel Landwerlin	6d4070f3dd	intel/compiler: add support for fragment coordinate with coarse pixels v2: Drop new internal opcodes (Jason) Simplify code (Jason) v3: Add Z computation for coarse pixels v4: Document things a little Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>	2021-05-02 20:20:06 +00:00
Francisco Jerez	a2572a9da4	intel/fs: Add more efficient fragment coordinate calculation. The PIXEL_X/Y opcodes used by the current implementation are broken on XeHP due to the new regioning restrictions of the floating-point pipe. We could have the regioning lowering pass fix it in theory by lowering the conversions into separate MOV instructions, but that would be more costly than this implementation that only needs a pair of pipelined ADDs and a pair of pipelined MOVs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>	2021-04-16 08:27:35 +00:00
Yevhenii Kharchenko	edd12acbec	intel/compiler: remove unused member 'input_vue_map' v2: Instead of fixing unitialized member 'fs_visitor::input_vue_map' (as reported by Coverity Scan in defect CID 1474559), remove unused members 'vec4_tcs_visitor::input_vue_map' and 'fs_visitor::input_vue_map'. Also fixed 'debug_enabled' argument skipped in a fs_visitor constructor call from brw_compile_tes(). Signed-off-by: Yevhenii Kharchenko <yevhenii.kharchenko@globallogic.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10040>	2021-04-08 18:20:10 +00:00
Anuj Phogat	1d296484b4	intel: Rename Genx keyword to Gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/Gen$[[:digit:]]\+$/Gfx\1/g" Exclude changes in src/intel/perf/oa-.xml: find src/intel/perf -type f $ -name ".xml" $ \| xargs sed -ie "s/Gfx/Gen/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	b75f095bc7	intel: Rename genx keyword to gfxx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen$[[:digit:]]\+$/gfx\1/g" Exclude pack.h and xml changes in this patch: grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+_pack\.h$/gen\1/g" grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+\.xml$/gen\1/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	c1f3a778de	intel: Rename GENx prefix in macros to GFXx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN" -rIl src/intel/genxml \| grep -E ".py" \| xargs sed -ie "s/GEN$[%{]$/GFX\1/g" grep -E "[^_]GEN[[:digit:]]+" -rIl $SEARCH_PATH \| grep -E ".(\.c\|\.h\|\.y\|\.l)" \| xargs sed -ie "s/$[^_]$GEN$[[:digit:]]\+$/\1GFX\2/g" Leave out renaming GFX12_CCS_E macros. They fall under renaming pattern like "_GEN[[:digit:]]+": grep -E "GFX12_CCS_E" -rIl $SEARCH_PATH \| xargs sed -ie "s/GFX12_CCS_E/GEN12_CCS_E/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Caio Marcelo de Oliveira Filho	7fb1e58651	intel/compiler: Make visitors take debug_enabled as a parameter The callers already have this value, and we would like to make it follow different rules other than stage that might not be visible to the helper function, so just pass explicitly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Jason Ekstrand	7280b0911d	intel/compiler: Add support for bindless shaders The Intel bindless thread dispatch model is very simple. When a compute shader is to be used for bindless dispatch, it can request a set of stack IDs. These are allocated per-dual-subslice by the hardware and recycled automatically when the stack ID is returned. Passed to the bindless dispatch are a global argument address, a stack ID, and an address of the BINDLESS_SHADER_RECORD to invoke. When the bindless shader is dispatched, it is passed its stack ID as well as the global and local argument pointers. The local argument pointer is the address of the BINDLESS_SHADER_RECORD plus some offset which is specified as part of the BINDLESS_SHADER_RECORD. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>	2020-11-25 05:37:09 +00:00

1 2 3

108 Commits