third_party_mesa3d

Author	SHA1	Message	Date
Lionel Landwerlin	a26c7b0b03	intel/ds: new tracepoints for generated commands Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	472f49ef43	genxml: remove NDEBUG_UNUSED Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	41b2ed65e2	genxml: generate opencl packing headers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	2a0328ba8b	genxml: enable opencl code generation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	e6b5196079	intel-clc: print text input Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	4fd7495c69	intel/clc: add ability to output NIR This will be used to generate a serialized NIR of functions for internal shaders. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	2bae1b6b66	intel-clc: move ISA generation to its own function Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	2a1ff08376	intel/compiler: make default NIR compiler options visible Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	012489e55c	meson: add a new option to enable intel-clc without building RT shaders Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lionel Landwerlin	c53a4711cb	anv: fix incorrect flushing on shader query copy When doing query result copies in 3D mode, we're flushing the render target cache, but the shader writes go through the dataport. Fixes flakes/fails in piglit with shader query copies forced with Zink : $ query_copy_with_shader_threshold=0 ./bin/arb_query_buffer_object-coherency -auto -fbo Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b3b12c2c27` ("anv: enable CmdCopyQueryPoolResults to use shader for copies") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lionel Landwerlin	2437556d83	intel/fs: rerun divergence prior to lowering non-uniform interpolate at sample Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `74a40cc4b6` ("intel/fs: move lower of non-uniform at_sample barycentric to NIR") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lionel Landwerlin	8f5a7f57df	intel/fs: indent lowering code to make it more readable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lionel Landwerlin	c517088cf1	anv: factor out post submit queue debug code Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lionel Landwerlin	67f3fa896e	intel/dev: fix missing dependency on generated packing heaers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lepton Wu	04d26ceb0a	llvmpipe: Set "+64bit" for X86_64 Without this, on some "buggy" qemu cpu setup, LLVM could crash if LLVM detects the wrong CPU type. Fixes: `f92cadccc6` ("llvmpipe: Always using util_get_cpu_caps to get cpu caps for llvm on x86") Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27539>	2024-02-12 22:43:46 +00:00
Danylo Piliaiev	5dd5d4c4b5	tu: Exclude more a7xx regs from stomping Stomping these regs even for a short time leads to crashes. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	e4631bee61	freedreno/devices: Update magic regs for a7xx These regs are written by blob, for some of them blob could write non-zero values. So executing Turnip after blob without writing these regs could lead to nasty GPU crashes. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	eb1e71e707	freedreno,tu: Move varying interp and varying repl modes to xml Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	78c843230c	tu/a750: Consider vertex attr buff in gmem allocation A750 added a new optimization - placement of vertex attributes into GMEM, so part of GMEM is carved out for it and needs to be considered during GMEM allocations. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Mark Collins	5266815ca9	tu/a7xx: Update CCU layout logic for A7XX A7XX introduces some changes into the CCU such as having different amounts of memory per CCU for depth and color and dividing up CCU control into two registers A7XX_RB_CCU_CNTL and A7XX_RB_CCU_CNTL2 where CNTL2 no longer requires a complete flush to be updated, we currently don't take advantage of this as any CCU updates set both registers but it's a potential optimization we can add in the future. Signed-off-by: Mark Collins <mark@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	98d6d93a82	turnip,ir3/a750: Implement inline uniforms via ldg.k Inline consts suffer the same issue as driver params, so they also should be preloaded via preamble. There is special instruction to load from global memory into consts. Co-Authored-By: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Connor Abbott	6a744ddebc	ir3: Initial support for pushing globals with ldg.k Add a separate pass which uses the analyze_ubo_ranges machinery to construct ranges of readonly globals accessed in the shader and push them to constants in the preamble, using ldg.k if possible. This is enough to handle inline uniforms in turnip but also provides a base for OpenCL, although the pass would need further work for that. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Connor Abbott	513fa1873c	ir3/a7xx: Fix load_global_ir3 with immediate offset Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Connor Abbott	45c71803f9	tu: Add more info to ldg inline uniform path This will let us push the ldg into the preamble. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	b87b8fdf73	tu: Use SS6_INDIRECT for VS params a750 has SS6_DIRECT path broken, we should either use UBO lowering or SS6_INDIRECT path. It is implemented as INDIRECT load even on a750+ because with UBO lowering it would be tricky to get const offset for to use in multidraw, also we would need to ensure the offset is not 0. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	76e417ca59	turnip,ir3/a750: Implement consts loading via preamble A750 expects driver params loaded through the preamble, old path does work but has issues when the same LOAD_STATE is used between several draw calls (it seems that LOAD_STATE is executed only for the first draw call). To solve this we now lower driver params to UBOs and let NIR deal with them. Notes: - VS params are loaded via old path since blob do the same and there are no issues observed. - FDM is not supported at the moment. - For now driver params data is emitted via CP_NOP because it's tricky to allocate space for the data. (It is emitted when we are already in sub_cs) Co-Authored-By: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	7429ca3115	tu: Use SS6_INDIRECT consts upload path for 3d blits 3d blits used DIRECT consts upload path, which doesn't work properly on a750+, however uploading them via SS6_INDIRECT seem to be working. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	30597970a5	tu/a7xx: Do not preload shaders, HW does it by default Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	ac75edb8c4	tu/a7xx: Correctly set A7XX_HLSQ_UNKNOWN_A9AE.SYSVAL_REGS_COUNT Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	bc6b847017	ir3: Add ldg.k instruction Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:13 +00:00
Danylo Piliaiev	ad52f92cb8	tu: Define and set to zero all SP__VGPR_CONFIG regs SP_FS_VGPR_CONFIG was found to be correlated with blob using avgs/uvgs. Other SP__VGPR_CONFIG where undefined per-stage regs and it was tested via rddecompiler that they "fix" hangs in respective shader stage, when such stage uses the following instructions pattern: avgs.s.1.tex.0 (ss) avgs.e; uvgs.s.tex.0; uvgs.e The exact meaning of SP_*_VGPR_CONFIG is to be investigated. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:12 +00:00
Jonathan Marek	c166c5100b	tu/a750: Basic a750 support Could run vkcube. Based on changes from Jonathan Marek <jonathan@marek.ca> Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:12 +00:00
Danylo Piliaiev	cdadead230	tu/a7xx: Make A7XX_RB_UNKNOWN_8E06 value configurable per-gen It is some kind of DBG register which has different value on different gens. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>	2024-02-12 22:05:12 +00:00
Sagar Ghuge	98b62434bd	intel/compiler: Lower texture operation to combine LOD and AI We have to push the lowering of texture operations a bit further in pipeline since nir_lower_tex gets invoked twice and if there is no LOD source present, nir_lower_tex adds that as a source. Once that's all done we can easily combine the LOD and array index into a single 32-bit value. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Sagar Ghuge	c984d6e2fc	nir: Drop intel specific lowering code In previous patches, we have moved the Intel specific lowering code in brw_nir_lower_texture file. We can go ahead and drop the Intel specific texture source too. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Sagar Ghuge	15129c7634	intel/compiler: Use nir_tex_src_backend1 to pack LOD and array index Since this lowering is totally Intel specific, we don't have to introduce the new texture source. We can use the nir_tex_src_backend1 source to pack LOD/LOD Bias and array index into 32 bit single value. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Sagar Ghuge	73a3257968	intel/compiler: Add texture operation lowering pass This pass combines the LOD or LOD bias and array index into a single 32-bit value since Xe2+ sampler messages requires us to do that. v2: (Alyssa) - Use nir_iand_imm instead of nir_iand and nir_imm_int - Use nir_trim_vector instead of nir_swizzle Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Lionel Landwerlin	646a7c864d	anv: re-introduce BO CCS allocations On Gfx12.0, CCS allocations have to be allocated per image because the format of the image goes into the AUX-TT PTEs. The effect on memory allocations is limited since the main surface granularity in the AUX-TT PTE is 64KB. On Gfx12.5, the granularity of the AUX-TT PTE is 1MB. This creates a lot of waste in the application memory allocations. Fortunately the HW doesn't care about the format put into the PTEs anymore. So it becomes possible to have 2 images share the same PTE. To implement this we bring back an earlier version of AUX-TT mappings where we used to allocate additional CCS space at the end of the VkDeviceMemory objects. On Gfx12.5, if the BO has additional CCS space, we will now map the main surface to that space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Lionel Landwerlin	bd197c6bcf	intel/aux_map: add helper to compute offset in aux data Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Lionel Landwerlin	c0889a127b	intel/aux_map: add BSpec reference Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Lionel Landwerlin	da6484a8a4	anv: use address helper to compute address u64 value Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Lionel Landwerlin	7763e75eea	anv: move ALLOC_HOST_CACHED_COHERENT as define That way gdb can decode the other flags when looking at the variables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Lionel Landwerlin	3f64ec141e	isl: add a no-aux-align usage flag This flag signals that the driver will be dealing with aux-tt alignment requirements on its own. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Lionel Landwerlin	44515bb92c	isl: printout sparse usage Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>	2024-02-12 21:00:27 +00:00
Rhys Perry	926d9f1cef	radv: support minmax filter for more formats Support should be the same as AMDVLK, except for these formats: - VK_FORMAT_R4G4_UNORM_PACK8 - VK_FORMAT_A4R4G4B4_UNORM_PACK16_EXT - VK_FORMAT_A4B4G4R4_UNORM_PACK16_EXT - VK_FORMAT_A1B5G5R5_UNORM_PACK16_KHR - VK_FORMAT_A8_UNORM_KHR - VK_FORMAT_X8_D24_UNORM_PACK32 - VK_FORMAT_D24_UNORM_S8_UINT And the various emulated compressed formats. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27551>	2024-02-12 20:05:27 +00:00
Faith Ekstrand	05cf04ac97	nvk: Convert shader addresses to offsets in nvk_shader.c Fixes: `e162c2e78e` ("nvk: Use VM_BIND for contiguous heaps instead of copying") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27565>	2024-02-12 18:47:07 +00:00
Faith Ekstrand	afd42f5951	nvk/heap: Rework over-allocation Instead of making it part of every BO, just reserve a bit of space at the end of the top buffer as part of setting up our vma_heap. This reduces our memory allocation by nvk_heap::overalloc per BO and means that the over-allocation is taken into account when sparse binding heap BOs in the contiguous case. Fixes: `e162c2e78e` ("nvk: Use VM_BIND for contiguous heaps instead of copying") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27565>	2024-02-12 18:47:07 +00:00
Faith Ekstrand	728256e994	nvk/heap: Use nvk_heap_bo::addr instead of bo->offset Fixes: `e162c2e78e` ("nvk: Use VM_BIND for contiguous heaps instead of copying") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27565>	2024-02-12 18:47:07 +00:00
Faith Ekstrand	83521dd486	nvk: Don't set CONSTANT_BUFFER_SELECTOR with a zero size Kepler complains about this and it's unnecessary since we set ENABLE_FALSE whenever we have a zero size anyway. Fixes: `55413e33dc` ("nvk: Disable all cbufs in nvk_queue_init_context_draw_state()") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27565>	2024-02-12 18:47:07 +00:00
Sviatoslav Peleshko	28ad2f488a	anv: Store host-located copy of NULL surface state for faster memcpy Real null_surface_state is located in the GPU memory, so copying from there will be slow for dGPUs. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10594 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27577>	2024-02-12 17:48:15 +00:00

1 2 3 4 5 ...

184483 Commits