diff --git a/docs/relnotes.rst b/docs/relnotes.rst index 731c3384143..3c22bc8b5f4 100644 --- a/docs/relnotes.rst +++ b/docs/relnotes.rst @@ -3,6 +3,7 @@ Release Notes The release notes summarize what's new or changed in each Mesa release. +- :doc:`24.0.0 release notes ` - :doc:`23.3.4 release notes ` - :doc:`23.3.3 release notes ` - :doc:`23.3.2 release notes ` @@ -408,6 +409,7 @@ The release notes summarize what's new or changed in each Mesa release. :maxdepth: 1 :hidden: + 24.0.0 23.3.4 23.3.3 23.3.2 diff --git a/docs/relnotes/24.0.0.rst b/docs/relnotes/24.0.0.rst new file mode 100644 index 00000000000..cf8feba9a65 --- /dev/null +++ b/docs/relnotes/24.0.0.rst @@ -0,0 +1,4455 @@ +Mesa 24.0.0 Release Notes / 2024-02-01 +====================================== + +Mesa 24.0.0 is a new development release. People who are concerned +with stability and reliability should stick with a previous release or +wait for Mesa 24.0.1. + +Mesa 24.0.0 implements the OpenGL 4.6 API, but the version reported by +glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / +glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. +Some drivers don't support all the features required in OpenGL 4.6. OpenGL +4.6 is **only** available if requested at context creation. +Compatibility contexts may report a lower version depending on each driver. + +Mesa 24.0.0 implements the Vulkan 1.3 API, but the version reported by +the apiVersion property of the VkPhysicalDeviceProperties struct +depends on the particular driver being used. + +SHA256 checksum +--------------- + +:: + + TBD. + + +New features +------------ + +- VK_EXT_image_compression_control on RADV +- VK_EXT_device_fault on RADV +- OpenGL 3.3 on Asahi +- Geometry shaders on Asahi +- GL_ARB_texture_cube_map_array on Asahi +- GL_ARB_clip_control on Asahi +- GL_ARB_timer_query on Asahi +- GL_EXT_disjoint_timer_query on Asahi +- GL_ARB_base_instance on Asahi +- OpenGL 4.6 (up from 4.2) on d3d12 +- VK_EXT_depth_clamp_zero_one on RADV +- GL_ARB_shader_texture_image_samples on Asahi +- GL_ARB_indirect_parameters on Asahi +- GL_ARB_viewport_array on Asahi +- GL_ARB_fragment_layer_viewport on Asahi +- GL_ARB_cull_distance on Asahi +- GL_ARB_transform_feedback_overflow_query on Asahi +- VK_KHR_calibrated_timestamps on RADV +- VK_KHR_vertex_attribute_divisor on RADV +- VK_KHR_maintenance6 on RADV +- VK_KHR_ray_tracing_position_fetch on RADV +- EGL_EXT_query_reset_notification_strategy + + +Bug fixes +--------- + +- vlc crashes when playing 1920x1080 video with Radeon RX6600 hardware acceleration and deinterlacing enabled. +- [radeonsi] Regression: graphical artifacting on water texture in OpenGOAL +- Assertion when creating dmabuf-compatible VkImage on Tigerlake +- VAAPI: EFC on VCN2 produces broken H264 video and crashes the HEVC encoder +- [AMDGPU RDNA3] Antialiasing is broken in Blender +- MTL: vulkan cooperative matrix tests gpu hang on MTL +- Assassin's Creed Odyssey wrong colors on Arc A770 +- The Finals fails to launch with DX12 on Intel Arc unless "force_vk_vendor" is set to -1. +- VA-API CI tests freeze +- radv: games render with garbage output on RX5600M through PRIME with DCC +- radv: RGP reports for mesh shaders are confusing +- zink crashes on nvidia +- d3d10umd: Build failure regression with MSVC during 23.3 development cycle +- Error during SPIR-V parsing of OpCopyLogical +- rusticl: fails to find SPIRV-Tools headers via pkg-config under non-default prefix +- Conservative depth output doesn't work with RADV +- RADV: DOA-X3 (yuzu) missing hair, eyes and skybox +- intel: Require 64KB alignment when using CCS and multiple engines +- radv: Atlas Fallen corrupted rendering +- r300: nir pass to lower indirect regression +- r300: LRP present even with .lower_flrp32=true +- 23.3.2 regression: kms_swrast_dri.so segfaults +- Radeon: YUYV DMA BUF eglCreateImageKHR fails +- No support for a644 +- anv: importing memory for a compressed image using modifier is hitting an assert +- anv: importing memory for a compressed image using modifier is hitting an assert +- anv: importing memory for a compressed image using modifier is hitting an assert +- Large regression in \`glbench --tests context` on Intel +- Android 14 depends on Vulkan EXT_swapchain_maintenance1, which breaks radv +- nvk,nak: Implement shaderFloat64 +- Mesa is not compatible with Python 3.12 due to use of distutils +- anv: glcts regression on zink +- nir: Trivial loop not unrolling +- Possible regression with AMD GPU with flatpak apps +- nvk,nak: Implement VK_KHR_vulkan_memory_model +- Compiling Mesa with X in custom prefix fails in Intel Vulkan driver +- anv: implement recommended AUX-TT invalidation on compute/transfer queues +- anv: implement recommended AUX-TT invalidation on compute/transfer queues +- !26307 broke some piglit tests with rusticl on radeonsi on Navi 14 +- Compute shader with imageStore() to a swapchain image (from a display surface) produces incorrect results (Raspberry, Vulkan). +- nvk: Implement VK_EXT_multi_draw +- radv/aco: Crysis 2 Remastered RT reflections are blocky around the edges with ACO, renders normally with LLVM +- radv: Major regression in main branch causing all Vulkan apps to crash on 6600M (Navi 23) +- [23.3.0] Parallel build failure - fatal error: vtn_generator_ids.h: No such file or directory +- crocus: Assertion failures in NIR divergence analysis +- nak: Implement nir_op_fmulz +- nvk,nak: Implement VK_KHR_shader_float_controls +- 748b7f80ef1cf6a3fed9991d70230e69fef51a0e - Regression on Doom Eternal w/ RT Reflections +- glFlush() blocks until close to GPU completion on Radeon R9 270 +- nvk: Implement VK_EXT_texel_buffer_alignment +- rusticl: fails to find X11 headers via pkg-config under non-default prefix +- nvk,nak: Implement VK_EXT_shader_image_atomic_int64 +- nvk,nak: Implement VK_KHR_shader_atomic_int64 +- nvk,nak: Implement VK_KHR_shader_subgroup_extended_types +- nvk,nak: Implement shaderInt64 +- nvk: Implement VK_EXT_subgroup_size_control +- mesa:freedreno / afuc-disasm unit test failure +- anv: Resident Evil 2 hang +- Mesa 23.3.0 release build fails on 22.04 LTS +- Segfault in SDL2 game when using environment variables: \`SDL_VIDEODRIVER=wayland DRI_PRIME=1` +- Mesa 22.3.0 SEGFAULT in nir shader creation for r600 cards on FreeBSD +- radeonsi: merge request 26055 causes thousands of piglit failures +- iris: INTEL_COMPUTE_CLASS causes gpu hangs on MTL platforms +- anv: piglit tests regressed for zink +- aco,radeonsi: GFX11 dEQP-GLES31.functional.separate_shader.random.0 fail when AMD_DEBUG=useaco +- crash in si_update_tess_io_layout_state during _mesa_ReadPixels (radeonsi_dri, mesa 23.2.1) +- Compilation error with current LLVM git (createLoopSinkPass) +- [RADV] War Thunder has some grass flickering. +- radv: satisfactory broken shader +- RADV problem with R7 M440 in some games +- nvk,nak: Weird fog effect in old GTA games with DXVK +- gpu driver crashes when opening ingame map playing dead space 2023 +- [anv] Valheim water misrendering +- radv, zink: dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component16 fails on gfx9 +- Armored Core 6 (1888160) fake_sparse support +- radv: fix sparseResidencyImage3D on GFX8 +- build still broken on Slackware 15.0 i586 +- mesa fails to build on arch +- EGL/v3d: EGL applications under a X compositor doesn't work +- nvk,nak: Implement VK_KHR_fragment_shader_barycentric +- RADV: trunc_coord breaks ambient occlusion in Dirt Rally and other games +- radv: Mass Effect Legendary Edition: a line going across the screen is visible in some areas with Ambient Occlusion enabled +- LTO-related build failures +- anv: DIRT5 gfx11_generated_draws_spv_source triggers "assert(!copy_value_is_divergent(src) || copy_value_is_divergent(dest));" +- nvk: Implement VK_KHR_synchronization2 +- nvk: Implement bufferDeviceAddressCaptureReplay +- nvk,nak,codegen: Implement VK_KHR_pipeline_executable_properties +- panfrost: gbm_bo_get_offset() wrongly returns 0 for second plane of NV12 buffers +- Sastisfactory since Update 8 needs force_vk_vendor set +- [RADV][TONGA] - BeamNG.drive (284160) - Artifacts are present when looking at the skybox. +- LEGO Star Wars: The Skywalker Saga graphical glitches (DXVK) on R9 380 +- [radv] Crypt not rendering properly +- Leaks of DescriptorSet debug names +- [Tracing flake] Missing geometry in trace\@freedreno-a630\@freedoom\@freedoom-phase2-gl-high.trace +- Unreal Engine 5.2 virtual shadow maps have glitchy/lazy tile updates +- RADV: Visual glitches in Unreal Engine 5.2.1 when using material with anisotropy and light channel 2 +- radv: Regression with UE5 test +- SIGSEGV with MESA_VK_TRACE=rgp and compute only queue +- mesa: vertex attrib regression +- [ANV] Corruptions in Battlefield 4 +- anv regression w/ commit e488773b29d97 ("anv: Fast clear depth/stencil surface in vkCmdClearAttachments") +- freedreno uses wrong patch size +- ir3: dEQP-GLES31.functional.synchronization.inter_invocation.image_atomic_read_write crash on a6xx gen4 +- a630: antichamber crashes with pack_A6XX_GRAS_CL_GUARDBAND_CLIP_ADJ: Assertion +- mesa:amd+compiler / aco_tests assembler.gfx11.vop12c_v128/gfx11 failure with llvm-17 +- ci_run_n_monitor crash because of incorrect parsing of dag +- Zink + Venus: driver can't handle INVALID<->LINEAR! +- anv not initializing engine correctly with INTEL_COPY_CLASS=1 +- Anv: Particles have black square artifacts on Counter Strike 2 on Skylake +- Lords of the Fallen 2023 Red Eye mode crashing game and desktop +- [radeonsi] [vulkan] [23.3-rc1 regression] Video output corrupted in QMplay2 with Vulkan renderer +- [BISECTED] ac/radeon commit somehow breaks nv12 surface from HEVC decode +- radv: Chrome crashes when ANGLE uses GPL +- Parsec displays completely green screen with hardware decoder selected while using Mesa 23.3 and Mesa 24 +- H264 to H264 transcode output corruption with gst-vaapi +- opencl-jpeg-encoder does not work with nouveau/rusticl, works with nouveau/clover +- [rusticl] [radeonsi] [darktable4] [ppc64le] Darktable always renders black images despite not throwing any error +- [R600] X-plane 11 demo (Linux Native) crashes upon launch on HD5870 and HD6970 +- [CI] .gitlab-ci/setup-test-env.sh date -d parsing fails on Alpine Linux containers +- ANV not handling VkMutableDescriptorTypeCreateInfoEXT::pMutableDescriptorTypeLists[i] being out of range +- Ubuntu 23.10 build error with rusticl_opencl_bindings.rs +- Rusticl fails to build +- tu: Wolfenstein: The New Order misrenders on a740 +- DRI_PRIME fails with ACO only radeonsi +- ci_run_n_monitor: undetected sanity dep breaks the pipeline + + +Changes +------- + +Alejandro Piñeiro (10): + +- broadcom/qpu: use back BITFIELD64_RANGE for ANYOPMASK +- broadcom/compiler: add v3d_pack_unnormalized_coordinates helper +- broadcom: only support v42 and v71 +- broadcom/compiler: set properly lod query +- broadcom/cle: remove v33 and v41 from xml definition +- broadcom/cle: rename xml files +- docs/v3d: update v3d documentation +- nir: add new opcodes to map new v71 packing/conversion instructions +- broadcom/compiler: update image store lowering to use v71 new packing/conversion instructions +- broadcom/compiler: remove one superfluous call to nir_opt_undef + +Alessandro Astone (2): + +- asahi: Use the compat version of qsort_r +- zink: Fix resizable BAR detection logic + +Alexander von Gluck IV (3): + +- egl/haiku: Cleanup includes; minor build fix +- hgl: Redefine visual options in hgl_context.h +- egl/haiku: Remove some dead cleanup code + +Alyssa Rosenzweig (286): + +- hasvk: Support builiding on non-Intel +- crocus: Support building on non-Intel +- meson: Add vulkan-drivers=all option +- meson: Add gallium-drivers=all option +- gitlab: Highlight .cl as C +- nir,vtn: Add exported bool to nir_function +- nir: Add nir_remove_non_exported +- nir/builder: Add nir_call helper +- meson: Simplify clc expression +- meson: Require clc for asahi +- vtn: Add spirv_library_to_nir_builder feature +- clc: Add missing idep_vtn +- agx: Fix lower regular texture metadata +- agx: Vectorize load/stores +- agx: Fuse (unmasked) extr_agx +- agx: Fuse ubitfield_extract +- asahi: Fix agx_pack unrolling +- asahi: Make GenXML compatible with OpenCL +- asahi: Unpack at 32-bit granularity +- asahi: Reexpress genxml pack macro +- asahi: Add folder for internal shaders +- asahi: Add asahi_clc infrastructure +- asahi: Pass valid memctx to open_device +- asahi: Deserialize libagx when opening device +- asahi,agx: Plumb libagx +- asahi: Add software-defined field to texture desc +- agx: Use CL for texture lowerings +- asahi: Remove placeholder shader +- asahi: Fix tools=all builds +- ci: Opt out asahi from clang-format +- ttn: Set sample shading for sample ID reads +- compiler: Make shader_enums.h CL-safe +- compiler: Inline mesa_vertices_per_prim +- compiler: Make u_decomposed_prims_for_vertices available to CL +- nir/lower_gs_intrinsics: Include primitive counts +- nir/lower_gs_intrinsics: Append EndPrimitive +- nir/lower_gs_intrinsics: Count decomposed primitives too +- nir: Also gather decomposed primitive count +- nir: Add intrinsics for lowering GS +- nir: Add intrinsics for lowering bindless textures/samplers +- nir/print: handle adjacency +- asahi: Clamp 8-bit integer RTs +- agx: Legalize image MS index +- agx: Fix fragment side effects scheduling +- agx: Check for spilling in release builds +- docs/features: Mark ARB_mdi done on asahi +- agx: Cleanup 8-bit math before lowering +- agx: Require 32-bit alignment for EOT offset +- agx: Add scaffolding for subgroup ops +- agx: Translate simple subgroup ops +- asahi: Pack non-border colour sampler desc +- agx: Allow drivers to lower texture handles +- asahi: Lower samplers to bindless if needed +- agx: Lower LOD bias earlier +- agx: Handle bindless samplers +- asahi: Handle load_sampler_handle +- asahi: Add sampler heap data structure +- asahi: Use the sampler heap +- asahi: Upload tex/samplers properly with merged shaders +- asahi: Don't hazard track fake resources +- asahi: Refactor encoder data structure +- asahi: Factor out agx_launch +- asahi: Make encoder_allocate public +- asahi: Add data structures for geometry shaders +- asahi: Add helpers for lowering GS +- asahi: Add GS lowering pass +- asahi: Wire up geometry shaders +- asahi: Advertise geometry shaders +- asahi: rm unused deqp debug flag +- asahi: Don't use OpenGL clip bit +- asahi: Plumb clip_halfz bit from RS +- asahi: Advertise ARB_clip_control +- asahi: Implement timer queries +- docs: Mark timer queries as done on asahi +- asahi: Implement ARB_base_instance +- nir: Simplify nir_alu_instr_channel_used definition +- nir/validate: Optimize ssa_srcs set +- nir/validate: Don't spam nir_alu_instr_channels +- nir/validate: Don't validate out-of-bounds channels +- nir/validate: Use unlikely for validate_assert +- nir/validate: Don't check dimensions in validate_def +- nir/validate: Drop stale todo +- nir/validate: Inline validate_ssa_src +- nir/validate: Split out validate_sized_src +- nir/validate: Specialize if source validation +- panfrost: Add an allow_rotating_primitives() helper +- panfrost: Factor out vertex attribute stride calculation +- panfrost: Add panfrost_get_{position,varying}_shader() helpers +- gallium: add pipe_shader_from_nir helper +- radeonsi: use pipe_shader_from_nir +- v3d: use pipe_shader_from_nir +- asahi: use pipe_shader_from_nir +- vc4: use pipe_shader_from_nir +- zink: use pipe_shader_from_nir +- nouveau: use pipe_shader_from_nir +- panfrost: use pipe_shader_from_nir +- gallium: drop pipe_shader_state_from_nir +- mesa/st: collapse tgsi deadcode +- mesa/st: use pipe_shader_from_nir +- nir/lower_tex: Add 1D lowering +- agx: fix 1D texture sampling +- ac,radv,radeonsi: use common 1D texture lowering +- nir/format_convert: handle clamping smaller bit sizes +- nir/lower_idiv: Optimize idiv sign calculation +- agx: Hotfix for stack_adjust in GS +- asahi/decode: Decode multiple macOS commands +- asahi: Quiet clang warning +- asahi: Add half float type to genxml +- asahi: Add XML for hw tessellation +- asahi: Identify Primitive ID frag input +- asahi: Identify bicubic filtering mode +- asahi: fix index bias with GS/XFB +- asahi: Sync heap size +- asahi: init clear colour between batches +- asahi: clamp clear colours +- asahi: handle self blits +- asahi: bump limits +- asahi: remove bogus assertion +- asahi: be robust about null xfb +- asahi: fix dirty tracking fail with point sprites +- asahi: handle null PBE +- asahi: Be robust with arrays of images +- asahi: fix imageSize of null image +- asahi: rm compact image atomic descriptors +- asahi: use 2D descriptors for cubes +- asahi: defer texture packing to draw-time +- ail: handle >4GiB textures +- asahi: return GL_OOM for excessive image sizes +- asahi: fix meta usc builder allocation +- asahi: implement xfb stream queries +- asahi: fix output to non-rast streams +- asahi: bump glsl version +- asahi: minify when blitting for transition +- asahi: blit with the old format when transitioning +- asahi: flush before resource transition +- agx: Fix flatshading of matrices +- asahi: fix xfb of pointsize when not drawing points +- asahi: defeature quads +- asahi: Rotate tri fans based on provoking vtx +- asahi: use GS for first-provoking fans +- asahi: Early out for GS + rast discard +- asahi: Implement draw parameters +- agx: wire up texture_samples/image_samplers +- asahi: advertise ARB_shader_texture_image_samples +- asahi: fix layout transitions with arrays +- asahi: use correct target packing PBE +- asahi: choose staging bind better +- asahi: fix destroy_query leaving dangling references +- asahi: add agx_push macro +- asahi: collapse unreachable condition +- asahi: use agx_push +- asahi: remove dead declarations +- asahi: rm unnecessary uniform upload for GS +- asahi: make UB easier to see +- asahi: force GS for indirect prim gen query +- asahi: rework GS input assembly +- asahi: Implement multidraw indirect +- asahi: move heap alloc to first use +- asahi: double depth bias +- asahi: add static assert +- agxdecode: fix stack smash with border colour +- asahi: Support L/A/I formats for texture buffers +- asahi: fix tri fan enum +- asahi: rework cf binding xml +- asahi: add xml for flatshading fans +- agx: fix VARYING_SLOT_COL0 getting flatshaded +- agx: Avoid scratch mem with tri strip w/ adjacency +- agx: rework libagx linking a bit +- asahi: Unroll GS/XFB primitive restart on the GPU +- asahi: Lower edge flags +- asahi: assert hw invariant +- asahi: rewrite pointsize handling +- agx: remove spurious z/s writes in force early-z shaders +- agx: handle force early-z + discard +- agx: note that sample_mask runs occlusion queries +- agx: allocate varying slot if writing viewport only +- agx: report if we have a nonzero viewport +- asahi: allow empty scissor box +- asahi: add XML for multiple viewports +- asahi: Implement ARB_viewport_array +- asahi: handle some components/offsets in GS lowering +- asahi: prepare gs copy shaders for compact clip/cull +- asahi: handle compact clip/cull in gs component gather +- asahi: Implement ARB_cull_distance +- asahi: add more BGR formats +- asahi: fix dupe rgb65 formats +- asahi: fix pbe swizzling +- asahi: fix integer RT clamping +- agx: fix fp64 lowering options +- agx: Lower 64-bit I/O to 32-bit +- agx: don't produce split of immediate +- asahi: fix size calculation for 2d msaa arrays +- asahi: allow more format reinterpretation +- asahi: respect render condition for compute +- asahi: wire up hardware gl_PrimitiveID +- asahi: clamp draw count for mdi +- gallium: fix util_clamp_color type confusion +- gallium: add PIPE_IMAGE_ACCESS_DRIVER_INTERNAL +- nir/validate: allow bias on nir_texop_lod +- asahi: Implement lod queries +- vtn: fuse OpenCL mad if we can can +- asahi: fix eMRT + background load interaction +- ail: add is_level_compressed query +- ail: use is_level_compressed +- ail: add ail_is_level_twiddled_uncompressed +- asahi: do not use compression blits for uncompressed levels +- agx: allow bindful arrays if not clamping +- asahi: don't format convert with staging blits +- asahi: implement arrays as 2d for internal images +- asahi: respect last_block +- asahi: allow compressed image stores in blits +- asahi: fix image_mask with unbind num trailing +- asahi: add compute blitter +- asahi: add and use batch_is_compute helper +- asahi: fix get_batch with compute batches +- asahi: allow multiple compute dispatches in a batch +- asahi: drop custom mipmap generate +- asahi: set data_valid on first draw +- asahi: fix data valid tracking +- asahi: reduce transfer map flushing with staging blits +- asahi: do not stall for writers with invalid mips +- asahi: implement blit-based resource_copy_region +- asahi: fix snorm staging blits +- asahi: use copy region for decompression +- asahi: fix scissor arrays +- asahi: disable compute-based blitter for now +- agx: use more mem->tex barriers even on g13g +- agx: fix early-z + discard together +- asahi: fix set_sampler_views +- asahi: fix max tex sizes +- agx: optimize fcmp like fcmpsel +- agx: wire up some ballots +- agx: lower votes to ballots +- agx: implement query_levels +- agx: skip scoreboard bit in builder for !wait +- agx: make vec widths explicit in IR +- agx: validate post-RA +- agx: rm silly todo +- agx: rm outdated comment +- agx: add index size helper +- agx: trust in agx_index size +- agx: mv agx_read/write_regs to validator +- agx: use custom assert when packing +- agx: use mov imm for pcopies +- agx: allow phis with 16bit imms +- agx: prepare for immediates in phis +- agx: handle imm inlining into phis +- asahi: rework compute emptiness tracking +- asahi: stub qbo on the cpu +- asahi: implement xfb overflow queries +- agx: const fold after discard lowering +- agx: fix xfb of invalid comp +- agx: fix xfb of invalid var +- asahi: bump vertex shader outputs +- asahi: rm pointless multisample key bit +- asahi: rm layered bit from shader key +- asahi: implement point sprites w/o shader key +- asahi: rm unused blend enable bit +- asahi: rm logicop enable bit +- asahi: rm nr_cbufs from key +- asahi: rm blend->store from shader key +- asahi: rm vbuf.count from key +- asahi: rm agx_vbufs wrapper +- asahi: invert program_point_size +- asahi: divide by xfb stride for xfb draws +- asahi: disable fp16 cbuf cap +- asahi: add missing GS line strip (+adj) handling +- asahi: link libagx before lowering mem access widths +- asahi: cl-ify some xfb logic +- asahi: factor out libagx_map_vertex_in_tri_strip +- asahi: rotate xfb'd tri strips +- asahi: inline something silly +- asahi: plumb get_ubo_size +- asahi: make txf robust properly +- asahi: fix passthrough GS with poly modes +- asahi: add missing tib alignment check +- agx: optimize split(64-bit uniform) +- agx: expand agx_index +- agx: fix 64-bit phis with inlined immediates +- agx: add unit test for pcopy lowering bug +- agx: require min alignment for load/store vectorize +- asahi: fallback some resource copies +- asahi: don't canonicalize nans/flush denorms when copying +- agx: unit test split uniform opt +- agx: clang-fmt +- nir,zink: Redefine flat_mask in terms of I/O locations + +Andrew Gazizov (4): + +- venus: Add use_guest_vram capset to enable guest-based blob alloc +- venus: Use vk_object_id as blob_id for guest_vram device memory alloc +- venus: Tighten the conditions for guest_vram device memory alloc +- venus: Make sure that guest allocated blobs from hostmem are mappable + +Anthony Roberts (1): + +- glsl: Use unsigned instead of enum type in ir_variable_data + +Antoine Coutant (1): + +- clc: retrieve libclang path at runtime. + +Antonio Gomes (14): + +- rusticl, meson: Move libc functions to their own crate +- rusticl, meson: Add gl/egl/glx bindings +- iris: Fixups in resource_get_handle and resource_from_handle +- mesa/st: Add new data to mesa_glinterop +- mesa/st, dri2, wgl, glx: Modify flush_objects interop func to export a fence_fd +- rusticl: Add xplat helpers to dynamic link interop functions +- rusticl/device: Function to check for gl interop support +- rusticl/device: Enable gl_sharing only if create_fence_fd is implemented +- rusticl: Add functions to create CL ctxs from GL, and also to query them +- rusticl/format: Add conversion table for GL->CL +- rusticl: Create CL mem objects from GL +- rusticl: Add support for cube maps +- rusticl: Flush objects just before importing them +- rusticl: Advertise cl_khr_gl_sharing extension + +Anuj Phogat (1): + +- intel/l3: Adjust URB weight calculation for gfx12.5+. + +Asahi Lina (12): + +- asahi: Fix CDM Launch/Barrier naming +- asahi: Add extra CDM barrier bit for G13X +- asahi: Move USC cache flush to agx_batch_init_state +- asahi: Add more memory barrier opcodes +- asahi: Add extra barrier for texture atomics on G13X +- ail: Fix miptree offset generation for compressed textures +- ail: Add explicit specification of mip level strides +- ail: Fix tile size & strides for compressed textures +- asahi: Add .editorconfig for CL files +- asahi: Implement BO alignment +- agx: Fix packing of stack map/unmap +- agx: Add scoreboarding to stack instructions + +Bas Nieuwenhuizen (11): + +- radv: Add DGC preprocessing barrier support. +- radv: Add compute DGC preprocessing support. +- radv: Add some initial graphics DGC preprocessing support. +- radv: Add implementation of cmd buffers for a sparse binding queue. +- radv: Remove the sparse binding queue from coherent images. +- radv: Move sparse binding into a dedicated queue. +- nir: Add nir_static_workgroup_size helper. +- nir: Add pass for clearing memory at the end of a shader. +- radv: Add option to clear LDS at the end of a shader. +- radeonsi: Add support to clear LDS at the end of a shader. +- radv: Use correct writemask for cooperative matrix ordering. + +Benjamin Lee (14): + +- nak: make sm available in builders +- nak: Legalize a bunch of instructions for SM50 +- nak: add IADD instruction for SM50 +- nak: implement ST* and LD* on SM50 +- nak: add ATOM{G,S} encoding for SM50 +- nak: add carry register file +- nak: move iadd64 construction to a builder method +- nak: use carry register file for IADD2 +- nak: make as_imm_not_{i,f}20 helper methods public +- nak: implement SHL and SHR on SM50 +- nak: implement IMUL for SM50 +- nak: encode Dst::None as RZ on SM50 +- nak: implement SHFL on SM50 +- nak: implement VOTE on SM50 + +Boris Brezillon (74): + +- pan/genxml: Fix "{Last,First} Heap Chunk" field position +- panfrost: Fix format_minimum_alignment() for v6- +- pan/bo: Make sure we catch refcnt underflows +- pan/genxml: Fix 'Shader Program' descriptor definition on v9 and v10 +- pan/decode: Print the resource table label +- pan/decode: Make CSF decoding more robust to NULL pointers +- pan/decode: Fix the pan_unpack() call for JUMP instruction unpacking +- panfrost: Flag the right shader when updating images +- panfrost: Kill unused panfrost_batch::polygon_list field +- panfrost: Emit attribs in panfrost_update_state_3d() on bifrost/midgard +- panfrost: Emit image attribs for compute in panfrost_update_shader_state() +- panfrost: Rename panfrost_vtable::context_init +- panfrost: Inline pan_emit_tiler_heap() +- panfrost: Inline pan_emit_tiler_ctx() +- panfrost: Count draws at the batch level +- panfrost: Express the per-batch limit in term of draws +- panfrost: Count the number of compute jobs at the batch level +- panfrost: Make panfrost_has_fragment_job() public +- panfrost: Stop using the scoreboard to check the presence of draws/compute +- panfrost: Store the fragment job descriptor address in the batch +- panfrost: Emit the fragment job from panfrost_batch_submit() +- panfrost: Move the panfrost_emit_tile_map() call around +- panfrost: Get rid of unused in_sync parameter in panfrost_batch_submit[_ioctl]() +- panfrost: Get rid of the out_sync parameter in panfrost_batch_submit_jobs() +- panfrost: Get rid of unused fb parameter passed to panfrost_batch_submit_jobs() +- panfrost: Add a submit_batch() hook to panfrost_vtable +- panfrost: Store the index pointer in panfrost_batch +- panfrost: Stop passing vertex attribute arrays around +- panfrost: Store varying related fields in panfrost_batch +- panfrost: Use u_reduced_prim() to do the is_line check +- panfrost: Move JM specific fields to their own struct +- panfrost: s/panfrost_emit_vertex_tiler_jobs/jm_push_vertex_tiler_jobs/ +- panfrost: Move the JM-specific bits out of emit_fragment_job() +- panfrost: Rename several job emission helpers +- panfrost: Factor out the point-sprite shader update logic +- panfrost: Factor out the vertex count logic +- panfrost: Re-order things in panfrost_direct_draw() +- panfrost: Move all JM-specific bits out of panfrost_direct_draw() +- panfrost: Use batch->tls.gpu to store the compute TLS descriptor +- panfrost: Move JM-specific bits out of panfrost_launch_grid_on_batch() +- panfrost: Move JM specific bits out of panfrost_launch_xfb() +- panfrost: Drop the vertex_count argument passed to panfrost_batch_get_bifrost_tiler() +- panfrost: Rename panfrost_batch_get_bifrost_tiler() +- panfrost: s/panfrost_emit_shader/jm_emit_shader_env/ +- panfrost: s/panfrost_emit_primitive/jm_emit_primitive/ +- panfrost: Rename JM-specific batch submission helpers +- panfrost: s/preload/jm_preload_fb/ +- panfrost: s/init_batch/jm_init_batch/ +- panfrost: Prepare things for the common/JM cmdstream split +- panfrost: Move JM helpers to their own source file +- panfrost: Add a JOBX() macro to simplify job-frontend selection +- panfrost: Fix multiplanar YUV texture descriptor emission on v9+ +- panfrost: Don't leak NIR compute shaders +- panfrost: s/pan_scoreboard/pan_jc/ +- panfrost: Rename pan_cs.{c,h} into pan_desc.{c,h} +- panfrost: Make pan_afbc_compression_mode() per-gen +- panfrost: Restrict job chain helpers to JM hardware +- panfrost: Restrict job descriptor emission to JM hardware +- util/hash_table: Use FREE() to be consistent with the CALLOC_STRUCT() call +- util/hash_table: Don't leak hash_u64_key objects when the entry exists +- util/hash_table: Don't leak hash_key_u64 objects when the u64 hash table is destroyed +- panfrost: Abstract kernel driver operations +- pan/kmod: Add a backend for the panfrost kernel driver +- panfrost: Avoid direct accesses to some panfrost_device fields +- panfrost: Avoid direct accesses to some panfrost_bo fields +- panfrost: Back panfrost_device with pan_kmod_dev object +- panfrost: Add a VM to panfrost_device +- panfrost: Back panfrost_bo with pan_kmod_bo object +- panfrost: Introduce a PAN_BO_SHAREABLE flag +- panvk: Pass PAN_BO_SHAREABLE when relevant +- panfrost: Flag BO shareable when appropriate +- panvk: Fix tracing +- panvk: Fix access to unitialized panvk_pipeline_layout::num_sets field +- panfrost: Clamp the render area to the damage region + +Boyuan Zhang (4): + +- gallium/pipe: define hevc max slices number +- frontend/va: add support for multi slices reflist +- radeonsi: add new interface to handle multi slice reflist +- radeonsi/vcn: add new logic for hevc multi slices reflist + +Brian King ((MEDIA)) (1): + +- d3d12: Add constraint_set1_flag support + +Caio Oliveira (90): + +- anv: Fix leak when compiling internal kernels +- intel/compiler: Remove unused parameter from brw_nir_adjust_payload() +- intel/compiler: Take more precise params in brw_nir_optimize() +- intel/compiler: Remove unused parameter from brw_nir_analyze_ubo_ranges() +- intel/compiler: Clarify the asserts in nir_load_workgroup_id lowering +- intel/compiler: Rework opt_split_sends to not rely/modify LOAD_PAYLOAD +- intel/compiler: Re-enable opt_zero_samples() for Gfx7+ +- intel/compiler: Re-enable opt_zero_samples() in many cases for Gfx12.5 +- intel/compiler: Remove is_tex() +- intel/compiler: Use linear allocator in parts of brw_schedule_instructions +- intel/compiler: Remove reference to brw_isa_info from schedule_node +- intel/compiler: Allocate all schedule_nodes at once +- intel/compiler: Use array to iterate the scheduler nodes +- intel/compiler: Add only available instructions to scheduling list +- intel/compiler: Extract scheduling related basic functions +- intel/compiler: Cache issue_time information +- intel/compiler: Remove virtual calls from scheduler +- intel/compiler: Move FS specific fields to fs_instruction_scheduler +- intel/compiler: Merge child/latency arrays in schedule_node +- intel/compiler: Tidy up code in scheduler related to reads_remaining +- intel/compiler: Move earlier scheduler code that is not mode-specific +- intel/compiler: Separate schedule_node temporary data +- intel/compiler: Make scheduler classes take an external mem_ctx +- intel/compiler: Reuse same scheduler for all pre-RA scheduling modes +- intel/compiler: Clear up block instructions before re-adding them +- intel/compiler: Simplify allocation of NIR related arrays +- intel/compiler: Prefer ctor/dtors in some Google Tests +- intel/compiler: Don't use fs_visitor::bld in tests +- intel/compiler: Don't use fs_visitor::bld in fs_reg_alloc +- intel/compiler: Don't use fs_visitor::bld in thread payload classes +- intel/compiler: Add a few more helpers to fs_builder +- intel/compiler: Allow dumping CFG to a specific FILE* +- intel/compiler: Sort lists of succs and preds in CFG dump output +- intel/compiler: Add a few tests to opt_predicated_break +- anv/xe2+: Use Region-based Tessellation redistribution +- iris/xe2+: Use Region-based Tessellation redistribution +- intel/compiler: Refactor program exit in intel_clc +- intel/compiler: Use single variable instead of dynarray +- intel/compiler: Fix memory leaks in intel_clc +- intel/compiler: Remove the linking step in intel_clc +- intel/compiler: Remove unused headers +- intel/compiler: Move NIR emission code to brw_fs_nir.cpp +- intel/compiler: Make a NIR intrinsic emission functions static +- intel/compiler: Make more functions in NIR conversion static +- intel/compiler: Make functions for NIR control flow conversion static +- intel/compiler: Make setup functions of NIR emission static +- intel/compiler: Make non-intrinsic NIR conversion functions static +- intel/compiler: Make NIR atomic conversion functions static +- intel/compiler: Make NIR resources helpers static +- intel/compiler: Move nir_ssa_value into a local structure +- intel/compiler: Move remaining NIR conversion fields to nir_to_brw_state +- intel/compiler: Stop using fs_visitor::bld field in NIR conversion +- intel/compiler: Annotate and use nir_to_brw_state::bld +- intel/compiler: Don't use fs_visitor::bld in remaining places +- intel/compiler: Remove fs_visitor::bld +- intel/compiler: Make fs_visitor not depend on fs_builder +- intel/compiler: Make fs_builder include fs_visitor and not the other way +- intel/compiler: Add ctor to fs_builder that just takes the shader +- intel/compiler: Create and use nir_to_brw() function +- intel/compiler: Use reference instead of pointer for nir_to_brw_state +- intel/compiler: Use reference instead of pointer for fs_visitor +- compiler/glsl: Reduce scope of is_anonymous +- clover: Remove usage of glsl_type C++ helpers +- compiler/types: Add a few more helpers to get builtin types +- intel/compiler: Use C helpers to access builtin types +- compiler: Remove C++ static member pointers to builtin types +- intel/compiler: Use glsl_type C helpers +- r600/sfn: Use glsl_type C helpers +- nouveau: Use glsl_type C helpers +- nir: Use glsl_type C helpers +- mesa: Use glsl_type C helpers +- lima: Use glsl_type C helpers +- compiler/types: Add a few more glsl_type C helpers +- glsl: Use glsl_type C helpers +- compiler/types: Remove glsl_type C++ helpers +- compiler/types: Use a typedef for glsl_type +- intel/cmat: Add pass to lower cooperative matrix to subgroup operations +- intel/dev: Add cooperative matrix configuration information +- anv: Implement VK_KHR_cooperative_matrix +- util: Add a way to set the min_buffer_size in linear_alloc +- spirv: Use linear_alloc for parsing-only data +- spirv: Use value_id_bound to set initial memory allocated +- intel/fs: Only allocate acp_entry if we are adding one +- intel/fs: Use linear allocator in opt_copy_propagation +- intel/fs: Use linear allocator in fs_live_variables +- anv: Don't print warnings for GRL kernel compilations +- intel/compiler: Use INTEL_DEBUG=cs to ask for brw_compiler output +- nir: Disable -Wmisleading-indentation when compiling with GCC +- ci: Add Werror=misleading-indentation to debian-clang +- intel/compiler: Fix rebuilding the CFG in fs_combine_constants + +Casey Bowman (1): + +- anv: Override vendorID for Diablo IV + +Chia-I Wu (14): + +- radv: fix vkCmdCopyImage2 for emulated etc2/astc +- radv: stop using vk_render_pass_state::render_pass +- vulkan, tu, pvr: remove vk_render_pass_state::render_pass +- radv: fix image view extent override for astc +- radv: minor clean up to image view extent override +- ac: be careful with stencil_offset override +- radv: disable TC-compat htile on GFX9 in some cases +- radv: fix VkDrmFormatModifierProperties2EXT for multi-planar formats +- radv: fix VkSubresourceLayout2KHR for multi-planar formats with modifiers +- radv: fix a typo in radv_image_view_make_descriptor +- radv: fix asserts for radv_init_metadata +- radv: convert a check in radv_get_memory_fd to assert +- vk/util: ignore unsupported feature structs +- Revert "vk/util: ignore unsupported feature structs" + +Chris Spencer (7): + +- meson: Add option to ignore artificial Android limitations +- android.mk: Add option to pass arbitrary parameters to meson +- anv/android: Only limit advertised Vulkan version in strict mode +- radv/android: Only limit advertised Vulkan version in strict mode +- v3dv/android: Only limit advertised Vulkan version in strict mode +- vn/android: Only limit advertised Vulkan version in strict mode +- vulkan/android: Only limit advertised extensions in strict mode + +Christian Gmeiner (13): + +- agx: Re-index nir defs to reduce memory usage +- ci/etnaviv: Update ci expectation +- etnaviv: rs: Call etna_rs_gen_clear_surface(..) when needed +- etnaviv: Mark etna_rs_gen_clear_surface(..) private +- docs: Update etnaviv extensions +- etnaviv: Update headers from rnndb +- etnaviv: Add static_assert(..) to catch memory corruption +- isaspec: Add bool_inv type to print inverted bools +- etnaviv: Add isaspec support +- etnaviv: disassembler: Switch to isaspec +- mesa: Drop not used program_written_to_cache +- nir/opt_peephole_select: handle speculative ubo loads +- pan/mdg: Use nir_builder for load_sampler_lod_parameters_pan + +Colin Marc (1): + +- vulkan video: correctly set SPS VUI bits + +Connor Abbott (32): + +- util/rb_tree: Fix editorconfig +- util/rb_tree: Add augmented trees and interval trees +- freedreno/ci: Remove minetest trace +- v3d/ci: Remove minetest trace +- vk,lvp,tu,radv,anv: Add common vk_*_pipeline_create_flags() helper +- vk/graphics_state: Support VK_KHR_maintenance5 +- vk/graphics_state, tu: Rewrite renderpass flags handling +- vk/graphics_state: Support VK_EXT_attachment_feedback_loop_dynamic_state +- vk/graphics_state: Add vk_pipeline_flags_feedback_loops helper +- tu: Assume no raster-order attachment access with NULL DS/blend state +- tu: Fix order of rasterizer_discard check +- tu: Make sure copies to half-float formats are bit exact +- tu: Fix getting VkDescriptorSetVariableDescriptorCountLayoutSupport +- ir3/ra: Don't swap killed sources for early-clobber destination +- nir: Add quad vote intrinsics +- amd: Implement quad_vote intrinsics +- nir/subgroups: Add option to lower Boolean subgroup reductions +- amd: Enable boolean subgroup lowering +- tu: Fix re-emitting VS param state after it is re-enabled +- tu: Don't use pipeline layout to emit shared const enable +- tu: Rework dynamic offset handling +- tu: Make filling out tu_program_state not depend on the pipeline +- tu: Move shader linking to tu_shader.cc +- freedreno/afuc: Handle store instruction on a5xx +- freedreno/afuc: Add separate "SQE registers" +- freedreno/afuc: Use SQE registers for call stack +- freedreno/afuc: Add syntax for pre-increment addressing +- freedreno/afuc: Decode (sdsN) modifier +- freedreno: Update more control/pipe registers for a7xx +- freedreno/afuc: README updates for a7xx +- freedreno/afuc: Fix gen autodetection for a7xx +- ir3/legalize: Fix helper propagation with b.any/b.all/getone + +Corentin Noël (10): + +- mesa/bufferobj: ensure that very large width+offset are always rejected +- virgl: fill the array_size value when using PIPE_TEXTURE_CUBE +- virgl/texture: Align destination box to block depth +- mesa/ffvs: Use gl_state_index16 in helpers directly +- gallivm: Initialize indir_index to NULL before use +- gallivm/lp_bld_nir_aos: Use TGSI instead of PIPE enum +- mesa: Use a switch for state_iter and be more precise about its type +- frontends/va: Remove wrong use of ProfileToPipe +- virgl: Only send the same amount of data than declared in pipe_sampler_state +- virgl: Assert build_id_note before dereferencing it + +Daniel Almeida (33): + +- nak: derive From for Op through a proc macro +- nak: make Instr::new() generic +- nak: compiler: add From> for Instr +- nak: compiler: replace Instr::new(..) with OpFoo {}.into() +- nak: Heap-allocate Instrs +- nak: Do not allocate vectors needlessly in optimization passes +- nak: add support for floor, ceil and trunc +- nak: run nir_lower_frexp and nir_opt_algebraic_late +- nak: more lowerings +- nak: change ishl data type to I32 +- nak: add support for nir_op_isign +- nak: Add support for nir_op_bitcount +- nak: add support for nir_op_bitfield_reverse +- nak: add support for findmsb,findlsb +- nak: add support for packhalf2x16_split +- nak: add support for nir_op_unpack_half_2x16_split_{x|y} +- nak: add support for atomic cmpxcgh on images +- nak/sm50: rewrite encode_iadd2 to not use encode_alu() +- nak: sm50: rewrite fsetp to not use encode_alu +- nak: sm50: Rewrite fmnmx to not use encode_alu +- nak: sm50: rewrite fmul to not use encode_alu +- nak: sm50: rewrite fset to not use encode_alu +- nak: sm50: rewrite iabs to not use encode_alu +- nak: sm50: convert sel to not use encode_alu() +- nak: sm50: convert i2f to not use encode_alu() +- nak: sm50: rewrite encode_f2f to not use encode_alu() +- nak: convert encode_imad to not use encode_alu() +- nak: sm50: rewrite encode_popc to not use encode_alu() +- nak: sm50: rewrite encode_prmt to not use encode_alu() +- nak: sm50: remove encode_alu() and friends +- nak/sm50: remove ALUSrc and friends +- nak/sm50: remove \*fmod* calls from iabs +- nak: sm50: fix ineg legalization + +Daniel Schürmann (24): + +- nir/lower_subgroups: optimize reductions with cluster_size == 1 +- nir: optimize open-coded quadVote* directly to new nir_quad intrinsics +- aco: delete instruction selection for boolean subgroup operations +- nir: remove info.fs.needs_all_helper_invocations +- nir/gather_info: add missing wide subgroup operations +- nir: add info.fs.require_full_quads +- aco: enable helper lanes if shader->info.fs.require_full_quads +- amd: rename max_wave64_per_simd -> max_waves_per_simd +- aco: rename max_wave64_per_simd -> max_waves_per_simd +- radv: fix number of physical SGPRs on GFX10+ +- aco: remove VCCZ and EXECZ register handling +- nir/opt_loop: move loop control-flow optimizations into separate pass +- treewide: replace calls to nir_opt_trivial_continues() with nir_opt_loop() +- nir: remove nir_opt_trivial_continues() +- nir: remove redundant passes from nir_opt_if() +- nir/opt_loop_cf: generalize removal of "trivial" continues +- aco: fix should_form_clause() for memory instructions without operands +- aco: form clauses for LDS instructions +- aco: add new post-RA scheduler for ILP +- aco: refactor and speed-up dead code analysis +- nir/opt_move_discards_to_top: don't schedule discard/demote across subgroup operations +- nir/gather_info: fix enumeration of wide subgroup intrinsics +- aco: give spiller more room to assign spilled SGPRs to VGPRs +- aco/insert_exec_mask: Fix unconditional demote at top-level control flow. + +Daniel Stone (7): + +- ci: Try really hard to print final result string +- ci/radeonsi: Occlusion queries are flaky on stoney +- ci: Fix trivial typo in ARTIFACTS_BASE_URL +- panfrost/ci: Remove Vulkan expectations from G57 +- panfrost/ci: Add environment variable to suppress warnings +- panfrost/ci: Skip broken image copy tests +- ci: Re-enable Collabora farm + +Danylo Piliaiev (15): + +- tu: Fix reading of stale (V)PC_PRIMITIVE_CNTL_0 +- tu/a7xx: Zero out A7XX_VPC_PRIMITIVE_CNTL_0 in 3d blits +- tu/a6xx: Exclude REG_A6XX_TPL1_UNKNOWN_B602 from reg stomping +- tu/a7xx: Fix occlusion queries on pre-A740 GPUs +- tu: Always print startup failure messages +- tu: Return error when GPU is unsupported +- freedreno/devices: Support Adreno 725 +- tu: Add a725 workaround dispatch at the start of each cmdbuf +- freedreno/devices: Separate device definition into base + gen features +- freedreno,tu,ir3: Pass fd_dev_info into ir3_compiler_create +- freedreno,tu: Add env vars to modify fd_dev_info +- freedreno: Add a644 support +- freedreno/devices: Update a690 magic regs from WSL blob +- turnip: Disable UBWC for D/S images on A690 +- freedreno: Disable UBWC for D/S images on A690 + +Dave Airlie (38): + +- vulkan: update video headers +- vulkan/video: add support for h264 encode to common code +- vulkan/video: add h265 encode support +- vulkan/video: add h264 nal enum +- vulkan/video: add a nal_unit lookup for hevc +- util: add a bitstream encoder for video stream headers. +- vulkan/video: add h264 level idc convertor utility +- vulkan/video: add a h265 level translator. +- vulkan/video: add h264 headers encode +- vulkan/video: add h265 header encoders. +- nak: fix backtrace crash running computeheadless +- nak: make ipa encoding match the order in codegen gv100 +- nak: do perspective divide for interp none as well +- nvk/xfb: set correct counter buffer for writing stream out counters. +- nvk/nil: allow storage on VK_FORMAT_A2B10G10R10_UINT_PACK32 +- nvk: fix transform feedback with multiple saved counters. +- nvk/nak/xfb: handle skipping properly when setting xfb_attr. +- nvk: drop unneeded shader type conversion function +- nvk/nak: fix regression with shf changes on sm70 +- intel/compiler: move gen5 final pass to actually be final pass +- vulkan/video: drop encode beta checks and rename EXT->KHR +- gallivm: handle llvm 16 atexit ordering problems. +- intel/compiler: fix release build unused variable. +- intel/compiler: revert part of "Move earlier scheduler code that is not mode-specific" +- llvmpipe: fix caching for texture shaders. +- gallivm/sample: refactor first/last level handling and use level_zero_only. +- gallivm/sample: add some num_samples vs level zero only support +- gallivm/sample: make the load_mip helper useful outside this file. +- gallivm/lp: reduce size of lp_jit_texture. +- gallivm/lp: reduce image descriptor size. +- gallivm/lp: merge sample info into normal info +- gallivm/lp: move sampler index around to reduce struct +- lavapipe: bump .maxResourceDescriptorBufferRange +- intel/compiler: reemit boolean resolve for inverted if on gen5 +- radv: don't emit cp dma packets on video rings. +- radv/video: refactor sq start/end code to avoid decode hangs. +- radv: don't submit empty command buffers on encoder ring. +- gallivm: passing fp16_split_fp64 to fp16 lowering. + +Dave Stevenson (2): + +- gallium: Add more TinyDRM drivers to the list of kmsro drivers +- gallium: Add udl (DisplayLink) to the list of kmsro drivers + +David Heidelberg (53): + +- ci/docs: add coreutils +- ci: bump tags +- ci/zink: reduce premerge testing on a618 to ~ 12 minutes +- ci: hide Mesa install phase +- ci: drop clover from release builds and remove rusticl build +- ci: simplify debian-rusticl-testing definition +- ci: drop mingw and wine from the x86_64 build container +- ci: always cleanup pip and cargo leftovers +- ci: bashify scripts, use arrays +- ci: drop debootstrap, unused +- ci/panfrost: run T860 traces as intended (nightly job) +- ci/venus: reduce pre-merge to fit under 15 min +- ci/alpine: do not store apk cache +- ci/wine: move wine configuration into rootfs where is wine available +- Revert "ci/wine: move wine configuration into rootfs where is wine available" +- ci/lava: add wine into the amd64 ephemeral container packages +- ci/zink: restore full premerge testing on Adreno 618 +- ci: fixup section names +- ci/nouveau: define a kernel and dtb, so we can fetch it from external sources +- ci: inject gfx-ci/linux S3 artifacts without rebuilding containers +- ci/zink: disable nheko trace, as it sometimes crashes +- gitlab: make commit more commit-like formatted +- ci: tag sanity, rustfmt and clang-format job as a "placeholder" job +- ci/traces: drop the freedoom-phase2-gl-high.trace +- ci: disable Anholt farm +- ci/freedreno: disable a660 as it's down now +- Revert "ci/freedreno: disable a660 as it's down now" +- ci: bump kernel to 6.6.4 +- docs: drop unused manual optimizations override +- ci/freedreno: mark unvanquished-lowest trace as flaky and skip +- ci/freedreno: switch Adreno 630 boards back to 6.4 kernel +- ci/freedreno: increase fraction for Vulkan testing +- ci/tu: add another failing pipeline strip draw +- ci/freedreno: extend timeout for full runs +- ci/freedreno: re-enable two Adreno 618 tests +- ci/freedreno: timestamp-get no longer fails on Adreno +- ci/freedreno: downgrade a618_piglit to 6.4 kernel +- ci/freedreno: fail introduced by ARB_post_depth_coverage +- rusticl: add freedreno alias for RUSTICL_ENABLE +- ci/freedreno: more issues showed up on a618, let's use 6.4 +- ci/austriancoder: separate HW definition from SW +- ci/freedreno: downgrade whole Adreno 6xx series, incl. zink-a618 jobs +- ci/broadcom: separate HW definition from SW +- ci: skip EGL functional color_clears tests for Wayland +- ci/lava: separate HW definitions from SW +- ci/google: re-enable farm +- ci/zink: update piano trace +- ci/radeonsi: disable VA-API testing on raven +- ci: enable ci-deb-repo for libdrm 2.4.119 (and others in the future) +- ci/alpine: update to latest to get libdrm 2.4.119 +- ci: bump Fedora and Android libdrm2 to 2.4.119 +- ci/rootfs: add libdrm also inside the rootfs +- ci/deqp: uprev deqp-runner for Linux too to 0.18.0 + +David Rosca (19): + +- frontends/va: Map decoder and postproc surfaces for reading +- radeonsi/vce: Implement destroy_fence vfunc +- radeonsi/uvd: Implement destroy_fence vfunc +- radeonsi/uvd_enc: Implement destroy_fence vfunc +- radeonsi/uvd_enc: Fix leaking session info buffer +- Revert "radeon/radeon_vce: fix out of target bitrate in CBR mode (H.264)" +- radeonsi/vce: Tweak motion estimation params for better quality +- radeonsi/vce: Add VUI parameters in output bitstream +- radeonsi/uvd_enc: Add VUI parameters in output bitstream +- radeonsi: Fix offset for linear surfaces on GFX < 9 +- gallium/auxiliary/vl: Fix coordinates clamp in compute shaders +- gallium/auxiliary: Fix coordinates clamp in util_compute_blit +- gallium/auxiliary/vl: Scale dst_rect x0/y0 when rendering chroma plane +- gallium/auxiliary/vl: Support interleaved input in deinterlace filter +- Revert "frontends/va: Alloc interlaced surface for interlaced pics" +- gallium/auxiliary: NIR blit_compute_shader +- gallium/auxiliary/vl: NIR compute shaders +- util/rbsp: Fill bits twice if reading more than 16 bits +- radeonsi/vcn: Fix H264 slice header when encoding I frames + +Dennis Bonke (1): + +- mesa: add managarm support + +Dmitry Baryshkov (9): + +- freedreno/regs/mdp_common: change BPC1 -> BPC4 +- freedreno/regs/mdp_common: fix BPC comments +- freedreno/regs: add mdp_fetch_mode enum +- freedreno/drm: fallback to default BO allocation if heap alloc fails +- ir3: fix shift amount for 8-bit shifts +- ir3/a6xx: fix ldg/stg of ulong2 and ulong4 data +- freedreno/drm: notify valgrind about FD_BO_NOMAP maps +- freedreno/drm: don't crash in heap allocator when run under valgrind +- freedreno/drm: don't crash for unsupported devices + +Dudemanguy (1): + +- vulkan/wsi/wayland: fix wl_event_queue memory leak + +Dylan Baker (3): + +- docs: add release notes for 23.2.1 +- docs: Add sha256 sum for 23.2.1 +- meson: add wrap for libdrm + +Echo J (2): + +- nvk: Set HOST_CACHED_BIT for the GTT type +- vulkan: Remove nonexistent output in vk_synchronization_helpers target + +Eric Engestrom (236): + +- VERSION: bump to 24.0 +- docs: reset new_features.txt +- docs: update calendar for 23.3.0-rc1 +- ci/rpi4: group all spec\@ext_image_dma_buf_import\@ext_image_dma_buf_import-sample_* together +- ci/rpi4: add spec\@ext_image_dma_buf_import\@ext_image_dma_buf_import-sample_yvyu to the list of known failures +- ci/zink+radv: add another flake on polaris +- ci: drop confusing fake \`rules`, \`if` and \`when` on the list of rules strings +- docs/ci: allow sanity job to be missing +- ci: don't run sanity in Marge pipelines +- ci: add \`.never-post-merge-rules` to avoid re-running pre-merge jobs after merging +- broadcom: use \`.never-post-merge-rules` for all rpi tests +- ci/radeonsi: add another flake +- rpi4/ci: add more known dEQP-EGL.functional.*.*_context.gles*.other failures +- rpi4/ci: move \`spec\@!opengl 1.1\@depthstencil-default_fb-drawpixels-24_8 samples=2` from fails for flakes after an UnexpectedPass +- rpi4/ci: remove \`spec\@!opengl 1.1\@depthstencil-default_fb-drawpixels-32f_24_8_rev samples=2` from fails as it's a flaky test and already marked as such +- Revert "ci: backport two mesh/task query fixes for VKCTS" +- ci/build-deqp: stop ignoring failures while fetching patches +- ci/build-deqp: split deqp version into a variable +- ci/build-deqp: move mkdir earlier +- ci/build-deqp: print more detailed information about what deqp version is running +- ci: bump image tags to rebuild deqp +- ci/rules: add missing clang-format files to what needs containers to build +- broadcom/ci: merge gl test lists to use a single deqp instance +- broadcom/ci: fix list indentation +- broadcom/ci: split broadcom-common manual rules to .broadcom-common-manual-rules +- vc4/ci: add manual variant of .vc4-rules +- v3dv/ci: add manual variant of .v3dv-rules +- v3d/ci: add "full run" variant of v3d-rpi4-gl:arm64 as a manual job +- v3dv/ci: add "full run" variant of v3dv-rpi4-vk:arm64 as a manual job +- vc4/ci: add piglit "full run" variant of vc4-rpi3-gl:arm32 as a manual job +- rpi4/ci: skip more timing out tests in the dEQP-VK.ssbo.layout.* group +- zink+radv/ci: simplify deqp config +- zink+radv/ci: ensure renderer is "zink on radv" +- ci: restore sanity (aka. Revert "ci: don't run sanity in Marge pipelines") +- gitlab_gql: strip newline at the end of the token file +- ci_run_n_monitor: compile target_jobs_regex only once +- ci/gitlab_gql: stop re-compiling regex now that all users pre-compile it +- v3d/ci: run manual jobs in daily pipeline +- radeonsi/ci: document new failures and flakes +- ci: disable lima farm as it appears to be down +- radv/ci: add navi21 flakes +- radv/ci: add vega10 flakes +- radv/ci: add polaris10 flakes +- radv+zink/ci: add polaris10 flakes +- radv+zink/ci: add navi10 flakes +- bin/gitlab_gql: resolve sha locally to be able to use things like \`HEAD` +- gitlab_gql: make \`--rev` optional, defaulting to \`HEAD` +- bin/gitlab_gql: fix command in example +- bin/gitlab_gql: only get the pipeline when a pipeline is needed +- v3d/ci: add new failures +- bin/gitlab_gql: only allow a single \`--print-\*` argument per invocation +- bin/gitlab_gql: rename get_job_final_definition() to print\_...() since that's what it actually does +- bin/gitlab_gql: deduplicate fetch_merged_yaml() logic between print branches +- bin/gitlab_gql: give a better name to the --print-job-manifest argument value than PRINT_JOB_MANIFEST +- ci/valve-infra: ensure the correct farm picks up the job +- docs: update calendar for 23.3.0-rc{2,3,4} and add another release candidate +- util/xmlconfig: drop default SYSCONFDIR & DATADIR values +- lima: drop unused lima_get_absolute_timeout() +- intel/ci: fix gl/vk dependencies in hsw jobs +- intel/dev: use libdrm.h wrapper to support builds without libdrm +- ci_run_n_monitor: require user to add an explicit \`.*` at the end if jobs like \`*-full` are wanted +- amd/ci: avoid re-running all the test jobs when changing the expectations for only one of them +- egl/dri2: increase NUM_ATTRIBS to fit all the attributes +- asahi: use util_resource_num() instead of open-coding it +- ci/piglit: specify only the traces file in the job config +- amd/ci: track changes to the traces config file as well +- ci: fix kdl commit fetch +- ci: uprev deqp-runner from 0.16.1 to 0.18.0 +- ci/deqp-runner: turn paths in errors into links +- docs: update calendar for 23.0.0-rc5 +- docs: add another -rc +- ci: use released version of meson +- lp: make sure 0xff is unsigned before shifting it past signed int range +- intel/perf: fix regex escaping +- intel/ci: fix .hasvk-manual-rules +- docs: update calendar for 23.3.0 +- docs/calendar: add 23.3.x releases +- bin/python-venv: detect python version change +- ci: disable opengl & gles in debian-vulkan build +- radv/ci: add navi21-aco flake +- bin/gen_release_notes: fix regex raw string +- bin/python-venv: fix venv folder check +- bin/gen_release_notes: include removed 'new_features.txt' in commit +- docs: add release notes for 23.3.0 +- docs: add sha256sum for 23.3.0 +- docs: fix release date for 23.3.0 +- turnip: fix typo in comment +- ci_run_n_monitor: allow picking a pipeline by its MR +- amd/ci: radeonsi is gl, not vk +- v3dv: update symbols that have become aliases for newer ones +- v3dv: drop duplicate flag +- radv: update symbols that have become aliases for newer ones +- pvr: update symbols that have become aliases for newer ones +- anv: update symbols that have become aliases for newer ones +- hasvk: update symbols that have become aliases for newer ones +- amd/ci: fix yaml indentation +- amd/ci: split common amd files list from radeonsi files list +- amd/ci: limit radv jobs to radv + aco files changes +- nvk: update symbols that have become aliases for newer ones +- vk/runtime: update symbols that have become aliases for newer ones +- vk/wsi: update symbols that have become aliases for newer ones +- vk/util: update symbols that have become aliases for newer ones +- vk/overlay-layer: update symbols that have become aliases for newer ones +- venus: update symbols that have become aliases for newer ones +- venus: fix typo in comment +- amd/ci: reuse .radeonsi-rules in .radeonsi-vaapi-rules +- nvk: use \`||` instead of \`|` between bools +- radeonsi/ci: update vangogh piglit expectations +- freedreno/ci: add flake seen on a630 +- freedreno/ci: add more flakes seen on a630 +- freedreno/ci: add more a630 flakes +- v3d: drop leftover from "move v3d_tiling to common" +- radeonsi/ci: track changes to \`vpelib` +- turnip: update symbols that have become aliases for newer ones +- util/blob: fix trivial typo +- ci: explain what we mean by the various types of pipelines +- ci: turn comment into code in \`sanity` job rules +- ci: identify merge request pipelines using \`$CI_PIPELINE_SOURCE == merge_request_event` instead of \`$CI_COMMIT_BRANCH` being missing +- ci: rename is-pre-merge-for-marge to is-merge-attempt to be clearer +- ci: drop containers, builds, and tests from post-merge pipeline +- ci: add pipeline for direct pushes to main +- ci: give an explicit priority to the scheduled nightly pipelines +- ci: clean up pre-merge and fork pipelines rules +- ci: make sure pre-merge pipelines have the same jobs as merge pipelines +- ci: improve comments +- ci: take microsoft farm offline +- ci: fix rules for formatting checks +- zink/ci: fix yaml indentation +- zink/ci: use variable to avoid repeating the list +- zink/ci: expand first (and only) level of folders in the list of files +- zink/ci: run only the relevant jobs when changing the ci expectations +- panfrost/ci: fix yaml indendation +- panfrost/ci: run only the relevant jobs when changing the ci expectations +- freedreno/ci: fix yaml indentation +- freedreno/ci: run only the relevant jobs when changing the ci expectations +- intel/ci: fix yaml indentation +- intel/ci: deduplicate common intel files rules +- intel/ci: expand first level of common intel files +- intel/ci: anv changes should only trigger anv jobs +- intel/ci: hasvk changes should only trigger hasvk jobs +- intel/ci: run only the relevant jobs when changing the ci expectations +- docs/calendar: add 24.0 branchpoint and release schedule +- etnaviv/ci: fix yaml indentation +- etnaviv/ci: expand first level of files in src/etnaviv/ +- etnaviv/ci: run only the relevant jobs when changing the ci expectations +- broadcom/ci: avoid running the rpi4 jobs when changing the rpi3 expectations, and vice-versa +- vk/update-aliases.py: drop dead --check-only +- vk/update-aliases.py: allow specifying the files we want to update +- vk/update-aliases.py: handle "no match" grep call +- vk/update-aliases.py: sort files when informing the user of the matches +- vk/update-aliases.py: simplify addition of other concatenated prefixes +- vk/update-aliases.py: handle more concatenated prefixes +- vk/update-aliases.py: enforce correct list order +- vk/update-aliases.py: only apply renames for the vulkan api (not vulkansc) +- v3dv/ci: only trigger on relevant changes +- a630/ci: add another flake +- freedreno/ci: move hang-y a630 jobs from pre-merge to nightly +- spirv: add missing build dependency +- ci/b2c: drop passthrough of unset CI_JOB_JWT +- ci/b2c: stop ignoring errors in before_script +- ci/b2c: fix indentation of comment and after_script: list +- ci/b2c: drop unused B2C_EXTRA_VOLUME_ARGS +- ci/b2c: tags are mandatory +- ci/b2c: drop support for harbor.freedesktop.org +- ci/b2c: drop unused --volume and --mount-volume +- ci/b2c: always define job_volume_exclusions +- ci/b2c: always define cmdline_extras +- ci/b2c: use with:write instead of manually doing open;write;close +- ci/b2c: export B2C_TEST_SCRIPT +- ci/b2c: use envvars directly instead of converting them back and forth into cli args +- ci/b2c: import all variables starting with \`B2C_` +- ci/b2c: rename B2C_TEST_SCRIPT to B2C_CONTAINER_CMD to match the automatic import +- ci/b2c: identify dut by its id instead of its tags +- docs: add release notes for 23.3.1 +- docs: add sha256sum for 23.3.1 +- docs: update calendar for 23.3.1 +- ci: deduplicate constructing the ARTIFACTS_BASE_URL +- bin/gitlab_gql: fix --print-merged-yaml when --rev != HEAD +- bin/gitlab_gql: print merged yaml as yaml instead of a python dict +- v3d/ci: add flake +- ci: fix indentation +- ci: run every test when changing the build +- docs: drop \`:` in title +- radv/ci: add flake +- docs: document how to build the docs +- vulkan/wsi: fix build when platform headers are installed in non-standard locations +- ci/build: drop redundant meson/build.sh from jobs that already inherit from .meson-build +- radv/ci: add flake on raven +- ci: add nvk to the clang build +- ci: disable collabora farm as it is currently offline +- ci: fix farm restore pipelines +- meson: always define {,DRAW_}LLVM_AVAILABLE one way or the other +- docs: add release notes for 23.3.2 +- docs: add sha256sum for 23.3.2 +- docs: update calendar for 23.3.2 +- meson: update expat wrap +- meson: update libarchive wrap +- meson: update libxml2 wrap +- meson: update zlib wrap +- meson: use \`allow_fallback` instead of manually listing the deps and what they provide +- ci/containers: use build-libdrm.sh in debian/android +- Revert "meson: add wrap for libdrm" +- zink: update symbols that have become aliases for newer ones +- zink/requirements: update feature and property names that have been promoted +- docs/backport-mr: fix invalid nested formatting +- docs: fix list whitespace +- docs: mention that python package \`packaging` is required on python 3.12+ +- lvp: update symbols that have become aliases for newer ones +- egl: only accept APIs that are compiled in +- ci: split & reuse debian version identifier +- ci: convert several \`find | xargs` to \`find -exec` +- ci/deqp: set default platform to \`default` instead of glx, to also support wayland +- docs: add release notes for 23.3.3 +- docs: add sha256sum for 23.3.3 +- docs: update calendar for 23.3.3 +- docs: close the 23.2 cycle +- VERSION: bump for 24.0.0-rc1 +- .pick_status.json: Update to 4fe5f06d400a7310ffc280761c27b036aec86646 +- .pick_status.json: Mark 0557f0d59c5b22a8a934900ddc91f7a6057e146f as denominated +- ci: make sure we evaluate the python-test rules first +- .pick_status.json: Update to ff84aef116f9d0d13440fd13edf2ac0b69a8c132 +- .pick_status.json: Update to 10e2dbb63b9d1f8f35c4fc3f570cd19b3fc03b43 +- ci: fix job dependency error in MRs for bin/ci/* scripts +- VERSION: bump for 24.0.0-rc2 +- ci/deqp: ensure that in \`default` builds, wayland + x11 + xcb are all built +- .pick_status.json: Update to d2b08f9437f692f6ff4be2512967973f18796cb2 +- .pick_status.json: Update to d0a3bac163ca803eda03feb3afea80e516568caf +- .pick_status.json: Update to 90939e93f6657e1334a9c5edd05e80344b17ff66 +- .pick_status.json: Update to eca4f0f632b1e3e6e24bd12ee5f00522eb7d0fdb +- VERSION: bump for 24.0.0-rc3 +- .pick_status.json: Update to b75ee1a0670a3207dfd99917e4f47d064a44197f +- .pick_status.json: Update to 4cd5b2b5426e8d670fc3657eee040a79e3f9df1e +- util: rename __check_suid() to __normal_user() +- tree-wide: use __normal_user() everywhere instead of writing the check manually +- util: simplify logic in __normal_user() +- util: check for setgid() as well in __normal_user() + +Eric R. Smith (1): + +- panfrost: fix panfrost drm-shim + +Erico Nunes (6): + +- v3dv: Rework to remove drm authentication for wsi +- lima/ci: update piglit ci expectations +- Revert "ci: disable lima farm as it appears to be down" +- panvk: Support modifiers for Wayland WSI +- ci: lima farm is down +- Revert "ci: lima farm is down" + +Erik Faye-Lund (34): + +- docs: prepare for hawkmoth +- docs: remove breathe/doxygen stuff +- docs: improve readability of c-signatures +- util: remove unused lut +- panfrost: allow packing formats outside of pan_format.c +- panfrost: bypass format-table for null-textures +- panfrost: pass blendable formats to pan_pack_color +- panfrost: store blendable_formats in panfrost_device +- panfrost: look at correct blendable format version +- panfrost: use perf_debug instead of open-coding +- mesa/ffvs: use unreachable instead of assert +- docs: apply permanent redirect +- panfrost: do not open-code panfrost_has_fragment_job() +- ci: opt-out panfrost from clang-format +- panfrost: minify dimensions when converting modifiers +- util/format: document NONE swizzle +- lavapipe: do not use NONE-swizzle +- panfrost: do not handle NONE-swizzle +- d3d12: do not handle PIPE_SWIZZLE_NONE from sampler-view +- zink: do not handle PIPE_SWIZZLE_NONE +- meson: work around meson 0.62 issue +- mesa/main: remove unused Log2 variants of width/height/depth +- mesa/main: remove unused ClassID +- mesa/main: use _mesa_is_zero_size_texture-helper +- mesa/main: remove unused function +- mesa/st: use _mesa_is_zero_size_texture-helper +- zink: update profile schema +- zink: use KHR version of maint5 features +- panfrost: document ci failure +- mesa/st: do not require render-target support for texture-only exts +- mesa/st: do not check for emulated format +- mesa: actually check for EXT_color_buffer_float support +- mesa/main: require EXT_color_buffer_float for ES 3.2 +- mesa: check for float-format support + +Etaash Mathamsetty (1): + +- driconf: add a workaround for Rainbow Six Siege + +Faith Ekstrand (663): + +- nir: Add a lower_first_invocation_to_ballot option to lower_subgroups +- nir: Add a lower_read_first_invocation option to lower_subgroups +- nir/lower_bit_size: Fix subgroup lowering for floats +- nir/lower_bit_size: Handle vote_feq/ieq separately +- nir/lower_bit_size: Use u_intN_min/max() +- nir: Split nir_lower_subgroup_options::lower_vote_eq into two bits +- nir: Return b2b ops from nir_type_conversion_op() +- nir/lower_bit_size: Use b2b for boolean subgroup ops +- nir: add deref follower builder for casts. +- nir: Handle wildcards with casts in copy_prop_vars +- nir: Use nir_builder to insert movs +- nir: Add asserts to nir_phi_builder_value_set_block_def +- vc4: Stop assuming glsl_get_length() returns 0 for vectors +- v3d: Stop assuming glsl_get_length() returns 0 for vectors +- nir/lower_io_to_vector: Only call glsl_get_length() on arrays +- nir/types: Support vectors in glsl_get_length() +- nir: Handle array-deref-of-vec in vars_to_ssa +- nir: Handle array-deref-of-vec in var split passes +- nir/validate: Allow array derefs on vectors on function/shader_temp +- nvk: Force all mappable BOs into GART pre-Maxwell +- nvk: Fix nvk_heap_free() for contiguous heaps +- nvk: Drop a bogus assert +- nvk: Assert no storage images on Kepler +- nir: Optimize boolean ieq/ine with an immediate +- nouveau: Add initial headers and meson for the new compoiler +- nak: Copy the optimization loop from Intel +- nak: Add a bunch of shader lowering code in NIR +- nak: Add initial stubs for rust code +- nvk: Run shaders through NAK +- nak: Add the core IR +- nak: Add Rust bindings for NIR +- nak: Add initial translation from NIR +- nak: Add a copy-prop pass +- nak: Add a dead-code pass +- nak: Add a util library +- nak: Add a trivial register allocator +- nak: Add a lowering pass for VEC and SPLIT instructions +- nak: Add a lowering pass for ZERO sources and destinations +- nak: Add bitset infrastructure +- nak: Add encoding for a few instructions +- nak: Encode program headers +- nak: Header stuff +- nak: Lower system values to a new load_sysval_nak intrinsic +- nak: Implement load_sysval_nv as S2R +- nak: Implement load_ubo +- nak: Implement load/store_global +- nak: Zero out the .w component of descriptors +- nak: Add an instruction fuzzing tool +- nak: Implement iadd and ishl +- nak: Add a pass for computing instruction dependencies +- nak: Implement 32-bit logic ops +- nak: Add support for instruction predicates +- nak: Implement integer comparisons +- nak: Implement bcsel +- nak: Rework ALU instruction encode +- nak/meson: Use bindgen dependencies +- nak: Add nak_compiler_create/destroy +- nvk: Pass an actual nak_compiler to nak_compile_shader() +- nak: Plumb the SM through to nak::Shader +- nak: Encode load/store correctly on SM80 +- nak: Rework instruction encoding +- nak: Implement boolean logic ops +- nak: Lower 8 and 16-bit types +- HACK: Support old meson +- nak: Use Instr::num_srcs/dsts() less +- nak: Get rid of meta instructions +- meson: Pull in syn from crates.io +- nak: Add SrcAsSlice and DstAsSlice traits +- nak: Add a SrcModsAsSlice trait +- nak: Use a different inner struct type for each opcode +- nak: Use Src::Zero for load_const(0) +- nak: Handle zeroes at emit time +- nak: Implement i2f +- nak: Implement fadd +- nak: Rework integer compare ops +- nak: Implement float comparisons +- nak: Implement nir_op_b2f32 +- nak: Implement unary float and integer ops +- nak: Allow iadd3 to take an immediate in srcs[2] +- nak: Implement fsign +- nak: Rework ALUSrc in emit code +- nak: Rework source modifiers +- nak: One of the predicates in IADD3 is a destination +- nak: Implement Display for SSAValue +- nak: Make Dst its own type +- nak: Add modifier propagation +- nak: Implement basic control-flow +- nak: Move nak_compiler to nak_private.h +- nak: Add a nir_shader_compiler_options to nak_compiler +- nvk: Pull the NIR options from NAK +- nak: Implement b2i32 +- nak: Implement iadd64 +- nak: Implement phis +- nak: Add a union-find implementation +- nak: Lower global access to scalars as needed +- nak: Print names of missing instructions +- nak: Implement unpack_64_2x32_split_* +- WIP: nak: Rework the barrier assignment pass +- nak: Add an SSAValueAllocator struct +- nak: Pass an SSAValueAllocator through to map methods +- nak: Handle fadd funnyness in the emit code +- WIP: nak: Add a legalization pass +- nak: Rename Imm to Imm32 +- nak: Add separate True and False source types +- nak: Handle phis with non-SSA sources +- nak: Support both destinations in PLOP3 +- nak: Drop the special cases for single-component vec/split +- nak: Don't emit MOVs for overlapping vec and split src/dst +- HACK: nak: Lower iadd64 again +- nak: Add a parallel copy in struction with lowering +- nak: Use OpParCopy for OpVec and OpSplit lowering +- nak: Get rid of the BitSet and BitSetMut traits +- nak: Rename BitSetView to BitView +- nak: Add a BitSet struct +- nak: Add an SSAComp struct +- nak: Rework dead-code +- nak: Rework phis +- nak: Add a space to the end of vec and split arg lists +- nak: Add a liveness analysis pass +- nak: Add a non-trivial register allocator +- nak: Improve the dependency tracker +- nak: Handle token re-use in dep tracking +- nak: Implement nir_op_i(eq|ne) for booleans +- nak: Fold [P]Lop3 sources +- nak: Predicates default to true +- nak: Implement nir_op_[iu](min|max) +- nak: Implement nir_op_fmul +- nak: Implement nir_op_(fmin|fmax) +- nak: Implement nir_op_u2f +- nak: Implement nir_op_vecN +- nak: Implement MuFu and a bunch of float unops +- nak: Move nak_sysval_attr_addr/sysval_idx higher in the file +- nak: Implement input interpolation +- nak: Handle multiple vector destinations in RA +- nak: Use immediage offsets for load/store_global +- nak: Implement OpFSOut with an OpParCopy +- nak: Implement f2[iu]32 +- nak: Wire up ffma +- nak: Add more legalization +- nak: Implement right-shifts +- nak: Implement nir_op_[iu]mul[_high] +- nak: Enable nir_lower_idiv +- nak: Add a NIR texture lowering pass +- nak: Use more core NIR texture lowering +- nak: Wire up texture ops +- nak: Simplify the FromVariants proc macro +- nak: Simplify the (Srcs|Dsts)AsSlice proc macro +- HACK: spirv: Add a MESA_SPIRV_DUMP_PATH environment variable +- nak: Add a NAK_DEBUG environment variable +- nvk: Drop printing of NAK shaders +- nvk: Pass NAK flags through to shader cache UUIDs +- nak: Add a debug flag to assign worst-case instruction deps +- nak: Rework vector handling +- nak: Legalize vector sources +- nak: Add a use tracker to RA +- nak: Much more believable try_find_unused_reg_range() +- nak: Implement nir_op[iu]mul_2x32_64 +- Revert "HACK: nak: Lower iadd64 again" +- nak: Implement nir_op_ixor +- nak: Implement undef instructions +- nak: Implement image load/store +- nak: Wire up OpLd and OpSt for local and shared +- nak: Implement nir_intrinsic_load/store_scratch +- nak: Add a smarter new_lop2 helper +- nak: Improve RA failure messages +- nak: Legalize OpShf +- nak: Only put actually live SSA values in the ra.live_in sets +- nak: Legalize more stuff +- nak/nir: Lower image size and samples to txq +- nak: Improve [FI]SETP encoding +- nak: Legalize Op[FI]Setp +- nak: Don't allow r255 in texture or surface ops +- nak: sin() and cos() require we divide by 2pi +- nak: Add F2F and implement fquantize16 +- nak: Implement barriers +- nvk: Plumb num_barriers through from NAK +- nak: Implement load/store_shared +- nak: Integers don't have abs() source modifiers +- nak: Add a mechanism for decorating sources with types +- nak: Decorate sources with types +- nak: Only divide FS inputs by .w for smooth interpolation +- nak: Rework source modifiers a bit +- nak: Add a Src::supports_src_type() helper +- nak: Rework copy-prop to use soruce type decorations +- nak: Implement nir_intrinsic_global_atomic_* +- nak: Implement nir_intrinsic_shared_atomic_* +- nak: Implement global/shared_atomic_comp_swap +- nak: Implement image atomics +- nak: Fix the 2nd predicate on LOP3 +- nak: Optimize OpLop3 and OpPLop3 +- nak: DCE things with constant false predicates +- nak: Rework source modifiers instructions a bit +- nak: Fold fsat into FAdd/FFma/FMul +- nak: Delete unused imports and dead code +- nak: Add accum predicates to Op[FI]Setp +- nak: Add a Pred struct move the enum to PredRef +- nak: Fix multisampled textureing +- nak: Legalize everything +- nak: Rework cbufs a bit +- nak: Implement indirect UBO loads +- nak: Implement nir_op_b2b1 and nir_op_b2b32 +- nak: Follow memcpy semantics with OpParCopy +- nak: Work in terms of bits for type sizes +- nak: Add a builder +- nak: Use the builder in some lowering passes +- nak: Compute liveness in reverse block order +- nak: Rework liveness to add next-use information +- nak: Add a PerRegFile helper struct +- nak: Record register pressure in liveness +- nak: Initialize RA with only live registers +- nak: Use num_regs instead of max_reg in RA +- nak: Use pcopy.push() in RA +- nak: Rework RA a bit +- nak: Add some documentation for SSA values +- nak: Print to stderr +- nak/ra: Pass a PerRegFile num_regs into the allocator +- nak: Allocate the minimum number of GPRs. +- nak: Separate the CFG from liveness +- nak: Break guts of liveness into traits +- nak: Require Rust 1.70.0 +- nak: Handle dead destinations in RA +- nak: Make calc_max_live a function of the Liveness trait +- nak: Bring back bitset-based liveness +- nak: Add mum_gprs and tls_size to Shader +- nak: Accurately set num_gprs +- nak: Add a RegFileSet struct +- nak: Add more SSA iterator options +- nak: Add a new VecPair type +- nak/nir: Add more helpers +- nak: Emit if branches in the predecessor block +- nak: Add a more awesome CFG data structure +- nak: Store the blocks in the CFG +- nak: Base liveness on CFG indices +- nak: Add loop detection to the CFG +- nak: Add a phi allocator +- nak: Refactor nak_assign_regs a bit +- nak: Use u32 for register indices +- nak: Rework map_instrs() +- nak: Add a new OpCopy instruction for parallel copy lowering +- nak: Use the builder for the legalize pass +- nak: Use OpCopy in legalize +- nak: Use more OpCopy +- nak: Add a Mem register file +- nak: Handle RegFile::Mem in parallel copy lowering +- nak: Allow DCE on functions +- nak: Restructure liveness construction +- nak: Add interference helpers +- nak: Add a dominance check to CFG +- nak: Add helpers to BasicBlock to get phis +- nak: Add a to-CSSA pass +- nak: Add an SSA repair pass +- nak: Union find +- nak/ra: Drop the pointless AssignRegs struct +- nak/ra: Handle parallel copies as a special case +- nak/ra: Don't free killed for OpPhiSrcs +- nak: Expose LiveSet for incremental liveness tracking +- nak: Add a RegFileSet filter to NextUseLiveness::for_function() +- nak: Add more NextUseLiveness helpers +- nak: Add a spilling pass +- nak: Use the correct number of GPRs on Turing+ +- nak: Spill registers before RA +- nak: Add a debug flag to test spilling +- nak: Implement shader clock +- nak/ra: Improve coalescing +- nak/spill: Tweak the construction of S sets +- nak: Document spilling and RA +- nak: Add an alloc_vec() to SSAValueAllocator +- nak: Move all the IADD3 insanity to a new OpIAdd3X opcode +- nak/legalize: Fix too many IADD3 source modifiers +- nak: Disable lower_image_size_to_txs for NAK +- nak: IMAD also has a destination predicate +- nak: Remap GLSL_SAMPLER_DIM_SUBPASS and SUBPASS_MS to 2D and MS +- nak: Fix instruction ordering in nak_ir.rs +- nak: Rename OpBFind to OpFlo +- nak: Implement Index[Mut] for RegTracker +- nak: Use the right number of predicates in RegTracker +- nak: Rework the barrier insert pass +- nak: Rework calc_delay.rs +- nak: Re-work Instr::get_latency() +- nak: Emit FS_OUT before EXIT +- nvk: Use sysvals for fragcoord etc. with NAK +- nak: Handle flat FS inputs +- nak: Add support for centroid and sample interp modes +- nak: Use load_interpolated_input for frag_coord +- nak: Properly handle OpFSOut in RA and liveness +- nak: Handle empty OpFSOut +- nak/nir: Several FS output fixes +- nak: Implement load_sample_id and load_sample_mask_in +- nak: Implement discard and demote +- nak: Set TLS size properly in the shader header +- nvk,nak: Plumb through the zs_self_dep key bit +- nak: Use count_attribute_slots for FS input var sizes +- nak: Pull sm, num_gprs, and tls_size into a ShaderInfo struct +- nak: Stash a ShaderInfo in ShaderFromNir +- nak: Rework FS outputs again +- nak: Re-plumb compute shader info +- nak: Plumb more FS info through to the C API +- nvk/nak: Translate our new FS flags from NAK to nvk_shader +- nak: Saturate depth writes +- nak: Add support for gl_FrontFace +- nak/nir: Fix helper invocations +- nak/nir: Use nir_shader_intrinsics_pass for FS inputs +- nak: Handle interpolate_at_offset +- nak: Take components into account in load_*input +- nak: Plumb uses_kill through from nak_from_nir +- nak/nir: Plumb the FS key into lower_fs_input_intrin +- nak/nir: Move frag_coord/sample_pos lowering to FS input lowering +- nak/nir: Fix sample vs. pixel input interpolation +- nak/nir: Add a load_frag_w helper +- nak/nir: Interpolate gl_PointCoord +- nak/nir: Return one sample for gl_SampleMaskIn[0] when sample shading +- nak: Fold source modifiers in legalize +- nak: Provide more detail when printing IR after passes +- nak: Handle modifiers in dedup_srcs() in opt_lop() +- nvk: Add a helper for lowering system values to root table loads +- nvk: Lower more draw system values +- nak: Take component into account in store_output +- nak: Fix printing of OpASt +- nak: Move NIR enum translation out of nak_sph.rs +- nak: rustfmt fixes +- nak: Simplify I/O gathering +- nvk: Set clip/cull_enable for NAK shaders +- nak: Run simple liveness data-flow bottom-up +- nak/bitset: Add a helper for modifying in-place +- nak: Don't allocate bitsets in liveness data-flow +- nak: Handle non-constant I/O offsets +- nouveau/parser: Dump SET_STREAM_OUT_CONTROL_* properly +- nak: Translate XFB info +- nvk: Plumb through XFB info from NAK +- nak: Add a Label struct for branch targets +- nak: Add OpNop which can have a label +- nak: Break indirect offset encoding into a helper +- nak: Allow encoding Dst::None +- nak: Add barrier instructions +- nak/builder: Return the instruction from push_*() +- nak: Implement NIR control barriers +- nak: Implement From for SrcRef for more types +- nak: Add enums for sysvals and attributes +- nak: Plumb clip/cull enables through nak +- nak/nir: Lower tessellation and geometry I/O +- spirv: Fix locations for per-patch varyings +- nak: NVIDIA calls them tessellation init shaders +- nak: Rework OpALd and OpASt a bit +- nak: Set per patch attribute count both places in the SPH +- nak: Handle location_frac for FS outputs in nak_from_nir.rs +- nak: Add lowering for per-vertex I/O +- nak: Implement more attribute I/O +- nak/nir: Lower load_primitive_id +- nak,nvk: Plumb through tessellation info +- nak: Implement load_tess_coord +- nak: Fix lowering for patch_vertices_in +- HACK: Only emit OpBar in compute shaders +- nak/nir: Use count_vec4_slots instead of count_attribute_slots +- nak: Add NIR lowering for attribute I/O +- nak/nir: Lower systm values before lowering I/O +- nak: Use nak_nir_lower_vtg_io +- nak: Fix a bunch of warnings +- nak: Fix opt_out +- nak/bitset: Improve set_words() +- nak/bitset: Add an is_empty() helepr +- nak/bitset: Fix next_set() +- nak/sph: Round tls_size up to a multiple of 16 +- nak: Fix repair_ssa() for back-edges +- nak: Fix parallel copy handling in spilling +- nak: Fix to_cssa() +- nak/nir: Don't lower 1-bit phis +- nak: Support encoding -Zero +- nak: Fix fneg to do fadd(-0, x) +- nak: Rename lower_vec_split() to lower_ineg() +- nak: Use Src::From and Src::From +- nak: A quick rustfmt fix +- nak: Upgrade to more modern meson +- nak: Add some #[allow(dead_code)] +- nak: Drop some unused helpers +- nak: Get rid of dead code warnings in RegFileSet +- nak: Get rid of warnings in nak_sph.rs +- nak: Drop the final calc_max_live() after GPR spilling +- nak: Don't print a range for one register +- nir: Add nvidia barrier intrinsics +- nak/nir: Add a pass for adding convergence barriers +- nak: Add OpBreak +- nak: Handle control-flow barriers +- nak: Use barriers for re-convergence +- nak: Remove unnecessary control barriers +- nak: Call nir_lower_subgroups() +- nak: Use nir_shader_intrinsics_pass for system values +- nak: Lower subgroup_id and num_subgroups +- nak/nir: Allow boolean vote_ieq +- nak/nir: Zero-pad subgroup masks +- nak: Implement vote and ballot +- nak: Fix the encoding of OpShfl +- nak: Implement read_invocation and shuffle_* +- nak: Allow 1-component image load/store +- nak: Emit CCtl in barriers with acq/rel semantics +- nak: Use strong ordering for Image load/store +- nak: Use the simplified BAR.SYNC encoding +- nak: Emit MemBar before Bar +- nak: Insert an OpNop after OpBar +- nak: Document a bit in encode_lds() +- nvk: Enable subgroups features +- nak: Rely on Rust 1.73 for next_multiple_of() and div_ceil() +- nak: Require meson 1.3.0 and clean up a couple bits +- meson: Set build.rust_std +- ci: Bump container images for NAK dependencies +- ci: Add syn to --force-fallback-for +- ci: Update the python env for ci_run_n_monitor.py +- nvk: Default to NAK on Turing+ +- nvk: Stop asserting 11-bit storage image handles +- nvk: Free NAK shaders +- nak: Fix copy-prop for OpPLop3 sources +- nak: Drop OpAtomCas in favor of OpAtom with atom_op == CmpExch +- nak: Make ALD/AST.PHYS a boolean +- nak: Make encode_sm75 a method of Shader +- nak: Plumb the nak_compiler through to lower_fs_input_intrin +- nak: Rework FS input interpolation +- nvk: Only advertise VK_KHR_shader_terminate_invocation if using NAK +- nvk: Handle load_first_vertex in nvk_nir_lower_descriptors() +- nak/nir: Lower indirect FS inputs +- nvk: Only lower outputs to temporaries +- nvk: Add a codegen helper for nir_shader_compiler_options +- nvk: Move a bunch of codegen-specific lowering to helpers +- nvk: Move the optimization loop to the nvk_codegen.c +- nvk: Move the guts of nvk_compile_nir() to nvk_codegen.c +- nvk: Move even more lowering into nvk_codegen.c +- nvk: Use nak_fs_key instead of rolling our own +- nak: Rename TLS to SLM +- nak: Properly prefix nak_xfb_info +- nak: Move clip, cull, and XFB into a nak_shader_info.vtg +- nak: Add a writes_layer bit to nak_shader_info::vtg +- nak: Handle the num_gpr offsetting inside nak +- nvk: Use nak_shader_info natively +- nak: Enable SM70 for Volta +- nak: Stop passing undefs to ipa_nv +- nak: Support dumping shader assembly as part of compile +- nvk: Don't set pipeline->base.type manually +- nvk: Implement VK_KHR_pipeline_executable_properties +- nvk: Drop nouveau_ws_bo_new_tiled() +- nvk: Rework error handling in nouveau_ws_bo_new() and from_dma_buf() +- nvk: Handle VMA allocation failure +- nvk: Add a separate VMA heap for BDA capture/replay +- nvk: Implement bufferDeviceAddressCaptureReplay +- nvk: Advertise VK_KHR_synchronization2 +- nvk: Set the right API version in the ICD json files +- nak: Add the predicate destination to OpShfl +- nak: Add builder helpers for a few ops +- nak: Use c == 0x0 for shuffle_up +- nak: Lower scan/reduce in NIR +- nak: Implement quad ops +- nvk: Advertise the rest of the subgroup ops +- nak: Rework reg and SSA value printing +- nak: Make most Display stuff lower-case +- nak: Rework opcode printing to use a new trait +- nak: Implement DisplayOp on Op instead of Display +- nak: Default InstrDeps::delay to 0 +- nak: Only write deps.delay when set +- nak: Align instructions when printing +- nak: Display memory access bits with the "." prefix +- nak: Make MemAddrType a part of MemSpace +- nak: Display memory type at the end for load/store ops +- nak: Rework printing of texture and image dims +- nak: Two more print fixes +- nak: gl_FragCoord and gl_PointCoord are screen-space interpolated +- nvk/codegen: Fragment shader builtins are noperspective +- nvk: Wire up MESA_VK_VERSION_OVERRIDE +- nvk: Limit shader stages to supported stages +- nak: Run rustfmt +- nak: Only insert barriers around ifs if they actually re-converge +- vulkan: Default override patch version to VK_HEADER_VERSION +- nvk: Advertise Vulkan 1.1 on Turing+ +- nak: Drop the PrmtSelection stuff +- nak: Add a builder helper for OpPrmt +- nak: Rework OpPrmt a bit +- nak: Implement nir_op_extract_* +- nak: Fix int8/16 lowering +- nak: Add base support for 8 and 16-bit types +- nak: Implement more int/float conversions +- nak: Implement integer conversions +- nak: Handle non-DW-aligned UBO loads +- nvk: Enable 8 and 16-bit integer types +- nak: Implement scan/reduce on booleans +- nak/nir: Handle CBuf alignment rules +- nak: Revert "nak: Handle non-DW-aligned UBO loads" +- nvk: Use the copy engine for CmdFillBuffer +- nvk: Use the copy engine for NVK_DEBUG=zero_memory +- nvk: Stop initializing the 2D engine +- vulkan: Move vk_synchronization2 to vk_synchronization +- vulkan: Add some auto-generated synchronization helpers +- vulkan: Add helpers for pipeline stage flags +- vulkan: Add helpers for access flags +- nvk: Move Begin/EndTransformFeedback to nvk_cmd_draw.c +- nvk: Rework transform feedback stalling +- nvk: Implement vkCmdPipelineBarrier2 for real +- nvk: Drop unnecessary per-draw/dispatch cache maintenance +- nvk: Drop MME_DMA_SYSMEMBAR before indirect draw/dispatch +- nak: Drop a bunch of SET_REFERENCE from the pre-Turing paths +- nvk: Advertise VK_EXT_subgroup_size_control +- nil: Add support for filling out linear texture headers +- nouveau: Rename nvidia-headers to headers +- nouveau: Move headers/classes to headers/nvidia/classes +- nak: Run rustfmt again +- nak: Fix integer roll-over when we have a u64vec4 +- nak: Set .64/.32 on CSSR as needed +- nak/nir: Don't use nir_lower_bit_size on 64-bit values +- nak: Implement 64-bit ineg +- nak: Natively implement 64-bit shifts +- nak: Lower isign in NIR +- nak: Rework printing of comparisons +- nak: Implement 64-bit comparisons +- nak: Don't ask NIR to lower [iu]mul64_2x32 +- nak: Use the right source types for I2F, F2I, and F2F +- nak: Fix encoding of 64-bit F2I, I2F, and F2F +- nak: Implement b2i64 +- nak/nir: Don't lower 64-bit conversions +- nvk: Advertise shaderInt64 +- nvk: Advertise VK_EXT_shader_subgroup_ballot/vote +- nak/nir: Handle non-32-bit data in lower_scan_reduce +- nvk: Advertise KHR_shader_subgroup_extended_types +- nvk: Advertise VK_KHR_shader_atomic_int64 +- nak/nir: Trim image load/stores based on format +- nak: Lower 64-bit image load/store +- nak: Handle 64-bit image atomics +- nil: Add R64_SINT and R64_UINT formats +- nvk: Don't disable non-texturable formats +- nvk: Implement VK_EXT_shader_image_atomic_int64 +- nak: Simplify Src::is_predicate() +- nak: Replace OpBMov with OpBClear +- nak: Fix scheduling for control barriers +- nak: Add a barrier register file +- nak: Add back OpBMov with better semantics +- nak: Add support for spilling barriers +- nak: Take num_barriers from RA +- nak: Make barriers SSA-friendly +- nak: Force RA to allocate bar_in/out to the same register +- nak: Add a barrier propagation pass +- dxil: Use mesa_prim consistently +- glsl: Properly remap GL_* to MESA_PRIM +- intel/vec4: Use MESA_PRIM_* instead of GL_* +- nir: Return a mesa_prim from gs_in_prim_for_topology +- compiler: Fix a comment +- radeonsi: Drop an unnecessary cast +- nvk: Advertise VK_EXT_scalar_block_layout +- nak: Advertise subgroupBroadcastDynamicId +- nak: Add a B32 source type +- nak: Rework the OpIAdd3/OpIAdd3X split +- nak/legalize: Handle the src0/1 source mod condition for OpIAdd3X +- nak: Legalize immediates with source modifiers +- nak: Implement uadd_sat +- nak: Implement usub_sat +- nvk: Implement VK_EXT_texel_buffer_alignment +- spirv: Plumb variable alignments through to NIR +- nir: Respect variable alignments in lower_vars_to_explicit_types +- nak: rustfmt +- nak: Restructure for better module separation +- ci: Also rustfmt binaries +- nir: Split has_[su]dot_4x8 bits into regular and _sat versions +- nir: Lower [su]dot_4x8_[ui]add_sat to [su]dot_4x8_[ui]add +- microsoft: Stop claiming dot_4x8_sat support +- nak: Rework printing of int/float types and rounding modes +- nak: Wire up DP4 +- nvk: Advertise KHR_shader_integer_dot_product +- nak: Split legalize into per-SM functions +- nak: Initial WIP SM50 backend +- nak: Rework set_src_imm20 in nak_encode_sm50 +- nak: Rewrite SM50 encode_fadd to not use encode_alu +- nak: Rename LogicOp to LogicOp3 +- nak: Use OpLop2 and OpPSetP pre-SM70 +- nak: Rework the SM50 encoding of isetp +- nak: Add SM50 encodings for ALD and AST +- nak: Only split texture destinations on Volta+ +- nak: Rework nvfuzz for SM50 +- nak/nv50: Rewrite the encoding of OpShf +- nak/sm50: Wire up tex ops +- nak: Rewrite the SM50 encoding of OpF2I +- nak/sm50: Rewrite the encoding for OpIMnMx +- nak: Implement FS input interpolation on SM50 +- nak/sm50: Rewrite the encoding for OpMov +- nak: Drop the SM50 encoding of BREV +- nak/sm50: Add better helpers for encoding sources with modifiers +- nak/sm50: Stop using ALUSrc for IADD2 +- nak/sm50: Drop src_mod_has* in favor of core helpers +- nak: Clean up compiler warnings +- nak: Add barriers on Volta +- nak/nvfuzz: Add an SM parameter +- nak: Drop the fmnmx from Builder +- nak: Add an ftz bit to a bunch of float ops +- nak: Plumb through float controls +- nvk: Advertise VK_KHR_shader_float_controls +- nak: Plumb through float controls for fset[p] +- nak: Plumb through float controls for frnd[p] +- nak: Add dnz bits to OpFMul and OpFFma +- nak: Audit remaining FTZ/DNZ bits on sm70+ +- nak: Audit sm50 for FTZ/DNZ bits +- nak: Clean up instruction printing a bit +- nak: Rework barrier handling a bit +- nvk: Make NVK_DEBUG=push an alias for push_dump +- nvk: s/device/dev in nvk_descriptor_set_layout.c +- nvk: Plumb a physical device into descriptor_stride_align_for_type +- nvk: Add a nvk_min_cbuf_alignment() helper and use it +- nvk: Add an NVK_MIN_TEXEL_BUFFER_ALIGNMENT #define +- nak: Reduce minStorageBufferAlignment +- nvk: Simplify alignment limit plumbing +- nvk: CBuf alignment reduces to 64B on Turing +- nvk: Throw Tegra behind NVK_I_WANT_A_BROKEN_VULKAN_DRIVER +- nvk: Rework the way we set up memory heaps/types +- nir: Add a new has_fmulz_no_denorms flag +- nak: Set .ftz on f32 ops by default +- nak: Implement fmulz and ffmaz +- nvk: Enable NAK by default for Volta +- nak: Don't set both FTZ and DNZ at the same time +- nvk: Implement VK_EXT_multi_draw +- nak: Add a delay of 2 cycles for barriers +- nak: Rework the dependency pass +- nak: Handle negative cbuf offset immediates +- nak/sm50: Fix immediate encodings +- nak/sm50: Fix legalization of OpIAdd +- nak/sm50: Add legalization and encoding for OpLdc +- nvk/nir: Add cbuf analysis to nvi_nir_lower_descriptors() +- nvk/nir: Lower UBO loads to load_ubo when we have a cbuf +- nvk: Add a cbuf_bind_map to nvk_shader +- nvk: Stash descriptor set sizes +- nvk: Rework push_indirect to take an address +- nvk: Set MME_DATA_FIFO_CONFIG on device init +- nvk: Don't flush descriptors in BeginConditionalRendering +- nvk: Upload cbufs based on the cbuf_map +- nvk: Add debug flags to the physical device +- nvk: Enable cbufs +- nvk: Use ENUM_PACKED for enums instead of PACKED +- nir: Scalarize bounds checked loads and stores +- nak: Switch to //-style comments +- nak: Plumb shader model into instruction latency queries +- nak: Handle minimum execution latencies in the dep tracker +- nvk: Advertise VK_KHR_vulkan_memory_model +- nvk: Use render->color_att_count for color write enables +- nvk: Support extendedDynamicState3ColorWriteMask +- nak: Move the copy detection part of opt_copy_prop to a helper +- nak: Fix copy-prop for fp64 +- nak: Copy propagate and constant fold OpPrmt +- nak: Make OpAtom::cmpr a GPR source +- nak: Pass SrcTypes around instead of RegFile in legalize +- nak/sm70: Allow src2 of 3src ops to be an immediate +- nak: OpDAdd doesn't have saturate +- nak: Rework encoding of ALU instructions on SM70+ +- nak: Add the rest of the double-precision ops +- nak: Split fmul/ffma handling from fmulz/ffmaz +- nak: Wire up 64-bit nir_op_fadd/ffma/fmul and comparisons +- nak: Fix nir_op_f2f64 +- nak: Implement b2f64 +- nak/nir: Set nir_lower_io_lower_64bit_to_32 for varyings +- meson: Update our rust dependencies +- nak: Fix encoding of dsetp with RZ on SM70+ +- nak: Implement 64-bit nir_op_fsign +- nak/sm50: Add encoding and legalization for dadd/dfma/dmul/dsetp +- nak/sm50: Fix encoding of f20 immediates +- nak/sm50: Fix encoding of iadd with imm32 +- nak/sm50: Properly legalize OpSel and drop an assert +- nak/sm50: Add DMnMx and use it for fp64 fmin/fmax +- nir/lower_doubles: Add lowering for fmin/fmax/fsat +- nak/nir: Lower a bunch of fp64 +- nvk: Advertise shaderFloat64 +- nvk: Free shaders created by codegen +- nvk: Unref shaders on pipeline free +- nvk: Don't exnore ExternalImageFormatInfo +- nak: Fix TCS output reads + +Felix DeGrood (3): + +- anv: remove CS_FLUSH from query regression +- driconf: add Dying Light 2 to Intel XeSS workaround +- driconf: add Witcher3 to Intel XeSS workaround + +Felix bridault (1): + +- radv: use 32bit va range for sparse descriptor buffers + +Florian Weimer (1): + +- meson: C type error in strtod_l/strtof_l probe + +Francisco Jerez (70): + +- intel/l3/gfx11+: Add tile cache partition to intel_l3_config struct. +- intel/l3: Define helper for obtaining the size of an L3 partition in KB. +- intel/l3: Set up L3FullWayAllocationEnable config if ALL partition has over 126 ways. +- intel/dg2: Import L3 cache configurations. +- intel/mtl: Import L3 cache configurations. +- intel/xehp+: Add TBIMR-related genxml definitions. +- intel/xehp+: Import algorithm for TBIMR tiling parameter calculation. +- intel/xehp+: Add dynamic state flags controlling whether TBIMR is enabled during 3D primitives. +- intel/xehp+: Define driconf option for selectively disabling TBIMR. +- iris/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation. +- anv/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation. +- anv/xehp+: Enable TBIMR in generated draw calls. +- intel/xehp: Adjust TBIMR performance chicken bits. +- intel/xehp+: Adjust TBIMR batch size based on slice count. +- intel/xehp+: Use TBIMR tile box check in order to avoid performance regressions. +- intel/xehp: Enable TBIMR by default. +- intel/eu/xe2+: Add support for 10-bit SWSB representation on Xe2+ platforms. +- intel/fs/xe2+: Add comment reminding us to take advantage of the 32 SBID tokens. +- intel/fs/xe2+: Teach SWSB pass about the behavior of double precision instructions. +- intel/fs/xe2+: Handle extended math instructions as in-order in SWSB pass. +- intel/eu/xe2+: Add definition for size of GRF space on Xe2. +- intel/fs/xe2+: Don't special case SEL_EXEC in inferred_exec_pipe(). +- intel: Improve N-way pixel hashing computation to handle pixel pipes with asymmetric processing power. +- intel/compiler: Add max_polygons FS compilation parameter. +- intel/compiler: Add multipolygon dispatch fields to brw_wm_prog_data. +- intel/compiler: Add polygon count statistic to brw_compile_stats. +- intel/fs: Add separate constructor of fs_visitor for fragment shaders. +- intel/fs: Map all GS input attributes to ATTR register number 0. +- intel/fs: Map all VS input attributes to ATTR register number 0. +- intel/fs: Map all TES input attributes to ATTR register number 0. +- intel/fs: Assert fs_reg::nr is always zero for ATTR registers in geometry stages. +- intel/fs: Consider ATTR registers with different fs_reg::nr as belonging to disjoint register spaces. +- intel/fs: Provide component index explicitly to interp_reg(). +- intel/fs: Pass builder to per_primitive_reg(). +- intel/fs: Fix fs_reg::component_size() to handle two-dimensional register regions. +- intel/fs: Rework layout of FS vertex setup data in ATTR file to support multi-polygon dispatch. +- intel/fs: Don't copy-propagate ATTR registers in multi-polygon FS shaders when invalid. +- intel/compiler: Don't change types for copies from ATTR file. +- intel/fs/gfx12+: Don't set nir_divergence_single_prim_per_subgroup option for fragment shaders. +- intel/fs/gfx12: Don't consider multipolygon PS to have packed dispatch. +- intel/fs: No need to copy null destinations in lower_simd_width. +- intel/fs: Fix PS thread payload setup for depth_w_coef_reg. +- intel/fs/gfx12: Implement multi-polygon format of back/front-facing flag in PS payload. +- intel/fs/gfx12: Implement multi-polygon format of render target array index in PS payload. +- intel: Add debug flag for enabling dual-SIMD8 fragment shader dispatch. +- intel/compiler: Attempt to build dual-SIMD8 variant of fragment shaders on gfx12+ platforms. +- intel/genxml: Add 3DSTATE_PS definitions needed for dual-SIMD8 dispatch on Gfx12+. +- intel/gfx12: Enable SIMD8 dispatch in 3DSTATE_PS for FS multipolygon dispatch. +- iris/gfx12: Hook up dual-SIMD8 fragment shader dispatch. +- anv/gfx12: Hook up dual-SIMD8 fragment shader dispatch. +- intel/fs/xe2+: Stop building SIMD8 compute-like shaders (CS/BS/TS/MS). +- intel/fs/xe2+: Stop building SIMD8 fragment shaders. +- intel/fs/xe2+: Stop building SIMD8 shaders for geometry stages (VS/TCS/TES/GS). +- intel/eu/xe2+: Add helpers for constructing registers in 512b units. +- intel/fs/xe2+: Implement PS thread payload register offset setup. +- intel/fs/xe2+: Fix for new layout of X/Y pixel coordinates in PS payload. +- intel/fs/xe2+: Update uses of pixel/sample mask from PS thread payload. +- intel/fs/xe2+: Update location of sample ID fields in PS payload. +- intel/fs/xe2+: Update poly info PS payload for new multi-polygon dispatch format. +- intel/fs: Add support for vector payload values to fetch_payload_reg(). +- intel/fs/xe2+: Enable new format of barycentrics in PS payload. +- intel/fs/xe2+: Update for new layout of vertex setup data in PS payload. +- intel/fs/xe2+: Implement support for multi-polygon vertex setup data in PS payload. +- intel/fs/xe2+: Implement layout of mesh shading per-primitive inputs in PS thread payloads. +- intel/fs: Plumb shader instead of compiler to get_lowered_simd_width() and friends. +- intel/fs/xe2+: Lower SIMD width of instructions that access ATTR file from SIMD2x8/4x8 FS. +- intel: Add debug flags for enabling Xe2+ multipolygon fragment shader dispatch modes. +- intel/fs/xe2+: Attempt to build quad-SIMD8 and dual-SIMD16 FS variants on Xe2+ platforms. +- intel/xe2+: Implement fragment shader dispatch state setup. +- intel/compiler/xe2: Don't disassemble non-existent fields. + +Frank Binns (4): + +- pvr: rename some more instances of 'reserved' to 'carveout' for consistency +- include/drm-uapi: add pvr_drm.h +- pvr: Add powervr winsys implementation +- pvr: alloc WSI memory via GPU when there isn't a valid display FD + +Friedrich Vock (24): + +- aco: Update printed block kinds +- vulkan: Don't use set_foreach_remove when destroying pipeline caches +- radv/ci: Update skips comments +- ac/gpu_info: Manually compute L3 size for Navi33 +- radv: Enable compute dispatch tunneling +- radv,vtn,driconf: Add and use radv_rt_ssbo_non_uniform workaround for Crysis 2/3 Remastered +- radv/rt: Initialize unused children in PLOC early-exit +- radv/rt: bsearch inlined shaders +- radv/rt: Free traversal NIR after compilation +- radv,aco: Convert 1D ray launches to 2D +- radv/rt: Move per-geometry build info into a geometry_data struct +- radv/rt: Acceleration structure updates +- radv/rt: Add workaround to make leaves always active +- radv: Fix shader replay allocation condition +- nir: Make is_trivial_deref_cast public +- nir: Handle casts in nir_opt_copy_prop_vars +- util: Provide a secure_getenv fallback for platforms without it +- vulkan: Use secure_getenv for trigger files +- aux/trace: Guard triggers behind __normal_user +- vtn: Use secure_getenv for shader dumping +- mesa/main: Use secure_getenv for shader dumping +- radv: Use secure_getenv in radv_builtin_cache_path +- radv: Use secure_getenv for RADV_THREAD_TRACE_TRIGGER +- util/disk_cache: Use secure_getenv to determine cache directories + +GKraats (1): + +- i915G: show correct number of needed ALU instructions at errmess + +Ganesh Belgur Ramachandra (9): + +- radeonsi: Fix clear-render-target shader for 1darrays in NIR +- radeonsi: "create_dma_compute" shader in nir +- radeonsi: "create_fmask_expand_cs" shader in nir +- radeonsi: "get_blitter_vs" shader in nir +- asahi: fixes prevailing '-Werror=maybe-uninitialized' issue +- radeonsi: enable nir pass for 64 bit operations +- radeonsi: add comments for unpack_2x16* utility functions +- radeonsi: convert "create_query_result_cs" shader to nir +- radeonsi: convert "gfx11_create_sh_query_result_cs" shader to nir + +Georg Lehmann (28): + +- aco, radv: vectorize f2f16 if rounding mode is rtz +- aco: force uniform result for LDS load with uniform address if it can be non uniform +- aco: stop using cstdint +- aco: namespace aco_opcode +- aco: deduplicate instr_class definition +- aco: deduplicate Format definition +- aco: don't CSE v_permlane across exec +- aco: use null operand for SOPK s_waitcnt +- aco: fix detecting sgprs read by SMEM hazard +- aco/tests: add some missing scc defs +- aco/tests: use correct operand size for some 64bit ops +- aco: use lm for carry out in vsub32 +- aco: add missing scc def for SALU quad broadcast +- aco/gfx10+: don't use v_cmpx with VCC def +- aco: use correct operand size for int tg4 wa +- aco: add src/def count and size for all ALU opcodes +- aco: validate ALU operands and defs +- aco/sched: treat p_dual_src_export_gfx11 like export +- aco: don't optimize DPP across more than one block +- aco: add test for post-ra DPP clobbered in linear cfg +- aco: optimize 32bit fsign by using fmulz with Inf +- aco: shrink buffer stores with undef/zero components +- aco/gfx12: implement broadcast dmask shrink behavior +- aco: apply packed fneg commutatively +- aco: fix applying input modifiers to DPP8 +- aco: clean up fneg/fabs combining +- aco: apply fneg/fabs to VOP3P +- aco: stop scheduling at p_logical_end + +George Ouzounoudis (9): + +- nvk: Move SET_BLEND_STATE_PER_TARGET to graphics state initialization +- nvk: Support extendedDynamicState3ColorBlendEnable +- nvk: Support extendedDynamicState3ColorBlendEquation +- nvk: Support extendedDynamicState3SampleMask +- nvk: Support extended dynamic state for alpha to coverage/one +- vulkan: Fix dynamic graphics state enum usage +- nvk: Support extended dynamic state for rasterization stream +- nvk: Remove pipeline state setting functions +- nvk: Support extended dynamic state for tessellation domain origin + +Gert Wollny (15): + +- virgl: Use host reported limits for max outputs +- r600: Add callbacks for get_driver_uuid and get_device_uuid +- r600: Add experimental get_compute_state_info +- r600: Link with libgalliumvl, when enabling rusticl this is needed +- r600/sfn: Fixup component count only if intrinsic has it +- r600/sfn: Allow skipping backend shader optimization for a subset of shaders +- r600/sfn: keep workgroup and invocation ID registers for whole shader +- r600/sfn: Fix usage of std::string constructor +- r600/sfn: Don't try to re-use iterators when the set is made empty +- zink: Don't pass a blend state when we have full ds3 support +- r600: lower dround_even also on hardware that supports fp64 +- virgl: Use better reporting for mirror_clamp features +- radv: Fix compilation with gcc-13 and tsan enabled +- nir/lower_int64: Fix compilation with gcc-13 and tsan enabled +- nir/builder: Fix compilation with gcc-13 when tsan is enabled + +Giancarlo Devich (1): + +- nir: Workaround MSVC internal compiler error in ARM64 build + +Guilherme Gallo (19): + +- ci/bin: Use iid instead of SHA in gitlab_gql +- ci/bin: Do not forget to add early-stage dependencies +- ci/bin: Refactor create_job_needs_dag +- ci/lava: Use project_name instead of hardcoded \`mesa` +- ci/lava: Fix imports formatting +- ci/lava: Refactor UART definition building blocks +- ci/lava: Create LAVAJobDefinition +- ci/lava: Make SSH definition wrap the UART one +- ci/lava: Enable SSH by default in fastboot devices +- ci/lava: Add unit tests covering job definition +- ci/bin: Fix find_dependency function calls +- ci/bin: Replace AIOHTTPTransport with RequestsHTTPTransport +- ci/bin: gql: make the query cache optional +- ci/bin: gql: Log the caching errors +- ci/bin: gql: Implement pagination +- ci/bin: gql: Improve queries for jobs/stages retrieval +- ci/bin: Fix gitlab_gql methods that uses needs DAG +- ci/bin: Fix mypy errors in gitlab_gql.py +- ci/bin: Print a summary list of dependency and target jobs + +Haihao Xiang (1): + +- anv: Fix typo in transition_color_buffer + +Hans-Kristian Arntzen (2): + +- radv/radeonsi: Forward correct GPU instance to umr. +- wsi/x11: Add workaround for Detroit Become Human. + +Helen Koike (3): + +- ci/zink: add spec\@ext_timer_query\@time-elapsed to flakes +- ci/ci_run_n_monitor: abort when target gets skipped +- ci: fix python-test dependency error on merge requests + +Hyunjun Ko (2): + +- vulkan/video: fix a typo +- anv/video: fix out-of-bounds read + +Iago Toral Quiroga (13): + +- v3d,v3dv: fix MMU error from hardware prefetch after ldunifa +- v3d: implement support for PIPE_CAP_NATIVE_FENCE_FD +- broadcom: fix scheduling dependencies for SETMSF instruction +- v3dv: disallow image stores on VK_KHR_DISPLAY surfaces +- v3dv: switch timestamp queries to using BO memory +- broadcom: disable perquad tmu loads after discards +- broadcom: lower null pointers +- v3dv: implement VK_KHR_shader_terminate_invocation +- v3dv: implement VK_EXT_shader_demote_to_helper_invocation +- v3dv: expose VK_EXT_subgroup_size_control +- broadcom/compiler: fix incorrect flags setup in non-uniform if path +- broadcom/compiler: fix incorrect flags update for subgroup elect +- broadcom/compiler: be more careful with unifa in non-uniform control flow + +Ian Romanick (39): + +- nir/split_vars: Don't split arrays of cooperative matrix types +- nir/lower_packing: Don't generate nir_pack_32_4x8_split on drivers that can't handle it +- nir/lower_packing: Add lowering for nir_op_unpack_32_4x8 +- nir/builder: Teach nir_pack_bits and nir_unpack_bits about 32_4x8 +- intel/vec4: Don't emit an empty ELSE +- intel/compiler: Add basic CFG validation +- intel/compiler: Limit scope of cur_endif variable +- intel/compiler: Delete bidirectional block links in opt_predicated_break +- intel/compiler: Don't create extra CFG links in opt_predicated_break +- intel/compiler: Don't create extra CFG links when deleting a block +- intel/compiler: Don't promote CFG link types when removing a block +- intel/fs: Don't add MOV instructions to DO blocks in combine constants +- intel/compiler: Verify that DO is alone in the block +- nir: Handle divergence for decl_reg +- intel/fs/xe2+: Pass correct dispatch_width to fs_generator for geometry-processing stages. +- intel/cmat: Update get_slice_type for packed slices +- intel/cmat: Add lowering for cmat_insert and cmat_extract +- intel/cmat: Enable packed formats for unary, length, and construct +- intel/cmat: Enable packed formats for binary ops +- intel/cmat: Enable packed formats for scalar ops +- intel/cmat: Add lowering for cmat_bitcast +- intel/cmat: Lower cmat_load and cmat_store +- intel/compiler: Initial bits for DPAS instruction +- intel/disasm: Disassembly support for DPAS +- intel/compiler: Validation for DPAS instructions +- intel/fs: Fix scoreboarding for DPAS +- intel/fs: DPAS lowering +- intel/fs: nir: Add nir_intrinsic_dpas_intel +- anv: Add anv_physical_device::has_cooperative_matrix +- anv: Set COMPUTE_WALKER systolic mode enable flag +- anv: Set PIPELINE_SELECT systolic mode enable flag +- anv: Lower indirect derefs again after lowering cooperative matrices +- anv: Select the SIMD mode very early when cooperative matrices are used +- intel/dev: Advertise integer configs with saturatingAccumulation too +- intel/dev: Enable VK_KHR_cooperative_matrix on all Gfx9+ GPUs +- intel/cmat: Generate better code for nir_intrinsic_cmat_insert +- intel/compiler: Disable DPAS instructions on MTL +- intel/compiler: Track lower_dpas flag in brw_get_compiler_config_value +- intel/compiler: Track mue_compaction and mue_header_packing flags in brw_get_compiler_config_value + +Italo Nicola (4): + +- panfrost: fix untracked dependency when converting resource modifier +- gallium: stop calling resource_copy_region for multisampled copy_image +- panfrost: legalize afbc before blitting +- panfrost: expose support for EXT_copy_image + +Iván Briano (8): + +- anv: use the right vertexOffset on CmdDrawMultiIndexed +- hasvk: ensure we reapply always pipeline dynamic state in runtime state +- anv: allow NULL index buffers +- anv: remove no longer valid assert +- anv: handle VkBindMemoryStatusKHR on buffer/image memory bind +- anv: add support for Cmd*DescriptorSet*2KHR +- anv: move astc_emu to use descriptors2 calls +- anv: enable VK_KHR_maintenance6 + +Jan Beich (2): + +- intel: make CLOCK_TAI optional for non-Linux +- intel: make CLOCK_BOOTTIME optional for non-Linux + +Jani Nikula (7): + +- nir: add names to some typedef'd structs/enums +- nir: drop \**< style documentation comments +- isl: drop \**< style documentation comments +- docs: Add docs/header-stubs/README.rst +- docs/vulkan: use hawkmoth instead of doxygen +- docs/nir: use hawkmoth instead of doxygen +- docs/isl: use hawkmoth instead of doxygen + +Janne Grunau (4): + +- gallium: Avoid empty version scripts in pipe-loader +- gallium: Fix i915 pipe-loader build +- gallium: Do not create pipe-loader version scripts for disabled drivers +- asahi: Fix typo in arch check in agx_get_gpu_timestamp + +Jesse Natalie (64): + +- microsoft: Disable post-merge CI for Windows +- d3d12: Only set draw params root parameter index for actual draw params +- dzn: Implement VK_MSFT_layered_driver +- wgl: Take pixelformat color channels into account for choosing a PFD +- winsys/gdi: Handle 4444 and 1010102 texture formats +- winsys/gdi: Update is_displaytarget_format_supported to reflect reality +- d3d12: Don't support displaytargets that can't be supported by GDI/DXGI +- dzn: Use vk_properties helper +- vulkan: Remove no-longer-needed prototypes for ICD entrypoints +- vulkan: Consolidate common ICD methods +- vulkan: Support loader interface v7 +- dzn: Fix memory type sorting +- microsoft/compiler: Set src/dest nir types on image intrinsics when deducing format +- d3d12: Disable common state promotion for non-simultaneous-access textures +- d3d12: Initialize shader key swizzle for non-int textures +- d3d12: Add a fallback for int clears where value can't be cast to float +- d3d12: Binding buffers as SSBO/storage image needs to add buffer ranges +- d3d12: Change memory barrier implementation +- d3d12: Support ARB_texture_view +- d3d12: Use format casting for shader images +- d3d12: GL4.3 +- microsoft/compiler: Bump signature limits for 32 rows of 4 components +- microsoft/compiler: Don't declare PS output registers split across variables +- microsoft/compiler: Don't use 64-bit types for signature entries +- microsoft/compiler: When packing fractional inputs, find a row with space for it +- microsoft/compiler: Stop lowering all I/O to temps +- d3d12: Fix location_frac_mask bitfield size +- d3d12: Split dvec3 interpolatns into devc2 and double +- d3d12: Support enhanced layouts for VS inputs +- d3d12: Fix GS variant I/O slot counts +- d3d12: Enable ARB_enhanced_layouts and ARB_texture_mirror_clamp_to_edge +- d3d12: Reference count queries in a batch +- d3d12: ARB_query_buffer_object and GL4.4 +- d3d12: PRIMITIVES_GENERATED for stream > 0 should only be an SO query +- d3d12: Handle cull distance as an XFB target +- d3d12: Fix MSAA-disabling pass; sample mask should be 0 for helper lanes +- d3d12: GL4.5 +- nir_lower_mem_access_bit_sizes: Fix write-mask-constrained 3-byte stores as atomics +- nir: Add a flag to opt_if to prevent fighting with splitting 64bit phis +- d3d12: Fixes for QBO shaders +- d3d12: Enable some 4.6 extensions that were already implemented +- d3d12: GL4.6 +- nir_lower_mem_access_bit_sizes: Fix assert (bit -> byte size) +- microsoft/compiler: Fix lower_mem_access_bit_size callback result +- d3d12/driconf: Force on ARB_texture_view for Blender +- d3d12: Fix multidimensional array ordering +- d3d12: Fix h264 encoder 32-bit build (uint64_t -> size_t) +- d3d12: Fix hevc encoder 32-bit build (uint64_t -> size_t) +- microsoft/clc: Fix image lowering pass to only erase variables at the end +- microsoft/clc: Fix images with multiple derefs for real +- microsoft/clc: Add a test which sinks image derefs +- microsoft/clc: One more image lowering fix +- compiler/clc: Don't fail to parse SPIR-V if there's no kernels +- microsoft/clc: Flip on capabilities to prevent warning spew +- microsoft: Whitespace change to trigger CI +- vulkan/wsi: Convert bit tests to bool with != 0 +- util: Re-implement getenv for Windows +- d3d12: Add a debug flag to opt out of singleton behavior +- d3d12: Only destroy the winsys during screen destruction, not reset +- libgl-gdi: Update wgl test to use a 32bit framebuffer +- libgl-gdi: Update wgl test to set debug flags needed for tests +- dzn: Fix 3D to 2D image copies +- zink: Add ASSERTED to vars that are only used for asserts +- mesa: Consider mesa format in addition to internal format for mip/cube completeness + +Jianxun Zhang (12): + +- intel/isl: Add a debug option to override modifer list +- intel: Move mod_plane_is_clear_color() into isl +- intel/vulkan: Report clear color in subresource layout +- intel/vulkan: Allow modifiers supporting fast clear +- intel/vulkan: Specify offset when creating aux state tracker +- intel/vulkan: Import aux state tracking buffer +- intel/vulkan: Remove private binding on fast clear region +- intel/vulkan: Use the last 2 dwords of clear color struct +- intel/vulkan: Correct a comment about an offset in fast clear +- intel/vulkan: Update comment of a workaround of modifiers +- intel/vulkan: Add COMPRESSED_CLEAR state in layout translation +- intel/isl: Add Gfx 12.x RC_CCS_CC into modifier scores + +Job Noorman (5): + +- ir3: correctly set bit size for 64b constant \@load_ubo +- nir: add _safe variants of nir_foreach_reg_load/store +- ir3: lower 64b registers +- nir: add helper to create cursor after all \@decl_regs +- ir3: lower 64b registers before creating preamble + +Jonathan Gray (2): + +- intel/common: add directory prefix to intel_gem.h include +- zink: put sysmacros.h include under #ifdef MAJOR_IN_SYSMACROS + +Jordan Justen (25): + +- intel/l3: Use devinfo->urb.size when cfg urb-size is 0. +- anv: Add more space for init_render_queue_state() batch (MTL regression) +- intel/dev/wa: Raise error if mesa_defs.json contains unknown platforms +- intel/dev: Rename mtl-m to mtl-u +- intel/dev: Rename mtl-p to mtl-h +- intel/compiler: Define XE2 compiler enum +- intel/genxml: Update COMPUTE_WALKER for xe2 +- iris: Set COMPUTE_WALKER Message SIMD field +- anv: Set COMPUTE_WALKER Message SIMD field +- intel/genxml: Update INTERFACE_DESCRIPTOR_DATA for xe2 +- anv, iris: Update INTERFACE_DESCRIPTOR_DATA programming for xe2 +- iris: xe2 doesn't have INTERFACE_DESCRIPTOR_DATA::BarrierEnable +- intel/genxml: Update 3DSTATE_TE for xe2 +- isl: Add mocs for xe2 +- intel/genxml: Add UNIFIED_COMPRESSION_FORMAT enum for xe2 +- anv, blorp, iris: Update 3DSTATE_PS programming for xe2 +- anv, blorp, iris, intel/genxml: Update 3DSTATE_VS for xe2 +- anv, blorp, iris, intel/genxml: Update 3DSTATE_PS_EXTRA for xe2 +- intel/batch_decoder: Update 3DSTATE_PS decoding for xe2 +- anv, iris, intel/genxml: Update 3DSTATE_GS for xe2 +- anv, iris, intel/genxml: Update 3DSTATE_HS for xe2 +- intel/compiler: Pass max_polygons to copy-prop from fs_visitor. +- intel/xe2+: Implement brw_wm_state_simd_width_for_ksp() on Xe2+. +- intel/genxml/gfx125: Move L1_CACHE_CONTROL to enum +- intel/genxml/gfx125: Move STATE_SURFACE_TYPE to enum + +Jordan Petridis (1): + +- Revert "ci: take microsoft farm offline" + +Joshua Ashton (2): + +- nvk: Hook up driconf for nvk_instance +- nvk: Enable KHR_present_id and KHR_present_wait + +José Expósito (5): + +- zink: Fix crash on zink_create_screen error path +- zink: fix dereference before NULL check +- zink: allow software rendering only if selected +- zink: initialize drm_fd to -1 +- egl/glx: fallback to software when Zink is forced and fails + +José Roberto de Souza (56): + +- anv: Add missing ANV_BO_ALLOC_EXTERNAL flags when calling anv_device_import_bo() +- intel: Add more information about the PAT entry used +- intel: Update MTL scanout PAT entry +- intel: Add a write combining PAT entry +- anv: Honor memory coherency of the memory type selected +- anv: Move PAT entry selection to common code +- anv: Change default PAT entry to WC +- anv: Calculate mmap mode based on alloc_flags +- anv: Remove anv_bo flags that can be inferred from alloc_flags +- iris: Add iris_bufmgr_get_pat_entry_for_bo_flags() +- intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp() +- anv: Reduce ifdefs in anv_GetCalibratedTimestampsEXT() +- anv: Make use of intel_gem_read_correlate_cpu_gpu_timestamp() +- intel/common/xe: Re implement xe_gem_read_render_timestamp() with xe_gem_read_correlate_cpu_gpu_timestamp() +- anv: Bring back the non optimized version of build_load_render_surface_state_address() +- intel: Sync xe_drm.h +- intel: Sync xe_drm.h +- iris: Change default PAT entry to WC +- intel: Rename PAT entries +- intel: Share function to do device query in Xe KMD +- iris: Check for maximum allowed priority in Xe KMD +- anv: Rename ANV_BO_ALLOC_SNOOPED to ANV_BO_ALLOC_HOST_CACHED_COHERENT +- anv: Add support all possible cached and coherent memory types +- intel: Add PAT entries for gfx12 and newer +- intel: Sync xe_drm.h +- intel: Enable has_set_pat_uapi for Xe +- iris: Prepare iris_heap_to_pat_entry() for discrete GPUs +- iris: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs +- anv: Prepare anv_device_get_pat_entry() for discrete GPUs +- anv: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs +- anv: Add heaps for Xe KMD in platforms without LLC +- intel/dev: Adjust prefetch_size values for Xe2 engines +- anv: Fix vm bind of DRM_XE_VM_BIND_FLAG_NULL +- iris: Fix the mmap mode for IRIS_HEAP_DEVICE_LOCAL_PREFERRED +- intel: Sync xe_drm.h take 2 part 3 +- intel/isl: Set mocs.blitter_dst/src for MTL +- anv: Fix handling of host_cached_coherent bos in gen9 lp in older kernels +- anv: Split ANV_BO_ALLOC_HOST_CACHED_COHERENT into two actual flags +- anv: Promote bos to host_cached+host_coherent in platforms with LLC +- anv: Avoid unnecessary intel_flush calls +- intel/genxml/xe2: Update PIPE_CONTROL +- intel/genxml/xe2: Update PIPELINE_SELECT +- intel: Sync xe_drm.h final part +- anv: Remove libdrm usage from Xe KMD backend +- anv: Add ANV_BO_ALLOC_IMPORTED +- anv: Replace anv_bo.vram_only by anv_bo.alloc_flags check +- anv: Assume that imported bos already have flat CCS requirements satisfied +- intel/isl/xe2: Enable route of Sampler LD message to LSC +- utils/u_debug: Fix parse of "all, +- anv: Increase ANV_MAX_QUEUE_FAMILIES +- anv: Drop useless STATIC_ASSERT in anv_physical_device_init_queue_families() +- anv: Simply companion_rcs handling +- anv: Add missing anv_measure_submit() calls in Xe KMD backend +- anv: Fix anv_measure_start/stop_snapshot() over copy or video engine +- anv: Call anv_measure_submit() before anv_cmd_buffer_chain_command_buffers() +- anv: Fix PAT entry for userptr in integrated GPUs + +Juan A. Suarez Romero (12): + +- v3d/ci: run V3D GL tests in 64-bits +- v3d: use kmsro to create drm screen on real hw +- vc4/ci: comment why piglit is disabled +- broadcom/ci: separate hiden jobs to -inc.yml files +- v3d: include the revision in the device name +- ci/baremetal: make BM_BOOTCONFIG optional +- ci: do not mount already mounted directories +- ci/v3d/vc4: remove explicit modules to load +- ci/v3dv: add new failures +- ci/v3dv: update results +- ci/vc4/v3d: remove some flakes +- ci/v3d: add support for rpi5 + +Julia Zhang (1): + +- radeonsi: modify binning settings to improve performance + +Juston Li (17): + +- venus: add helper function to get cmd handle +- venus: refactor out common cmd feedback functions +- venus: support deferred query feedback recording +- venus: track/recycle appended query feedback cmds +- venus: append query feedback at submission time +- venus: switch to unconditionally deferred query feedback +- venus: sync protocol for VK_EXT_extended_dynamic_state3 +- venus: pipeline fixes for VK_EXT_extended_dynamic_state3 +- venus: enable VK_EXT_extended_dynamic_state3 +- venus: disable unsupported ExtendedDynamicState3Features +- venus: implement vkGet[Device]ImageSparseMemoryRequirements +- radv: enable stippledBresenhamLines on GFX9 chips +- venus: fix query feedback copy sanitize off by 1 +- venus: rename buffer cache to buffer reqs cache +- venus: use vk_format helper for plane count +- venus: support caching image memory requirements +- venus: add LRU cache eviction for image mem reqs cache + +Kai Wasserbäch (1): + +- fix: ac/llvm: LLVM 18: remove useless passes, partially removed upstream + +Karol Herbst (74): + +- vtn/opencl: always lower to libclc fmod +- rusticl/device: restrict image_buffer_size +- rusticl/device: restrict param_max_size further +- rusticl/mem: properly set pipe_image_view::access +- zink: support CLAMP_TO_BORDER with unnormalized coords +- zink: alias nir scratch memory by lowering to common bit_size +- zink: emit float controls +- zink: lower fisnormal as it requires the Kernel Cap +- radv: fix buffers in vkGetDescriptorEXT with size not aligned to 4 +- rusticl/queue: Only take a weak ref to the last Event +- rusticl/device: restrict const max size to 1 << 26 bytes +- rusticl/mesa: pass PIPE_BIND_LINEAR in resource_create_texture_from_user +- rusticl: handle failed maps gracefully +- zink: validate pointer alignment in resource_from_user_memory +- zink: handle denorm preserve execution modes +- zink: deallocate global_bindings array +- zink: emit MemoryAccess flags for coherent global load/stores +- rusticl/mesa/screen: do not derefence the entire pipe_screen struct +- nir: Stop assuming glsl_get_length() returns 0 for vectors +- ir2: Stop assuming glsl_get_length() returns 0 for vectors +- nvc0: implement PIPE_CAP_TIMER_RESOLUTION +- radeonsi: support importing arbitrary resources +- radeonsi: hack for importing 3D textures +- rusticl/context: fix importing gl cube maps +- docs/features: mark rusticl gl_sharing as done +- rusticl/queue: do not send empty lists of event to worker queue +- rusticl/queue: fix implicit flushing of queue dependencies +- rusticl: only support the matching device for gl_sharing +- rusticl/memory: fix new clippy::needless-borrow warning +- nir: allow vec derefs on system values +- vtn: add hack for system values placed in CrossWorkgroup memory +- rusticl/api: workaround DPCPP fetching clSetProgramSpecializationConstant +- rusticl: add x11 dependency +- rusticl/gl: make GLX support optional +- clc: allow debug flag to be read from other files +- clc: add dump_llvm debug options +- nir/opt_preamble: make load_workgroup_size handling optional +- radeonsi: lower relative shuffle subgroup ops +- radeonsi: lower 64bit subgroup shuffle to 32 bit +- clc: add support for cl_khr_subgroup_shuffle and shuffle_relative +- rusticl: implement cl_khr_subgroup_shuffle and shuffle_relative +- ci/fedora: bump to meson 1.3.0 +- rusticl: bump meson req +- rusticl: use rust.proc_macro for proc macros +- clc: use addMacroDef/Undef instead of -D/-U flags +- nak: fix some sm checks for volta +- nir/algebraic: add support for custom arguments +- nak: add algebraic lowering pass +- nak: move nir_lower_subgroups into nak_postprocess_nir +- rusticl/kernel: explicitly set rounding modes +- radeonsi: fix reg_saved_mask for non graphics contexts +- clc: add workaround for clang always defining __IMAGE_SUPPORT_ and __opencl_c_int64 +- rusticl: do not warn on empty RUSTICL_DEBUG or RUSTICL_FEATURES +- rusticl: silence clippy::arc-with-non-send-sync for now +- rusticl: fix constant and printf buffer size +- rusticl/nir: add missing nir include +- rusticl: check rustc version for flags requiring newer rustc/clippy +- ci: merge debian-rusticl-testing into debian-testing +- zink: lock screen queue on context_destroy and CreateSwapchain +- clc: remove code supporting pre llvm-10 +- zink: fix heap-use-after-free on batch_state with sub-allocated pipe_resources +- rusticl: specify buffer bindings explicitly +- rusticl: add QueueContext to track GPU state +- rusticl/queue: release bound constant buffer +- rusticl: use real buffer for cb0 for drivers prefering +- ci,rusticl: bump meson req to 1.3.1 +- rusticl/meson: generate bindings for LLVM +- rusticl/program: add LLVM functions to cache timestamp +- rusticl/llvm: do not include spirv-tools/linker.hpp +- rusticl/kernel: run opt/lower_memcpy later to fix a crash +- nir: rework and fix rotate lowering +- nak/opt_out: fix comparison in try_combine_outs +- rusticl/kernel: check that local size on dispatch doesn't exceed limits +- clc: force fPIC for every user when using shared LLVM + +Kenneth Graunke (21): + +- intel/compiler: Delete unused emit_dummy_fs() +- intel/compiler: Delete unused repclear shader uniform handling +- intel/compiler: Delete repclear shader's special case for 1 color target +- intel/compiler: Drop unused saturate handling in repclear shader +- intel/compiler: Convert the repclear shader to use send-from-GRF +- intel/compiler: Assert that FS_OPCODE_[REP\_]FB_WRITE is for pre-Gfx7 +- iris: Make an iris_bucket_cache structure and array per heap +- iris: Make an iris_heap_is_device_local() helper +- iris: Rename heap_flags -> heap in i915_gem_create +- iris: Split system memory heap into cached-coherent and uncached heaps +- iris: Use 64K BOs for the shader uploader +- iris: Align fresh BO allocations to 2MB in size +- iris: Ensure virtual addresses are aligned to 2MB for 2MB+ blocks +- anv: Implement rudimentary VK_AMD_buffer_marker support +- anv: Drop 3/4 of PPGTT size restriction for sys heap size calculation +- anv: Don't report more memory available than the heap size +- intel/fs: Allow omitting the destination of A64 untyped atomics +- intel/fs: Drop opt_register_renaming() +- iris: Initialize bo->index to -1 when importing buffers +- iris: Don't search the exec list if BOs have never been added to one +- iris: Skip mi_builder init for indirect draws + +Konstantin Seurer (40): + +- radv: Add RADV_MAX_HIT_ATTRIB_DWORDS +- radv/nir: Add radv_nir_lower_hit_attrib_derefs +- radv/nir: Handle boolean hit attribs +- radv/clang-format: Do not indent C++ modifiers +- radv: Add radv_nir_lower_hit_attrib_derefs_tests +- radv/sqtt: Fix tracing acceleration structure commands +- radv/sqtt: Handle monolithic RT pipelines +- radv/rt: Use a helper for inlining non-recursive stages +- radv/rt: Skip null checks for small case counts +- nir/lower_vars_to_scratch: Remove all unused derefs +- drm-shim/nouveau: Set nv_device_info_v0::platform +- drm-shim/nouveau: Expose the 2D engine on NV50+ +- drm-shim/nouveau: Stub mitting ioctls +- nvk: Do not preserve metadata after lower_load_global_constant_offset_instr +- radv: Add more offsets acceleration_structure_layout +- radv/bvh: Stop emitting leaf nodes inside the encoder +- nir: Optimize fpow with small constant exponents +- radv: Implement VK_KHR_ray_tracing_position_fetch +- radv: Make pipeline cache object data generic +- radv: Don't store library stack sizes +- radv: Add more ray tracing data to the cache +- radv/rt: Skip compiling a traversal shader +- radv: Skip compiling chit and miss shaders +- radv/rt: Remove useless assert +- radv/rt: Use radv_shader for compiled shaders +- radv/sqtt: Avoid duplicate stage check +- radv/rt: Repurpose radv_ray_tracing_stage_is_compiled +- vtn: Remove transpose(m0)*m1 fast path +- ac/nir: Export clip distances according to clip_cull_mask +- vtn: Handle DepthReplacing correctly +- radv/rmv: Fix tracing ray tracing pipelines +- radv/rt/rmv: Log pipeline library creation +- radv: Use PLOC for TLAS builds +- radv: Remove the BVH depth heuristics +- radv/rt: Lower ray payloads to registers +- vtn: Allow for OpCopyLogical with different but compatible types +- ac/llvm: Enable helper invocations for quad OPs +- lavapipe: Fix DGC vertex buffer handling +- lavapipe: Mark vertex elements dirty if the stride changed +- lavapipe: Report the correct preprocess buffer size + +Lang Yu (1): + +- radeonsi: emit SQ_NON_EVENT for GFX11_5 + +Leo Liu (2): + +- gallium/vl: match YUYV/UYVY swizzle with change of color channels +- radeonsi: fix video processing path without VPE enabled + +LingMan (9): + +- rusticl: Show an error message if the build is attempted with an outdated bindgen version +- rusticl: Show an error message if the version of bindgen can't be detected +- rusticl: Directly pass a \`&Device` to \`Mem::map_image` and \`Mem::map_buffer` +- rusticl: Only put an Arc around PipeScreen where needed +- rusticl: Avoid repeatedly creating Vecs during Platform initialization +- rusticl: Turn pointers in enqueue_svm_mem_fill_impl into proper Rust types +- rusticl: Turn pointers in enqueue_svm_memcpy_impl into slices +- rusticl/api: Add checking wrappers around \`slice::from_raw_parts{_mut}` +- rusticl: Use the \`from_raw_parts` wrappers + +Lionel Landwerlin (88): + +- intel/fs: fix dynamic interpolation mode selection +- anv/meson: add missing dependency on the interface header +- anv: ensure we reapply always pipeline dynamic state in runtime state +- intel/fs: Xe2 fix for ExBSO on UGM +- blorp: handle binding table & surface state allocation failures +- anv: rename internal heaps +- anv: deal with state stream allocation failures +- anv: add max_size argument for block & state pools +- anv: make sure pools can handle more than 2Gb +- anv: fail pool allocation when over the maximal size +- anv: use anv_state_pool_state_address for blorp vertex buffer address +- anv: fix corner case of mutable descriptor pool creation +- anv: dynamically allocate utrace batch buffers +- perfetto/pps-producer: add optimized cpu/gpu timestamp correlation support +- intel/ds: use improved timestamp correlation if available +- isl: disable MCS compression on R9G9B9E5 +- intel: fix PXP status check +- anv: handle protected memory allocation +- anv: allow creation of protected queues +- anv: Emit protection + session ID on protected command buffers +- anv: allow protected GEM context creation +- anv: enable protected memory +- intel/fs: fix residency handling on Xe2 +- anv: workaround XeSS for Satisfactory +- intel/fs: rerun divergence analysis prior to convert_from_ssa +- intel/nir/rt: fix reportIntersection() hitT handling +- anv: fix source_hash propagation with libraries +- anv: fix missing naming for dirty bit +- anv: fix CC_VIEWPORT pointer dirty after blorp/simple-shaders +- anv: fix dirty state tracking for 3DSTATE_PUSH_CONSTANT_ALLOC +- intel/decoder: handle 3DPRIMITIVE_EXTENDED in accumulated prints +- intel/blorp: move Wa_18019816803 out of blorp code +- anv: get rid of the duplicate pipeline fields in command buffer state +- anv/blorp: move helper function about BTI changes to blorp +- intel/perf: fix querying of configurations +- intel/fs: fix incorrect register flag interaction with dynamic interpolator mode +- intel/fs: reuse set_predicate() +- intel/aux_map: introduce ref count of L1 entries +- anv: use main image address to determine ccs compatibility +- anv: track & unbind image aux-tt binding +- anv: remove heuristic preferring dedicated allocations +- intel/ds: add trace of buffer markers +- intel/tools: add hang_replay tool +- intel/hang_replay: add the ability to pass the context image to sim-drm +- intel: add error2hangdump tool +- intel/aubinator_error_decode: bump max buffers to 1024 +- intel/error_decode: map i915 gfx12.5 register names to our names +- intel/tools: hang viewer/editor +- anv: add a sampler state pool +- anv: move descriptor set type selection to earlier +- anv: make a couple of descriptor function private +- anv: add missing push descriptor flush on ray tracing pipelines +- anv: set layout printer +- anv: use 2 different buffers for surfaces/samplers in descriptor sets +- intel/hang_replay: fix compile race with generated files +- intel/tools: 32bit compile fixes +- vulkan/runtime: retain video session creation flags +- anv/video: only report matching memory types for protected sessions +- util/u_printf: add a u_printf_ptr() variant +- nir: make printf_info (de)serializer available +- nir/clone: fix missing printf_info clone +- nir: include printfs from linked shaders +- nir/divergence: handle printf intrinsic +- nir/serialize: untangle printf serialization from a particular stage +- nir: fixup nir_printf intrinsic description +- anv: fix incorrect queue_family access on command buffer +- isl: constify isl_device_get_sample_counts() +- anv: get features after initializing drm +- anv: switch to use runtime physical device properties infrastructure +- anv: promote EXT_vertex_attribute_divisor to KHR +- anv: promote EXT_calibrated_timestamps to KHR +- isl: drop AUX-TT CCS alignment with INTEL_DEBUG=noccs +- anv: wait for CS write completion before executing secondary +- isl: further restrict alignment constraints +- isl: implement Wa_22015614752 +- intel/fs: fix depth compute state for unchanged depth layout +- anv: remove ANV_ENABLE_GENERATED_INDIRECT_DRAWS variable +- anv: fix disabled Wa_14017076903/18022508906 +- intel/aux_map: fix fallback unmapping range on failure +- anv: hide vendor ID for The Finals +- anv: fix pipeline executable properties with graphics libraries +- anv: implement undocumented tile cache flush requirements +- anv: don't prevent L1 untyped cache flush in 3D mode +- anv: add missing alignment for AUX-TT mapping +- anv: factor out aux-tt binding logic for future reuse +- anv: rename aux_tt image field +- anv: retain ccs image binding address +- anv: fix transfer barriers flushes with compute queue + +Louis-Francis Ratté-Boulianne (4): + +- panfrost: factor out method to check whether we can discard resource +- panfrost: add copy_resource flag to pan_resource_modifier_convert +- panfrost: add can_discard flag to pan_legalize_afbc_format +- panfrost: Legalize before updating part of a AFBC-packed texture + +Luc Ma (1): + +- loader: Remove a line of unused include + +Luca Weiss (1): + +- freedreno: Enable A305B + +Lucas Fryzek (2): + +- freedreno/drm: Add more APIs to per backend API +- gallivm/nir: Load all inputs into indirect inputs array + +Lucas Stach (2): + +- etnaviv: drm: don't update cmdstream timestamp when skipping submit +- etnaviv: disable 64bpp render/sampler formats + +Lynne (1): + +- radv: change queue family order in radv_get_physical_device_queue_family_properties + +M Henning (21): + +- nak: Fix a warn(unused_must_use) by calling drop +- nak: Remove MemScope::Cluster +- nak: Memory order/scope encodings for Ampere +- nak: Specify MemScope on MemOrder::Strong +- nak: Bind nir_intrinsic_access +- nak: Add MemOrder::Constant +- nvk: Use load_global_constant for ubo loads +- nak: Add encodings for cache eviction priorities +- nak: Set "evict first" from ACCESS_NON_TEMPORAL +- nak: Request alignment that matches the load width +- nak: Use nir_combined_align +- nvk: Fix descriptor alignment offset +- nak: Provide robustness info to postprocess_nir +- nak: Call nir_opt_load_store_vectorize +- nak: Call nir_opt_combine_barriers +- nak: Call nir_opt_shrink_vectors +- nak: Clamp negative texture array indices to zero +- nak: Enable loop unrolling. +- nak: Print out an instruction count +- nak: Add a jump threading pass +- nak: Optimize jumps to fall-through if possible + +Marcin Ślusarz (1): + +- anv: fix minSubgroupSize for xe2 + +Marek Olšák (199): + +- radeonsi: initialize perfetto in the right place +- ac: add missing gfx11.5 bits +- ac/gpu_info: adjust attribute ring size for gfx11 +- ac/surface: cosmetic changes +- ac/surface/tests: cosmetic changes +- radeonsi: don't use nir_optimization_barrier_vgpr_amd with ACO +- radeonsi: inline si_allocate_gds and si_add_gds_to_buffer_list +- radeonsi: inline si_screen_clear_buffer +- radeonsi: remove redundant VS_PARTIAL_FLUSH for streamout +- radeonsi: remove AMD_DEBUG=nogfx +- radeonsi: rename ctx -> sctx in si_emit_guardband +- radeonsi: remove and inline si_shader::ngg::prim_amp_factor +- radeonsi: decrease PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS to 1024 +- radeonsi: cosmetic changes in si_pm4.c +- radeonsi: split setting num_threads in si_emit_dispatch_packets +- radeonsi: use si_shader_uses_streamout properly +- radeonsi: adjust setting PA_SC_EDGERULE once more +- radeonsi: various isolated cosmetic changes +- radeonsi: move max_dist for MSAA into si_state_msaa.c +- radeonsi: cosmetic changes in si_state_viewport.c +- radeonsi: cosmetic changes in si_state_binning.c, si_state_msaa.c +- radeonsi: move setting registers at the end of si_emit_cb_render_state +- ac/gpu_info: split has_set_pairs_packets into context and sh flags +- ac/gpu_info,llvm: trivial cosmetic changes +- radeonsi: clean up si_set_streamout_targets +- radeonsi: upload shaders using a compute queue instead of gfx +- radeonsi: rewrite PM4 packet building helpers with less duplication +- radeonsi: move buffered_xx_regs into a substructure +- radeonsi: rename HAS_PAIRS -> HAS_SH_PAIRS_PACKED +- radeonsi: rename radeon_*push_*_sh_reg -> gfx11_*push_*_sh_reg +- radeonsi: rewrite gfx11_*push*_sh_reg helpers +- radeonsi: restructure blocks in si_setup_nir_user_data +- radeonsi: restructure blocks in si_emit_graphics_{shader,compute}_pointers +- radeonsi/gfx11: use PKT3_SET_CONTEXT_REG_PAIRS_PACKED for PM4 states +- radeonsi: don't call nir_lower_compute_system_values too many times +- radeonsi: don't check DCC compatibility on chips where it's no-op +- radeonsi: cosmetic changes in si_emit_db_render_state +- radeonsi: prettify code around PA_SC_LINE_STIPPLE +- radeonsi: move emitting VGT_TF_PARAM into gfx10_emit_shader_ngg +- radeonsi: remove num_params variable from gfx10_shader_ngg +- radeonsi: move SPI_SHADER_IDX_FORMAT into the preamble (it's immutable) +- radeonsi: adjust the total viewport area +- radeonsi/gfx11: use SET_CONTEXT_REG_PAIRS_PACKED for other states +- radeonsi/gfx11: don't set OREO_MODE to fix rare corruption +- radeonsi: don't dma-upload shaders on APUs +- radeonsi/ci: update failures for gfx103 +- st/mesa: disable light_twoside if back faces are culled +- glsl/nir: return failure from link_varyings if there is a linker error +- nir: add lowering from FS LAYER input to LAYER_ID sysval +- nir: return progress from nir_remove_sysval_output +- ac/nir: add kill_layer flag to VS/GS/NGG lowering +- st/mesa: set pipe_framebuffer_state::layers for PBO blits +- radeonsi: clean up si_nir_kill_outputs +- radeonsi: don't allocate output space for LAYER/VIEWPORT before TES and GS +- radeonsi: implement gl_Layer in FS as a system value +- radeonsi: remove the LAYER output if the framebuffer state has only 1 layer +- nir: fix gathering TESS_LEVEL_INNER/OUTER usage with lowered IO +- nir: don't declare illegal varyings in nir_create_passthrough_tcs +- nir/print: print PATCH0 and VARn_16BIT names instead of numbers for TCS and TES +- gallium/docs: make CAP doc order match definition order +- gallium: add PIPE_CAP_PERFORMANCE_MONITOR for GL_AMD_performance_monitor +- radeonsi: group equal CAP cases +- radeonsi: only expose GL_AMD_performance_monitor on gfx7-10.3 +- ac: rename ac_parse_ib.c -> ac_ib_parser.c +- ac: move the IB parsers into ac_parse_ib.c +- ac: add an IB parser that gathers context rolls +- mesa: optimize _mesa_matrix_is_identity +- mesa: skip checking for identity matrix in glMultMatrixf with glthread +- mesa: optimize setting the identity matrix +- glthread: add a marker at the end of batches indicating the end +- glthread: eliminate push/pop calls in PushMatrix+Draw/MultMatrixf+PopMatrix +- glthread: add option to put autogenerated marshal structures in the header file +- glapi: rename primcount -> instance_count in a few Draw functions +- glthread: use autogenerated marshal structures for custom functions +- glthread: rework type reduction and reduce vertex stride params to 16 bits +- glapi: only expose GL_EXT_direct_state_access functions to GL compatibility +- glthread: don't do "if (COMPAT)" if the function is not in the GL core profile +- glapi: only allow deprecated="" on non-aliased functions +- glthread: pass struct marshal_cmd_DrawElementsUserBuf into Draw directly +- mesa: deduplicate glVertexPointer and glNormalPointer vs DSA error checking +- glthread: add a string table of function names +- radeonsi/gfx11: fix unaligned SET_CONTEXT_PAIRS_PACKED +- radeonsi: don't set non-existent VGT_GS_MAX_PRIMS_PER_SUBGROUP on gfx10 +- radeonsi: change the low-priority compiler queue to normal priority +- radeonsi: update shaders for blend state only if the shader key changed +- radeonsi: update shaders for rasterizer state only if the shader key changed +- radeonsi: clean up setting poly/line/stipple shader key bits +- radeonsi: rewrite how shader key bits dependent on current_rast_prim are updated +- radeonsi: rewrite si_get_total_colormask as si_any_colorbuffer_written +- radeonsi: in bind_{blend,rs}_state, only call 1 update function per if +- radeonsi/gfx11: skip si_set_streamout_enable because it has no effect +- radeonsi: execute streamout_begin after cache flushes +- radeonsi: don't print the preamble state separately for GALLIUM_DDEBUG +- radeonsi: replace gl_FrontFacing with a constant if one side is always culled +- radeonsi: set OOB_SELECT for VBOs in si_create_vertex_elements +- radeonsi: group most vertex element fields +- radeonsi/gfx11: prefer Wave64 for PS without inputs for better VALU perf +- radeonsi/gfx11: disable the shader profile for Medical that forces Wave64 +- radeonsi/gfx11: disable the shader profile for Medical that disables binning +- radeonsi: clean up how debug flags and shader profiles determine the wave size +- radeonsi/gfx11: prefer Wave64 for VS/TCS/TES/GS because it's slightly faster +- winsys/amdgpu: bypass GL2 for command buffers +- radeonsi: track NIR progress properly for optimizations in si_get_nir_shader +- ac,radeonsi: rename pos_inputs -> fragcoord_components +- nir,radeonsi: add FLAGS into load_vector_arg_amd to record color input usage +- radeonsi: change the signature of si_nir_lower_ps_color_input +- radeonsi: gather lowered color inputs for monolithic PS +- radeonsi: add PS input info into si_shader_binary_info +- radeonsi: don't include the PARAM_GEN input in si_shader_info +- radeonsi: decrease NUM_INTERP if uniform inlining eliminated PS inputs +- radeonsi: update comments about uniform inlining +- radeonsi: decrease NUM_INTERP if export formats/colormask eliminated PS inputs +- util: make BITSET_TEST_RANGE_INSIDE_WORD take a value to compare with +- radeonsi: merge context_reg_saved_mask and other_reg_saved_mask into a BITSET +- radeonsi: convert depth-stencil-alpha state to tracked registers +- radeonsi: convert rasterizer state to tracked registers +- ac/gpu_info: fix printing radeon_info after adding VPE +- radeonsi: rework how guardband registers are updated to decrease overhead +- mesa: fix _mesa_matrix_is_identity +- mesa: remove some DrawTransformFeedback duplication +- mesa: remove some DrawElementsInstanced duplication +- mesa: remove more DrawArrays/Elements duplication +- mesa: remove non-relevant 16-year-old comment +- st/mesa: make prepare_(indexed\_)draw non-static +- mesa: inline st_draw_transform_feedback +- mesa: call st_prepare_(indexed\_)draw before Driver.DrawGallium(MultiMode) +- st/mesa: no need to check index_size in st_prepare_indexed_draw anymore +- mesa: move index bounds code (st_prepare_indexed_draw) into draw.c +- cso: do cso_context inheritance how we do it elsewhere +- cso: inline cso_get_pipe_context +- mesa: execute an error path sooner in _mesa_validated_drawrangeelements +- gallium: add typedef pipe_draw_func matching the draw_vbo signature and use it +- ac/llvm: remove code for converting txd from 1D to 2D because NIR does it +- ac,radeonsi: require DRM 3.27+ (kernel 4.20+) same as RADV +- winsys/amdgpu: don't return a value from cs_add_buffer +- winsys/amdgpu: cosmetic changes in amdgpu_cs_add_buffer +- winsys/amdgpu: inline amdgpu_add_fence_dependencies_bo_lists +- winsys/amdgpu: use inheritance for the cache_entry BO field +- winsys/amdgpu: use inheritance for the real BO +- winsys/amdgpu: use inheritance for the sparse BO +- winsys/amdgpu: use inheritance for the slab BO +- winsys/amdgpu: move lock from amdgpu_winsys_bo into sparse and real BOs +- winsys/amdgpu: don't count memory usage because it's unused +- winsys/amdgpu: change real/slab/sparse_buffers to buffer_lists[3] +- winsys/amdgpu: change amdgpu_lookup_buffer to take struct amdgpu_buffer_list +- winsys/amdgpu: clean up duplicated code around amdgpu_lookup/add_buffer +- winsys/amdgpu: return amdgpu_cs_buffer* from add/lookup_buffer instead of index +- winsys/amdgpu: pass amdgpu_buffer_list* to amdgpu_add_bo_fences_to_dependencies +- winsys/amdgpu: clean up the rest of the code for cs->buffer_lists +- winsys/amdgpu: fix amdgpu_cs_has_user_fence for VPE +- winsys/amdgpu: document BO structures +- ci: disable the google/freedreno farm because it's down +- glthread: add a missing end-of-batch marker +- mesa: micro-improvements in draw.c +- st/mesa: restore pipe_draw_info::mode at the end of st_hw_select_draw_gallium +- mesa: add a pipe_draw_indirect_info* parameter into the DrawGallium callback +- mesa: enable GL_SELECT and GL_FEEDBACK modes for indirect draws +- winsys/amdgpu: reduce wasted memory due to the size tolerance in pb_cache +- gallium/pb_slab: move group_index and entry_size from pb_slab_entry to pb_slab +- iris,zink,winsys/amdgpu: remove unused/redundant slab->entry_size +- winsys/amdgpu: rename to amdgpu_bo_slab to amdgpu_bo_slab_entry +- winsys/amdgpu: stop using pb_buffer::vtbl +- gallium/pb_cache: remove pb_cache_entry::end to save space +- gallium/pb_cache: switch time variables to milliseconds and 32-bit type +- radeon_winsys: add struct radeon_winsys* parameter into fence_reference +- r300,r600,radeon/winsys: always pass the winsys to radeon_bo_reference +- winsys/amdgpu: don't layer slabs, use only 1 level of slabs, it improves perf +- winsys/amdgpu: add amdgpu_bo_real_reusable slab for the backing buffer +- winsys/amdgpu: remove now-redundant amdgpu_bo_slab_entry::real +- winsys/amdgpu: remove va (gpu_address) from amdgpu_bo_slab_entry +- winsys/amdgpu: don't use gpu_address to compute slab entry offset in bo_map +- gallium/pb_buffer: define pb_buffer_lean without vtbl, inherit it by pb_buffer +- gallium/pb_cache: switch to pb_buffer_lean +- gallium/pb_cache: remove pb_cache_entry::mgr +- gallium/pb_cache: remove pb_cache_entry::buffer +- winsys/radeon: stop using pb_buffer::vtbl +- r300,r600,radeonsi: switch to pb_buffer_lean +- winsys/amdgpu: allocate 1 amdgpu_bo_slab_entry per cache line +- winsys/amdgpu: compute bo->unique_id at pb_slab_alloc, not at memory allocation +- winsys/amdgpu: rewrite BO fence tracking by adding a new queue fence system +- winsys/amdgpu: rename amdgpu_winsys_bo::bo -> bo_handle +- winsys/amdgpu: rename amdgpu_bo_sparse::lock -> commit_lock +- winsys/amdgpu: rename amdgpu_bo_real::lock to map_lock +- winsys/amdgpu: remove dependency_flags parameter from cs_add_fence_dependency +- winsys/amdgpu: implement explicit fence dependencies as sequence numbers +- winsys/amdgpu: use pipe_reference for amdgpu_ctx refcounting +- winsys/amdgpu: don't use amdgpu_fence::ctx for fence dependencies +- winsys/amdgpu: simplify code using amdgpu_cs_context::chunk_ib +- radeonsi/ci: add gfx11 flakes +- glthread: don't unroll draws using user VBOs with GLES +- glthread: add proper helpers for call fences +- gallium/u_threaded_context: use function table to jump to different draw impls +- mesa,u_threaded_context: add a fast path for glDrawElements calling TC directly +- gallium/u_threaded: use a dummy end call to indicate the end of the batch +- gallium/u_threaded: remove unused param from tc_bind_buffer/add_to_buffer_list +- gallium/u_threaded: keep it enabled even if the CPU count is 1 +- meson: require libdrm_amdgpu 2.4.119 +- winsys/amdgpu: remove amdgpu_bo_real::gpu_address, use amdgpu_va_get_start_addr +- winsys/amdgpu: remove amdgpu_bo_sparse::gpu_address, use amdgpu_va_get_start_addr + +Mario Kleiner (1): + +- v3d: add B10G10R10[X2/A2]_UNORM to format table. + +Mark Collins (8): + +- meson: Only include virtio when DRM available +- meson: Only link libvdrm to Turnip with virtio KMD +- meson: Update lua wrap to 5.4.6-4 +- freedreno/rddecompiler: Emit explicit scope for CP_COND_REG_EXEC +- freedreno/rddecompiler: Decode ELSE branches using NOPs +- freedreno/rddecompiler: Reset buffers after RD_CMDSTREAM_ADDR +- freedreno/rddecompiler: Print pkt values in hex +- freedreno/rddecompiler: Add ability to read GPU buffer into file + +Mark Janes (7): + +- iris: make shader cache content deterministic +- anv: make shader cache content deterministic +- intel: remove workaround for preproduction DG2 steppings +- intel/dev: improve descriptions of workaround macros. +- intel/dev: poison macros for workarounds fixed at a stepping +- intel: remove MTL a0 workarounds +- intel/dev: update workaround definitions to latest defect status + +Mart Raudsepp (1): + +- docs: Fix typo in OpenGL 3.3 support on Asahi + +Martin Roukala (né Peres) (12): + +- zink/ci: drop the concurrency of the zink-radv-vangogh-valve job +- ci/b2c: fix artifact collection +- radv/ci: fix \`vkcts-navi21-valve` execution +- Revert "ci/deqp-runner: turn paths in errors into links" +- radv: disable meshShaderQueries on gfx10.3 +- amd/ci: reduce Renoir's concurrency to 16 +- ci/b2c: fix the \`cmdline_extra` variable name +- ci: disable the valve-kws farm until it can be rebooted +- Revert "ci: disable the valve-kws farm until it can be rebooted" +- ci: disable mupuf's farm +- ci: disable collabora's farm which appears to be down +- Revert "ci: disable mupuf's farm" + +Mary Guillemard (37): + +- venus: skip bind sparse info when checking for feedback query +- nir: Add AGX-specific doorbell and stack mapping opcodes +- agx: Add doorbell and stack mapping opcodes +- agx: Handle doorbell and stack mapping intrinsics +- asahi: clc: Handle doorbell and stack mapping intrinsics +- agx: Add stack load and store opcodes +- agx: Implement scratch load/store +- agx: Add stack adjust opcode +- agx: Emit stack_adjust in the entrypoint +- zink: Check for VK_EXT_extended_dynamic_state3 before setting A2C +- nak: sm75: Fix panic when encoding MUFU with SQRT and TANH +- nak: Make PRMT selection a Src +- nak: Add support for fddx and fddy +- nak: Add for_each_instr in Shader +- nak: Gather global memory usage for ShaderInfo +- nak: Fix ALD/AST encoding for vtx and offset +- nak: Add a complete wrapper around SPH +- nak: Collect information to create SPH +- nak: Remove encode_hdr_for_nir +- nak: Restructure ShaderInfo +- nak: Add geometry shader support +- nak: Ensure we allocate one barrier when using BAR.SYNC +- nak: Implement VK_KHR_shader_terminate_invocation +- nak: Move nir_lower_int64 after I/O lowering +- nak: Pass offset to load_frag_w +- nak: Rewrite nir_intrinsic_load_sample_pos and implement nir_intrinsic_load_barycentric_at_sample +- nir: Add a ldtram_nv intrinsic +- nak: Add more bits discovered in SPH +- nvk: Implement VK_KHR_fragment_shader_barycentric +- nvk: Disable flush on each queries and flush at the end +- nvk: Implement VK_EXT_primitives_generated_query +- venus: Do not submit batch manually when no feedback is required +- nak: Fix NAK_ATTR_CLIP_CULL_DIST_7 wrong value +- nak: sm50: Implement FFMA +- zink: Force 128 fs input components under Venus for Intel +- zink: Initialize pQueueFamilyIndices for image query / create +- zink: Always fill external_only in zink_query_dmabuf_modifiers + +Matt Turner (11): + +- r600: Add missing dep on git_sha1.h +- util: Include stdint.h in libdrm.h +- util: Provide DRM_DEVICE_GET_PCI_REVISION definition +- ci/lava: Add firmware-misc-nonfree on amd64 +- intel: Only validate inst compaction if debugging a shader stage +- iris: Only initialize batch decoder if necessary +- symbols-check: Add _GLOBAL_OFFSET_TABLE_ +- nir: Fix cast +- nir/tests: Reenable tests that failed on big-endian +- util: Add DETECT_ARCH_HPPA macro +- util/tests: Disable half-float NaN test on hppa/old-mips + +Mauro Rossi (3): + +- Android.mk: filter out cflags to build with Android 14 bundled clang +- Android.mk: disable android-libbacktrace to build with Android 14 +- Android.mk: be able to build radeonsi without llvm + +Max R (3): + +- virgl: Implement clear_render_target and clear_depth_stencil +- ci: Uprev virglrenderer +- d3d10umd: Fix compilation + +Maíra Canal (22): + +- v3dv: implement VK_EXT_multi_draw +- v3dv: move multisync functions to the beginning of the file +- v3dv: allow different in/out sync queues +- v3dv: allow set_multisync() to accept more wait syncobjs +- drm-uapi: extend interface for indirect CSD CPU job +- v3dv: check CPU queue availability +- v3dv: create a CPU queue type +- v3dv: use the indirect CSD user extension +- v3dv: occlusion queries aren't handled with a CPU job +- drm-uapi: extend interface for timestamp query CPU job +- v3dv: use the timestamp query user extension +- drm-uapi: extend interface for reset timestamp CPU job +- v3dv: use the reset timestamp user extension +- drm-uapi: extend interface for copy timestamp results CPU job +- v3dv: use the copy timestamp query results user extension +- drm-uapi: extend interface for the reset performance query CPU job +- v3dv: don't start iterating performance queries at zero +- v3dv: use the reset performance query user extension +- drm-uapi: extend interface for copy performance query CPU job +- v3dv: use the copy performance query results user extension +- v3d/v3dv: move V3D_CSD definitions to a separate file +- v3dv: enable CPU jobs in the simulator + +Michael Catanzaro (1): + +- util: create parents of disk cache directory if needed + +Michael Tretter (1): + +- egl/wayland: fix formatting and add trailing comma + +Michel Dänzer (2): + +- gallium/dri: Return __DRI_ATTRIB_SWAP_UNDEFINED for _SWAP_METHOD +- glx: Handle IGNORE_GLX_SWAP_METHOD_OML regardless of GLX_USE_APPLEGL + +Mike Blumenkrantz (48): + +- zink: don't block large vram allocations +- vulkan/wsi: unify all the image usage flag caps +- draw: fix uninit variable false positive +- zink: add copy box locking +- tc: add non-definitive tracking for batch completion +- tc: always track fb attachments +- tc: add batch usage tagging to threaded_resource +- tc: use strong refs for fb attachment tracking +- tc: allow unsynchronized texture_subdata calls where possible +- zink: handle unsynchronized image maps from tc +- zink: barrier_cmdbuf -> reordered_cmdbuf +- zink: assert that transfer_dst is available before doing buf2img +- zink: rework cmdbuf submission to be more extensible +- zink: add a third cmdbuf for unsynchronized (not reordered) ops +- zink: add flag to restrict unsynchronized texture access +- zink: add locking for batch refs +- zink: enable unsynchronized texture uploads using staging buffers +- ci: skip zink vram test +- ci: bump VVL to 1.3.269 +- zink: emit SpvCapabilitySampleRateShading with SampleId +- zink: always set VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT for usermem +- zink: clamp resolve extents to src/dst geometry +- zink: only emit xfb execution mode for last vertex stage +- aux/u_transfer_helper: set rendertarget bind for msaa staging resource +- zink: unset explicit_xfb_buffer for non-xfb shaders +- mesa/st/texture: match width+height for texture downloads of cube textures +- zink: add more locking for compute pipelines +- radv: correctly return oom from the device when failing to create a cs +- zink: make (some) vk allocation commands more robust against vram depletion +- zink: check for cbuf0 writes before setting A2C +- vk/cmd_queue: exempt more descriptor functions from autogeneration +- vulkan: add wrappers for descriptor '2' functions +- zink: enforce maxTexelBufferElements for texel buffer sizing +- zink: always force flushes when originating from api frontend +- vk/cmd_queue: stop using explicit casts +- vk/cmd_queue: generate maint6 functions +- vk/cmd_queue: fix up indentation a little +- lavapipe: maint6 descriptor stuff +- lavapipe: maint6 +- zink: fix buffer rebind early-out check +- zink: ignore tc buffer replacement info +- vk/cmdbuf: add back deleted maint6 workgraph bits +- lavapipe: use pushconstants2 for dgc +- lavapipe: fix devenv icd filename +- zink: fix separate shader patch variable location adjustment +- zink: set more dynamic states when using shader objects +- zink: always map descriptor buffers as COHERENT +- zink: fix descriptor buffer unmaps on screen destroy + +Mohamed Ahmed (4): + +- nvk: Fix GetImageSubResourceLayout for non-disjoint images +- nil: Add support for linear images +- nvk: Wire up rendering to linear +- nvk: Enable linear images for texturing + +Molly Sophia (1): + +- tu: Fix KHR_present_id and KHR_present_wait being used without initialization + +Nanley Chery (11): + +- iris: Optimize BO_ALLOC_ZEROED for suballocations +- iris: Zero the clear color before FCV_CCS_E rendering +- iris: Don't memset the clear color BO during aux init +- iris: Simplify get_main_plane_for_plane +- iris: Simplify a plane count check in from_handle +- iris: Use helpers for generic aux plane importing +- iris: Inline import_aux_info +- iris: Use common res fields for imported planes +- iris: Delay main and aux resource creation on import +- isl: Handle MOD_INVALID in clear color plane check +- iris: Fix lowered images in get_main_plane_for_plane + +Neha Bhende (1): + +- ntt: lower indirect tesslevels in ntt + +Patrick Lerda (1): + +- glsl/nir: fix gl_nir_cross_validate_outputs_to_inputs() memory leak + +Paulo Zanoni (34): + +- anv: don't forget to destroy device->vma_mutex +- anv: alloc client visible addresses at the bottom of vma_hi +- anv/sparse: join multiple bind operations when possible +- anv/sparse: join multiple NULL binds when possible +- anv/sparse: also print bind->address at dump_anv_vm_bind +- intel/genxml: add the Gen12+ TR-TT registers +- anv/sparse: extract anv_sparse_bind() +- anv: setup the TR-TT vma heap +- vulkan: fix potential memory leak in create_rect_list_pipeline() +- anv/sparse: allow sparse resouces to use TR-TT as its backend +- anv/sparse: fix limits.sparseAddressSpaceSize when using vm_bind +- anv/trtt: join L1 writes into a single MI_STORE_DATA_IMM when possible +- anv/trtt: also join the L3/L2 writes into a single MI_STORE_DATA_IMM +- anv/sparse: drop anv_sparse_binding_data from dump_anv_vm_bind() +- anv/sparse: join all submissions into a single anv_sparse_bind() call +- anv/sparse: pass anv_sparse_submission to the backend functions +- anv/sparse: add 'queue' to anv_sparse_submission +- anv/trtt: use 'queue' from anv_sparse_submission in the backend +- anv/sparse: move waiting/signaling syncobjs to the backends +- anv/sparse: process image binds before opaque image binds +- anv/i915: extract setup_execbuf_fence_params() +- anv/xe: allow passing extra syncs to xe_exec_process_syncs() +- anv/trtt: don't wait/signal syncobjs using the CPU anymore +- anv/trtt: add struct anv_trtt_batch_bo and pass it around +- anv/trtt: add support for queue->sync to the TR-TT batches +- anv/trtt: properly handle the lifetime of TR-TT batch BOs +- anv: enable sparse by default on i915.ko +- anv/sparse: don't support YCBCR 2x1 compressed formats +- anv+zink/ci: document new sparse failures +- anv/sparse: reject binds that are not a multiple of the granularity +- anv/tr-tt: assert the bind size is a multiple of the granularity +- anv/sparse: check if the non-sparse version is supported first +- anv/sparse: document USAGE_2D_3D_COMPATIBLE as non-standard too +- intel/tools: fix compilation of intel_hang_viewer on 32 bits + +Pavel Asyutchenko (1): + +- mesa/main: allow S3TC for 3D textures + +Pavel Ondračka (17): + +- r300: add late vectorization after nir_move_vec_src_uses_to_dest +- r300: small adress register load optimization +- r300: nir fcsel/CMP lowering pass for R500 +- r300: add some more early bool lowering +- r300: lower flrp in NIR +- r300: fcsel_ge lowering from lowered ftrunc +- r300: lower ftrunc in NIR +- r300: remove backend CMP lowering +- r300: remove backend LRP lowering +- r300: mark load_ubo_vec4 with ACCESS_CAN_SPECULATE +- r300: fix memory leaks in compiler tests +- ci: uprev mesa-trigger container +- ci: add r300 RV530 dEQP gles2 CI job +- r300/ci: add missing kernel url quotes +- r300/ci: switch to b2c v0.9.11 +- r300/ci: add piglit job +- r300: fix reusing of color varying slots for generic ones + +Peyton Lee (6): + +- frontends, va: add new parameters of post processor +- amd,radeonsi: add libvpe +- amd: add new hardware ip for vpe +- amd, radeonsi: add si_vpe.c with helper functions of VPE lib +- amd, radeonsi: supports post processing entrypoint +- winsys, amdgpu, drm: add VPE submission handle + +Phillip Pearson (1): + +- radeonsi: use PRIu64 instead of %lu for uint64_t formatting + +Pierre-Eric Pelloux-Prayer (23): + +- mesa: restore call to _mesa_set_varying_vp_inputs from set_vertex_processing_mode +- radeonsi/ci: update failures +- radeonsi: check sctx->tess_rings is valid before using it +- Revert "radeonsi: decrease PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS to 1024" +- egl/wayland: set the correct modifier for the linear_copy image +- radeonsi: use a compute shader to convert unsupported indices format +- radeonsi: update guardband if vs_disables_clipping_viewport changes +- radeonsi/sqtt: fix RGP pm4 state emit function +- radeonsi/sqtt: clear record_counts variable +- radeonsi/sqtt: rework pm4.reg_va_low_idx +- radeonsi/sqtt: use calloc instead of malloc +- radeonsi/sqtt: reformat with clang-format +- radeonsi/sqtt: fix capturing indirect dispatches with SQTT +- radeonsi/winsys: add cs_get_ip_type function +- radeonsi/sqtt: fix emitting SQTT userdata when CAM is needed +- radeonsi/sqtt: fix capturing RGP on RDNA3 with more than one Shader Engine +- radeonsi/sqtt: handle COMPUTE queues as well +- radeonsi: fix extra_md handling with fmask +- ac/surface: don't oversize surf_size +- radeonsi: compute epitch when modifying surf_pitch +- Revert "ci/radeonsi: disable VA-API testing on raven" +- radeonsi: emit cache flushes before draw registers +- radeonsi: adjust flags for si_compute_shorten_ubyte_buffer + +Qiang Yu (35): + +- aco: do not fix_exports when separately compiled ngg vs or es +- aco: add create_end_for_merged_shader +- aco: extend max operands in a instruction to 128 +- aco: move end program handling to select_shader +- aco: stop emit s_endpgm for first stage of merged shader +- aco: add aco_is_gpu_supported +- radeonsi: add vs prolog args needed by aco ls vgpr fix +- radeonsi: fill aco shader info for part mode merged shader +- radeonsi: enable aco compilation for merged shader parts +- radeonsi: move use_aco to si_screen +- radeonsi: move llvm compiler alloc/free into create/destroy funcntion +- radeonsi: stop llvm context creation when use aco +- radeonsi: move llvm internal header to si_shader_llvm.h +- radeonsi: selectively build si llvm compiler create/destroy +- radeonsi: selectively build llvm compile +- radeonsi: set use_aco when no llvm available +- radeonsi: include ac_llvm_util.h when llvm available +- radeonsi: disk cache remove llvm dependancy when use aco +- radeonsi: does not call llvm init when no llvm available +- radeonsi: change compiler name for aco +- radeonsi: selectively build llvm files +- meson: be able to build radeonsi without llvm +- radeonsi: fix piglit image coherency test when use aco +- aco,radv: add aco_is_nir_op_support_packed_math_16bit +- radeonsi: only vectorize nir ops that aco support +- ac/llvm: remove nir_op_*2*mp ops handling +- nir: add force_f2f16_rtz option to lower f2f16 to f2f16_rtz +- aco,ac/llvm,radeonsi: lower f2f16 to f2f16_rtz in nir +- aco: set MIMG unrm for GL_TEXTURE_RECTANGLE +- aco: handle GL_TEXTURE_RECTANGLE in tg4_integer_workarounds +- radeonsi: add missing args in spi_ps_input_ena when fbfetch output +- nir: fix load layer id system_values_read info gather +- aco: fix set_wqm segfault when ps prolog +- radeonsi: fix legacy merged LS/ES workgroup size for aco compilation +- radeonsi: unify elf and raw shader binary upload + +Raphaël Gallais-Pou (1): + +- gallium: add sti DRM entry point + +Rhys Perry (55): + +- nir: add helpers to skip idempotent passes +- radv: use NIR_LOOP_PASS helpers +- aco: add VALU/SALU/VMEM/SMEM statistics +- aco: collect Pre-Sched SGPRs/VGPRs before spilling +- radv: call lower_array_deref_of_vec before lower_io_arrays_to_elements +- radv: skip radv_remove_varyings for mesh shaders +- radv: disable gs_fast_launch=2 by default +- aco/tests: fix tests with LLVM 17 +- aco/tests: fix tests with LLVM 18 +- aco: workaround LS VGPR initialization bug in RADV prologs +- aco: skip LS VGPR initialization bug workaround if the prolog exists +- radv: set prolog as_ls if has_ls_vgpr_init_bug=true +- docs: fix RADV_THREAD_TRACE_CACHE_COUNTERS default +- nir/lower_fp16_casts: correctly round RTNE f64->f16 casts +- nir/lower_fp16_casts: add option to split fp64 casts +- radeonsi: use nir_lower_fp16_casts +- radv: use nir_lower_fp16_casts +- aco: remove f16<->f64 conversions +- intel/compiler: use nir_lower_fp16_casts +- radv: add radv_disable_trunc_coord option +- radv: enable radv_disable_trunc_coord for vkd3d-proton/DXVK +- ac/gpu_info: update conformant_trunc_coord comment +- ac/nir: fix partial mesh shader output writes on GFX11 +- ac/nir: ignore 8/16-bit global access offset +- ac/nir: fix 32-bit offset global access optimization +- aco: flush denormals for 16-bit fmin/fmax on GFX8 +- aco: implement 16-bit fsign on GFX8 +- aco: implement 16-bit derivatives +- aco: implement 16-bit fsat on GFX8 +- aco: simplify v_mul_* labelling slightly +- aco: insert p_end_wqm before p_jump_to_epilog +- nir/loop_analyze: skip if basis/limit/comparison is vector +- nir/loop_analyze: scalarize try_eval_const_alu +- nir/loop_analyze: fix vector basis/limit/comparison +- nir/loop_analyze: check min compatibility with comparison +- nir/loop_analyze: support umin and {u,i,f}max +- nir/loop_analyze: support loops with min/max and non-add incrementation +- vulkan/wsi: don't support present with queues where blit is unsupported +- vulkan/wsi: fix win32 compilation +- vulkan/wsi: always create command buffer for special blit queues +- nir/loop_analyze: remove invariance analysis +- aco/tests: use more raw strings +- aco: correctly set min/max_subgroup_size for wave32-as-wave64 +- radv: use CS wave selection for task shaders +- radv: remove radv_shader_info's cs.subgroup_size +- nir: add msad_4x8 +- nir/algebraic: optimize vkd3d-proton's MSAD +- aco: implement msad_4x8 +- ac/llvm: implement msad_4x8 +- radv: enable msad_4x8 +- nir: remove sad_u8x4 +- radv: do nir_shader_gather_info after radv_nir_lower_rt_abi +- nir/lower_non_uniform: set non_uniform=false when lowering is not needed +- nir/lower_shader_calls: remove CF before nir_opt_if +- aco: fix labelling of s_not with constant + +Rob Clark (34): + +- ci: Only strip debug symbols +- tu/msm: Fix timeline semaphore support +- tu/virtio: Fix timeline semaphore support +- freedreno/drm: Fix race in zombie import +- freedreno: Fix modifier determination +- freedreno: Handle DRM_FORMAT_MOD_QCOM_TILED3 import +- virtio/drm: Split out common virtgpu drm structs +- freedreno/drm: Simplify backend mmap impl +- virtio: Add vdrm native-context helper +- freedreno/drm/virtio: Switch to vdrm helper +- tu/drm/virtio: Switch to vdrm helper +- freedreno/a6xx: Assume MOD_INVALID imports are linear +- freedreno/a6xx: Fix antichamber trace replay assert +- Revert "ci/freedreno: disable antichambers trace" +- freedreno/a6xx: Don't set patch_vertices if no tess +- freedreno/a6xx: Rework wave input size +- freedreno/drm: Fix mmap leak +- freedreno: Always attach bo to submit +- isaspec: Sort labels with same output +- freedreno/drm: Fix zombie BO import harder +- freedreno/a6xx: Fix NV12+UBWC import +- freedreno: De-duplicate 19.2MHz RBBM tick conversion +- freedreno: Fix timestamp conversion +- freedreno: Implement PIPE_CAP_TIMER_RESOLUTION +- drm-uapi: Sync drm-uapi +- freedreno/layout: Add layout metadata +- tu: Add metadata support for dedicated allocations +- freedreno/drm: Add BO metadata support +- freedreno: Add layout metadata support +- ci: More context for color_clear skips for Wayland +- ci: List specific color_clears skips +- ci: Add wayland-dEQP-EGL.functional.render.* skips +- ci: Remove per-driver wayland-dEQP-EGL xfails +- freedreno/drm/virtio: Fix typo + +Robert Foss (3): + +- egl/surfaceless: Fix EGL_DEVICE_EXT implementation +- egl: Add _eglHasAttrib() function +- egl/surfaceless: Don't overwrire disp->Device if using EGL_DEVICE_EXT + +Robert Mader (4): + +- util: Add new helpers for pipe resources +- panfrost: Support parameter queries for main planes +- vc4/resource: Support offset query for multi-planar planes +- v3d/resource: Support offset query for multi-planar planes + +Rohan Garg (31): + +- intel/compiler: migrate WA 14013672992 to use WA framework +- blorp,anv,iris: refactor blorp functions into something more generic +- iris: Wa 16014538804 for DG2, MTL A0 +- iris: pull WA 22014412737 into emit_3dprimitive_was +- anv: WA 16014538804 for DG2, MTL A0 +- blorp: WA 16014538804 for DG2, MTL A0 +- anv: Refactor loading indirect parameters and filling IDD +- anv: refactor kernel dispatch to use new common functions +- intel/dev: Add a bit for when the HW can do a indirect draw/dispatch unroll +- genxml/12.5: Add the EXECUTE_INDIRECT_DRAW instruction +- genxml/12.5: Add the EXECUTE_INDIRECT_DISPATCH instruction +- anv: Emit EXECUTE_INDIRECT_DRAW when available +- anv: Emit a EXECUTE_INDIRECT_DISPATCH when available +- iris: Emit a EXECUTE_INDIRECT_DISPATCH when available +- anv: memcpy the thread dimentions only when they're on the CPU +- anv: introduce ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH +- intel/genxml: Add the preferred slm size enum for xe2 +- intel: Set a preferred SLM size for LNL +- intel/genxml: Update COMPUTE_WALKER_BODY for xe2 +- intel/genxml: Update IDD for new fields +- blorp: set min/max viewport depths to -FLT_MAX/FLT_MAX when EXT_depth_range_unrestricted is enabled +- anv: ensure that we clamp only when EXT_depth_range_unrestricted is not enabled +- anv: enable VK_EXT_depth_range_unrestricted +- iris: Emit EXECUTE_INDIRECT_DRAW when available +- intel/compiler: use the proper enum type to store the op +- intel/compiler: infer the number of operands using lsc_op_num_data_values +- anv: rename anv_create_companion_rcs_command_buffer to anv_cmd_buffer_ensure_rcs_companion +- iris,isl: Adjust driver for several commands of clear color (xe2) +- intel/fs/xe2+: Lift CPS dispatch width restrictions on Xe2+. +- intel/compiler: Update disassembly for new LSC cache enums +- anv: untyped data port flush required when a pipeline sets the VK_ACCESS_2_SHADER_STORAGE_READ_BIT + +Roland Scheidegger (1): + +- lavapipe: bump image alignment up to 64 bytes + +Roman Stratiienko (5): + +- v3d: Don't implicitly clear the content of the imported buffer +- u_gralloc: Extract common code from fallback gralloc +- u_gralloc: Add QCOM gralloc support +- egl/android: Switch to generic buffer-info code +- u_gralloc: Add support for gbm_gralloc + +Ruijing Dong (12): + +- radeonsi/vcn: vcn4 encoding interface dummy update +- radeonsi/vcn: preparation for enc intra-refresh +- radeonsi/vcn: change intra-ref name +- radonesi/vcn: enable intra-refresh in vcn encoders +- frontends/va: add intra-refresh in VAAPI interface +- radesonsi/vcn add qp_map definition +- frontends/va: add ROI feature +- radeonsi/vcn: ROI feature implementation +- radeonsi/vcn: enable ROI feature in vcn. +- radeonsi/vcn: ROI capability value initialization. +- frontends/va: remove some TODOs in hevc encoding +- radeonsi/vcn: update session_info from vcn3 and up. + +Ryan Neph (6): + +- virgl: implemement resource_get_param() for modifier query +- venus: add VN_PERF=no_tiled_wsi_image +- venus: strip ALIAS_BIT for WSI image creation on ANV +- venus: reject multi-plane modifiers for tiled wsi images +- venus: add dri option to enable multi-plane wsi modifiers +- venus: fix shmem leak on vn_ring_destroy + +Sagar Ghuge (24): + +- iris: Disable auxiliary buffer if MSRT is bound as texture +- iris: Disable CCS compression on top of MSAA compression on ACM +- isl: Enable MCS compression on ACM platform +- anv: Write timestamp using MI_FLUSH_DW on blitter +- anv: Avoid emitting PIPE_CONTROL command for copy/video queue +- anv: Flush data cache while clearing depth using HIZ_CCS_WT +- anv: Add comment to copy image code block +- iris: Init aux map state for compute engine +- anv,hasvk: Use uint32_t for queue family indices +- blorp: Handle stencil buffer compression on blitter engine +- anv: Use RCS cmd buffer if blit src/dest has 3 components +- intel/compiler: Adjust assertion in lower_get_buffer_size() for Xe2 +- intel/fs: Adjust destination size for image size intrinsic +- intel/fs: Adjust destination size for global load constant on Xe2+ +- intel/fs: Adjust destination size for load ubo on Xe2+ +- intel/genxml: Add BCS/VD0 aux table base address register +- anv: Handle video/copy engine queue initialization +- anv: Invalidate aux map for copy/video engine +- iris: Handle aux map init for copy engine +- docs: Document INTEL_COPY_CLASS +- anv: Enable blitter engine unconditionally on ACM+ +- iris: No need to emit PIPELINE_SELECT on Xe2+ +- anv: No need to emit PIPELINE_SELECT on Xe2+ +- intel/fs: Check fs_visitor instance before using it + +Samuel Pitoiset (169): + +- radv: move RADV_DEBUG_NO_HIZ check in radv_use_htile_for_image() +- radv: implement VK_EXT_image_compression_control +- radv: advertise VK_EXT_image_compression_control +- ac/gpu_info: remove bogus assertion about number of COMPUTE/SDMA queues +- radv: dump the pipeline hash to the gpu hang report +- radv: fix a synchronization issue with primitives generated query on RDNA1-2 +- ac/registers: allow to parse GCVM_L2_PROTECTION_FAULT_STATUS +- ac/debug: add a helper to print GPUVM fault protection status +- radv: use the GPUVM fault protection status helper +- radv: remove NGG streamout support for RDNA1-2 +- radv: remove unnecessary VS_PARTIAL_FLUSH for NGG streamout +- ac/nir: remove dead code in nir_intrinsic_xfb_counter_{add,sub}_amd +- aco: remove dead code in nir_intrinsic_xfb_counter_{add,sub}_amd +- radv/ci: update list of expected failures/flakes for NAVI31 +- radv: add RADV_DEBUG=nomeshshader +- radv/ci: enable RADV_DEBUG=nomeshshader for vkcts-navi31-valve +- radv: bind the non-dynamic graphics state from the pipeline unconditionally +- radv: adjust binning settings to improve performance on GFX9 +- radv: fix compute shader invocations query on compute queue on GFX6 +- radv: emit COMPUTE_PIPELINESTAT_ENABLE for CS invocations on ACE +- ci: backport two mesh/task query fixes for VKCTS +- radv/ci: document one more flake test +- nir: fix inserting the break instruction for partial loop unrolling +- radv: add initial VK_EXT_device_fault support +- radv: advertise VK_EXT_device_fault +- ci: re-apply two mesh/task query fixes for VKCTS +- radv: add a helper to determine if it's possible to preprocess DGC +- radv: emit individual SET_SH_REG for inlined push constants with DGC +- radv: optimize emitting inlined push constants with DGC +- radv: enable DGC preprocessing when all push constants are inlined +- radv: restore sampling CPU/GPU clocks before starting SQTT trace +- ac/rgp: update dumping queue event records to the capture +- radv: add radv_write_timestamp() helper +- radv: add support for RGP queue events +- radv: add drirc options to force re-compilation of shaders when needed +- radv: fix VRS subpass attachment when HTILE can't be enabled on GFX10.3 +- radv: fix registering queues for RGP with compute only +- radv: set radv_zero_vram=true for Unreal Engine 4/5 +- radv: fix a descriptor leak with debug names and host base descriptor set +- radv: add a missing async compute workaround for Tonga/Iceland +- zink/ci: add a manual job on radv-navi31 +- aco: remove useless nir_intrinsic_load_force_vrs_rates_amd +- radv: remove redundant check when forcing VRS rates +- radv: check earlier if a graphics pipeline can force VRS per vertex +- ac/surface: change tile mode for 3D PRT surfaces with bpp < 64 on GFX6-8 +- radv: re-enable sparseResidencyImage3D on POLARIS10+ +- aco: rename color_exports to exports in create_fs_jump_to_epilog() +- radv: rename ps_epilog_inputs to colors for PS epilogs +- radv: add radv_physical_device::emulate_mesh_shader_queries for GFX10.3 +- radv: add support for mesh primitives queries on GFX10.3 +- radv: define new pipeline statistics indices for mesh/task on GFX11 +- radv: bump the pipeline state query size to 14 on GFX10.3 +- radv: do not harcode the pipeline stats mask for query resolves +- radv: add support for mesh shader invocations queries on GFX10.3 +- radv: rework gfx10_copy_gds_query() slightly +- radv: make some gang functions non-static +- radv: add support for task shader invocations queries on GFX10.3 +- radv: enable meshShaderQueries on GFX10.3 +- radv/ci: add missing expected failures for mesh queries on VANGOGH +- radv: disable TC-compatible HTILE on Tonga and Iceland +- radv: add missing FDCC_CONTROL bits for GFX1103 R2 +- radv: set radv_invariant_geom=true for War Thunder +- radv: do not set OREO_MODE to fix rare corruption on GFX11 +- ci: uprev vkd3d-proton to 2.11 +- radv/ci: add new flakes for VEGA10 +- radv: remove useless NIR instructions when emitting IBO with DGC +- radv: set the stream VA for DGC graphics +- radv: use an indirect draw when IBO isn't updated as part of DGC +- radv: enable DGC preprocessing for IBO +- radv: fix bogus interaction between DGC and RT with descriptor bindings +- radv: make sure to prefetch the compute shader for DGC +- radv: remove radv_pipeline_key::dynamic_color_write_mask +- radv: simplify creating image views for src resolve images +- radv: stop performing redundant resolves with the HW resolve path +- radv: remove unused layers support for the HW/FS resolve paths +- radv: only re-initialize DCC for one level for the HW resolve path +- radv: adjust assertions for multi-layer resolves with the HW/FS paths +- radv: remove never used binds_state for DGC +- radv: only initialize the VBO reg if VBOs are bound with DGC +- radv: only initialize the VTX base SGPR if non-zero with DGC +- radv: add DGC support for mesh shader only +- radv: advertise VK_EXT_depth_clamp_zero_one +- radv: update the reset stipple pattern mode +- radv: change the reset stipple pattern mode for adjacent lines +- radv: make sure to reset the stipple line state when it's disabled +- radv: set combinedImageSamplerDescriptorCount to 1 for multi-planar formats +- radv: switch to on-demand PS epilogs for GPL +- radv: remove unused code for compiling PS epilogs as part of pipelines +- aco: export depth/stencil/samplemask in create_fs_jump_to_epilog() +- ac/nir: add an option to skip MRTZ exports in ac_nir_lower_ps() +- radv: determine if MRTZ needs to be exported via PS epilogs +- radv: prepare the PS epilog key for exporting MRTZ on RDNA3 +- radv,aco: declare PS epilog VGPR arguments for depth/stencil/samplemask +- radv: determine and emit SPI_SHADER_Z_FORMAT for PS epilogs +- zink/ci: remove skipped tests from the list of expected failures for NAVI31 +- radv: export MRTZ via PS epilogs when alpha to coverage is dynamic on GFX11 +- radv: enable extendedDynamicState3AlphaToCoverageEnable on GFX11 +- zink/ci: skip more tests that run OOM on NAVI31 +- zink/ci: update list of failures for NAVI31 +- zink/ci: stop running zink-radv-navi31-valve sequentially +- ci: uprev vkd3d-proton to a0ccc383937903f4ca0997ce53e41ccce7f2f2ec +- radv: simplify disabling MRT compaction for PS epilogs +- vulkan: bump headers/registry to 1.3.273 +- radv: promote EXT_calibrated_timestamps to KHR +- docs: update features.txt for RADV +- radv: remove useless check for TC-compat CMASK images during fb emission +- radv: stop clearing FMASK_COMPRESS_1FRAG_ONLY for TC-compat CMASK images +- vulkan/runtime: promote VK_EXT_vertex_attribute_divisor to KHR +- radv: advertise VK_KHR_vertex_attribute_divisor +- radv/ci: remove dEQP-VK.mesh_shader.ext.query.* from the lists +- radv: emit the task shader in radv_emit_graphics_pipeline() +- radv: cleanup ac_nir_lower_ps options +- radv: cleanup gathering PS info with/without PS epilogs +- radv: cleanup radv_pipeline_generate_ps_epilog_key() +- radv: add support for MRT compaction with PS epilogs +- radv: fix binding partial depth/stencil views with dynamic rendering +- radv: stop asserting some image create info fields +- radv: remove some declared but unused functions/macros +- radv: add missing HTILE support for fb mip tail workaround +- radv: stop checking FMASK for the fb mip tail workaround +- radv: move emitting the fb mip tail workaround when rendering begins +- radv: remove radv_get_tess_output_topology() declaration +- radv: move meta declarations to radv_meta.h +- radv: move RADV_HASH_SHADER_xxx flags to radv_pipeline.c +- radv: move radv_image_is_renderable() to radv_image.c +- radv: move more descriptor related declarations to radv_descriptor_set.h +- radv: move radv_depth_clamp_mode to radv_cmd_buffer.c +- radv: move more shader related declarations to radv_shader.h +- radv: move SI_GS_PER_ES to radv_constants.h +- radv: move buffer view related code to radv_buffer_view.c +- radv: move image view related code to radv_image_view.c +- vulkan: bump headers/registry to 1.3.274 +- vulkan: drop VK_ENABLE_BETA_EXTENSIONS for video encode layouts +- radv/ci: update CI lists for NAVI10,NAVI31 and RENOIR +- ci: apply two bugfixes for VKCTS +- radv: move radv_{emulate,enable}_rt() to radv_physical_device.c +- radv: make a couple of NIR RT functions as static +- radv: move radv_rt_{common,shader} files to nir/ +- radv: move radv_BindImageMemory2() to radv_image.c +- radv: add support for VkBindMemoryStatusKHR +- radv: rename RADV_GRAPHICS_STAGES to RADV_GRAPHICS_STAGE_BITS +- radv: add support for version 2 of all descriptor binding commands +- radv: add support for NULL index buffer +- radv: advertise VK_KHR_maintenance6 +- radv: disable FMASK for MSAA images with layers on GFX9 +- radv: stop clearing CMASK to 0xcc when FMASK is present on GFX9 +- radv: disable stencil test without a stencil attachment +- radv: constify a variable in radv_emit_depth_control() +- radv: remove duplicated si_tile_mode_index() function +- radv: rename si_make_texture_descriptor() to gfx6_make_texture_descriptor() +- radv: remove radv_write_scissors() +- radv: drop si\_ prefix from all functions +- Revert "radv: disable DCC with signedness reinterpretation on GFX11" +- radv: stop disabling DCC for mutable with 0 formats on GFX11 +- radv: do not program COMPUTE_MAX_WAVE_ID (GDS register) on GFX6 +- radv/winsys: replace '<= GFX6' by '== GFX6' +- radv: query drirc options in only one place +- radv: move dri options to radv_instance::drirc +- radv: rework declaring color arguments for PS epilogs +- Revert "radv/rt: Lower ray payloads to registers" +- radv: do not issue SQTT marker with DISPATCH_MESH_INDIRECT_MULTI +- radv: add missing disable_shrink_image_store to the pipeline key +- radv: move RADV_HASH_SHADER_KEEP_STATISTICS to radv_pipeline_key +- radv: initialize radv_device::disable_trunc_coord earlier +- radv: introduce radv_device_cache_key for per-device cache compiler options +- radv: move all per-device keys from radv_pipeline_key to radv_device_cache_key +- radv: fix indirect dispatches on the compute queue on GFX7 +- radv: fix indirect draws with NULL index buffer on GFX10 +- radv: fix segfault when getting device vm fault info + +Sarah Walker (3): + +- pvr: Update AM62 DSS compatible string to match upstream +- pvr: csbgen: Add dummy implementation of stream type +- pvr: Add command stream and static context state layout to rogue_kmd_stream.xml + +Sathishkumar S (1): + +- frontends/va: use va interface for jpeg partial decode + +Sebastian Wick (1): + +- radeonsi: Destroy queues before the aux contexts + +Sergi Blanch Torne (8): + +- ci: disable Collabora's LAVA lab for maintance +- Revert "ci: disable Collabora's LAVA lab for maintance" +- ci: disable Collabora's LAVA lab for maintance +- Revert "ci: disable Collabora's LAVA lab for maintance" +- Revert "ci: disable collabora farm as it is currently offline" +- ci: disable Collabora's LAVA lab for maintance +- Revert "ac/nir: Export clip distances according to clip_cull_mask" +- Revert "ci: disable Collabora's LAVA lab for maintance" + +Shuicheng Lin (1): + +- intel/xe: Correct DRM_XE_EXEC_QUEUE_SET_PROPERTY's ioctl + +Sil Vilerino (76): + +- d3d12: d3d12_video_buffer_create_impl - Fix resource importing +- d3d12: Allow creating d3d12_dxcore_screen from existing ID3D12Device +- vl/win32: Add vl_win32_screen_create_from_d3d12_device +- gallium/auxiliary: Fix pb_bufmgr_slab.c leak +- pipe: Extend get_feedback with additional metadata +- pipe: Add PIPE_VIDEO_CAP_ENC_H264_DISABLE_DBK_FILTER_MODES_SUPPORTED +- pipe: Add PIPE_VIDEO_CAP_ENC_INTRA_REFRESH_MAX_DURATION +- pipe: Add H264 VUI encode params +- pipe: Add HEVC VUI encode params +- pipe: Add max_slice_bytes for H264, HEVC encoding +- frontend/va: Add log2_max_frame_num_minus4 and log2_max_pic_order_cnt_lsb_minus4 for h264enc +- frontend/va: Parse VUI H264 parameters +- frontend/va: Parse VUI HEVC parameters +- frontend/va: Support VAEncMiscParameterMaxSliceSize +- meson: add vp9 and av1 codec support options +- gallium/vl: Check for VP9 and AV1 meson option support flags +- d3d12: Plumb pipe_h264_enc_picture_desc.dbk.disable_deblocking_filter_idc +- d3d12: Use log2_max_frame_num_minus4 and log2_max_pic_order_cnt_lsb_minus4 from pipe_pic_params_h264 +- d3d12: Video Encode - Remove PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE as not supported +- d3d12: Disable codecs according to meson video-codecs option +- d3d12: Implement H264 VUI Writer +- d3d12: Implement HEVC VUI Writer +- d3d12: Implement Intra Refresh for H264, HEVC, AV1 +- d3d12: Support PIPE_VIDEO_CAP_ENC_H264_DISABLE_DBK_FILTER_MODES_SUPPORTED +- d3d12: Implement get_feedback with additional metadata +- d3d12: fix usage of GetAdapterLuid() in mingw/GCC using ABI helper +- ci: Build d3d12 gallium driver in debian-x86_32 +- pipe: Support inserting new headers on each H264/HEVC IDR frame +- pipe: Add get_feedback_fence for encode async waiting on pipe_feedback_fence +- pipe: Add fence_get_win32_handle to get HANDLE from pipe_fence_handle +- pipe: Add p_video_codec.get_encode_headers for out of band VPS, SPS, PPS +- pipe: Add PIPE_VIDEO_FEEDBACK_METADATA_TYPE_AVERAGE_FRAME_QP +- pipe: Add PIPE_VIDEO_CAP_ENC_H264_SUPPORTS_CABAC_ENCODE +- pipe: Add PIPE_H264_MAX_REFERENCES +- frontend/va: Add h264 encode ip_period param +- frontend/va: Add VACodedBufferSegment Average QP metadata +- frontend/va: Use p_video_codec.get_feedback_fence to report errors on frame submission +- vl_winsys_win32: call winsys->destroy(winsys) in error conditions +- d3d12: Implement inserting optional new headers on each H264/HEVC IDR frame +- d3d12: Do not increase active_seq_parameter_set_id on new SPS. Force PPS on new SPS +- d3d12: H264 encode - Allow CONSTRAINED_BASELINE profile to be written in headers +- d3d12: Implement get_feedback_fence for encode async waiting on pipe_feedback_fence +- d3d12: Implement fence_get_win32_handle to get HANDLE from d3d12_fence +- d3d12: Only pass texture dimensions to d3d12_video_encoder_update_current_encoder_config_state +- d3d12: Implement d3d12_video_encoder_get_encode_headers for out of band VPS, SPS, PPS +- d3d12: Use new pipe h264 encode ip_period param +- d3d12: max_frame_poc workaround for infinite GOPs +- d3d12: Fix max slice size and max frame size metadata reporting +- d3d12: Implement PIPE_VIDEO_FEEDBACK_METADATA_TYPE_AVERAGE_FRAME_QP +- d3d12: Autodetect d3d12_video_buffer imported handle/resource format and dimensions when not passed +- d3d12: Implement PIPE_VIDEO_CAP_ENC_H264_SUPPORTS_CABAC_ENCODE +- d3d12: Detect imported resource buffer unknown format +- d3d12: Improve error detection and reporting for video encoder +- d3d12: Fix d3d12_tcs_variant_cache_destroy leak in d3d12_context +- d3d12: Fix screen->winsys leak in d3d12_screen +- d3d12: d3d12_create_fence_win32 - Fix double refcount bump +- d3d12: Fix max reference frames reporting when HW does not support B frame +- d3d12: Video Encoder - When setting rate control dirty flags take into account rolled back optional configs +- d3d12: Video Encoder: Support reporting non contiguous NALU, offsets for frontend extraction +- meson: Add all, all_free (default) options for video-codecs option. +- d3d12: Fix usage of H264/HEVC specific classes when VIDEO_CODEC_H26XENC not set +- d3d12: Fix AV1 video encode 32 bits build +- d3d12: Fix typos in d3d12_video_encoder_bitstream_builder_h264 +- d3d12: Use enc_constraint_set_flags for H264 NALU writing +- frontends/va: Parse enc_constraint_set_flags from packed SPS +- d3d12: Check video encode codec cap before checking encode profile/level cap +- meson: Only build WGL for Windows platform when opengl option is active +- d3d12: Bump directx-headers dependency to v611.0 for latest video codecs and features +- d3d12: Remove D3D12_SDK_VERSION checks after bumping directx-headers dependency to v611 +- d3d12: Fix warning C4065 switch statement contains default but no case labels +- d3d12: Implement Delta QP ROI In h264, hevc and av1 video encode +- d3d12: Report support for PIPE_VIDEO_CAP_ENC_ROI for Delta QP +- Revert "d3d12: Only destroy the winsys during screen destruction, not reset" +- Revert "d3d12: Fix screen->winsys leak in d3d12_screen" +- d3d12: Fix AV1 Encode - log2 rounding for tile_info section +- d3d12: Implement cap for PIPE_VIDEO_CAP_ENC_INTRA_REFRESH + +Simon Ser (3): + +- egl: extract EGLDevice setup in dedicated function +- egl: move dri2_setup_device() after dri2_setup_extensions() +- egl: ensure a render node is passed to _eglFindDevice() + +Simon Zeni (2): + +- EGL: sync files with Khronos +- egl: implement EGL_EXT_query_reset_notification_strategy + +Sviatoslav Peleshko (23): + +- nir/loop_analyze: Fix inverted condition handling in iterations calculation +- anv: Fix MI_ARB_CHECK calls in generated indirect draws optimization +- nir/loop_analyze: Don't test non-positive iterations count +- intel/fs: Don't optimize DW*1 MUL if it stores value to the accumulator +- intel/compiler: Add variable to dump binaries of all compiled shaders +- intel/disasm: Print half-float values instead of placeholder +- intel/compiler: Set flag reg to 0 when disabling predication +- intel/disasm: Print src1_len correctly depending on ExDesc type +- intel/fs: Set group 0 for Wa_14010017096 MOV instruction +- intel/eu/validate: Validate that the ExecSize is a factor of chosen ChanOff +- intel/tools/i965_asm: Add SWSB handling +- intel/tools/i965_asm: Handle HF immediates +- intel/tools/i965_asm: Handle sync instruction +- intel/tools/i965_asm: Allow neg and abs modifiers on accumulator register +- intel/tools/i965_asm: Don't override flag reg from cond modifier +- intel/tools/i965_asm: Allow src0 and src2 of ternary instructions to be imm +- intel/tools/i965_asm: Implement gfx12 and gfx12.5 send/sendc +- intel/tools/i965_asm: Add dp4a and add3 instructions +- intel/tools/i965_asm: Don't set src0 for break and while on gfx12 +- intel/tools/tests: Fix sends indirect argument in gfx9 test +- intel/tools/tests: Unbreak i965_asm tests +- intel/tools/tests: Add i965_asm tests for gfx12 and gfx12.5 +- nir: Use alu source components count in nir_alu_srcs_negative_equal + +Sylvain Munaut (1): + +- mesa/st, dri2, wgl, glx: Restore flush_objects interop backward compat + +Tapani Pälli (34): + +- intel/dev: provide intel_device_info_is_adln helper +- iris: add required PC for Wa_14014966230 +- anv: add current_pipeline for batch_emit_pipe_control +- anv: add required PC for Wa_14014966230 +- intel/dev: fix intel_device_info_is_adln check +- iris: handle tile case where cso width, height is zero +- anv: skip engine initialization if vm control not supported +- iris: add data cache flush for pre hiz op +- anv/drirc: add option to disable FCV optimization +- drirc: use fake_sparse for Armored Core 6 +- drirc: Set limit_trig_input_range option for Valheim +- iris: implement Wa_18020335297 +- anv: refactor state emission +- anv: implement Wa_18020335297 +- iris: implement dummy blit for Wa_16018063123 +- anv: implement dummy blit for Wa_16018063123 +- mesa: lower EXT_render_snorm version requirement +- anv: use slow clear for small surfaces with Wa_18020603990 +- iris: use slow clear for small surfaces with Wa_18020603990 +- anv/hasvk/drirc: change anv_assume_full_subgroups to have subgroup size +- drirc: setup anv_assume_full_subgroups=16 for UnrealEngine5.1 +- anv: cleanup, use intel_needs_workaround instead of is_dg2 +- iris: cleanup, use intel_needs_workaround instead of is_dg2 +- iris: use intel_needs_workaround with 14015055625 +- mesa: fix enum support for EXT_clip_cull_distance +- drirc/anv: disable FCV optimization for Baldur's Gate 3 +- isl: implement Wa_14018471104 +- iris: use workaround framework for Wa_22018402687 +- anv: use workaround framework for Wa_22018402687 +- anv: check for wa 16013994831 in emit_so_memcpy_end +- iris: expand pre-hiz data cache flush to gfx >= 125 +- anv: expand pre-hiz data cache flush to gfx >= 125 +- iris: replace constant cache invalidate with hdc flush +- anv: move \*bits_for_access_flags to genX_cmd_buffer + +Tatsuyuki Ishi (25): + +- fast_urem_by_const: #ifdef DEBUG an assertion. +- radv: Fix mis-sizing of pipeline_flags in radv_hash_rt_shaders. +- radv: Use sizeof(flags) instead of hardcoded size in radv_hash_shaders. +- aco: Replace aco_vs_input_state.divisors with bitfields. +- radv: Remove last VS prolog reuse logic. +- radv, aco: Rework VS prolog key handling. +- radv, aco: Inline struct aco_vs_input_state. +- radv: Pre-mask misaligned_mask for VS prolog. +- radv: Implement helpers for shader part caching. +- radv: Use shader part caching helpers for VS prolog and PS/TCS epilog. +- zink: Fix missing sparse buffer bind synchronization. +- zink: Defer freeing sparse backing buffers. +- zink: Fix waiting for texture commit semaphores. +- zink: Remove now unused dead_framebuffers. +- radv: Remove aspect mask "expansion" for copy_image. +- radv: Add workaround to allow sparse binding on gfx queues. +- radv: Enable radv_legacy_sparse_binding for DOOM Eternal. +- radv/amdgpu: Remove virtual bo dump logic. +- radv/amdgpu: Separate the concept of residency from use_global_list. +- radv: Simplify shader config assignment. +- radv: Move up radv_get_max_waves, radv_get_max_scratch_waves. +- radv: Precompute shader max_waves. +- radv: Add layer to skip UnmapMemory for Quantic Dream Engine +- radv: Recompute max_waves after postprocessing RT config +- radv: never set DISABLE_WR_CONFIRM for CP DMA clears and copies + +Tele42 (1): + +- drirc: enable \`vk_wsi_force_swapchain_to_current_extent` for "The Talos Principle VR" + +Teng, Jin Chung (1): + +- d3d12: Decode - Adding more supported resolution + +Thomas Devoogdt (1): + +- util: os_same_file_description: fix unknown linux < 3.5 syscall SYS_kcmp + +Thomas H.P. Andersen (13): + +- docs: update nvk extensions +- nvk: use nvk_pipeline_zalloc +- nouveau: drop unused #includes of tgsi_parse.h +- nvk: VK_EXT_color_write_enable +- docs: update features.txt for nvk +- nvk: loop over stages in MESA order +- nvk: add hashing for shaders +- nvk: allocatable nvk_shaders +- nvk: pipeline shader cache +- nvk: VK_EXT_pipeline_creation_feedback +- nvk: VK_EXT_pipeline_creation_cache_control +- nvk: VK_EXT_shader_module_identifier +- docs: update features.txt for nvk + +Thong Thai (1): + +- radeonsi/vcn: remove EFC support for renoir + +Timothy Arceri (24): + +- nir: move build_write_masked_stores() to nir builder +- glsl/nir: implement a nir based lower distance pass +- glsl: switch to NIR distance lowering pass +- glsl: remove now unused lower distance pass +- nir: simplify nir_build_write_masked_store() +- glsl: drop ir_binop_ubo_load +- glsl: add nir based lower_named_interface_blocks() +- glsl: use the nir based lower_named_interface_blocks() +- glsl: remove GLSL IR lower_named_interface_blocks() +- nir: add nir_fixup_deref_types() +- glsl: support glsl linking in nir block linker +- glsl: use new nir based block linker +- glsl: remove now unused GLSL IR block linker +- glsl/st: move has_half_float_packing flag to consts struct +- glsl/st: move remaining glsl ir lowering to linker +- mesa/st: drop additional validate_ir_tree() call +- glsl: combine shader stage loops in linker +- radeonsi: fix divide by zero in si_get_small_prim_cull_info() +- glsl: tidy up validation loop in linker +- glsl: remove some unused linker code +- glsl: copy precision val of function output params +- glsl: add additional lower mediump test +- glsl: move glsl ir lowering out of glsl_to_nir() +- glsl: add support for inout params to glsl_to_nir() + +Timur Kristóf (32): + +- radv: Remove always false tmz variables from SDMA functions. +- radv: Expose radv_get_dcc_max_uncompressed_block_size function. +- radv: Implement buffer/image copies on transfer queues. +- radv: Add temporary BO for transfer queues. +- radv: Implement workaround for unaligned buffer/image copies. +- ac: Rename SDMA max copy size macros to reflect SDMA version. +- ac: Remove CIK prefix from SDMA opcodes. +- ac: Add sdma_version enum and use it for SDMA features. +- radv: Use GPU info for determining SDMA metadata support. +- radv: Use SDMA version instead of gfx_level where possible. +- radv: disable HTILE/DCC for concurrent images with transfer queue if unsupported. +- radv: Disable DCC on exclusive images with transfer queue when SDMA doesn't support it. +- radv: Disable HTILE on exclusive images with transfer queues when SDMA doesn't support it. +- radv: Don't retile DCC on transfer queues. +- radv: Implement barriers for transfer queues. +- radv: Implement vkCmdFillBuffer on transfer queues. +- radv: Implement vkCmdWriteTimestamp2 on transfer queues. +- radv: Implement vkCmdWriteBufferMarker2AMD on transfer queues. +- radv: Implement buffer copies on transfer queues. +- radv: Implement vkCmdUpdateBuffer on transfer queues. +- radv: Move SDMA function and struct declarations to a new header. +- radv: Unify SDMA surface struct for linear and tiled images. +- radv: Refactor and simplify SDMA surface info functions. +- radv: Pass radv_sdma_surf from copy functions to SDMA. +- radv: Use SDMA surface structs for determining unaligned buffer copies. +- radv: Clean up SDMA chunked copy info struct. +- radv: Use correct plane and binding index with SDMA. +- radv: Correct binding index for transfer buffer-image copies. +- radv: Implement image copies on transfer queues. +- radv: Implement T2T scanline copy workaround. +- radv: Expose transfer queues, hidden behind a perftest flag. +- radv: Correctly select SDMA support for PRIME blit. + +Vignesh Raman (5): + +- ci: Add CustomLogger class and CLI tool +- ci: copy logging script to install +- ci: bare-metal: poe: Create strutured logs +- ci: bare-metal: cros-servo: Create strutured logs for a630 +- ci/freedreno: add FARM variable + +Vinson Lee (6): + +- ac/surface/tests: Remove duplicate variable block_size_bits +- nir: Fix decomposed_prmcnt copy-paste error +- nvk: Fix tautological-overlap-compare warning +- etnaviv: Remove duplicate initializers +- ac/rgp: Fix single-bit-bitfield-constant-conversion warning +- intel/disasm: Remove duplicate variable reg_file + +Violet Purcell (1): + +- gallium: Fix undefined symbols in version scripts + +Vitaliy Triang3l Kuzmin (13): + +- r600: Move r600_create_vertex_fetch_shader to r600_shader.c +- r600: Remove Gallium dependencies in r600_isa +- r600: Replace R600_ERR with R600_ASM_ERR in shader code +- r600: Remove Gallium dependencies in r600_asm +- r600: Split r600_shader.h into common and Gallium parts +- r600/sfn: Make r600 header include paths relative +- r600/sfn: Split r600_shader_from_nir into common and Gallium parts +- r600: Fix outputs typo in print_pipe_info +- r600: Replace TGSI I/O semantics with shader_enums +- r600/sfn: Change sampler_index to texture_index in buffer txs +- r600/sfn: Remove unused sampler reference in emit_tex_lod +- nir: Don't skip lower_alu if only bit_count needs lowering +- vulkan: Fix pipeline layout allocation scope + +Vlad Schiller (1): + +- pvr: Fix VK_EXT_texel_buffer_alignment + +VladimirTechMan (1): + +- venus/android: Switch to using u_gralloc + +Yiwei Zhang (57): + +- venus: use common vk_image_format_to_ahb_format helper +- venus: use common vk_image_usage_to_ahb_usage helper +- venus: tiny refactor of device memory report interface +- venus: avoid modifier prop query in vn_android_get_image_builder +- venus: use common vk_image as vn_image base +- venus: use common vk_device_memory as vn_device_memory base +- venus: use common AHB management and export impl +- venus: use vk_device_memory tracked export and import handle types +- venus: use vk_device_memory tracked size +- venus: use vk_device_memory tracked memory_type_index +- venus: fix query feedback batch leak and race upon submission +- zink: apply can_do_invalid_linear_modifier to Venus +- venus: scrub msaa sample mask only with valid msaa state +- venus: fix async compute pipeline creation +- venus: properly initialize ring monitor initial alive status +- venus: add missing shmem pool fini for cs_shmem pool +- venus: reduce ring idle timeout from 50ms to 5ms +- venus: use STACK_ARRAY to prepare for indirect submission +- venus: enable renderer shmem cache dump for cache debug +- venus: add ring helper to avoid redundant ring wait requests +- venus: use instance allocator for ring allocs +- venus: use instance allocator for indirect cs storage alloc +- venus: add vn_instance_fini_ring helper +- venus: refactor instance creation failure path +- venus: move ring monitor to instance for sharing across rings +- venus: refactor to add vn_watchdog +- venus: further cleanup vn_relax_init to take instance instead of ring +- venus: always set reply command stream to avoid seek +- venus: make vn_renderer_shmem_pool thread-safe +- venus: remove command_dropped tracking +- venus: relax ring mutex +- venus: move ring shmem into vn_ring +- venus: move the rest ring belongings into ring +- venus: move ring submission into ring +- venus: move the actual ring creation into ring as well +- venus: add vn_ring_get_id and hide vn_ring internals entirely +- venus: switch to vn_ring as the protocol interface - part 1 +- venus: switch to vn_ring as the protocol interface - part 2 +- venus: switch to vn_ring as the protocol interface - part 3 +- venus: add vn_gettid helper +- venus: dispatch background shader tasks to secondary ring +- driconfig: add a workaround for Hades (Vulkan backend) +- vulkan/wsi/wayland: ensure drm modifiers stored in chain are immutable +- venus: clang format fixes +- venus: split up the pipeline fix description into self and pnext +- venus: refactor to add pipeline info fixes helpers +- venus: properly ignore formats in VkPipelineRenderingCreateInfo +- meson/vulkan/util: allow venus to drop compiler deps +- venus: make tls hint specific to pipeline creation +- venus: TLS ring +- venus: clean up secondary ring +- venus: allow to retrieve pipeline cache on TLS ring +- venus: populate oom from ring submit alloc failures +- vulkan/wsi/wayland: fix returns and avoid leaks for failed swapchain +- venus: fix pipeline layout lifetime +- venus: fix pipeline derivatives +- venus: fix to respect the final pipeline layout + +Yogesh Mohan Marimuthu (10): + +- winsys/amdgpu: add _dw to max_ib_size variable for code readability +- winsys/amdgpu: remove ib_type variable from struct amdgpu_ib +- winsys/amdgpu: rename struct amdgpu_ib main variable as main_ib everywhere +- winsys/amdgpu: rename ib variable name to chunk_ib +- winsys/amdgpu: remove rcs variable from struct amdgpu_ib +- winsys/amdgpu: move 125% comment to correct line of code +- winsys/amdgpu: rename requested_size_dw to projected_size_dw +- winsys/amdgpu: rename ptr_ib_size_inside_ib to is_chained_ib +- winsys/amdgpu: rename big_ib_buffer,ib_mapped variables in struct amdgpu_ib +- winsys/radeon: remove unused gpu_address variable from struct radeon_cmdbuf + +Yonggang Luo (61): + +- compiler: Implement num_mesh_vertices_per_primitive to match u_vertices_per_prim +- treewide: Merge num_mesh_vertices_per_primitive and u_vertices_per_prim into mesa_vertices_per_prim +- nir: remove redundant include of gallium headers +- nir: #include "util/macros.h" for BITFIELD64_MASK in nir.c +- compiler,vulkan,drm-shim: Remove unused include directories from meson.build +- nvk: Should use alignment instead of align +- microsoft/clc: Using sampler_id instead PIPE_MAX_SHADER_SAMPLER_VIEWS for dxil_lower_sample_to_txf_for_integer_tex +- microsoft/clc: Use 128 instead of PIPE_MAX_SHADER_SAMPLER_VIEWS +- micosoft: define enum dxil_tex_wrap to avoid the usage of enum pipe_tex_wrap +- micosoft: decouple microsoft vulkan driver and compiler from gallium +- dzn: Fixes -Werror=incompatible-pointer-type +- d3d12,dzn: Simplify the usage of #include +- util: Fixes note: the alignment of ‘_Atomic long long int’ fields changed in GCC 11. +- glsl: move glsl_get_gl_type into glsl/linker_util.h +- meson/win32: There is no need install OpenGL headers on win32 +- intel: Remove unused ALIGN macro +- clover: Rename function align to align_vector to avoid conflict with global align +- treewide: Avoid use align as variable, replace it with other names +- util,vulkan,mesa,compiler: Generate source files with utf8 encoding from mako template +- intel: Generate source file with utf-8 encoding from mako template +- zink: Generate source file with utf-8 encoding from mako template +- docs: Generate document with utf8 encoding +- v3dv: Use correct type VkStencilOp in function translate_stencil_op +- broadcom/compiler: Use correct type pipe_logicop for logicop_func in struct v3d_fs_key +- broadcom/compiler: remove unused blend in v3d_fs_key +- broadcom: remove unused headers include +- osmesa: Make osmesa.h compatible with Windows SDK's GL.h +- broadcom/(compiler,common): avoid include of gallium headers in header files +- broadcom/compiler: remove include of gallium headers from meson.build +- osmesa: Fixes building osmesa.c on windows +- meson: Support for both packaging and distutils +- dzn: Remove #if D3D12_SDK_VERSION blocks now that 611 is required +- ci/msvc: update flex and bison to winflexbison3 +- ci/msvc: Install graphics tools(DirectX debug layer) easy to stuck, place it at the beginning +- ci/msvc: Split install vulkan sdk out of choco +- ci/msvc: Rename vs2019 to msvc +- ci/msvc: Rename vs to msvc for consistence +- ci/msvc: Improve msvc init +- ci/msvc: Remove &windows_msvc_image_tag +- ci/msvc: Upgrade to vs2022 build tools +- ci/msvc: Install msvc2019 only from vs2022 +- ci/msvc: Install both msvc2019 and msvc2022 +- ci/msvc: Stick deqp-runner to version v0.16.1 +- ci/msvc: Stick VK-GL-CTS to specific version 56114106d860c121cd6ff0c3b926ddc50c4c11fd +- ci/msvc: Split the install of rust and d3d out of mesa_deps_test.ps1 +- ci/microsoft: Update the image-tag and image-path for msvc2019/msvc2022 +- treewide: Replace the include of nir_types.h with glsl_types.h +- compiler/glsl: Move glsl specific _mesa_glsl_initialize_types out and glsl_symbol_table of glsl_types.h +- intel: Avoid use align as variable, replace it with other names +- intel: Use ALIGN_POT instead of ALIGN inside macro define +- intel: Cleanup duplicate ALIGN macro defines +- intel,crocus,iris: Use align64 instead of ALIGN for 64 bit value parameter +- amd: Use align64 instead of ALIGN for 64 bit value parameter +- util,compiler: Avoid use align as variable, replace it with other names +- panfrost: Avoid use align as variable, replace it with other names +- glsl: Fixes glcpp/tests with mingw/gcc +- util: Add align_uintptr and use it treewide to replace ALIGN that works on size_t and uintptr_t +- nvk: Avoid use align as variable, replace it with alignment +- nouveau: Use align64 instead of ALIGN for 64 bit value parameter +- etnaviv/drm: Remove redundant ALIGN macro by #include "util/u_math.h" +- compiler/spirv: The spirv shader is binary, should write in binary mode + +Zhang Ning (2): + +- iris: use helper util_resource_at_index +- lima: Support parameter queries for PIPE_RESOURCE_PARAM_NPLANES + +Zhang, Jianxun (5): + +- intel/genxml: Remove 3DSTATE_CLEAR_PARAMS instruction (xe2) +- intel/genxml: update 3DSTATE_WM_HZ_OP instruction (xe2) +- intel/genxml: update 3DSTATE_DEPTH_BUFFER instruction (xe2) +- intel/isl: update 3DSTATE_STENCIL_BUFFER (xe2) +- intel/genxml: Add RENDER_SURFACE_STATE for xe2 + +antonino (4): + +- nir: don't take the derivative of the array index in \`nir_lower_tex` +- vulkan: use instance allocator for \`object_name` in some objects +- nir/zink: drop NIH helper in favor of \`mesa_vertices_per_prim` +- egl: only check dri3 on X11 + +daoxianggong (1): + +- zink - Fix for blend color change without blend state change + +duncan.hopkins (4): + +- util: Update util/libdrm.h stubs to allow loader.c to compile on MacOS. +- dri: added build dependencies for systems using non-standard prefixed X11 libs. +- glx: fix automatic zink fallback loading between hw and sw drivers on MacOS +- vulkan: added build dependencies for systems using non-standard prefixed X11 libs. + +i509VCB (3): + +- asahi,docs: add PBE to hardware glossary +- asahi: create queue for screen +- agx: remove internal agx_device queue + +jphuang (1): + +- dzn: Change dst image layout according to aspect + +llyyr (1): + +- docs: document AMD_DEBUG=noefc and useaco + +ratatouillegamer (2): + +- hasvk: Add Vulkan API version override +- hasvk: Enable hasvk override Vulkan API Version for Brawlhalla