With this option enabled range of input values for fsin and fcos is
limited to [-2*pi : 2*pi] by calculating the reminder after 2*pi modulo
division. This helps to improve calculation precision for large input
arguments on Intel.
-v2: Add limit_trig_input_range option to prog_key to update shader
cache (Lionel)
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16388>
On Gfx7 we can only give the sample location for a given multisample
number. This means everytime the multisampling value changes, we have
to re-emit the locations. It's fine because it's also where
(3DSTATE_MULTISAMPLE) the number of samples is stored.
On Gfx8+ though, 3DSTATE_MULTISAMPLE only holds the number of samples
and all the sample locations for all number of samples are located in
3DSTATE_SAMPLE_PATTERN. So to be more effecient there, we need to
track the locations for all sample numbers and compare new values with
the relevant sample count when touching the dynamic state.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16220>
Discrete platforms don't have LLC, but on those, we mmap our buffers
with WC. So we shouldn't need to clflush there.
Anv already had a boolean field on the physical device to know whether
we need to use clflush(), based off the memory heaps available. So use
that instead.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15780>
Without this we might choose 8 or 16 width, while the app assumes 32.
With subgroup operations it may cause wrong calculations and thus bugs.
Examples of such games are Aperture Desk Job and DOOM Eternal.
v2: Make it a driconf option instead of applying unconditionally, move
from brw_required_dispatch_width to brw_compile_cs
v3: Rename allow_assuming_full_subgroups -> assume_full_subgroups.
Include assume_full_subgroups value in anv_pipeline_hash_compute().
v4: Move actual workaround code from brw_fs.c -> anv_pipeline.c.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6171
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15708>
Instead of having two different helpers, delete the pipeline_cache ones.
Also, instead of manually handling the cache == NULL case in every
vkCreateFooPipelines call, handle it inside the helpers. This means
that BLORP can use them too by passing cache=NULL.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13184>
The problem is that we missed looking at pipeline changes. Pipelines
hold bits of dynamic states and when it changes we might need to
reemit a packet.
v2: fix comment (Tapani)
Add missing anv_cmd_buffer_needs_dynamic_state() use (Tapani)
Cc: mesa-stable
Fixes: 505d176a8e ("anv: disable baked in pipeline bits from dynamic emission path")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>
This is about the only "small" change we can make in the process of
converting from render-pass-based to dynamic-rendering-based. Make
everything in pipeline creation work in terms of dynamic rendering and
create the dynamic rendering structs from the render pass as-needed.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14961>
This raises our vertex input bindings limit from 28 to 31, and our
vertex input attribute limit from 28 to 29. We could theoretically
go higher, but it will take additional work.
The 3DSTATE_VERTEX_BUFFERS and 3DSTATE_VERTEX_ELEMENTS limits are 33
vertex buffers, and 34 vertex elements. But we need up to two vertex
elements for system values (FirstVertex, BaseVertex, BaseInstance,
DrawID), and we currently use two vertex bindings for those.
There is another hidden limit: our compiler backend only supports the
push model for VS inputs currently. 3DSTATE_VS only allows URB Read
Lengths between [0, 15], which is measured in pairs of inputs, which
means we can theoretically push no more than 32 vertex elements. This
is no artifical limit either, as a vec4 element takes up 4 registers
in the payload, and 32 * 4 = 128, the entire size of our register file.
Plus, the VS Thread payload needs at least g0 and g1 for other things,
so we can really only push 31.
We can theoretically support one additional binding, by combining our
two SGV bindings into a single upload. In order to support additional
vertex elements, we would need to add support to the backend compiler
for the pull model for VS inputs.
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5917
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14991>
With Vulkan 1.1, we have a VkImageSwapchainCreateInfoKHR struct which
lets you create a new VkImage which aliases a swapchain image. However,
there is no corresponding swapchain create flag so we have to set
VK_IMAGE_CREATE_ALIAS_BIT all the time.
We need to do a bit of work in ANV to prevent it from asserting the
moment it sees one of these. Fortunately, they're already safe because
WSI images go through a different bind path for
VkBindImageMemorySwapchainInfoKHR.
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12031>
There's two of them because they can be created from three points in the
code that provide different details and this is the least ugly way I
could think of for now.
v2: Avoid allocations (Lionel)
v3: Move definition closer to its usage (Lionel)
v4: (Lionel)
- Simplify anv_dynamic_pass_init_full
- Zero out pass/subpass to avoid stall pointers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13980>
This change introduces the anv_descriptor_size_for_mutable_type and
anv_descriptor_data_for_mutable_type helpers to compute the size and
data flags respectively for mutable descriptor types.
In order to make handling these types easier we now store a precomputed
descriptor stride for all types and use in the in appropriate places.
We also need to adjust the compiler to take into account this descriptor
stride. To that extent, we now pack the stride into the upper 16 bits
alongside the index and the dynamic offset index and use it later to
compute the correct offset.
Closes: #4250
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14633>