Commit Graph

91386 Commits

Author SHA1 Message Date
Nicolai Hähnle
ce55afc4d6 glsl: remove the shader_group_vote and shader_ballot expression ops
They are now no longer used.
2017-04-28 11:33:59 +02:00
Nicolai Hähnle
0aef96e00c glsl: implement arb_shader_ballot builtins using intrinsics 2017-04-28 11:33:59 +02:00
Nicolai Hähnle
2c30ea3fcd glsl: implement arb_shader_group_vote builtins via intrinsics
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 11:33:59 +02:00
Nicolai Hähnle
944455217b st/glsl_to_tgsi: implement shader_group_vote and shader_ballot intrinsics
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 11:33:59 +02:00
Nicolai Hähnle
99941a9724 glsl: add intrinsics for ARB_shader_group_vote and ARB_shader_ballot
These operations are currently implemented as IR expressions. However,
they cannot be transformed and moved in the way that other IR
expressions can because they have non-trivial interactions with
control-flow.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 11:33:58 +02:00
Samuel Pitoiset
24011ead71 glsl: reject image qualifiers with non-image types inside uniform blocks
Fixes the following ARB_shader_image_load_store tests:

format-layout-with-non-image-type.frag
memory-qualifier-with-non-image-type.frag

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-28 10:43:53 +02:00
Samuel Pitoiset
edb4a1ab2d glsl: introduce validate_image_qualifier_for_type() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-28 10:43:13 +02:00
Samuel Pitoiset
80738425e4 glsl: fix error when using format qualifiers with non-image types
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-28 10:43:04 +02:00
Timothy Arceri
22fa3d90a9 util/disk_cache: remove percentage based max cache limit
The more I think about it the more this seems like a bad idea.
When we were deleting old cache dirs this wasn't so bad as it
was unlikely we would ever hit the actual limit before things
were cleaned up. Now that we only start cleaning up old cache
items once the limit is reached the a percentage based max
cache limit is more risky.

For the inital release of shader cache I think its better to
stick to a more conservative cache limit, at least until we
have some way of cleaning up the cache more aggressively.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-28 14:35:27 +10:00
Jason Ekstrand
032861693e anv: Move queues, events, and semaphores to their own file
Things are about to get more complicated, especially as far as
semaphores are concerned.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
9bd1f03487 anv: Implement VK_KHX_external_memory_fd
This commit just exposes the memory handle type.  There's interesting we
need to do here for images.  So long as the user doesn't set any crazy
environment variables such as INTEL_DEBUG=nohiz, all of the compression
formats etc. should "just work" at least for opaque handle types.

v2 (chadv):
  - Rebase.
  - Fix vkGetPhysicalDeviceImageFormatProperties2KHR when
    handleType == 0.
  - Move handleType-independency comments out of handleType-switch, in
    vkGetPhysicalDeviceExternalBufferPropertiesKHX.  Reduces diff in
    future dma_buf patches.

Co-authored-with: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
818b857914 anv: Use the BO cache for DeviceMemory allocations
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
494d6f65a7 anv/allocator: Add a BO cache
This cache allows us to easily ensure that we have a unique anv_bo for
each gem handle.  We'll need this in order to support multiple-import of
memory objects and semaphores.

v2 (Jason Ekstrand):
 - Reject BO imports if the size doesn't match the prime fd size as
   reported by lseek().

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
5d25ac6a4b anv: Implement VK_KHX_external_memory
This is the trivial implementation that just exposes the extension
string but exposes zero external handle types.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Chad Versace
354ca7a1d4 anv: Implement VK_KHX_external_memory_capabilities
This is a complete but trivial implementation. It's trivial becasue We
support no external memory capabilities yet.  Most of the real work in
this commit is in reworking the UUIDs advertised by the driver.

v2 (chadv):
  - Fix chain traversal in vkGetPhysicalDeviceImageFormatProperties2KHR.
    Extract VkPhysicalDeviceExternalImageFormatInfoKHX from the chain of
    input structs, not the chain of output structs.
  - In vkGetPhysicalDeviceImageFormatProperties2KHR, iterate over the
    input chain and the output chain separately. Reduces diff in future
    dma_buf patches.

Co-authored-with: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
d4d9258b61 anv/physical_device: Rename uuid to pipeline_cache_uuid
We're about to have more UUIDs for different things so this one really
needs to be properly labeled.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
02767cb4ff anv: Refactor device_get_cache_uuid into physical_device_init_uuids
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
35e626bd0e anv: Set EXEC_OBJECT_ASYNC when available
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
bd3a9813b9 anv/cmd_buffer: Use the device allocator for QueueSubmit
The command is really operating on a Queue not a command buffer and the
nearest object to that with an allocator is VkDevice.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: "17.0 17.1" <mesa-dev@lists.freedesktop.org>
2017-04-27 20:08:46 -07:00
Timothy Arceri
2bc06767e1 mesa: remove wip framebuffer code
This was added in 34b3b40af9 back in 2006. Seems it wasn't
needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-28 10:19:59 +10:00
Samuel Pitoiset
75a31a20af glsl: set vector_elements to 1 for samplers
I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.

This will also help for ARB_bindless_texture.

No piglit regressions with RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-27 22:52:21 +02:00
Jan Vesely
b295a52836 clover: Fix build since clang r301442
v2: rename default_ik -> ik_opencl

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-27 12:52:25 -04:00
Timothy Arceri
4e1f3afea9 disk_cache: use block size rather than file size
The majority of cache files are less than 1kb this resulted in us
greatly miscalculating the amount of disk space used by the cache.

Using the number of blocks allocated to the file is more
conservative and less likely to cause issues.

This change will result in cache sizes being miscalculated further
until old items added with the previous calculation have all been
removed. However I don't see anyway around that, the previous
patch should help limit that problem.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-27 20:44:00 +10:00
Timothy Arceri
ce41237151 disk_cache: reduce default cache size to 5% of filesystem
Modern disks are extremely large and are only going to get bigger.
Usage has shown frequent Mesa upgrades can result in the cache
growing very fast i.e. wasting a lot of disk space unnecessarily.

5% seems like a more reasonable default.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-27 20:43:50 +10:00
Dave Airlie
f4743763ce radeon/ac: remove assert causing regression
This assert wasn't in the original radeonsi code but I added
it without totally understanding the original code, it caused
some regressions in variable-indexing tessellation shaders.

Fixes: e2659176 radeonsi/ac: move vertex export remove to common code.
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 11:38:54 +01:00
Dave Airlie
550281f934 radeon/ac: fix build on llvm 3.8.1
Add missing include to fix build.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 11:22:12 +01:00
Boyan Ding
63df869f08 nvc0: Enable compute support for Pascal
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:15 +02:00
Boyan Ding
d03bfb078b nvc0: Add new launch descriptor format for GP100
v2:
Also handle the the new format in indirect dispatch
Use compute class check instead of chipset check

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:12 +02:00
Boyan Ding
2e35bd964e nvc0: Fix index of unk fields in nve4_cp_launch_desc
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:10 +02:00
Boyan Ding
4a9f7bfe90 nouveau: Fix indentation of maxwell compute class definitions
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:07 +02:00
Jason Ekstrand
c43b4bc85e anv: Don't place scratch buffers above the 32-bit boundary
This fixes rendering corruptions in DOOM.  Hopefully, it will also make
Jenkins a bit more stable as we've been seeing some random failures and
GPU hangs ever since turning on 48bit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620
Fixes: 651ec926fc "anv: Add support for 48-bit addresses"
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-04-27 02:04:57 -07:00
Dave Airlie
f205e19e4f radv/ac: eliminate unused vertex shader outputs. (v2)
This is ported from radeonsi, and I can see at least one
Talos shader drops an export due to this, and saves some
VGPR usage.

v2: use shared code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:18:52 +01:00
Dave Airlie
e2659176ce radeonsi/ac: move vertex export remove to common code.
This code can be shared by radv, we bump the max to
VARYING_SLOT_MAX here, but that shouldn't have too
much fallout.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:17:47 +01:00
Dave Airlie
9da1045933 radv: fix regression in descriptor set freeing.
Since the host pool changes,

Fixes:
dEQP-VK.api.descriptor_pool.out_of_pool_memory

Fixes: 126d5ad "radv: Use host memory pool for non-freeable descriptors."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 10:50:46 +10:00
Timothy Arceri
f8a2d00046 glsl: remove duplicate validation
Varying types have already been validated in
apply_type_qualifier_to_variable() by this point.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 08:21:28 +10:00
Timothy Arceri
52c76dbad3 glsl: use without_array() rather than get_scalar_type()
Here get_scalar_type() was just being use to remove the array
after that we converted it back to base_type anyway so just
use the without_array() helper.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-04-27 08:21:21 +10:00
Brian Paul
28feb63580 svga: fix vertex buffer binding issue
When we ran Viewperf11's Maya-03 test 3 we saw warnings about flushing
the command buffer with mapped buffers.  This happened when transitioning
from hardware rendering to a 'draw' fallback path.

The problem is the util_set_vertex_buffers_count() function doesn't do
exactly what we want in svga_hwtnl_vertex_buffers().  In a case such as
dst_count=2, dst={bufA, bufB}, count=1 and src={bufC}, when the function
returns we'll have dst_count=2 and dst={bufC, bufB}.  What we really want
is dst_count=1 and dst={bufC, NULL}.  As it was, we were telling the svga
device that there were two vertex buffers when in fact we really only
needed one for the subsequent drawing command.

In this particular case, we first did hardware drawing with {bufA, bufB}
then we transitioned to the 'draw' module, consuming vertex data from
bufA and bufB and writing the new vertex data to bufC.  bufA and bufB are
mapped for reading when we flush the command buffer but should not be
referenced by the command buffer.  The above change fixes that.

No Piglit regressions.  Also tested with Viewperf, Google Earth, Heaven,
etc.

VMware bug 1842059

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:38:00 -06:00
Brian Paul
a36a1ea80a gallium/util: reduce util_snprintf() calls in debug_flush_might_flush_cb()
We only need to construct the debug message if the mapped_sync flag is set.
This should make the function faster since the flag is usually false.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:38:00 -06:00
Brian Paul
495840658e gallium/util: add some comments in u_debug_flush.c
Trivial.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
fbda9b905a svga: Removed the unused label 'done' in svga_validate_surface_view()
Trivial fix
2017-04-26 11:37:59 -06:00
Charmaine Lee
019d5d5346 svga: use the winsys interface to invalidate surface
Instead of directly sending the InvalidateGBSurface command,
this patch uses the invalidate_surface interface.

Fixes Linux VM piglit failures including
   ext_texture_array-gen-mipmap, fbo-generatemipmap-array S3TC_DXT1

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
5bd5ec6a0f svga: fix format for screen target
This patch revises the fix in commit 606f13afa31c9f041a68eb22cc32112ce813f944
to properly translate the surface format for screen target.
Instead of changing the svga format for PIPE_FORMAT_B5G6R5_UNORM
to SVGA3D_R5G6B5 for all texture surfaces, this patch only restricts
SVGA3D_R5G6B5 for screen target surfaces. This avoids rendering
failures when specify a non-vgpu10 format in a vgpu10 context with
software renderer.

Fixes piglit failures spec@!opengl 1.1@draw-pixels,
                      spec@!opengl 1.1@teximage-colors gl_r3_g3_b2
                      spec@!opengl 1.1@texwrap formats

Tested Xorg with 16bits depth.
Also tested with MTT piglit, MTT glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
3626112214 svga: cache the backing surface handle in the texture object
CinebenchR15 not only binds the same texture for rendering and sampling,
it actually changes the framebuffer buffer attachment very often, causing
a lot of backed surface view to be created and a lot of surface copies
to be done. This patch caches the backed surface handle
in the texture resource and allows the backed surface view to
reuse the backed surface handle.  With this patch, the number of
backed surface view reduces from 1312 to 3. Unfortunately, this
does not eliminate all the surface copies. There are still surface
copies involved when we switch from original to backed surface handle
for rendering.

Tested with CinebenchR15, NobelClinicianViewer, Turbine, Lightsmark2008,
            MTT glretrace, MTT piglit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
7f2f695d4d svga: Update the backing resource only if needed
This patch adds a timestamp in svga_surface structure to keep track
of when the backing surface is last sync with the original resource.
This helps to avoid unnecessary surface copy from the original
resource to the backing surface if the original resource has not
since been modified.

This reduces the amount of surface copy with CinebenchR15.

Tested with CinebenchR15, mtt glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
c6576461f5 svga: Set the surface dirty bit for the right surface view
For VGPU10, we will render to a backed surface view when
the same resource is used for rendering and sampling.
In this case, we will mark the dirty bit for the backed surface view.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
dc30ac5c24 svga: Move rendertarget view related fields to hw_clear state
This patch moves the rendertarget view related fields from
svga_hw_draw_state to svga_hw_clear_state where all the hw
framebuffer related state resides.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
f482493dcf svga: Move setting the rendered_to flags to framebuffer emit time
Instead of setting the rendered_to flags at set time, this patch
moves the setting of the flags to framebuffer emit time.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
1ee181b354 svga: add const qualifiers on svga_check_sampler_view_resource_collision()
We don't change any of the argument objects.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
0f236ea785 svga: improve surface view debug messages
The old ones were somewhat cryptic.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
943f4f47e0 svga: add DEBUG_SAMPLERS
The debug output in svga_create_sampler_state() was controlled by
DEBUG_VIEWS but that's not consistent with the other debug output for
sampler views.  Create/use a new debug flag just for this.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00