Commit Graph

117322 Commits

Author SHA1 Message Date
Erik Faye-Lund
477f019812 zink: only enable KHR_external_memory_fd if supported
While we're at it, make sure we error out if it's not supported when
required.

This brings us a bit closer to being able to test on SwiftShader, which
doesn't currently support KHR_external_memory_fd.
2019-10-30 19:40:50 +00:00
Bas Nieuwenhuizen
780c937a5d radv: Start signalling semaphores in WSI acquire.
Winsys semaphores without signal operation get silently ignored.

Not so for syncobjs, so actually signal them.

Fixes: 84d9551b23 "radv: Always enable syncobj when supported for all fences/semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 19:42:10 +01:00
Rhys Perry
e1bcc7a828 aco: rename README to README.md
Closes: #1974
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 18:16:00 +00:00
Rhys Perry
d4684a294b aco: a couple loop handling fixes for GFX10 hazard pass
It was joining from the wrong blocks and block.kind is a bitmask instead
of an enum.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-30 18:13:53 +00:00
Matt Turner
12d3b11908 intel/compiler: Add instruction compaction support on Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
c8fbc8823f intel/compiler: Make separate src0/src1 index tables
TGL uses different data (and even a different format!) for each source.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
cde73625f8 intel/compiler: Inline get_src_index()
TGL will have separate tables for src0 and src1, so the shared function
will no longer make sense.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
d0eff8a539 intel/compiler: Restructure instruction compaction in preparation for Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-30 11:11:50 -07:00
Matt Turner
ded9fb2b18 intel/compiler: Remove unreachable() from brw_reg_type.c
The EU compaction unit test fuzzes the compaction code by flipping bits.
We use a simple skip_bits() function with a list of reserved bits to
ignore, but for more complex cases like invalid combinations of register
file:type, we need either machinery to check validity or for these
functions to simply inform us whether a combination was valid.

enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it
with an "INVALID" value, just return -1 and let the caller check for
that.

Scott suggested redefining unreachable() within the unit test to
longjmp() which would allow driver code like this to still use it and
allow the test to handle expected failures like this. If that plan works
out, I plan to revert this.
2019-10-30 11:11:50 -07:00
Jonathan Marek
fa3baeab76 freedreno/a2xx: add missing vertex formats (SSCALE/USCALE/FIXED)
Mostly for vertex formats, but they are supported as texture formats too
(untested however).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-30 18:04:17 +00:00
Pierre-Eric Pelloux-Prayer
03a132912f radeonsi: disable sdma for gfx10
Disable sdma on gfx10 until all timeouts bugs are fixed.

See:
    https://gitlab.freedesktop.org/mesa/mesa/issues/1907
    https://bugs.freedesktop.org/show_bug.cgi?id=111481

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-30 18:03:14 +01:00
Pierre-Eric Pelloux-Prayer
2fb4b3c476 radeonsi: sdma misc fixes
SDMA IB doesn't need to be padded for SDMA.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-30 18:03:14 +01:00
Pierre-Eric Pelloux-Prayer
21b9a6b590 radeonsi: align sdma byte count to dw
If src/dst addresses are dw aligned and size is > 4 then we align
byte count to dw as well.

PAL implementation works like this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-30 18:03:14 +01:00
Timur Kristóf
f53811aeac radv: Enable ACO on Navi.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 16:54:41 +00:00
Leo Liu
a886ae5162 radeonsi: enable 8K video decode support for HEVC and VP9
HW 8K decode support starts at Renoir

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
2019-10-30 12:43:04 -04:00
Leo Liu
b4c812a269 radeon/vcn: Add VP9 8K decode support
Require increase of context buffer size

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
2019-10-30 12:43:04 -04:00
Rhys Perry
8235bc6411 aco: try to group together VMEM loads of the same resource
v2: remove accidental shaderInt16 change
v2: simplify can_move_down initialization
v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-30 17:23:49 +01:00
Daniel Schürmann
8b5aee78cc aco: don't schedule instructions through depending VMEM instructions
Previously, the scheduler tried to move up instructions from below depending
VMEM instructions only to move them down again when scheduling the VMEM
instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
636d45e46a aco: add can_reorder flags to load_ubo and load_constant
These got lost due to some refactoring.
Due to the way our scheduler works currently, for now
we add back the reorder flag for divergent loads only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
576f92d900 aco: only skip RAR dependencies if the variable is killed somewhere
This patch changes VMEM scheduling in a way that they can only
be moved upwards by previous VMEM instructions but not downwards.
This way, it improves the order of VMEM instructions in relation
to their users.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
703ce617ca aco: restrict scheduling depending on max_waves
Previously, we allowed all shaders to reduce the number of max_waves to as low as 5.
Restricting this on shaders with low register demand, increases the total number of waves
while the VMEM def-use distances hardly change.
This patch also changes the max number of move operations per MEM instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Jason Ekstrand
beca63c6c0 anv: Avoid emitting UBO surface states that won't be used
This shaves around 4-5% off of a CPU-limited example running with the
Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 16:05:57 +00:00
Jason Ekstrand
24c0545b2d intel/vec4: Set brw_stage_prog_data::has_ubo_pull
In 0e4a75f917, Ken added a flag brw_stage_prog_data which indicates
whether any UBO pulls ever occur.  Unfortunately, he neglected to set
the bit in the vec4 back-end.  This was fine at the time because the
optimization was intended for iris which does not support gen7 and using
the vec4 back-end on Gen8+ requires an environment variable.  We want to
use this in Vulkan which does support Gen7 so we want the information
from the vec4 back-end as well as scalar.

Fixes: 0e4a75f917 "intel/compiler: Record whether any pull constant..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 16:05:57 +00:00
Samuel Pitoiset
5a9d777f5a radv: fix perftest options
RADV_PERFTEST=outooforder has been removed a while ago. This fixes
dumping the options into hang reports.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 14:49:30 +01:00
Samuel Pitoiset
c895e08281 radv: move nomemorycache debug option at the right palce
Fixes: 6571000071 ("radv: add debug option to turn off in memory cache")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 14:49:28 +01:00
Samuel Pitoiset
d4e0bef1bb radv: fix dumping SPIR-V into hang reports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 13:02:08 +00:00
Tapani Pälli
4f8c86e6a5 mesa: enable ARB_gpu_shader_int64 in compat profile
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:37:27 +02:00
Tapani Pälli
2d8b8d3bd1 mesa: add [Program]Uniform*64ARB display list support
This is required for int64 to be enabled in compat profile.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-30 14:37:27 +02:00
Bas Nieuwenhuizen
396195e8f1 radv: Enable VK_KHR_timeline_semaphore.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
4aa75bb3bd radv: Add wait-before-submit support for timelines.
This is actually a non-threaded implementation. I'd summarize this
as event-based submission.

When submit happens we walk a tree of submissions that depend on
the syncobj signal operations to be submitted and if those submission
we no other dependencies we start to execute them immediately.

Or, well I still use a list to avoid issues with long chains and
the stacksize when using recursion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
88d41367b8 radv: Add timelines with a VK_KHR_timeline_semaphore impl.
This does not fully do wait-before-submit, to be done in a follow
up patch.

For kernels without support for timeline syncobjs, this adds an
implementation of non-shareable timelines using legacy syncobjs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2117c53b72 radv: Add temporary datastructure for submissions.
So we can defer them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
c3eae659e7 radv: Split semaphore into two parts as enum+union.
This is in preparation to adding more types.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
84d9551b23 radv: Always enable syncobj when supported for all fences/semaphores.
This simplifies code for timeline semaphores by needing to support
less configurations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
45f4a639a8 radv: Improve fence signalling in QueueSubmit.
Only signalling it once.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
a9c8424e08 radv: Do sparse binding in queue submission.
So we have one place to do queue things if we end up deferring
submissions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
915e9178fa radv: Split out commandbuffer submission.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
43ba44357c radv: Clean up unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2e3a635ee6 radv: Add an early exit in the secure compile if we already have the cache entries.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen
d78809632f radv: Compute hashes in secure process for secure compilation.
To prevent poisoning arbitrary cache entries.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:37:41 +01:00
Erik Faye-Lund
4c4ac2d4d5 zink: drop nop descriptor-updates
If there's nothing to be done, let's actually do nothing. Seems like a
good idea.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-10-30 10:29:23 +00:00
Erik Faye-Lund
b222f28357 zink: use bitfield for dirty flagging
Bitfields are a bit more ideomatic than explicit flags, and harder to
get wrong.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-10-30 10:29:23 +00:00
Erik Faye-Lund
6d30abb4f1 zink: use dynamic state for line-width
This will lead to fewer pipelines in the cache, which is assumed to
become our most unavoidable performance bottle-neck down the line.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-10-30 10:29:23 +00:00
Duncan Hopkins
d2bb63c8d4 zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-30 09:09:49 +00:00
Michel Dänzer
aaf1b09270 gitlab-ci: Disable meson-windows job for the time being
It needs a CI runner carrying the mesa-windows tag, but there's none
available currently.
2019-10-30 09:38:20 +01:00
Timothy Arceri
cf25664686 radv: make use of radv_sc_read()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
28fff3efbc radv: add radv_sc_read() helper
This is a function with timeout support for reading from the pipe
between processes used for secure compile.

Initially we hardcode the timeout to 5 seconds. We can adjust the
timeout limit in future if needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
23a6827e4d radv: allow select() calls in secure compile
This will be used in the following patch to support timeouts for
reading the pipe between processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Lepton Wu
1abf05764b mapi: Improve the x86 tsd stubs performance.
This skips touching %ebx most times and it shows that glGetString performance
increased from 114M/s to 120M/s on my desktop.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 20:50:05 -07:00
Lepton Wu
41407d5e9f mapi: Inline call x86_current_tls.
This saves one return and a simple benchmark which calls glGetString
repeatedly on my desktop shows it improves calls per second from 123M
to 141M.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1997
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-10-29 17:18:06 -07:00