Commit Graph

106973 Commits

Author SHA1 Message Date
Boris Brezillon
ada752afe4 panfrost: Extend the panfrost_batch_add_bo() API to pass access flags
The type of access being done on a BO has impacts on job scheduling
(shared resources being written enforce serialization while those
being read only allow for job parallelization) and BO lifetime (the
fragment job might last longer than the vertex/tiler ones, if we can,
it's good to release BOs earlier so that others can re-use them
through the BO re-use cache).

Let's pass extra access flags to panfrost_batch_add_bo() and
panfrost_batch_create_bo() so the batch submission logic can take the
appropriate when submitting batches. Note that this information is not
used yet, we're just patching callers to pass the correct flags here.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Boris Brezillon
12f790f7da panfrost: Add the shader BO to the batch in patch_shader_state()
We know a shader will be used by a batch when
panfrost_patch_shader_state() is called, so let's add the shader BO at
that time.

Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 16:55:38 -04:00
Andres Gomez
02c265be9d egl: Remove the 565 pbuffer-only EGL config under X11.
The CTS finally has agreed to drop the requirement for a
565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the
code to satisfy this requirement using a pbuffer-only visual with
whatever other buffers the driver happens to have given us.

This reverts commit 82607f8a90,
commit 6ad31c4ff3 and
commit dacb11a585.

v2:
  - Reference the VK-GL-CTS issue (Eric E.).

v3:
  - Don't revert
    fc21394bc4 ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers")
    (Kenneth).

References: VK-GL-CTS issue 1601.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-03 23:51:46 +03:00
Rafael Antognolli
cdc331c6f9 anv/block_pool: Align anv_block_pool state to 64 bits.
On 64 bits platforms, some atomic operations like __sync_fetch_and_add()
have constant time, but on 32 bits platforms they are implemented with a
loop and might take much longer.

Additionally, it seems like if their operands are not aligned to 64
bits, they also require extra memory accesses. From the Intel
Architecture's Developer Manual Vol. 1, 4.1.1:

 "A word or doubleword operand that crosses a 4-byte boundary or a
 quadword operand that crosses an 8-byte boundary is considered
 unaligned and requires two separate memory bus cycles for access."

Forcing the u64 field to be aligned to 64 bits seems to make the unit
tests that are stressing this finish much faster.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-03 12:40:33 -07:00
Erik Faye-Lund
0103d4747a loader/dri3: do not blit outside old/new buffers
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-03 18:58:34 +00:00
Anuj Phogat
0d60621101 intel/isl/icl: Use halign 8 instead of 4 hw workaround
v1 by Topi Pohjolainen
v2,v3 by Anuj Phogat:
- Apply for gen >= 11
- Remove wa_bug_xxx function
- Use helper functions

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-03 17:18:41 +00:00
Samuel Pitoiset
d861401554 ac/nir: remove unused code for nir_op_{fmod,frem}
RADV and RadeonSI both lower these two NIR instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-03 18:15:17 +02:00
Samuel Pitoiset
5ebe1a17e9 radv: enable lower_fmod for the LLVM path
This lowers fmod and frem at NIR level like RadeonSI. fmod is
already lowered directly in NIR->LLVM, and frem will be lowered by
LLVM anyways.

This fixes a LLVM crash with:
dEQP-VK.glsl.builtin.precision_fp16_storage32b.frem.compute.scalar.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-03 18:15:14 +02:00
Adam Jackson
1b87f4058d egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure
... because it's wrong to do so. The error path out of
dri2_initialize_drm ends with dri2_display_destroy, which calls
functions in the vtable we're trying to set up, so if we dlclose the
driver then those function pointers will point off into space and things
crash.

Noticed this because after !1923 eglinfo would crash when setting up the
GBM platform. This was something of a cascade failure, because my kernel
is too old for DRM_IOCTL_I915_GETPARAM to work without DRM_AUTH, so i965
wouldn't load. platform_drm.c then got very confused when it tries to
load swrast as a dri2 driver.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-03 09:39:51 -04:00
Bas Nieuwenhuizen
c837872fba radv: Fix warning in 32-bit build.
uintptr_t is 32 bits in a 32-bits build, resulting in shifting out
of bounds.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-03 13:06:08 +00:00
Bas Nieuwenhuizen
8ad3d8b178 radv: Fix condition for skipping the continue CS.
We need the continue CS for referencing the tess/GDS/sample position BOs.

Fixes: 46e52df34d "radv: add tessellation ring allocation support. (v2)"
Fixes: e1dc3ab753 "radv/gfx10: allocate GDS/OA buffer objects for NGG streamout"
Fixes: 1171b304f3 "radv: overhaul fragment shader sample positions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-03 13:02:07 +00:00
Gurchetan Singh
a1a5672118 virgl: honor winsys supplied metadata
To truly to do this correctly, we'll have to fix the discrepancy between
drm_virtgpu_3d_transfer_to_host and virtio_gpu_transfer_host_3d. However,
this is a good starting point.

Since virtio-gpu only supports self-import and export, this should be fine.
Let's only do WINSYS_HANDLE_TYPE_FD for this currently.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:59 -07:00
Gurchetan Singh
9bde8f3a8f virgl: modify internal structures to track winsys-supplied data
The winsys might supply dimensions that are different than
those we calculate.  In additional, it may supply virtualized
modifiers.

In practice, a stride != bpp * width and virtualized modifiers don't
happen yet, but the plan is to move in that direction.

Also make virgl_resource_layout static.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:53 -07:00
Gurchetan Singh
aad4127c41 virgl: modify resource_create_from_handle(..) callback
This commit makes no functional changes, just adds the revelant
plumbing.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:47 -07:00
Gurchetan Singh
2899bbe37a virgl: remove stride from virgl_hw_res
It's not used anywhere, and stride isn't really an intrinsic
property of a GEM buffer.

Reviewed by: Robert Tarasov <tutankhamen@chromium.org>
2019-10-02 17:57:40 -07:00
Lionel Landwerlin
1c6fdbc83c intel: fix topology query
i915 will report ENODEV on generations prior to Haswell because there
is no point in reporting values on those. This is prior any fusing
could happen on parts with identical PCI ids.

This query call was previously only triggered on generations that
support performance queries, which happens to match generation for
which i915 reports topology, but the commit pointed below started
using it on all generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1860
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: 96e1c945f2 ("i965: Move device info initialization to common code")
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-10-02 22:25:44 +00:00
Samuel Pitoiset
a2a68d551c radv/gfx10: fix the ESGS ring size symbol
Random hangs no longer happen, I'm actually not sure if they were
related to this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 21:50:40 +02:00
Samuel Pitoiset
34be977f80 radv: fix build
Forgot to amend the commit before updating the MR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-02 20:37:43 +02:00
Samuel Pitoiset
4304162744 Revert "radv: disable viewport clamping even if FS doesn't write Z"
This was actually the wrong fix.

This reverts commit 0a313cc285.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 19:40:39 +02:00
Samuel Pitoiset
b8fe6189a9 radv: rework the slow depthstencil clear to write depth from PS
Make sure to export the expected clear values to the depth
stencil attachment.

This fixes dEQP-VK.pipeline.depth_range_unrestricted.* on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 19:31:51 +02:00
Samuel Pitoiset
e19d1ee2d1 radv/gfx10: fix NGG streamout with triangle strips for VS
The number of vertices has to be adjusted with the output primitive
type.

This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:35 +02:00
Samuel Pitoiset
08ab13d340 radv/gfx10: fix storing/loading NGG stream outputs for GS
The GS outputs are stored differently in the LDS storage, they
are indexed by out_idx which is incremented for each stored DWORD.
Thus, we need a different path for exporting the stream outputs.

This fixes a bunch of CTS failures when NGG GS is force enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:32 +02:00
Samuel Pitoiset
3be21b5ab1 radv/gfx10: use the component mask when storing/loading NGG stream outputs
It's unnecessary to store/load more components that needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:30 +02:00
Samuel Pitoiset
60f8224171 radv/gfx10: fix storing/loading NGG stream outputs for VS and TES
The LDS storage allocated for stream outputs is 4 * N, where N
is the number of outputs. So, we have to store/load with N as index
and not with the output location as index.

This doesn't fix anything known but it should fix out-of-bounds
access and it also reduces the number of outputs written to the
LDS storage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:27 +02:00
Samuel Pitoiset
56e1b1ff0c radv/gfx10: add missing counter buffer to the BO list
The buffer isn't necessarily used before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:25 +02:00
Samuel Pitoiset
683c5e27c7 radv/gfx10: add radv_device::use_ngg
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:06:01 +02:00
Gert Wollny
c5da8230de etnaviv: enable triangle strips only when the hardware supports it
Some hardware has a bug with triangle strips and it is signalled by the
flag BUG_FIXED8 whether this bug has been fixed. So only enable triangle
strips when this flag is set.

Thanks: Jonathan Marek and Christian Gmeiner for the pointers

v2: Add TODO to indicate that the handling should be refined
    (Jonathan & Christian)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-10-02 07:34:36 +00:00
Dylan Baker
d855e19b87 meson: remove -DGALLIUM_SOFTPIPE from st/osmesa
It's unused here, and undefined in scons. It is used in targets/osmesa,
but it's properly defined there already.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-01 12:34:27 -07:00
Lionel Landwerlin
2208d79dde mesa: don't forget to clear _Layer field on texture unit
On the Android Antutu benchmark we ran into an assert in ISL where the
(base layer + num layers) > total layers. It turns out the core of
mesa forgot to clear the _Layer variable, potentially leaving an
inconsistent value.

v2: Pull setting u->_Layer out of the conditional blocks (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-01 21:49:13 +03:00
Robin Murphy
563f8974d8 egl/gbm: Fix config validation
In converting to shift/size-based validation, we lost a condition from
the ARGB/XRGB equivalence check, which left it working one way round
but not the other, and broke applications like glmark2-es2-drm on some
platforms. Restore the equivalent check that *both* configs actually
have an alpha channel before considering a mismatch.

Fixes: 7b4ed2b513 ("egl: Convert configs to use shifts and sizes instead of masks")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-01 14:45:15 +01:00
Ken Mays
4943c89d6d haiku: fix Mesa build
1. The hgl.c file is a read-only file versus read-write.
Ref: src/gallium/state_trackers/hgl/hgl.c

2.  I've included the Haiku-specific patches I used to get a successful
build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure.
Shows "[764/764] linking target ... libswpipe.so" at build completion.

v2:
Remove autotools files (Eric)

v3:
Update the patch

Reported-by: Ken Mays <kmays2000@gmail.com>
Tested-by: Ken Mays <kmays2000@gmail.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com>
2019-10-01 10:31:02 +00:00
Kevin Strasser
641320ce02 egl: Fix implicit declaration of ffs
Found when building for Android in C99 mode. Include bitscan.h to ensure ffs is
available.

Fixes: 7b4ed2b5 ("egl: Convert configs to use shifts and sizes instead of masks")

Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 14:33:43 -07:00
Rafael Antognolli
b9994cb8d5 intel/tools: Fix aubinator usage of rb_tree.
The order of comparison has changed, so we need to invert the logic of
"insert_left" when using rb_tree_insert_at().

Fixes: dae33052db (util/rb_tree: Reverse the order of comparison
                    functions).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-30 13:43:23 -07:00
Caio Marcelo de Oliveira Filho
54f1de1c5c i965: Enable EXT_demote_to_helper_invocation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
a3776df7b1 iris: Enable EXT_demote_to_helper_invocation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
008de52305 gallium: Add PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION
To enable EXT_demote_to_helper_invocation:

    This extension adds a "demote" keyword that is similar to "discard" but
    only suppresses subsequent writes and outputs to the framebuffer, and
    does not terminate the execution of the invocation. For the remainder
    of the execution, the invocation is "demoted" to act like a helper
    invocation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
61fa4b5707 glsl: Add helperInvocationEXT() builtin
From EXT_demote_to_helper_invocation, implemented with the existing
nir_intrinsic_is_helper_invocation.

Such builtin is necessary when using `demote` because we can't
redefine the value of gl_HelperInvocation (since it is an input
variable).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
3439956377 glsl: Parse demote statement
When the EXT_demote_to_helper_invocation extension is enabled,
`demote` is treated as a keyword, and produces an ir_demote.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
af1a6f0f77 glsl: Add ir_demote
To represent the new `demote` keyword when using
EXT_demote_to_helper_invocation extension.  Most of the changes are to
include it in the visitors.

Demote is not considered a control flow, so also include an empty
visit member function in ir_control_flow_visitor.

Only NIR actually supports `demote`, so assert the translations for
TGSI and Mesa's gl_program -- since the demote is not expected to
appear for those.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho
c81b912eb7 mesa: Extension boilerplate for EXT_demote_to_helper_invocation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-30 12:44:30 -07:00
Kenneth Graunke
309924c3c9 iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.
We can't just check for the BO base address, we need to check for the
full address including any offset we may have applied.  When updating
the address, we need to include the offset again.

Fixes: 5ad0c88dbe ("iris: Replace buffer backing storage and rebind to update addresses.")
2019-09-30 12:41:03 -07:00
Marek Olšák
a1545af079 ac/nir: fix GLSL imageSamples()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-30 14:21:42 -04:00
Marek Olšák
0cc233e3dc ac: add ac_build_image_get_sample_count from radeonsi
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-30 14:21:42 -04:00
Marek Olšák
39e638c14e ac/surface: don't allocate FMASK if there is no graphics
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-30 14:21:42 -04:00
Marek Olšák
f704fb7f0b tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes
radeonsi doesn't use the format and internal shaders don't set it.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-09-30 14:20:48 -04:00
Dylan Baker
3b265f61f5 meson: gallium media state trackers require libdrm with x11
v2: - update copyright year in all changed files
    - rebase on master

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-30 18:06:56 +00:00
Kenneth Graunke
a0a93763fb iris: Disable CCS_E for 32-bit floating point textures.
A while back, Michael Larabel noticed that Paraview's Wavelet Volume
case runs significantly slower on iris than i965.  It turns out this
is because we enable CCS_E for 32-bit floating point formats, while
i965 disables it, with an oblique comment saying that we benchmarked
it (on what exactly?) and determined that it was a loss.

Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed
large framerate drops when enabling CCS_E for either format.  However,
several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit
floating point formats, with no apparent ill effects.

So, disable compression for 32-bit float formats for now, but leave it
enabled for 16-bit float formats as they seem to be working fine.

Improves performance in Paraview's Wavelet Volume test by 62% on a
Skylake GT4e.

Fixes: 3cfc6a207b ("iris: Fill out res->aux.possible_usages")
2019-09-30 10:44:52 -07:00
Marek Olšák
4a0d2e2880 ac: reorder and print all radeon_info fields
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:21 -04:00
Marek Olšák
e8b1538587 ac: set the number of SDPs same as the number of TCCs
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:21 -04:00
Marek Olšák
b7c2f7c5a6 ac: fix num_good_cu_per_sh for harvested chips
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-30 13:36:20 -04:00