Commit Graph

98495 Commits

Author SHA1 Message Date
Samuel Iglesias Gonsálvez
ba4bb0838b anv: fix bug when using component qualifier in FS outputs
We can write to the same output but in different components, like
in this example:

layout(location = 0, component = 0) out ivec2 dEQP_FragColor_0;
layout(location = 0, component = 2) out ivec2 dEQP_FragColor_1;

Therefore, they are not two different outputs but only one.

Fixes:

dEQP-VK.glsl.440.linkage.varying.component.frag_out.*

v3:
- Remove FRAG_RESULT_MAX.
- Add const and use sizeof (Ian).
- Do three-pass to set properly the locations of fragment
  outputs when having arrays (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-12 07:24:55 +01:00
Ilia Mirkin
0332c7484b st/mesa: swizzle argument when there's a vector size mismatch
GLSL IR operation arguments can sometimes have an implicit swizzle as a
result of a vector arg and a scalar arg, where the scalar argument is
implicitly expanded to the size of the vector argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103955
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-12-11 23:08:43 -05:00
Roland Scheidegger
84c363fb09 gallivm: fix texture wrapping for texture gather for mirror modes
Care must be taken that all coords end up correct, the tests are very
sensitive that everything is correctly rounded. This doesn't matter
for bilinear filter (since picking a wrong texel with weight zero is
ok), and we could also switch the per-sample coords mistakenly.
While here, also optimize the coord_mirror helper a bit (we can do the
mirroring directly by exploiting float rounding, no need for fixing up
odd/even manually).
I did not touch the mirror_clamp and mirror_clamp_to_border modes.
In contrast to mirror_clamp_to_edge and mirror_repeat these are legacy
modes. They are specified against old gl rules, which actually does
the mirroring not per sample (so you get swapped order if the coord
is in the mirrored section). I think the idea though is that they should
follow the respecified mirror_clamp_to_edge rules so the order would be
correct.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-12-12 04:23:02 +01:00
Jason Ekstrand
24f019fd69 spirv: Allow ignoring decorations for workgroup variables
Since we switched over to lowering SLM access directly in SPIR-V -> NIR,
we no longer have vtn_variables for SLM.  It's all safe as with UBOs and
SSBOs but we need to let it through in the assert.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213
Fixes: 8761a04d0d
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-11 19:02:47 -08:00
Jason Ekstrand
2bc9123c33 spirv: Set lengths on scalar and vector types
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-11 19:02:47 -08:00
Bas Nieuwenhuizen
3342a432fa ac/nir: Support vulkan_resource_reindex.
Fixes: 93b4cb61eb "spirv: Allow OpPtrAccessChain for block indices"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-12 00:16:18 +01:00
Bas Nieuwenhuizen
368f49b284 ac/nir: Don't load the descriptor in vulkan_resource_index.
To support the reindex intrinsic, we need the result to be
something on which we can adjust the index/address.

Since it is all within a basic block, the compiler should be
able to merge any extra loads.

v2: Change visit_get_buffer_size too.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-12 00:16:18 +01:00
Marek Olšák
bf0904e31f winsys/amdgpu: disable local BOs again due to worse performance
Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-11 19:11:14 +01:00
Marek Olšák
8a821fa91c drirc: whitelist glthread for Mount and Blade Warband again 2017-12-11 19:11:12 +01:00
Bas Nieuwenhuizen
6469669beb radv: Don't use local BOs when allocating with export options.
If the app does not plan to put a buffer or image in it
(why? But it is allowed and CTS does it), they do not need to
allocate it with the deciate allocation struct.

Fixes: a639d40f13 "radv: add support for local bos. (v3)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-10 23:47:23 +01:00
Bas Nieuwenhuizen
b926da241a spirv: Fix loading an entire block at once.
There is no chain, so  checking the length ends with a SEGFAULT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-10 01:43:26 +01:00
Jason Ekstrand
4c7af87fb9 anv: Enable UBO pushing
Push constants on Intel hardware are significantly more performant than
pull constants.  Since most Vulkan applications don't actively use push
constants on Vulkan or at least don't use it heavily, we're pulling way
more than we should be.  By enabling pushing chunks of UBOs we can get
rid of a lot of those pulls.

On my SKL GT4e, this improves the performance of Dota 2 and Talos by
around 2.5% and improves Aztec Ruins by around 2%.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:26 -08:00
Jason Ekstrand
f1ce0b905a i965/fs: Handle !supports_pull_constants and push UBOs properly
In Vulkan, we don't support classic pull constants and everything the
client asks us to push, we push.  However, for pushed UBOs, we still
want to fall back to conventional pulls if we run out of space.
2017-12-08 15:43:25 -08:00
Jason Ekstrand
8d34077182 anv/device: Increase the UBO alignment requirement to 32
Push constants work in terms of 32-byte chunks so if we want to be able
to push UBOs, every thing needs to be 32-byte aligned.  Currently, we
only require 16-byte which is too small.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
2f9eb045f3 anv/cmd_buffer: Add support for pushing UBO ranges
In order to do this we have to modify push constant set up to handle
ranges.  We also have to tweak the way we handle dirty bits a bit so
that we re-push whenever a descriptor set changes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
0c879b62b0 anv/cmd_buffer: Add some stage asserts
There are several places where we look up opcodes in an array of stages.
Assert that the we don't end up going out-of-bounds.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
1968cd07a2 anv/cmd_buffer: Add some helpers for working with descriptor sets
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
1bce04deb8 anv/pipeline: Translate vulkan_resource_index to a constant when possible
We want to call brw_nir_analyze_ubo_ranges immedately after
anv_nir_apply_pipeline_layout and it badly wants constants.  We could
run an optimization step and let constant folding do it but that's way
more expensive than needed.  It's really easy to just handle constants
in apply_pipeline_layout.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
3b34ed79f1 i965/fs: Rewrite assign_constant_locations
This rewires the logic for assigning uniform locations to work in terms
of "complex alignments".  The basic idea is that, as we walk the list of
instructions, we keep track of the alignment and continuity requirements
of each slot and assert that the alignments all match up.  We then use
those alignments in the compaction stage to ensure that everything gets
placed at a properly aligned register.  The old mechanism handled
alignments by special-casing each of the bit sizes and placing 64-bit
values first followed by 32-bit values.

The old scheme had the advantage of never leaving a hole since all the
64-bit values could be tightly packed and so could the 32-bit values.
However, the new scheme has no type size special cases so it handles not
only 32 and 64-bit types but should gracefully extend to 16 and 8-bit
types as the need arises.

Tested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-12-08 15:43:25 -08:00
Jason Ekstrand
597c194487 anv: Disable VK_KHR_16bit_storage
The testing for this extension is currently very poor.  The CTS tests
only test accessing UBOs and SSBOs at dynamic offsets so none of our
constant-offset paths get triggered at all.  Also, there's an assertion
in our handling of nir_intrinsic_load_uniform that offset % 4 == 0 which
is never triggered indicating that nothing every gets loaded from an
offset which is not a dword.  Both push constants and the constant
offset pull paths are complex enough, we really don't want to ship
without tests.  We'll turn the extension back on once we have decent
tests.
2017-12-08 15:42:55 -08:00
Leo Liu
6d74cb2570 radeon/vce: move destroy command before feedback command
VCE processing IBs starts from session and task info at first level,
other commands processed subsequently. The task info for destroy is
embedded to destroy command, resulting that feedback command is not
properly procoessed. This is causing kernel spin VM fault messages on
Polaris and Vega10 card when running ends at encode application.

The fix is also verified on VCE physical mode card.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Christian König <christian.koenig@amd.com>
2017-12-08 12:56:48 -05:00
Ben Crocker
060eb314eb docs/llvmpipe: document ppc64le as alternative architecture to x86.
Power8, Power8NV, and Power9 are supported on an equal footing
with X86.

Cc: "17.2" "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

[Eric: changed formatting, reworded a bit (with Ben's ack)]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-08 14:49:00 +00:00
Emil Velikov
bce489a4ed docs/release-calendar: drop 17.3.0 from the table
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-08 13:59:27 +00:00
Emil Velikov
95c9d751ce docs: add news item and link release notes for 17.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-12-08 13:58:03 +00:00
Emil Velikov
706986bcc9 docs: add sha256 checksums for 17.3.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 49a612d1580b3316392273a069d20d93967126a8)
2017-12-08 13:54:34 +00:00
Emil Velikov
4124ac51f4 docs: Update 17.3.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8d55da9f579463038f4305ed7d505aa7fffa0f37)
2017-12-08 13:54:33 +00:00
Samuel Pitoiset
572b2bad1d radv: do not print ASM to stderr when dumping shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:24:24 +01:00
Samuel Pitoiset
33b329f769 radv/winsys: implement query_value()
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:35 +01:00
Samuel Pitoiset
c202119286 radv: remove useless check radv_set_dcc_need_cmask_elim_pred()
emit_fast_color_clear() already checks that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:03 +01:00
Samuel Pitoiset
d90b7a4c50 radv: remove useless checks in radv_set_{color,depth}_clear_regs()
Already checked by the respective callers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:00 +01:00
Samuel Pitoiset
c7c7b00889 radv: only re-mit the index type when it changes
dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:36 +01:00
Samuel Pitoiset
a302009b7b radv: only reset command buffers that are not in the initial state
dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:23 +01:00
Samuel Pitoiset
a380bc7ecf radv: track different status of a command buffer
RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:21 +01:00
Samuel Pitoiset
fc6c77e162 radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega
Copied from RadeonSI.

This fixes all CTS
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.*

And some other ones which use the same format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:15:44 +01:00
Jordan Justen
4d81c8e43e docs: Update GL_ARB_get_program_binary docs to support 1 format
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 17:01:02 +11:00
Jordan Justen
b4c37ce214 i965: Add ARB_get_program_binary support using nir_serialization
This resolves an apparent game bug described in 85564. The game
doesn't properly handle ARB_get_program_binary with 0 supported
formats.

V2 (Timothy Arceri):
 - less driver code as more has been moved into the common helpers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 17:00:57 +11:00
Jordan Justen
c1ff99fd70 main: Clear shader program data whenever ProgramBinary is called
The GL_ARB_get_program_binary extension spec says:

 "If ProgramBinary fails to load a binary, no error is generated, but
  any information about a previous link or load of that program object
  is lost."

v2:
 * Re-initialize shProg->data after clear. (Jordan)
   (Required after 6a72eba755)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
50c09a648f main: add binary support to ProgramBinary
V2: call generic mesa_program_binary() helper rather than driver
    function directly to allow greater code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
7ee54ad057 main: add binary support to GetProgramBinary
V2: call generic _mesa_get_program_binary() helper rather than driver
    function directly to allow greater code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
e30ed18215 main: Support getting GL_PROGRAM_BINARY_LENGTH
V2: call generic _mesa_get_program_binary_length() helper
    rather than driver function directly to allow greater
    code sharing.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>i (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
c20fd744fe mesa: Add Mesa ARB_get_program_binary helper functions
V2 (Timothy Arceri):
 - add extra code comment
 - stop passing around void *binary and just pass
   program_binary_header *hdr instead.
 - move to src/mesa/main rather than src/util

V3 (Timothy Arceri):
 - Move more code out of the backend and into the common
   helpers.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:59:25 +11:00
Timothy Arceri
90d4abdd87 mesa: add driver callbacks for serialising ProgramBinary blobs
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
64ad804e59 main: Support 1 Mesa format with get for GL_PROGRAM_BINARY_FORMATS
Mesa supports either 0 or 1 formats. If 1 format is supported, it is
GL_PROGRAM_BINARY_FORMAT_MESA as defined in the
GL_MESA_program_binary_formats extension spec.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
fb077d603b main: Allow non-zero NUM_PROGRAM_BINARY_FORMATS
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
2e28494af2 i965: Fix memory leak when serializing nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:59:25 +11:00
Jordan Justen
25b3ce6e3b i965: Add brw_program_serialize_nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:59:22 +11:00
Jordan Justen
b3f1b765e9 i965: Free serialized nir after deserializing
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
cdc7ac23b9 i965: Add brw_program_deserialize_nir
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
7cf1037d5a main, glsl: Add UniformDataDefaults which stores uniform defaults
The ARB_get_program_binary extension requires that uniform values in a
program be restored to their initial value just after linking.

This patch saves off the initial values just after linking. When the
program is restored by glProgramBinary, we can use this to copy the
initial value of uniforms into UniformDataSlots.

V2 (Timothy Arceri):
 - Store UniformDataDefaults only when serializing GLSL as this
   is what we want for both disk cache and ARB_get_program_binary.
   This saves us having to come back later and reset the Uniforms
   on program binary restores.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-12-08 16:44:35 +11:00
Jordan Justen
ebd9e789c4 glsl: Split out shader program serialization
This will allow us to use the program serialization to implement
ARB_get_program_binary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-12-08 16:44:35 +11:00