Commit Graph

106697 Commits

Author SHA1 Message Date
Roland Scheidegger
6f4083143b gallivm: use llvm jit code for decoding s3tc
This is (much) faster than using the util fallback.
(Note that there's two methods here, one would use a cache, similar to
the existing code (although the cache was disabled), except the block
decode is done with jit code, the other directly decodes the required
pixels. For now don't use the cache (being direct-mapped is suboptimal,
but it's difficult to come up with something better which doesn't have
too much overhead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-12-20 06:03:20 +01:00
Jason Ekstrand
ec1d5841fa radv/query: Use 1-bit booleans in query shaders
Fixes: 44227453ec "nir: Switch to using 1-bit Booleans for almost..."
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-19 16:36:40 -06:00
Jason Ekstrand
6896c91c10 radv/query: Add a nir_test_flag helper
This is little more than an iadd_imm right now but it will help in the
next commit where we refactor things further.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-19 16:36:26 -06:00
Eduardo Lima Mitev
c2ebc38052 freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()
An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
on desktop GL profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2018-12-19 22:49:05 +01:00
Eric Anholt
90818558f0 docs: Add an encouraging note about providing reviews and acks.
Across several projects I've seen new contributors say "I wasn't sure if I
should provide a review tag since I'm not really an expert in this area."
Everyone I know already applies some implicit weighting to reviews from
different people, so encourage participation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-19 12:49:17 -08:00
Eric Anholt
463df0ffe2 docs: Add a note that MRs should still include any r-b or a-b tags.
v2: Mention "Tested-by" too

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-12-19 12:48:13 -08:00
Eric Anholt
fcfb7f573c v3d: Load and store aligned utiles all at once.
This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.
2018-12-19 10:27:26 -08:00
Eric Anholt
7c56b7a6ea v3d: Add a fallthrough path for utile load/store of 32 byte lines.
Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.
2018-12-19 10:27:26 -08:00
Eric Anholt
f6a0f4f41e vc4: Move the utile load/store functions to a header for reuse by v3d.
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
2018-12-19 10:27:26 -08:00
Eric Anholt
8ee752194c v3d: Implement texture_subdata to reduce teximage upload copies.
This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.
2018-12-19 10:27:26 -08:00
Eric Anholt
e09d8aecb4 v3d: Remove dead prototypes for load/store utile functions. 2018-12-19 10:27:26 -08:00
Eric Anholt
fcf881adda v3d: Don't try to create shadow tiled temporaries for 1D textures.
They're raster order anyway, so we'd assertion fail along with wasting
bandwidth.

Fixes: 6ad9e8690d ("v3d: Add support for texturing from linear.")
2018-12-19 10:27:21 -08:00
Eric Anholt
b5adc744ba v3d: Fix check for TFU job completion in the simulator.
We're waiting for the jobs-completed count to increment (with wrapping),
not to reach its starting state.  This mostly ended up working out because
the next v3d_hw_tick() for a submit CL would end up doing the TFU
operation first, but it did fail when a blit was used for glReadPixels()
at the end of a test.

Fixes: ee0549ff9a ("v3d: Add the V3D TFU submit interface to the simulator.")
2018-12-19 10:26:04 -08:00
Eric Anholt
365728dc5d v3d: Put the dst bo first in the list of BOs for TFU calls.
In the UAPI, the first BO is the destination, and the one the kernel
should do an exclusive reservation on.  Currently we only do exclusive
reservations, anyway.  However, in the simulator path I was only copying
back the "destination" BO (actually src in this case), and this caused
regressions once I fixed the simulator to actually complete TFU before
returning (since otherwise, the TFU op would happen at the start of the
next CL submit and the draw would get the right contents).

Fixes: 976ea90bdc ("v3d: Add support for using the TFU to do some blits.")
2018-12-19 10:26:04 -08:00
Caio Marcelo de Oliveira Filho
947f7b452a nir: properly find the entry to keep in copy_prop_vars
When copy propagation handles a store/copy, it iterates the current
copy entries to remove aliases, but keeps the "equal" entry (if
exists) to be updated.

The removal step may swap the entries around (to ensure there are no
holes), invalidating previous iteration pointers.  The bug was saving
such pointer to use later.  Change the code to first perform the
removals and then find the remaining right entry.

This was causing updates to be lost since they were being made to an
entry that was not part of the current copies.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108624
Fixes: b3c6146925 "nir: Copy propagation between blocks"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-19 09:33:36 -08:00
Michel Dänzer
9d8395bf0e winsys/amdgpu: Pull in LLVM CFLAGS
Fixes build failure if the LLVM headers aren't in a standard include
directory.

Fixes: ec22dd34c8 "radeonsi: move SI_FORCE_FAMILY functionality to
                     winsys"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-12-19 17:54:18 +01:00
Caio Marcelo de Oliveira Filho
0ddc911f4d nir: properly clear the entry sources in copy_prop_vars
When updating a copy entry source value from a "non-SSA" (the data
come from a copy instruction) to a "SSA" (the data or parts of it come
from SSA values), it was possible to hold invalid data in ssa[0]
depending on the writemask.  Because the union, ssa[0] could contain a
pointer to a nir_deref_instr left-over from previous non-SSA usage.

Change code to clean up the array before use to avoid invalid data
around.

Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-19 08:35:48 -08:00
Eric Engestrom
0e4c7c3d5b docs: format code blocks a bit nicely
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2018-12-19 16:32:30 +00:00
Eric Engestrom
b0319d0768 docs: add meson cross compilation instructions
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-12-19 16:31:51 +00:00
Gurchetan Singh
b45aa6290b virgl: move resource creation / import / destruction to common code
We can remove some duplicated code.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
1d3d311133 virgl: move resource metadata into base resource
A resource is just a buffer with some metadata.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
db77573d7b virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT
Previously, we ignored the the glUnmap(..) operation and
flushed before we flush the cbuf.  Now, let's just flush
the data when we unmap.

Neither method is optimal, for example:

glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT)
glFlushMappedBufferRange(.., 25, 30)
glFlushMappedBufferRange(.., 65, 70)

We'll end up flushing 25 --> 70.  Maybe we can fix this later.

v2: Add fixme comment in the code (Elie)

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
11939f6fa2 virgl: make virgl_buffers use resource helpers
We can reuse the helpers we created.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
4e2c77cd51 virgl: make transfer code with PIPE_BUFFER targets
util_format_get_blocksize returns 1 for R8 formats (all
PIPE_BUFFERs are R8).

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
174f530008 virgl: consolidate transfer code
We could allocate and destroy transfers in one place.

v2: Keep l_stride around.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
13626b46f1 virgl: store layer_stride in metadata
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
2a44acc83b virgl: move vrend_get_tex_image_offset to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
f749229a8e virgl: move virgl_resource_layout to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
a63da9c062 virgl: move texture metadata to common code
Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
6e7d396ad3 virgl: remove unnessecary code
With commit 89b479, we moved to tracking buffer cleanliness
when binding.

TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Gurchetan Singh
6d13d1aadb virgl: texture_transfer_pool --> transfer_pool
It's used for all types of resources.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-12-19 13:29:16 +01:00
Nicolai Hähnle
d73a25f2c0 radeonsi: const-ify the si_query_ops
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:07 +01:00
Nicolai Hähnle
c85b0dea0a radeonsi: split perfcounter queries from si_query_hw
Remove a level of indirection to make the code more explicit -- should
make it easier to follow what's going on.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:04 +01:00
Nicolai Hähnle
e0f0d3675d radeonsi: factor si_query_buffer logic out of si_query_hw
This is a move towards using composition instead of inheritance for
different query types.

This change weakens out-of-memory error reporting somewhat, though this
should be acceptable since we didn't consistently report such errors in
the first place.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:02:01 +01:00
Nicolai Hähnle
0fc6e573dd radeonsi: move query suspend logic into the top-level si_query struct
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:59 +01:00
Nicolai Hähnle
e2b9329f17 radeonsi: move remaining perfcounter code into si_perfcounter.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:57 +01:00
Nicolai Hähnle
7dd289d9e4 radeonsi: track constant buffer bind history in si_pipe_set_constant_buffer
Other callers of si_set_constant_buffer don't need it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:54 +01:00
Nicolai Hähnle
829d417914 radeonsi: use si_set_rw_shader_buffer for setting streamout buffers
Reduce the number of places that encode buffer descriptors.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:52 +01:00
Nicolai Hähnle
ce785f5ffd radeonsi: add an si_set_rw_shader_buffer convenience function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:50 +01:00
Nicolai Hähnle
556c4c42b7 radeonsi: avoid using hard-coded SI_NUM_RW_BUFFERS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:48 +01:00
Nicolai Hähnle
1e49d72317 radeonsi: show the fixed function TCS in debug dumps
This is rather important for merged VS/TCS as LSHS shaders...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:45 +01:00
Nicolai Hähnle
6e67e79de4 radeonsi: const-ify si_set_tesseval_regs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:42 +01:00
Nicolai Hähnle
5c841a1b1e radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:39 +01:00
Nicolai Hähnle
0d58dcc3cf radeonsi: don't set RAW_WAIT for CP DMA clears
There is never a read-after-write hazard because the command doesn't read.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:34 +01:00
Nicolai Hähnle
23af72af25 radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:32 +01:00
Nicolai Hähnle
f18b2ac0db radeonsi: add si_init_draw_functions and make some functions static
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:30 +01:00
Nicolai Hähnle
555cb668cc radeonsi: extract declare_vs_blit_inputs
Prepare for some later refactoring.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:27 +01:00
Nicolai Hähnle
ec22dd34c8 radeonsi: move SI_FORCE_FAMILY functionality to winsys
This helps some debugging cases by initializing addrlib with
slightly more appropriate settings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:25 +01:00
Nicolai Hähnle
0ef263d62f ac/surface: 3D and cube surfaces are never displayable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:22 +01:00
Nicolai Hähnle
8efaffa893 amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan
Allow for a unified but efficient treatment of adding a bitmask over a
wave or an entire threadgroup.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:19 +01:00