Commit Graph

225 Commits

Author SHA1 Message Date
Jason Ekstrand
0291bf4db2 Revert "i965: use nir_lower_indirect_derefs() for GLSL"
This reverts commit 9404439a75.  I didn't
intend to push it and it breaks clip and cull distance.
2016-12-05 15:21:20 -08:00
Timothy Arceri
9404439a75 i965: use nir_lower_indirect_derefs() for GLSL
This moves the nir_lower_indirect_derefs() call into
brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
and removes that call to the old GLSL IR pass
lower_variable_index_to_cond_assign()

We want to do this pass in nir to be able to move loop unrolling
to nir.

There is a increase of 1-3 instructions in a small number of shaders,
and 2 Kerbal Space program shaders that increase by 32 instructions.

Shader-db results BDW:

total instructions in shared programs: 8705873 -> 8706194 (0.00%)
instructions in affected programs: 32515 -> 32836 (0.99%)
helped: 3
HURT: 79

total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
cycles in affected programs: 528104 -> 493460 (-6.56%)
helped: 47
HURT: 37

LOST:   2
GAINED: 0
2016-12-05 14:00:35 -08:00
Jason Ekstrand
347f43c8ec anv: Add an input attachment lowering pass
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-11-22 13:44:55 -08:00
Kenneth Graunke
a9eabd539c anv: Combine ClipDistance and CullDistance arrays.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-11-22 00:29:23 -08:00
Jason Ekstrand
49f08ad77f anv: Handle null in all destructors
This fixes a bunch of new CTS tests which look for exactly this.  Even in
the cases where we just call vk_free to free a CPU data structure, we still
handle NULL explicitly.  This way we're less likely to forget to handle
NULL later should we actually do something less trivial.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-11-16 20:07:23 -08:00
Jason Ekstrand
623e1e06d8 anv/pipeline: Get rid of the kernel pointer fields
Now that we have anv_shader_bin, they're completely redundant with other
information we have in the pipeline.  For vertex shaders, we also go
through way too much work to put the offset in one or the other field and
then look at which one we put it in later.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2016-11-16 10:08:38 -08:00
Jason Ekstrand
71cc1e188d anv/pipeline: Properly cache prog_data::param
Before we were caching the prog data but we weren't doing anything with
brw_stage_prog_data::param so anything with push constants wasn't getting
cached properly.  This commit fixes that.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2016-11-02 09:32:28 -07:00
Jason Ekstrand
ff3185e3ba anv/pipeline: Put actual pointers in anv_shader_bin
While we can simply calculate offsets to get to things such as the
prog_data and the key, it's much more user-friendly if there are just
pointers.  Also, it's a bit more fool-proof.

While we're at it, we rework the pipeline cache API to use the
brw_stage_prog_data type directly.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98012
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2016-11-02 09:32:22 -07:00
Timothy Arceri
91d61fbf7c i965: rewrite brw_setup_vue_interpolation()
Here brw_setup_vue_interpolation() is rewritten not to use the InterpQualifier
array in gl_fragment_program which will allow us to remove it.

This change also makes the code which is only used by gen4/5 more self contained
as it now has its own gen5_fragment_program struct rather than storing the map
in brw_context. This means the interpolation map will only get processed once
and will get stored in the in memory cache rather than being processed everytime
the fs changes.

Also by calling this from the fs compile code rather than from the upload code
and using the interpolation assigned there we can get rid of the
BRW_NEW_INTERPOLATION_MAP flag.

It might not seem ideal to add a gen5_fragment_program struct however by the end
of this series we will have gotten rid of all the brw_{shader_stage}_program
structs and replaced them with a generic brw_program struct so there will only
be two program structs which is better than what we have now.

V2: Don't remove BRW_NEW_INTERPOLATION_MAP from dirty_bit_map until the following
patch to fix build error.

V3 - Suggestions by Jason:
- name struct gen4_fragment_program rather than gen5_fragment_program
- don't use enum with memset()
- create interp mode set helper and simplify logic to call it
- add assert when calling function to show prog will never be NULL for
 gen4/5 i.e. no Vulkan

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-10-26 14:29:36 +11:00
Timothy Arceri
e1af20f18a nir/i965/anv/radv/gallium: make shader info a pointer
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.

There are other advantages such as being able to reuse the
shader info populated by GLSL IR.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-26 14:29:36 +11:00
Dave Airlie
1ae6ece980 anv: move to using vk_alloc helpers.
This moves all the alloc/free in anv to the generic helpers.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-19 09:05:26 +10:00
Jason Ekstrand
8e1a8dd47e anv: Move Create*Pipelines into genX_cmd_buffer.c
Now that we don't have meta, we have no need for a gen-agnostic pipeline
create path.  We can, instead, just generate one Create*Pipelines function
per gen and be done with it.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-14 15:40:39 -07:00
Jason Ekstrand
7df46b7533 anv/pipeline: Remove support for direct-from-nir shaders
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-14 15:40:39 -07:00
Jason Ekstrand
ac77528f7d anv: Get rid of graphics_pipeline_create_info_extra
Now that we no longer have meta, all pipelines get created via the normal
Vulkan pipeline creation mechanics.  There is no more need for this bit of
extra magic data that we've been passing around.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:40:39 -07:00
Jason Ekstrand
dedc406ec8 anv: Get rid of meta
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-10-14 15:40:39 -07:00
Kenneth Graunke
ba38a9d380 anv: Fix anv_pipeline_validate_create_info assertions.
Many of these can be "NULL if the pipeline has rasterization disabled."
Also, we should assert that pMultisampleState exists.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-10-11 22:50:09 -07:00
Jason Ekstrand
8cb144bd93 anv/pipeline: Roll compute_urb_partition into emit_urb_setup
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2016-09-13 12:40:12 -07:00
Jason Ekstrand
20b2f1ecb9 anv/pipeline: Lower indirect outputs when EmitNoIndirectOutput is set
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-09-06 08:27:23 -07:00
Jason Ekstrand
42d03c204c anv: Refactor pipeline l3 config setup
Now that we're using gen_l3_config.c, we no longer have one set of l3
config functions per gen and we can simplify a bit.  Also, we know that
only compute uses SLM so we don't need to look for it in all of the stages.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-09-03 08:23:07 -07:00
Jason Ekstrand
527f371999 intel: s/brw_device_info/gen_device_info/
Generated by:

sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' src/intel/**/*.h
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.c
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.cpp
sed -i -e 's/brw_device_info/gen_device_info/g' **/i965/*.h

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-09-03 08:23:06 -07:00
Jason Ekstrand
10f9901bce anv: Rework pipeline caching
The original pipeline cache the Kristian wrote was based on a now-false
premise that the shaders can be stored in the pipeline cache.  The Vulkan
1.0 spec explicitly states that the pipeline cache object is transiant and
you are allowed to delete it after using it to create a pipeline with no
ill effects.  As nice as Kristian's design was, it doesn't jive with the
expectation provided by the Vulkan spec.

The new pipeline cache uses reference-counted anv_shader_bin objects that
are backed by a large state pool.  The cache itself is just a hash table
mapping keys hashes to anv_shader_bin objects.  This has the added
advantage of removing one more hand-rolled hash table from mesa.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97476
Acked-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2016-08-30 15:08:23 -07:00
Jason Ekstrand
d5945bec12 anv/pipeline: Properly handle OOM during shader compilation
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-08-30 15:08:23 -07:00
Jason Ekstrand
4200c2266e anv/pipeline: Fix bind maps for fragment output arrays
Found by inspection.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-08-30 15:08:23 -07:00
Kenneth Graunke
93bfa1d7a2 nir: Change nir_shader_get_entrypoint to return an impl.
Jason suggested adding an assert(function->impl) here.  All callers
of this function actually want ->impl, so I decided just to change
the API.

We also change the nir_lower_io_to_temporaries API here.  All but one
caller passed nir_shader_get_entrypoint(), and with the previous commit,
it now uses a nir_function_impl internally.  Folding this change in
avoids the need to change it and change it back.

v2: Fix one call I missed in ir3_compiler (caught by Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-08-25 19:18:24 -07:00
Jason Ekstrand
2301705dee anv: Include the pipeline layout in the shader hash
The pipeline layout affects shader compilation because it is what
determines binding table locations as well as whether or not a particular
buffer has dynamic offsets.  Since this affects the generated shader, it
needs to be in the hash.  This fixes a bunch of CTS tests now that the CTS
is using a pipeline cache.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-08-24 20:42:05 -07:00
Jason Ekstrand
3c0077a6ec anv/pipeline: Set binding_table.gather_texture_start
This should get texture gather working on gen8+ and mostly working on gen7.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-dev@lists.freedesktop.org>
2016-07-22 16:27:35 -07:00
Jason Ekstrand
1eed753ee8 anv/pipeline: Assert that the number of uniforms from NIR fits 2016-07-13 11:35:24 -07:00
Jason Ekstrand
c2f2c8e407 anv: Use different BOs for different scratch sizes and stages
This solves a race condition where we can end up having different stages
stomp on each other because they're all trying to scratch in the same BO
but they have different views of its layout.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-22 12:39:45 -07:00
Jason Ekstrand
eb6764c4a7 anv: Add proper support for depth clamping
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-20 12:04:08 -07:00
Jason Ekstrand
e6c2fe4519 anv/pipeline: Do invariance propagation on SPIR-V shaders
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-20 12:02:58 -07:00
Chad Versace
c99a0a8bce anv: Fix a harmless overflow warning
anv_pipeline_binding::index is a uint8_t, but some code assigned to it
UINT16_MAX.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewd-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-15 15:34:13 -07:00
Nanley Chery
a4a5917248 anv/pipeline: Don't dereference NULL dynamic state pointers
Add guards to prevent dereferencing NULL dynamic pipeline state. Asserts
of pCreateInfo members are moved to the earliest points at which they
should not be NULL.

This fixes a segfault seen in the McNopper demo, VKTS_Example09.

v3 (Jason Ekstrand):
   - Fix disabled rasterization check
   - Revert opaque detection of color attachment usage

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-13 11:35:45 -07:00
Nanley Chery
a0d84a9ef9 anv: Document and rename anv_pipeline_init_dynamic_state()
To reduce confusion, clarify that the state being copied is not dynamic.

This agrees with the Vulkan spec's usage of the term. Various sections
specify that the various pipeline state which have VkDynamicState enums
(e.g. viewport, scissor, etc.) may or may not be dynamic.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-13 11:35:45 -07:00
Jason Ekstrand
a1a25db699 anv/pipeline: Store the (set, binding, index) tripple in the bind map
This way the the bind map (which we're caching) is mostly independent of
the pipeline layout.  The only coupling remaining is that we pull the array
size of a binding out of the layout.  However, that size is also specified
in the shader and should always match so it's not really coupled.  This
rendering issues in Dota 2.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-10 09:43:07 -07:00
Jason Ekstrand
a19ae36ce5 anv/pipeline: Refactor specialization constant handling a bit
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-06-03 19:29:28 -07:00
Jordan Justen
fa279dfbf0 i965: Add uniform for a CS thread local base ID
v4:
 * Force thread_local_id_index to -1 for now, and have
   fs_visitor::setup_cs_payload look at thread_local_id_index. This
   enables us to more easily cut over from the old local ID layout to
   the new layout, as suggested by Jason.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-06-01 19:29:02 -07:00
Jason Ekstrand
5432487792 anv: Move push constant allocation to the command buffer
Instead of blasting it out as part of the pipeline, we put it in the
command buffer and only blast it out when it's really needed.  Since the
PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a
stall which we would like to avoid.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-05-27 15:17:43 -07:00
Kenneth Graunke
08bc74e694 i965: Delete brw_wm_prog_key::render_to_fbo and drawable_height.
Now that we handle flipping and other gl_FragCoord transformations
via a uniform, these key fields have no users.

This patch actually eliminates the associated recompiles.  The Tomb
Raider benchmark's minimum FPS increases from ~1 FPS to a reasonable
number.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:30:09 -07:00
Kenneth Graunke
dac10e8a13 i965, anv: Use NIR FragCoord re-center and y-transform passes.
This handles gl_FragCoord transformations and other window system vs.
user FBO coordinate system flipping by multiplying/adding uniform
values, rather than recompiles.

This is much better because we have no decent way to guess whether
the application is going to use a shader with the window system FBO
or a user FBO, much less the drawable height.  This led to a lot of
recompiles in many applications.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-20 14:30:08 -07:00
Jordan Justen
8a80af2820 anv: Port L3 cache programming from i965
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jordan Justen
8ee31828c6 anv: Keep track of whether the data cache should be enabled in L3
If images or shader buffers are used, we will enable the data cache in
the the L3 config.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-05-17 13:04:03 -07:00
Jason Ekstrand
265487aedf i965/fs: Add an allow_spilling flag to brw_compile_fs
This allows us to disable spilling for blorp shaders since blorp state
setup doesn't handle spilling.  Without this, blorp fails hard if you run
with INTEL_DEBUG=spill.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Francisco Jerez <currojerez@riseup.net>
2016-05-17 10:20:11 -07:00
Jason Ekstrand
bee160b31b i965/fs: Organize prog_data by ksp number rather than SIMD width
The hardware packets organize kernel pointers and GRF start by slots that
don't map directly to dispatch width.  This means that all of the state
setup code has to re-arrange the data from prog_data into these slots.
This logic has been duplicated 4 times in the GL driver and one more time
in the Vulkan driver.  Let's just put it all in brw_fs.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:25 -07:00
Jason Ekstrand
712a980add i965/fs: Rework the persample shading key/prog_data bits
This commit reworks and simplifies the way we handle persample shading in
the shader key and prog_data.  The previous approach had three different
key bits that had slightly different and hard-to-decern meanings while the
new bits are far more clear.  This commit changes it to two easily
understood bits that communicate everything we need:

 1) key->persample_interp: means that the user has requested persample
    interpolation through the API.  This is equivalent to having
    SAMPLE_SHADING enabled and having MIN_SAMPLE_SHADING_VALUE set high
    enough that you actually get multiple per-sample invocations.

 2) key->multisample_fbo: means that the shader will be running on an
    actual multi-sampled framebuffer.

This commit also adds a new "persample_dispatch" bit to prog_data which
indicates that the shader should be run in persample mode.  This way the
state setup code doesn't have to look at the fragment program or GL state
and can just pull that data out of the prog_data.

In theory, this shuffle could mean more recompiles.  However, in practice,
we were shoving enough state into the key before that we were probably
hitting a recompile on every per-sample shader anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-14 13:34:05 -07:00
Rob Clark
5886d1bad1 anv: fix build break
Previous rename of lower-output-to-temps pass predated merging of anv,
and apparently vulkan wasn't enabled in my local builds so overlooked
this when rebasing.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-11 14:03:24 -04:00
Kenneth Graunke
81407531e0 i965: Generalize wm_key->compute_sample_id to wm_key->multisample_fbo.
I'm going to need a key entry meaning "we have a multisample FBO,
and multisampling is enabled" in an upcoming patch.  This is basically
wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID
system value is read.

The only use of wm_key->compute_sample_id is in emit_sampleid_setup(),
which is only called when handling the SAMPLE_ID system value.  So we
can just eliminate the check and generalize the field.

v2: Also update the Vulkan driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Kenneth Graunke
de0a46a040 i965: Delete now dead persample_2x FS program key flag.
This was only used by the old gl_SampleID calculations.  The new code
doesn't need to handle 2x specially.

v2: Delete it from the Vulkan driver, too.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-20 16:18:47 -07:00
Jason Ekstrand
35b758c378 anv/lower_push_constants: Stop treating scalar specially
All of the code that did something special based on vec4 vs. scalar is
bogus.  In the backend, everything is now in units of bytes and the vec4
backend can handle full std140 packing so we don't need to do anything
special anymore.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
2016-04-20 09:14:47 -07:00
Jason Ekstrand
e61c812f76 anv/pipeline: Use the right mask for lower_indirect_derefs 2016-04-14 15:13:29 -07:00
Jason Ekstrand
c34be07230 spirv: Move to compiler/
While it does rely on NIR, it's not really part of the NIR core.  At the
moment, it still builds as part of libnir but that can be changed later if
desired.
2016-04-14 10:28:47 -07:00