Commit Graph

115 Commits

Author SHA1 Message Date
Jason Ekstrand
cfc7ed75eb i965: Store image_param in brw_context instead of prog_data
This burns an extra 10k of memory or so in the case where you don't have
any images.  However, if you have several shaders which use images, this
should be much less memory.  It also gets rid of a part of prog_data
that really has nothing to do with the compiler.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
2975e4c56a intel: Rewrite the world of push/pull params
This moves us away to the array of pointers model and onto a model where
each param is represented by a generic uint32_t handle.  We reserve 2^16
of these handles for builtins that get generated by somewhere inside the
compiler and have well-defined meanings.  Generic params have handles
whose meanings are defined by the driver.

The primary downside to this new approach is that it moves a little bit
of the work that we would normally do at compile time to draw time.  On
my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to
have any measurable affect on OglBatch7.  So, while this may come back
to bite us, it doesn't look too bad.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:29 -07:00
Iago Toral Quiroga
5e584a9db7 i965: skip reading unused slots at the begining of the URB for the FS
We can start reading the URB at the first offset that contains varyings
that are actually read in the URB. We still need to make sure that we
read at least one varying to honor hardware requirements.

This helps alleviate a problem introduced with 99df02ca26 for
separate shader objects: without separate shader objects we assign
locations sequentially, however, since that commit we have changed the
method for SSO so that the VUE slot assigned depends on the number of
builtin slots plus the location assigned to the varying. This fixed
layout is intended to help SSO programs by avoiding on-the-fly recompiles
when swapping out shaders, however, it also means that if a varying uses
a large location number close to the maximum allowed by the SF/FS units
(31), then the offset introduced by the number of builtin slots can push
the location outside the range and trigger an assertion.

This problem is affecting at least the following CTS tests for
enhanced layouts:

KHR-GL45.enhanced_layouts.varying_array_components
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_components
KHR-GL45.enhanced_layouts.varying_locations

which use SSO and the the location layout qualifier to select such
location numbers explicitly.

This change helps these tests because for SSO we always have to include
things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is
very unlikely to read them, so by doing this we free builtin slots from
the fixed VUE layout and we avoid the tests to crash in this scenario.

Of course, this is not a proper fix, we'd still run into problems if someone
tries to use an explicit max location and read gl_ViewportIndex, gl_LayerID or
gl_CullDistancein in the FS, but that would be a much less common bug and we
can probably wait to see if anyone actually runs into that situation in a real
world scenario before making the decision that more aggresive changes are
required to support this without reverting 99df02ca26.

v2:
- Add a debug message when we skip clip distances (Ilia)
- we also need to account for this when we compute the urb setup
  for the fragment shader stage, so add a compiler util to compute
  the first slot that we need to read from the URB instead of
  replicating the logic in both places.

v3:
- Make the util more generic so it can account for all unused slots
  at the beginning of the URB, that will make it more useful (Ken).
- Drop the debug message, it was not what Ilia was asking for.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 08:27:13 +02:00
Kenneth Graunke
274afad4cd i965: Add a brw_wm_prog_data::has_render_target_reads field.
State upload code should use prog_data rather than poking at shader_info
directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Iago Toral Quiroga
81615ad444 intel/compiler: properly size attribute wa_flags array for Vulkan
Mesa will map user defined vertex input attributes to slots
starting at VERT_ATTRIB_GENERIC0 which gives us room for only 16
slots (up to GL_VERT_ATTRIB_MAX). This sufficient for GL, where
we expose exactly 16 vertex attributes for user defined inputs, but
in Vulkan we can expose up to 28 (which are also mapped from
VERT_ATTRIB_GENERIC0 onwards) so we need to account for this when
we scope the size of the array of attribute workaround flags
that is used during the brw_vertex_workarounds NIR pass. This
prevents out-of-bounds accesses in that array for NIR shaders
that use more than 16 vertex input attributes.

Fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.*

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-11 10:41:44 +02:00
Kenneth Graunke
6d28c6e52c i965: Select ranges of UBO data to be uploaded as push constants.
This adds a NIR pass that decides which portions of UBOS we should
upload as push constants, rather than pull constants.

v2: Switch to uint16_t for the UBO block number, because we may
    have a lot of them in Vulkan (suggested by Jason).  Add more
    comments about bitfield trickery (requested by Matt).

v3: Skip vec4 stages for now...I haven't finished wiring up support
    in the vec4 backend, and so pushing the data but not using it
    will just be wasteful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
8ec5a4e4a4 i965: Switch to absolute addressing for constant buffer 0.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.  I'd like
to be able to use all four push buffers.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Johnson Lin
165e704719 i965/i915: Add UYVY as the supported format
Trigger the correct sampler options for it. Similar with YUYV

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Anuj Phogat
30e749c8f1 i965/cnl: Update few assertions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Jason Ekstrand
18e18a1863 i965: Move clip program compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
9fb8a8775b i965: Move SF compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Rafael Antognolli
8fa8abef4b i965: Move enums to brw_compiler.h.
These enums live inside struct brw_wm_prog_data, so it makes sense to
keep them in the same header. It also allows to use them without
including brw_eu_defines.h.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:55:58 -07:00
Iago Toral Quiroga
ddb2bb3ed4 anv/pipeline: make FragCoord include sample positions when sample shading
We need to know if sample shading has been requested during shader
compilation since that affects the way fragment coordinates are
computed.

Notice that the semantics of fragment coordinates only depend on
whether sample shading has been requested, not on whether more
than one sample will actually be produced (that is,
minSampleShading and rasterizationSamples do not affect this
behavior).

Because this setting affects the code we generate for the shader, we also
need to include it in the WM prog key. Notice we don't need to alter the
OpenGL code because it doesn't ever use this behavior, so they key's
value is always false (the default).

Fixes:
dEQP-VK.glsl.builtin_var.fragcoord_msaa.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Emil Velikov
2438c0a236 intel/compiler: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Jason Ekstrand
700bebb958 i965: Move the back-end compiler to src/intel/compiler
Mostly a dummy git mv with a couple of noticable parts:
 - With the earlier header cleanups, nothing in src/intel depends
files from src/mesa/drivers/dri/i965/
 - Both Autoconf and Android builds are addressed. Thanks to Mauro and
Tapani for the fixups in the latter
 - brw_util.[ch] is not really compiler specific, so it's moved to i965.

v2:
 - move brw_eu_defines.h instead of brw_defines.h
 - remove no-longer applicable includes
 - add missing vulkan/ prefix in the Android build (thanks Tapani)

v3:
 - don't list brw_defines.h in src/intel/Makefile.sources (Jason)
 - rebase on top of the oa patches

[Emil Velikov: commit message, various small fixes througout]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00