Commit Graph

119548 Commits

Author SHA1 Message Date
Marek Olšák
77393cf39b ac: add prefix bitcount functions
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 16:16:11 -05:00
Marek Olšák
679b6244e1 radeonsi: turn an assertion into return in si_nir_store_output_tcs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:13 -05:00
Marek Olšák
27cc7703d3 radeonsi: fix doubles and int64
Fixes: 57bd73e229 - radeonsi: remove llvm_type_is_64bit

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:10 -05:00
Marek Olšák
df34fa14bb radeonsi: don't invoke decompression inside internal launch_grid
Decompress resources properly but don't do it inside launch_grid
to prevent recursion.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:08 -05:00
Marek Olšák
58c929be0d radeonsi: clean up how internal compute dispatches are handled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:07 -05:00
Marek Olšák
d69483270e Revert "radeonsi: unbind image before compute clear"
This reverts commit 3a527eda7c.

It's incorrect.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-20 15:40:05 -05:00
Samuel Pitoiset
dbdf3b3ef9 aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6
GFX6 doesn't have FLAT instructions which means we have to emit
a 64-bit MUBUF load.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
9e2fde84fc aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7
According to the different ISA docs (and to LLVM), this bit seems
to only exists on GFX6-GFX7.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
fe9157a700 aco: do not use the vec3 variant for loads on GFX6
GFX6 only supports vec3 with load/store format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
1b5bb204d9 aco: do not use the vec3 variant for stores on GFX6
GFX6 only supports vec3 with load/store format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
b8abfafe86 aco: fix constant folding of SMRD instructions on GFX6
SMRD instructions have an 8-bit dword offset on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>
2020-01-20 16:24:55 +00:00
Jason Ekstrand
dd92179a72 anv: Canonicalize buffer formats for image/buffer copies
Some formats, in particular YCbCr formats and ASTC have additional
restrictions.  We already whack ASTC formats to RGBA32_UINT because the
hardware doesn't allow LINEAR with ASTC.  However, we need to fix YCbCr
formats as well because they come with alignment restrictions that we
can't guarantee are satisfied.  We're using blorp_copy to do the copies
so we may as well just stomp formats for everything.

Fixes: b24b93d584 "anv: enable VK_KHR_sampler_ycbcr_conversion"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>
2020-01-20 16:08:17 +00:00
Jason Ekstrand
14c6e665f7 anv/blorp: Rename buffer image stride parameters
The new names fit better with the Vulkan names and don't pretend to be
an actual image extent.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>
2020-01-20 16:08:17 +00:00
Daniel Stone
cf5fccb0d9 Revert "gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES"
This reverts commit bec9c90b5e.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
2020-01-20 12:33:29 +00:00
Daniel Stone
32d45733ae Revert "st/dri: do FLUSH_VERTICES before calling flush_resource"
This reverts commit 3ba16d36c9.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>
2020-01-20 12:33:22 +00:00
Rhys Perry
29bfe18abd aco: fix fall-through test in try_remove_simple_block() with back-edges
3bca0af2 enhanced empty block determination which exposed this bug and
created an infinite loop in a Guild Wars 2 shader.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 3bca0af25d
     ('aco: ignore parallelcopies to the same register on jump threading')

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2364
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452>
2020-01-20 11:51:45 +00:00
Krzysztof Raszkowski
afb75e71e0 docs/GL4: update gallium/swr features
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2020-01-20 11:37:16 +00:00
Rhys Perry
e151398de6 aco: fix stack buffer overflow in apply_sgprs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: cef7879719 ('aco: rewrite apply_sgprs()')
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>
2020-01-20 11:13:11 +00:00
Tapani Pälli
9b2ccd6a0e anv: add assert for isl_mod_info in choose_isl_tiling_flags
CID: 1457859
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>
2020-01-20 12:12:29 +02:00
Tapani Pälli
8eebdd594b anv: fix assert in GetImageDrmFormatModifierPropertiesEXT
CID: 1457861
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>
2020-01-20 12:11:43 +02:00
Tapani Pälli
31feae1c21 isl/gen12: add reminder comment about missing WA with 3D surfaces
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3441>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3441>
2020-01-20 08:06:19 +02:00
Icecream95
d8a3501f1b panfrost: Dynamically allocate shader variants
This fixes a crash in LZDoom where over 16 shader variants are needed
for a few shaders in some maps, and should also save a few kilobytes
of RAM as most of the time only one or two variants of the 8 previously
allocated are actually needed.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2020-01-18 11:47:34 -05:00
Alyssa Rosenzweig
bef716b56c panfrost: Expose some functionality with dEQP flag
These features are stable enough that they don't need to be hidden.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464>
2020-01-18 14:57:52 +00:00
Alyssa Rosenzweig
4af8d5b064 pan/midgard: Fix recursive csel scheduling
Corner case causing invalid scheduling on shaders with nested csels,
i.e. GLSL code resembling:

   (foo ? bool1 : bool2) ? x : y

By explicitly disallowing csels this is fixed.

Fixes INSTR_INVALID_ENC on a glamor shader (noticeable with slowdown and
visual corruption when scrolling "too far" on GTK apps).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
2020-01-18 14:40:05 +00:00
Alyssa Rosenzweig
564a782ff7 panfrost: Identify un/pack colour opcodes
We still need to identify formats in the disassembler, but this will at
least get the opcode name clear.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
2020-01-18 14:18:48 +00:00
Alyssa Rosenzweig
13c32e5fed pan/midgard: Bytemasks should round up, not round down
Otherwise we'll lost components in DCE.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
2020-01-18 14:18:48 +00:00
Icecream95
5e8386c606 panfrost: Compact the bo_access readers array
Previously, the array bo_access->readers was only cleared when there
were no unsignaled fences, which in some situations never happened.

That resulted in the array having thousands of NULL pointers, but only
a handful of active readers.

With this patch, all the unsignaled readers are moved to the front of
the array, effectively building a new array only containing the active
readers in-place. This results in the readers array usually only having
a couple of elements.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419>
2020-01-18 13:58:43 +00:00
Erik Faye-Lund
c0ba9000d2 zink: support arrays of samplers
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
a9023ec566 zink: support sampling non-float textures
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
3e1acff560 zink: store image-type per texture
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
5fc1562a72 zink: avoid incorrect vector-construction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
8112240d29 zink: support offset-variants of texturing
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
f1a5bcdc16 zink: implement nir_texop_txs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>
2020-01-18 10:45:38 +00:00
Erik Faye-Lund
7ee94d1b21 docs: fixup indentation
The most canonical indentation-style here is two spaces, which is what
the standard boilerplate in all documents use. So let's normalize to
that.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:32 +01:00
Erik Faye-Lund
2ef989473a docs: remove pointless, stray newline
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:29 +01:00
Erik Faye-Lund
199572b65b docs: use [1] instead of asterisk for footnote
While we're at it, make it a link as well.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:25 +01:00
Erik Faye-Lund
063a28642e docs: remove trailing newlines
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:22 +01:00
Erik Faye-Lund
9954120b38 docs: remove leading spaces
There's no good reason to have leading space in these pre-formatted
blocks. It looks strange, so let's get rid of it.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:18 +01:00
Erik Faye-Lund
c871862744 docs: remove trailing header
This header has been there since the document was added, but contains
nothing. So let's get rid of it.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:14 +01:00
Erik Faye-Lund
37daddd3e4 docs: use figure/figcaption instead of tables
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:39:07 +01:00
Erik Faye-Lund
f5983a6eed docs: do not use definition-list for sub-topics
The dl-tag isn't a neat tool for defining sub-headings, it's a semantic
tool for defining definitions and their meaning. Let's insetad use
normal sub-headings instead.

To make the last few paragraphs stand out from the above, let's add a
sub-heading for those as well.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>
2020-01-18 11:38:56 +01:00
Rob Clark
95187083c4 freedreno/a6xx: add PROG_FB_RAST stateobj
For the handful of registers that depend on the union of program/
framebuffer/rasterizer state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
6dc9b292d0 freedreno/a6xx: move dynamic program state to streaming stateobj
Move the program state which we can't pre-bake to a streaming state
object, rather than emitting directly in the draw cmdstream.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
d2fd6469c3 freedreno/a6xx: drop a few more per-draw registers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
4d8f42c851 freedreno/a6xx: separate rast stateobj for prim restart
This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather
than unconditionally emitting it directly in the cmdstream on every
draw.

This also starts adding some tracking about previous draw state, so that
following patches can limit some of the register writes we currently
emit on every draw.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
0e063b3079 freedreno/a6xx: cleanup rasterizer state
All but one of the reg values is only used in the stateobj, so we can
inline the register value setup and stateobj construction.  While we
are at it, switch over to the new register builders.

Prep work for next patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Rob Clark
fba7e6f896 freedreno/a6xx: limit scratch/debug markers to debug builds
The overhead does seem to matter when you have a high enough # of draw
calls that effect few bins/pixels, because these writes would happen
unconditionally (ie. not part of a state-group).

Possibly we could keep these if we moved them into a state-group so the
register writes would be no-ops on bins with no geometry.  OTOH I
usually end up adding in a WFI when using them scratch reg values to
track down a crash.  (So add a WFI to mitigate the annoyance of needing
to use a debug build to get scratch regs to locate the position of a
crash/hang in the cmdstream.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
2020-01-17 15:43:51 -08:00
Jordan Justen
5d7381c645 iris: Fix some indentation in iris_init_render_context
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
2020-01-17 15:28:07 -08:00
C Stout
c1104e4cee util/vector: Fix u_vector_foreach when head rolls over
Also add unit tests for u_vector.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>
2020-01-17 22:21:00 +00:00
Francisco Jerez
b54b67e067 intel/fs: Switch to standard vector layout for barycentrics at optimization time.
This involves permuting the registers of barycentric vectors to have
the standard X[0-n] Y[0-n] layout at NIR translation time.
Barycentrics are converted to the format expected by the PLN
instruction in the lower_barycentrics() pass run after the
optimization loop.

Main reason is correctness of SIMD32 fragment shaders.  The
shuffle_from_pln_layout() and shuffle_to_pln_layout() helpers used
during NIR translation are busted for SIMD32.  This leads to serious
corruption at present with INTEL_DEBUG=do32, especially on Gen11+
where these helpers are hit more frequently due to the lack of a
hardware PLN instruction.

Of course one could have chosen to fix those helpers instead, but
there is another far more subtle issue that was reported during review
of the SIMD32 fragment shader codegen changes: The SIMD splitting pass
currently handles SIMD32 barycentric vectors as if they had the
standard X[0-n] Y[0-n] layout, even though they are interleaved for
the PLN instruction, which causes incorrect execution masks to be
applied to the MOVs unzipping barycentric vectors in cases where a
LINTERP instruction occurs under non-uniform control flow.

I'm not aware of any conformance regressions due to the latter issue
at present, but for our peace of mind let's move the conversion to the
PLN layout into the lower_barycentrics() pass run after
lower_simd_width().

This leads to the following shader-db improvements (including SIMD32
shaders) in combination with the previous back-end preparation changes
-- Without them (especially the copy propagation changes) this would
lead to a massive number of regressions.  On ICL:

   total instructions in shared programs: 20662316 -> 20466903 (-0.95%)
   instructions in affected programs: 10538474 -> 10343061 (-1.85%)
   helped: 68775
   HURT: 6

   total spills in shared programs: 8938 -> 8748 (-2.13%)
   spills in affected programs: 376 -> 186 (-50.53%)
   helped: 9
   HURT: 5

   total fills in shared programs: 8965 -> 8663 (-3.37%)
   fills in affected programs: 965 -> 663 (-31.30%)
   helped: 9
   HURT: 6

   LOST:   146
   GAINED: 43

On SKL:

   total instructions in shared programs: 18725867 -> 18614912 (-0.59%)
   instructions in affected programs: 3876590 -> 3765635 (-2.86%)
   helped: 27492
   HURT: 2

   LOST:   191
   GAINED: 417

On SNB:

   total instructions in shared programs: 14573613 -> 13980646 (-4.07%)
   instructions in affected programs: 5199074 -> 4606107 (-11.41%)
   helped: 29998
   HURT: 0

   LOST:   21
   GAINED: 30

Results are somewhat less impressive but still significant without
SIMD32 fragment shaders enabled.  On ICL:

   total instructions in shared programs: 16148728 -> 16061659 (-0.54%)
   instructions in affected programs: 6114788 -> 6027719 (-1.42%)
   helped: 42046
   HURT: 6

   total spills in shared programs: 8218 -> 8028 (-2.31%)
   spills in affected programs: 376 -> 186 (-50.53%)
   helped: 9
   HURT: 5

   total fills in shared programs: 8953 -> 8651 (-3.37%)
   fills in affected programs: 965 -> 663 (-31.30%)
   helped: 9
   HURT: 6

   LOST:   0
   GAINED: 3

On SKL:

   total instructions in shared programs: 14927994 -> 14926738 (-0.01%)
   instructions in affected programs: 168850 -> 167594 (-0.74%)
   helped: 711
   HURT: 2

On SNB:

   total instructions in shared programs: 10770538 -> 10734403 (-0.34%)
   instructions in affected programs: 2702172 -> 2666037 (-1.34%)
   helped: 17818
   HURT: 0

All of the hurt shaders are either spilling slightly more or emitting
additional NOP instructions due to the SIMD16 POW workaround for
Gen8-9 combined with differences in scheduling.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2020-01-17 13:23:12 -08:00