Kenneth Graunke
b3942beecf
intel/compiler: Set divergence analysis options
...
Although we don't use divergence analysis yet, we've had several
work-in-progress series that make use of it. We may as well set
our options so that those series can assume they're in place.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484 >
2022-03-26 00:28:19 +00:00
Danylo Piliaiev
b8d486f298
nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
...
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986 >
2022-01-10 13:20:39 +02:00
Ian Romanick
ff74d5dd1b
intel/compiler: Don't store "scalar stage" bits on Gfx8 or Gfx9
...
Since 1d71b1a311
, only Gfx7 and earlier have any vec4 stages ever.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14128 >
2021-12-08 14:59:32 -08:00
Dave Airlie
9bb375b0be
intel/compiler: drop glsl options from brw_compiler
...
Only the nir options are used now, since i965 was dropped,
the glsl options come from the state tracker
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14102 >
2021-12-07 08:52:36 +00:00
Caio Oliveira
db23c41537
intel/compiler: Add backend compiler basics for Task/Mesh
...
Task/Mesh stages are CS-like stages, and include many
builtins (e.g. workgroup ID/index) and intrinsics (e.g. workgroup
memory primitives) originally present only in CS.
This commit add two new stages (task and mesh) that 'inherit' from CS
by embedding a brw_cs_prog_data in their own prog_data structure, so
that CS functionality can be easily reused. They also currently use
the same helpers to select the SIMD variant to use -- that was
recently added for CS.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661 >
2021-12-04 00:41:46 +00:00
Marcin Ślusarz
d05f7b4a2c
intel: fix INTEL_DEBUG environment variable on 32-bit systems
...
INTEL_DEBUG is defined (since 4015e1876a
) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334 >
2021-10-15 19:55:14 +00:00
Ian Romanick
0f809dbf40
intel/compiler: Basic support for DP4A instruction
...
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142 >
2021-08-24 19:58:57 +00:00
Ian Romanick
5ce3bfcdf3
intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
...
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76 .
With the previous optimizations in place, this change seems to improve
the quality of the generated code. Comparing a couple Vulkan CTS tests
on Skylake had the following results.
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends
dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025 >
2021-08-18 22:03:37 +00:00
Ian Romanick
f0a8a9816a
nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
...
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results
in a lot of shifts and MOVs. When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.
v2: Rebase on b4369de27f
("nir/lower_packing: use
shader_instructions_pass")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025 >
2021-08-18 22:03:37 +00:00
Rhys Perry
ed70b256ce
nir: add ffma creation helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056 >
2021-08-16 17:19:45 +00:00
Timothy Arceri
a9ed4538ab
nir: add indirect loop unrolling to compiler options
...
This is where it should be rather than having to pass it into the
optimisation pass every time.
It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064 >
2021-08-03 10:54:50 +00:00
Sagar Ghuge
89b7a40212
intel/compiler: Enable has_iadd3 option on XeHP
...
shader-db result is inconclusive but doesn't harm to include it for
reference.
Shader-db result on XeHPG:
total instructions in shared programs: 1397405 -> 1397315 (<.01%)
instructions in affected programs: 88252 -> 88162 (-0.10%)
helped: 20
HURT: 7
helped stats (abs) min: 1 max: 18 x̄: 7.20 x̃: 7
helped stats (rel) min: 0.03% max: 2.20% x̄: 0.37% x̃: 0.23%
HURT stats (abs) min: 4 max: 23 x̄: 7.71 x̃: 4
HURT stats (rel) min: 0.10% max: 0.68% x̄: 0.22% x̃: 0.11%
95% mean confidence interval for instructions value: -6.81 0.14
95% mean confidence interval for instructions %-change: -0.42% -0.02%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 119924219 -> 119931868 (<.01%)
cycles in affected programs: 45029193 -> 45036842 (0.02%)
helped: 11
HURT: 16
helped stats (abs) min: 15 max: 5490 x̄: 1655.73 x̃: 140
helped stats (rel) min: <.01% max: 0.35% x̄: 0.11% x̃: <.01%
HURT stats (abs) min: 1 max: 2944 x̄: 1616.38 x̃: 1743
HURT stats (rel) min: <.01% max: 0.17% x̄: 0.09% x̃: 0.10%
95% mean confidence interval for cycles value: -606.11 1172.70
95% mean confidence interval for cycles %-change: -0.04% 0.07%
Inconclusive result (value mean confidence interval includes 0).
v2:
- Include shader-db result (Jason)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596 >
2021-07-16 15:59:56 +00:00
Andrii Simiklit
57f54bb9cc
Remove redundant assignment
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4957
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11780 >
2021-07-09 09:34:27 +00:00
Jason Ekstrand
d055ac9bdf
intel/compiler: Add a U32 reloc type
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637 >
2021-06-22 21:09:25 +00:00
Jason Ekstrand
55508bbe66
intel/compiler: Generalize shader relocations a bit
...
This commit adds a delta to be added to the relocated value as well as
the possibility of multiple types of relocations.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637 >
2021-06-22 21:09:25 +00:00
Rhys Perry
1cbcfb8b38
nir, nir/algebraic: add byte/word insertion instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:42 +00:00
Anuj Phogat
61e8636557
intel: Rename gen_device prefix to intel_device
...
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241 >
2021-04-20 20:06:33 +00:00
Anuj Phogat
926d343acf
intel: Rename files with gen_debug prefix
...
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
find $SEARCH_PATH -type f -name "*gen_debug.*[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \;
grep -E "gen_debug" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_debug\./intel_debug\./g"
grep -E "GEN_DEBUG" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241 >
2021-04-20 20:06:33 +00:00
Anuj Phogat
1d296484b4
intel: Rename Genx keyword to Gfxx
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/Gen\([[:digit:]]\+\)/Gfx\1/g"
Exclude changes in src/intel/perf/oa-*.xml:
find src/intel/perf -type f \( -name "*.xml" \) | xargs sed -ie "s/Gfx/Gen/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
abe9a71a09
intel: Rename gen field in gen_device_info struct to ver
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Christian Gmeiner
3fbde2fd93
nir: add has_txs flag
...
Some nir lowerings might need to know if txs is supported by
the backend.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898 >
2021-02-23 14:04:30 +00:00
Daniel Schürmann
bd8e84eb8d
nir: replace .lower_sub with .has_fsub and .has_isub
...
This allows a more fine-grained control about whether
a backend supports one of these instructions.
Reviewed-by: Eric Anholt <eric@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597 >
2021-01-11 19:13:51 +00:00
Jason Ekstrand
7280b0911d
intel/compiler: Add support for bindless shaders
...
The Intel bindless thread dispatch model is very simple. When a compute
shader is to be used for bindless dispatch, it can request a set of
stack IDs. These are allocated per-dual-subslice by the hardware and
recycled automatically when the stack ID is returned. Passed to the
bindless dispatch are a global argument address, a stack ID, and an
address of the BINDLESS_SHADER_RECORD to invoke. When the bindless
shader is dispatched, it is passed its stack ID as well as the global
and local argument pointers. The local argument pointer is the address
of the BINDLESS_SHADER_RECORD plus some offset which is specified as
part of the BINDLESS_SHADER_RECORD.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356 >
2020-11-25 05:37:09 +00:00
Jason Ekstrand
68092df8d8
intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+
...
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:
- Byte operations don't actually execute with a byte type. The
execution type for byte operations is actually word. (I don't know
if this has implications for the HW implementation. Probably?)
- Destinations are required to be strided out to at least the
execution type size. This means that B-type operations always have
a stride of at least 2. This means wreaks havoc on the back-end in
multiple ways.
- Thanks to the strided destination, we don't actually save register
space by storing things in bytes. We could, in theory, interleave
two byte values into a single 2B-strided register but that's both a
pain for RA and would lead to piles of false dependencies pre-Gen12
and on Gen12+, we'd need some significant improvements to the SWSB
pass.
- Also thanks to the strided destination, all byte writes are treated
as partial writes by the back-end and we don't know how to copy-prop
them.
- On Gen11, they added a new hardware restriction that byte types
aren't allowed in the 2nd and 3rd sources of instructions. This
means that we have to emit B->W conversions all over to resolve
things. If we emit said conversions in NIR, instead, there's a
chance NIR can get rid of some of them for us.
We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us. It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus. There is still a bit we have to handle in the back-end. In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482 >
2020-11-09 18:58:51 +00:00
Jason Ekstrand
3d22de05ca
intel/fs: Add an option to use dataport messages for UBOs
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932 >
2020-10-08 01:17:06 -05:00
Ian Romanick
60e1d0f028
intel/compiler: Remove INTEL_SCALAR_... env variables
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:10 -07:00
Kenneth Graunke
140f53e646
Revert "nir: replace lower_ffma and fuse_ffma with has_ffma"
...
This reverts commit 939ddf3f67
.
Intel has a separate pass for fusing FFMAs selectively. We split
these flags in commit 1b72c31e1f
and
the reasoning still stands. The patch being reverted was just a
cleanup, so there should be no issue with reverting it.
Acked-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849 >
2020-09-24 13:11:50 -07:00
Marek Olšák
939ddf3f67
nir: replace lower_ffma and fuse_ffma with has_ffma
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756 >
2020-09-24 12:29:11 +00:00
Marek Olšák
771aad3027
nir: split lower_ffma into lower_ffma16/32/64
...
AMD wants different behavior for each bit size
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756 >
2020-09-24 12:29:11 +00:00
Gert Wollny
80cde3ad55
intel/compiler: Set lower_uniform_to_ubo compiler flag
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6316 >
2020-09-16 10:07:42 +00:00
Jason Ekstrand
c897cd0278
intel/compiler: Handle all indirect lowering choices in brw_nir.c
...
Since everything flows through NIR and we're doing all of our indirect
deref lowering there now, there's no reason to keep making those
decisions in brw_compiler and stuffing them in the GLSL compiler
structs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909 >
2020-09-03 14:26:49 +00:00
Jason Ekstrand
8d8a3815ef
intel/eu: Add a mechanism for emitting relocatable constant MOVs
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244 >
2020-09-02 19:48:44 +00:00
Jason Ekstrand
54ba0daa28
intel/compiler: Get rid of the global compaction table pointers
...
With discrete GPUs, it's going to be possible to have GPUs from two
different hardware generations in the machine at the same time. Global
singletons like this aren't going to fly. Have a struct containing the
pointers which gets initialized once per shader disassemble instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244 >
2020-09-02 19:48:44 +00:00
Jason Ekstrand
003b04e266
intel/compiler: Allow MESA_SHADER_KERNEL
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280 >
2020-08-12 10:11:06 +00:00
Caio Marcelo de Oliveira Filho
e2b6ccbdad
intel/compiler: Use C99 array initializers for prog_data/key sizes
...
This is way better than a pile of STATIC_ASSERTs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280 >
2020-08-12 10:11:06 +00:00
Boris Brezillon
025988f818
intel: Set int64_options to ~0 when lowering 64b ops
...
That's more future proof than setting each bit manually. Looks like we
already miss nir_lower_ufind_msb64 because of that.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Acked-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588 >
2020-07-30 16:54:24 +00:00
Boris Brezillon
345b5847b4
nir: Replace the scoped_memory barrier by a scoped_barrier
...
SPIRV OpControlBarrier can have both a memory and a control barrier
which some hardware can handle with a single instruction. Let's
turn the scoped_memory_barrier into a scoped barrier which can embed
both barrier types. Note that control-only or memory-only barriers can
be supported through this new intrinsic by passing NIR_SCOPE_NONE to the
unused barrier type.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com >
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4900 >
2020-06-03 07:39:52 +00:00
Ian Romanick
f46eabf84e
nir/algebraic: Split ibfe and ubfe with two constant sources
...
I also tried splitting ubfe instructions with one or zero constants,
and zero shaders in shader-db were affected.
The "lost" shader is a compute shader that was promoted from SIMD8 to
SIMD16, so is also counted as the gained shader.
v2: Further restrict bfe splitting. bfe with multiple constants is
better on at least some Radeon GPUs. Use -x instead of 32-x in shift
counts.
v3: Fix the outer shift count for ibfe lowering. Add c=0 optimizations
to prevent bad lowering. Both suggested by Rhys. Add shift by -32
optimizations.
Tiger Lake
total instructions in shared programs: 17608764 -> 17596316 (-0.07%)
instructions in affected programs: 303765 -> 291317 (-4.10%)
helped: 113
HURT: 46
helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21
helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39%
HURT stats (abs) min: 1 max: 201 x̄: 25.83 x̃: 6
HURT stats (rel) min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11%
95% mean confidence interval for instructions value: -101.13 -55.45
95% mean confidence interval for instructions %-change: -2.61% -1.44%
Instructions are helped.
total cycles in shared programs: 338390770 -> 333530868 (-1.44%)
cycles in affected programs: 79438330 -> 74578428 (-6.12%)
helped: 112
HURT: 64
helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452
helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23%
HURT stats (abs) min: 2 max: 17618 x̄: 1522.41 x̃: 84
HURT stats (rel) min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34%
95% mean confidence interval for cycles value: -37232.47 -17993.69
95% mean confidence interval for cycles %-change: -3.37% -1.65%
Cycles are helped.
total spills in shared programs: 8944 -> 8138 (-9.01%)
spills in affected programs: 3240 -> 2434 (-24.88%)
helped: 67
HURT: 0
total fills in shared programs: 9373 -> 7842 (-16.33%)
fills in affected programs: 4736 -> 3205 (-32.33%)
helped: 67
HURT: 0
LOST: 1
GAINED: 2
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 16123288 -> 16116876 (-0.04%)
instructions in affected programs: 241155 -> 234743 (-2.66%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7
helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15%
95% mean confidence interval for instructions value: -61.29 -38.89
95% mean confidence interval for instructions %-change: -2.05% -1.42%
Instructions are helped.
total cycles in shared programs: 335419163 -> 330438819 (-1.48%)
cycles in affected programs: 77515502 -> 72535158 (-6.42%)
helped: 139
HURT: 37
helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597
helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31%
HURT stats (abs) min: 4 max: 17618 x̄: 2045.08 x̃: 174
HURT stats (rel) min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62%
95% mean confidence interval for cycles value: -37799.30 -18795.51
95% mean confidence interval for cycles %-change: -3.13% -1.57%
Cycles are helped.
total spills in shared programs: 8065 -> 7306 (-9.41%)
spills in affected programs: 3153 -> 2394 (-24.07%)
helped: 67
HURT: 0
total fills in shared programs: 8710 -> 7412 (-14.90%)
fills in affected programs: 4466 -> 3168 (-29.06%)
helped: 67
HURT: 0
LOST: 1
GAINED: 1
Broadwell
total instructions in shared programs: 14970538 -> 14965967 (-0.03%)
instructions in affected programs: 227040 -> 222469 (-2.01%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8
helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14%
95% mean confidence interval for instructions value: -43.05 -28.37
95% mean confidence interval for instructions %-change: -1.69% -1.19%
Instructions are helped.
total cycles in shared programs: 336237662 -> 333035960 (-0.95%)
cycles in affected programs: 72066394 -> 68864692 (-4.44%)
helped: 134
HURT: 42
helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833
helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38%
HURT stats (abs) min: 1 max: 17205 x̄: 1439.69 x̃: 92
HURT stats (rel) min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62%
95% mean confidence interval for cycles value: -23753.58 -12629.40
95% mean confidence interval for cycles %-change: -3.50% -1.98%
Cycles are helped.
total spills in shared programs: 21122 -> 20204 (-4.35%)
spills in affected programs: 3644 -> 2726 (-25.19%)
helped: 67
HURT: 0
total fills in shared programs: 24879 -> 23460 (-5.70%)
fills in affected programs: 4883 -> 3464 (-29.06%)
helped: 67
HURT: 0
Haswell
total instructions in shared programs: 13148269 -> 13145444 (-0.02%)
instructions in affected programs: 137046 -> 134221 (-2.06%)
helped: 97
HURT: 3
helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3
helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44%
HURT stats (abs) min: 1 max: 70 x̄: 47.00 x̃: 70
HURT stats (rel) min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82%
95% mean confidence interval for instructions value: -37.15 -19.35
95% mean confidence interval for instructions %-change: -1.56% -0.89%
Instructions are helped.
total cycles in shared programs: 321221834 -> 318333159 (-0.90%)
cycles in affected programs: 54932349 -> 52043674 (-5.26%)
helped: 95
HURT: 53
helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702
helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87%
HURT stats (abs) min: 4 max: 2357 x̄: 432.49 x̃: 113
HURT stats (rel) min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54%
95% mean confidence interval for cycles value: -26154.16 -12881.99
95% mean confidence interval for cycles %-change: -3.20% -1.55%
Cycles are helped.
total spills in shared programs: 19878 -> 19293 (-2.94%)
spills in affected programs: 3020 -> 2435 (-19.37%)
helped: 41
HURT: 2
total fills in shared programs: 20918 -> 19875 (-4.99%)
fills in affected programs: 3968 -> 2925 (-26.29%)
helped: 41
HURT: 2
LOST: 0
GAINED: 1
Ivy Bridge
total instructions in shared programs: 11875585 -> 11873641 (-0.02%)
instructions in affected programs: 78065 -> 76121 (-2.49%)
helped: 27
HURT: 0
helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72
helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42%
95% mean confidence interval for instructions value: -83.68 -60.32
95% mean confidence interval for instructions %-change: -2.78% -2.07%
Instructions are helped.
total cycles in shared programs: 178232734 -> 175769085 (-1.38%)
cycles in affected programs: 50018707 -> 47555058 (-4.93%)
helped: 27
HURT: 0
helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278
helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95%
95% mean confidence interval for cycles value: -93674.20 -88818.32
95% mean confidence interval for cycles %-change: -5.09% -4.78%
Cycles are helped.
total spills in shared programs: 4182 -> 3739 (-10.59%)
spills in affected programs: 1089 -> 646 (-40.68%)
helped: 27
HURT: 0
total fills in shared programs: 5216 -> 4345 (-16.70%)
fills in affected programs: 1874 -> 1003 (-46.48%)
helped: 27
HURT: 0
No changes on any earlier Intel platforms.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156 >
2020-05-07 10:55:50 -07:00
Rhys Perry
32d871b48f
nir/algebraic: don't undo lowering of 8/16-bit comparisons to 32-bit
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387 >
2020-04-23 10:57:38 +00:00
Caio Marcelo de Oliveira Filho
956e4b2d37
nir, intel: Move use_scoped_memory_barrier to nir_options
...
This option will be used later by GLSL, so move to a common struct.
Because nir_options is filled in the compiler instead of the Vulkan
driver, fix that up. GLSL will ignore that for now.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3913 >
2020-02-24 19:12:11 +00:00
Ian Romanick
4e9079d0c7
i965: Enable INTEL_shader_integer_functions2 on Gen8+
...
v2: Use new lower_hadd64 and lower_usub_sat64 flags.
v3: Enable SPIR-V capability.
v4: Move lowering options to COMMON_SCALAR_OPTIONS. Suggested by Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767 >
2020-01-23 00:18:57 +00:00
Matt Turner
49c21802cb
intel/compiler: Split has_64bit_types into float/int
...
Gen7 has 64-bit floats but not 64-bit ints.
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635 >
2020-01-22 00:19:20 +00:00
Kenneth Graunke
d0d28c783d
iris: Set nir_shader_compiler_options::unify_interfaces.
...
This is technically enabling the option in the common intel backend
code, but only the st/nir linker uses the option, so it's iris-only.
Fixes Piglit's spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out
Closes : #2274
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249 >
2020-01-03 00:41:50 +00:00
Kenneth Graunke
44754279ac
intel/fs/gen12: Use TCS 8_PATCH mode.
...
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
2019-10-11 12:24:16 -07:00
Jordan Justen
f9ec4ac5a1
intel/ir: Lower fpow on Gen12.
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
2019-10-11 12:24:16 -07:00
Marek Olšák
cebc38ff60
nir: add nir_shader_compiler_options::lower_to_scalar
...
This will replace PIPE_SHADER_CAP_SCALAR_ISA.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
2019-10-10 15:49:18 -04:00
Ian Romanick
b418269d7d
intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware
...
See the previous commit for the explanation of the Fixes tag.
Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal
Engine 4 tech demos.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Fixes: 7afa26d4e3
("nir: Add lowering for nir_op_bitfield_reverse.")
2019-08-28 11:39:29 -07:00
Jason Ekstrand
110669c85c
st,i965: Stop looping on 64-bit lowering
...
Now that the 64-bit lowering passes do a complete lowering in one go, we
don't need to loop anymore. We do, however, have to ensure that int64
lowering happens after double lowering because double lowering can
produce int64 ops.
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-07-16 16:05:16 +00:00
Jason Ekstrand
0ba508d7a3
nir,intel: Add support for lowering 64-bit nir_opt_extract_*
...
We need this when doing full software 64-bit emulation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b
"nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
2019-07-15 16:08:37 -05:00
Ian Romanick
1259f6d802
nir: intel/vec4: Add flag to disable some algebraic optimizations
...
A couple patches later in this series use the flag to avoid a few
thousand shader-db regresions on all vec4 platforms.
I'm not particularly enamored with the name of this flag. However, I
suspect the Intel vec4 backend is the only backend that will benefit
from it. Specifically, the cases where this helps are all cases where
we want to prevent nir_opt_algebraic from rearranging instructions to
create 3-source instructions, such as ffma and flrp, with additional
immediate value or uniform sources.
The earlier commit "intel/vec4: Try to emit a single load for multiple
3-src instruction operands" solves most of the problems caused by
additional immediate values, but the restrictions on register strides
that cause problems for uniforms and shader inputs persist.
Reviewed-by: Matt Turner <mattst88@gmail.com >
2019-07-11 10:20:03 -07:00