Commit Graph

1448 Commits

Author SHA1 Message Date
Jason Ekstrand
90b6745bc8 intel/fs,vec4: Stuff the constant data from NIR in the end of the program
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
2020-09-02 19:48:44 +00:00
Jason Ekstrand
91348d125d intel/eu: Add some new helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
2020-09-02 19:48:44 +00:00
Jason Ekstrand
54ba0daa28 intel/compiler: Get rid of the global compaction table pointers
With discrete GPUs, it's going to be possible to have GPUs from two
different hardware generations in the machine at the same time.  Global
singletons like this aren't going to fly.  Have a struct containing the
pointers which gets initialized once per shader disassemble instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244>
2020-09-02 19:48:44 +00:00
Matt Turner
e4dadb545f intel/tools: Disassemble WAIT's argument as a destination
WAIT takes a notification register as a destination and a src0 argument.
Since the same notification register is specified in both fields, we
treat it as a special case and disassemble it only once.

If we disassemble it as if it is a source register, its scalar region
will be printed as <0,1,0>. This causes difficulties round-tripping
through the assembler <-> disassembler because that is not an acceptable
destination region. If we instead disassemble the destination, we
instead get a <1> region which is an acceptable and equivalent region
for source and destination.

The test .asm files are regenerated by round-tripping them through the
assembler/disassembler. Note that the <0> region in the tests was a
harmless mistake: the compiler translated it to a <0,1,0> source region
and a <1> destination region, since <0> isn't valid.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6543>
2020-09-02 17:18:18 +00:00
Marcin Ślusarz
ed9ac3d60c intel/fs,vec4: remove unused assignments
Reported by Coverity.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6126>
2020-09-02 15:08:01 +00:00
Marcin Ślusarz
8e8356e3dc intel/compiler: mark debug constant as const
Should quiet Coverity's "'Constant' variable guards dead code".

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6126>
2020-09-02 15:08:01 +00:00
Marcin Ślusarz
c7a9dc76dc intel/compiler/test: use TEST_DEBUG env var consistently
Other tests use the same environment variable to decide whether they
should print debugging information.

Will quiet Coverity's "'Constant' variable guards dead code".

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6126>
2020-09-02 15:08:01 +00:00
Danylo Piliaiev
87fa645b94 intel/compiler: Fix pointer arithmetic when reading shader assembly
start_offset is a byte offset.

Fixes: 04a9951580
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6557>
2020-09-02 13:23:14 +00:00
Danylo Piliaiev
bc4a127d6e intel/disasm: Label support in shader disassembly for UIP/JIP
Shader instructions which use UIP/JIP now get formatted with a label
in addition with immediate value, labels have "LABEL%d" format.

v2: - Consider brw_jump_scale when calculating label's offset

From: "Lonnberg, Toni" <toni.lonnberg@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>
2020-09-02 10:33:29 +00:00
Danylo Piliaiev
6cbd4764cd intel/disasm: brw_label and support functions
Pre-work for shader disassembly label support.

Introduction of the structures and functions used by the shader disassembly
jump target labeling.

From: "Lonnberg, Toni" <toni.lonnberg@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>
2020-09-02 10:33:29 +00:00
Danylo Piliaiev
afa39d07e4 intel/disasm: Change visibility of has_uip and has_jip
Pre-work for shader disassembly label support.

From: "Lonnberg, Toni" <toni.lonnberg@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4245>
2020-09-02 10:33:29 +00:00
Jason Ekstrand
ff2f44d865 intel/fs: Implement nir_intrinsic_load_global_constant
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6379>
2020-09-01 20:50:04 +00:00
Jason Ekstrand
cccb497d3c intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+
The immediate case is pretty uncommon to see but it can happen, in
theory.  BROADCAST is typically used to uniformize values and those are
usually 32-bit.  However, it does come up in some subgroup ops.

Fixes: 49c21802cb "intel/compiler: Split has_64bit_types into float/int"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6211>
2020-09-01 13:25:20 -05:00
Karol Herbst
70cbddc4a7 nir: use enum operator helper for nir_variable_mode and nir_metadata
those are used quite a bit

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6520>
2020-09-01 17:45:08 +00:00
Jason Ekstrand
4d18e71fea nir: Rename num_shared to shared_size
This one is always a size in bytes.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6524>
2020-09-01 17:30:51 +00:00
Jason Ekstrand
e8b3bc1d55 intel/nir: Lower things with > 4 components in lower_mem_access_bit_sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6502>
2020-08-31 17:04:40 +00:00
Jason Ekstrand
55ae704513 intel/fs: Add support for vec8 and vec16 ops
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6502>
2020-08-31 17:04:40 +00:00
Jason Ekstrand
b7db9ee320 intel/nir: Clean up lower_alpha_to_coverage a bit
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
2020-08-29 16:41:05 +00:00
Jason Ekstrand
b6fdb1405e intel/nir: Rewrite the guts of lower_alpha_to_coverage
I have no idea how this pass ever worked.  I guess it worked ok on the
one or two piglit tests but the whole thing seemed very fragile.  It
makes a number of undocumented and unasserted assumptions and they
aren't always valid.  This rewrite makes a number of changes:

 1. It now properly handles the case where the gl_SampleMask write comes
    before the gl_FragColor or gl_FragData[0] write.

 2. It should early-exit faster because it now looks at bits in
    shader_info::outputs_written instead of looking for variables.

 3. Instead of the fragile variable lookup where we try to look the
    variable up by both location and driver_location and match, we just
    use the driver_location calculations used by brw_fs_nir.

 4. It asserts that the index parameter to store_output is a constant
    instead of silently failing if it isn't.

 5. We now actually assert the implicit assumption that the two writes
    are in the same block.  We go even further and assert that they are
    in the last block in the shader.

 6. In the case where 3 or fewer components of the output are written,
    we explicitly choose to leave the sample mask alone.

Fixes: 7ecfbd4f6d "nir: Add alpha_to_coverage lowering pass"
Closes: #3166
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
2020-08-29 16:41:05 +00:00
Jason Ekstrand
72dc06e07e intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage
I'm honestly not sure how passing a builder by-value ever worked.  I
guess the struct is mostly copyable.  In any case, that's the wrong way
to use it and it's causing issues.

Fixes: 7ecfbd4f6d "nir: Add alpha_to_coverage lowering pass"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6233>
2020-08-29 16:41:05 +00:00
Jason Ekstrand
c84e2784eb intel/nir: Allow splitting a single load into up to 32 loads
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
2020-08-21 22:49:54 +00:00
Jason Ekstrand
febe762246 intel/fs: Fix an assert in load_scratch
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
2020-08-21 22:49:54 +00:00
Jesse Natalie
d3faac7a15 nir: Add options to nir_lower_compute_system_values to control compute ID base lowering
If no options are provided, existing intrinsics are used.
If the lowering pass indicates there should be offsets used for global
invocation ID or work group ID, then those instructions are lowered to
include the offset.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
2020-08-21 22:07:05 +00:00
Jesse Natalie
2e1df6a17f nir: Move compute system value lowering to a separate pass
The actual variable -> intrinsic lowering stays where it is, but
ops which convert one intrinsic to be implemented in terms of
another have moved.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
2020-08-21 22:07:05 +00:00
Karol Herbst
e5899c1e88 nir: rename nir_op_fne to nir_op_fneu
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
2020-08-21 17:26:21 +00:00
Jason Ekstrand
1ccd681109 nir: Add an LOD parameter to image_*_size
The OpenCL image_width/height/depth functions have variants which can
take an LOD parameter.  More importantly, LLVM-SPIRV-Translator always
generates OpImageQuerySizeLod even if the LOD is guaranteed to be zero.
Given that over half the hardware out there has an LOD field for image
size queries (based on a rudimentary scan through their NIR -> whatever
code), we may as well just add the source to the NIR intrinsic.  If this
is ever a problem for anyone, the lowering is pretty trivial.

I've also added asserts to everyone's drivers that should alert them if
they ever see an LOD other than zero.  This will never happen with GL or
Vulkan so there's no need for panic.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6396>
2020-08-20 20:48:10 +00:00
Louis-Francis Ratté-Boulianne
7dcb1d272f st/mesa: Replace UsesStreams by ActiveStreamMask for GS
Some drivers need to know which streams are used by a geometry
shader. Adding a mask of active streams makes the use of
UsesStreams superfluous as it's the equivalent of:

    ActiveStreamMask != (1 << 0)

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5984>
2020-08-18 11:17:26 +00:00
Jason Ekstrand
003b04e266 intel/compiler: Allow MESA_SHADER_KERNEL
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>
2020-08-12 10:11:06 +00:00
Caio Marcelo de Oliveira Filho
e2b6ccbdad intel/compiler: Use C99 array initializers for prog_data/key sizes
This is way better than a pile of STATIC_ASSERTs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>
2020-08-12 10:11:06 +00:00
Jason Ekstrand
8e1de8e5ac intel/cs_intrinsics: Handle 64-bit intrinsics
It's safe to do the math in 32 bits because they're all local workgroup
calculations.  We just need to do a conversion at the end.  For a couple
of intrinsics, we just turn them into 32-bit intrinsics and add a u2u64.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6280>
2020-08-12 10:11:06 +00:00
Mark Janes
cf52b40fb0 intel/fs: work around gen12 lower-precision source modifier limitation
GEN:BUG:1604601757 prevents source modifiers for multiplication of
lower precision integers.

lower_mul_dword_inst() splits 32x32 multiplication into 32x16, and
needs to eliminate source modifiers in this case.

Closes: #3329
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2020-08-10 13:30:45 -07:00
Mark Janes
ee06e47c5b intel/fs: Assert if lower_source_modifiers converts 32x16 to 32x32 multiplication
Lowering source modifiers will convert a 16bit source to a 32bit
value.  In the case of integer multiplication, this will reverse
previous lowering performed by lower_mul_dword_inst.

Assert to prevent an illegal DxD operation (and GPU hang).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2020-08-10 13:29:56 -07:00
Eric Anholt
023e6669cc i965: Enable vector shrinking in the vec4 backend.
This manages to make some extra vec operations that would turn into movs
go away.

brw shader-db:
total instructions in shared programs: 3895037 -> 3893221 (-0.05%)
total cycles in shared programs: 113832759 -> 113792154 (-0.04%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6050>
2020-08-03 21:26:45 +00:00
Matt Turner
c883c482be intel/compiler: Relax SENDS regioning assertions
The next commit fixes a mistake in the assembler and ends up running
afoul of this assertion.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5956>
2020-07-31 12:59:24 -07:00
David Stevens
6c11a7994d i965/i915: Add colorspace support to YUV sampling
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6122>
2020-07-31 07:27:03 +00:00
Boris Brezillon
025988f818 intel: Set int64_options to ~0 when lowering 64b ops
That's more future proof than setting each bit manually. Looks like we
already miss nir_lower_ufind_msb64 because of that.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
2020-07-30 16:54:24 +00:00
Boris Brezillon
bfee35b45c nir: Stop passing an options arg to nir_lower_int64()
This information is exposed through shader->options->lower_int64_options.
Removing the extra arg forces drivers to initialize this field correctly.

This also allows us to check the int64 lowering options from each int64
lowering helper and decide if we should lower the instructions we
introduce.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5588>
2020-07-30 16:54:24 +00:00
Marcin Ślusarz
cb19fe24d3 intel/vec4: fix out of bounds read
NIR_MAX_VEC_COMPONENTS was bumped from 4 to 16 in a8ec4082
(2019.03.09, merged 2019.12.21)

float[4] array was added in acd7796a
(2019.06.11, merged 2019.07.11)

Found by Coverity.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3014

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Fixes: a8ec4082a4 ("nir+vtn: vec8+vec16 support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6067>
2020-07-30 10:41:00 +00:00
Jason Ekstrand
5c5555a862 nir: Add a find_variable_with_[driver_]location helper
We've hand-rolled this loop 10 places and those are just the ones I
found easily.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-29 17:38:58 +00:00
Jason Ekstrand
2956d53400 nir: Add nir_foreach_shader_in/out_variable helpers
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5966>
2020-07-29 17:38:57 +00:00
Francisco Jerez
4d73988f6f intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence.
This avoids some performance regressions on Gen12 platforms caused by
SIMD32 fragment shaders reported in titles like Dota2, TF2, Xonotic,
and GFXBench5 Car Chase and Aztec Ruins.

The most obvious pattern in the regressing shaders I identified among
these workloads is that they all had non-uniform discard statements,
which are handled rather optimistically by the current IR analysis
pass: No penalty is currently applied to the SIMD32 variant of the
shader in the form of differing branching weights like we do for other
control flow instructions in order to account for the greater
likelihood of divergence of a SIMD32 shader.

Simply changing that by giving the same treatment to discard
statements as we give to other branching instructions seemed to hurt
more than it helped on platforms earlier than Gen12, since it reversed
most of the improvement obtained from SIMD32 fragment shaders in
Manhattan for no measurable benefit in other workloads (Manhattan has
a handful of shaders with statically non-uniform discard statements
which actually perform better in SIMD32 mode due to their approximate
dynamic uniformity).  For that reason this change is applied to Gen12+
platforms only.

I've been running a number of tests trying to understand the
difference in behavior between Gen12 and earlier platforms, and most
of the evidence I've gathered seems to point at EU fusion being the
culprit: Unlike previous generations, on Gen12 EUs are arranged in
pairs which execute instructions in lockstep, giving an effective warp
size of 64 threads in SIMD32 mode, which seems to increase the
likelihood for control flow divergence in some of the affected shaders
significantly.

Fixes: 188a3659ae "intel/ir: Import shader performance analysis pass."
Reported-by: Caleb Callaway <caleb.callaway@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5910>
2020-07-23 01:40:06 +00:00
Jason Ekstrand
675d7b19a9 intel/fs: Use the correct logical op for global float atomics
Fixes: e644ed468f "intel/fs: Implement nir_intrinsic_global_atomic_*"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992>
2020-07-21 05:01:34 +00:00
Yevhenii Kolesnikov
36abb0c691 intel/compiler: don't propagate cmp to add if add is saturated
From the Kaby Lake PRM Vol. 7 "Assigning Conditional Flags":

   * Note that the [post condition signal] bits generated at
     the output of a compute are before the .sat.

Paragraph about post_zero does not mention saturation, but
testing it on actual GPUs shows that conditional modifiers
are applied after saturation.

   * post_zero bit: This bit reflects whether the final
     result is zero after all the clamping, normalizing,
     or format conversion logic.

For signed types we don't care about saturation: it won't
change the result of conditional modifier.

For floating and unsigned types there two special cases,
when we can remove inst even if scan_inst is saturated: G
and LE. Since conditional modifiers are just comparations
against zero, saturating positive values to the upper
limit never changes the result of comparation.

For negative values:
(sat(x) >  0) == (x >  0) --- false
(sat(x) <= 0) == (x <= 0) --- true

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2610

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4167>
2020-07-11 00:25:48 +00:00
Jordan Justen
8dfa072ed8 intel/compiler/fs: Still attempt simd32 when INTEL_DEBUG=no16 is used
If INTEL_DEBUG=no16 is used, then simd16 will not be attempted. This,
in turn prevents simd32 from running, because we attempt to skip
simd32 when simd16 fails to compile.

This change more accurately recognizes when we attempted simd16, but
simd16 failed.

One easy way to cause an issue is to set both no8 and no16. Before
this change, we would be left with no FS program, even though simd32
could still be generated in some cases.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5269>
2020-07-09 15:44:57 -07:00
Jordan Justen
1a4a2f563b intel/compiler/cs: Allow simd32 in some more cases with no8 and/or no16
If no16 was specified, and the shader can't run in simd8 due to the
local_size, then we need to generate a simd32 program.

If both no8 and no16 are specified, then we need to generate a simd32
program.

Rework:
 * Drop update of `if` that would have changed `do32` to try simd32
   even if simd16 spilled registers. (Caio)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5269>
2020-07-09 15:44:34 -07:00
Timothy Arceri
1a8f918050 intel/compiler: add and fix up fallthrough comments for gcc warnings
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5714>
2020-07-02 12:11:30 +10:00
Matt Turner
8da810a7fb intel/compiler: Don't emit no-op cr0 changes
If mask is 0, we're asking for no changes to cr0.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5566>
2020-07-02 01:24:06 +00:00
Matt Turner
fe14dc98bf intel/compiler: Add assert that set bits are within mask
We generate bitfields of bits that we want to retain (mask) and bits
that we want to set (brw_mode) in the cr0 register, so the bits we want
to set should be in the set of bits we want to retain.

Also, remove the initialization of mask from
fs_visitor::emit_shader_float_controls_execution_mode since
brw_rnd_mode_from_nir initializes the mask parameter unconditionally.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5566>
2020-07-02 01:24:06 +00:00
Jason Ekstrand
561aaeeb48 intel/eu: Add the RNDU opcode
We don't want to use it on gen5 and earlier because only RNDD can be
done with a single instruction and we can implement RNDU(x) as -RNDD(-x)
so it's better to just do that when we have the instruction.  On gen6
and above, we may as well just use the right instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:54 +00:00
Jason Ekstrand
e0ab48e3ea intel/eu: Set the right subnr for ALIGN16 destinations
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5596>
2020-06-23 17:43:54 +00:00