Commit Graph

115 Commits

Author SHA1 Message Date
Christian Gmeiner
9383009809 nir: rename has_txs to has_texture_scaling
Convert it to an opt-in for backends to prefer and use nir_load_texture_scale
instead of txs for nir lowerings.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
2023-07-12 10:03:06 +00:00
Alyssa Rosenzweig
1d4a59448c treewide: Remove use_scoped_barrier
It is now set by all relevant drivers and not checked anywhere.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23191>
2023-06-13 16:36:10 +00:00
Kenneth Graunke
a2d384a5c0 intel/compiler: Fix 64-bit ufind_msb, find_lsb, and bit_count
We only support 32-bit versions of ufind_msb, find_lsb, and bit_count,
so we need to lower them via nir_lower_int64.

Previously, we were failing to do so on platforms older than Icelake
and let those operations fall through to nir_lower_bit_size, which
used a callback to determine it should lower them for bit_size != 32.
However, that pass only emulates small bit-size operations by promoting
them to supported, larger bit-sizes (i.e. 16-bit using 32-bit).  It
doesn't support emulating larger operations (i.e. 64-bit using 32-bit).

So nir_lower_bit_size would just u2u32 the 64-bit source, causing us to
flat ignore half of the bits.

Commit 78a195f252 (intel/compiler: Postpone most int64 lowering to
brw_postprocess_nir) provoked this bug on Icelake and later as well,
by moving the nir_lower_int64 handling for ufind_msb until late in
compilation, allowing it to reach nir_lower_bit_size which broke it.

To fix this, we always set int64 lowering for these opcodes, and also
correct the nir_lower_bit_size callback to ignore 64-bit operations.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>
2023-05-19 22:44:37 +00:00
Ian Romanick
28311f9d02 nir: intel/compiler: Move ufind_msb lowering to NIR
Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Cycles in all programs: 9098346105 -> 9098333765 (-0.0%)
Cycles helped: 6

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
0cc7bf63b7 nir: intel/compiler: Move ifind_msb lowering to NIR
Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so
no @32 annotation is required.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
15c6c859cf intel/compiler: Lower find_lsb in NIR
No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Marcin Ślusarz
432e263284 intel/compiler: fine-grained control of dispatch widths
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20854>
2023-01-27 11:00:41 +00:00
Nico Cortes
29adbb132f Revert "intel/compiler: fine-grained control of dispatch widths"
This reverts commit bed18ab3e2.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8063
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20654>
2023-01-12 00:33:25 +00:00
Marcin Ślusarz
bed18ab3e2 intel/compiler: fine-grained control of dispatch widths
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20535>
2023-01-11 08:17:12 +00:00
Ian Romanick
8ab7ec0129 intel/compiler: Enable lower_bitfield_extract_to_shifts and lower_bitfield_insert_to_shifts for pre-Gfx7
GLSL IR opcodes generated for bitfieldExtract and bitfieldInsert are
lowered by lower_instructions.  4dff3ff005 ("nir/opt_algebraic:
Optimize open coded bfm.") adds an optimization that can rematerialize
nir_op_bfm that was prevented by the GLSL IR lowering.

It appears that every piece of hardware, except older Intel GPUS, that
has real integers (i.e., lower_bitops is not set) also sets
lower_bitfield_extract_to_shifts and lower_bitfield_insert_to_shifts.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 4dff3ff005 ("nir/opt_algebraic: Optimize open coded bfm.")
Closes: #7874
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Lionel Landwerlin
25608659a0 intel/compiler: mark shader_record_ptr as uniform
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20413>
2022-12-23 09:22:13 +00:00
Illia Abernikhin
aa4ac5ff8b utils: Merge util/debug.* into util/u_debug.* and remove util/debug.*
Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate
Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177

Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336>
2022-11-02 07:25:39 +00:00
Kenneth Graunke
2dfab687ec intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes [v2]
Setting the NIR options takes care of iris thanks to the common st/mesa
linking code, and updating brw_nir_link_shaders should handle anv.

The main effort here is updating remap_tess_levels, which needs to
handle vector stores, writemasking, and swizzling.  Unfortunately,
we also need to continue handling the existing single-component
access because it's used for TES inputs, which we don't vectorize.

We could try to vectorize TES inputs too, but they're all pushed
anyway, so it wouldn't buy us much other than deleting this code.
Also, we do have opt_combine_stores, but not one for loads.

One limitation of using nir_vectorize_tess_levels is that it works
on variables, and so isn't able to combine outer/inner writes that
happen to live in the same vec4 slot (for triangle domains).  That
said, it's still better than before.

For writes, we allow the intrinsics to supply up to the full size
of the variable (vec4 for outer, vec2 for inner) even if the domain
only requires a subset of those components (i.e. triangles needs 3).

shader-db results on Icelake:

   total instructions in shared programs: 19600314 -> 19597528 (-0.01%)
   instructions in affected programs: 65338 -> 62552 (-4.26%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12
   helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59%
   95% mean confidence interval for instructions value: -10.71 -9.85
   95% mean confidence interval for instructions %-change: -6.17% -5.43%
   Instructions are helped.

   total cycles in shared programs: 851842332 -> 851808165 (<.01%)
   cycles in affected programs: 618577 -> 584410 (-5.52%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 64 max: 540 x̄: 126.08 x̃: 111
   helped stats (rel) min: 2.57% max: 37.97% x̄: 6.12% x̃: 5.06%
   95% mean confidence interval for cycles value: -135.35 -116.80
   95% mean confidence interval for cycles %-change: -6.67% -5.57%
   Cycles are helped.

   total sends in shared programs: 1025238 -> 1024308 (-0.09%)
   sends in affected programs: 6454 -> 5524 (-14.41%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4
   helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39%
   95% mean confidence interval for sends value: -3.57 -3.29
   95% mean confidence interval for sends %-change: -15.42% -14.54%
   Sends are helped.

According to Felix DeGrood, this results in a 10% improvement in
the draw call time for certain draw calls from Strange Brigade.

v2: Fix assertions about number of components and add more of them.
    Combine the quads and triangles handling as it's nearly identical.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19061>
2022-10-13 11:38:21 -07:00
Kenneth Graunke
b61b1d5a4c Revert "intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes"
This reverts commit abba55382f.
The assertions I added late in the process broke shader-db, and my
quick fix broke CI, so let's just revert it for now and I'll resubmit
this later when it's working better.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7385
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18895>
2022-09-29 17:39:18 -07:00
Kenneth Graunke
abba55382f intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes
Setting the NIR options takes care of iris thanks to the common st/mesa
linking code, and updating brw_nir_link_shaders should handle anv.

The main effort here is updating remap_tess_levels, which needs to
handle vector stores, writemasking, and swizzling.  Unfortunately,
we also need to continue handling the existing single-component
access because it's used for TES inputs, which we don't vectorize.

We could try to vectorize TES inputs too, but they're all pushed
anyway, so it wouldn't buy us much other than deleting this code.
Also, we do have opt_combine_stores, but not one for loads.

One limitation of using nir_vectorize_tess_levels is that it works
on variables, and so isn't able to combine outer/inner writes that
happen to live in the same vec4 slot (for triangle domains).  That
said, it's still better than before.

For writes, we allow the intrinsics to supply up to the full size
of the variable (vec4 for outer, vec2 for inner) even if the domain
only requires a subset of those components (i.e. triangles needs 3).

shader-db results on Icelake:

   total instructions in shared programs: 19605070 -> 19602284 (-0.01%)
   instructions in affected programs: 65338 -> 62552 (-4.26%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12
   helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59%
   95% mean confidence interval for instructions value: -10.71 -9.85
   95% mean confidence interval for instructions %-change: -6.17% -5.43%
   Instructions are helped.

   total cycles in shared programs: 851854659 -> 851820320 (<.01%)
   cycles in affected programs: 618749 -> 584410 (-5.55%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 69 max: 540 x̄: 126.71 x̃: 108
   helped stats (rel) min: 2.57% max: 37.97% x̄: 6.17% x̃: 5.06%
   95% mean confidence interval for cycles value: -135.89 -117.54
   95% mean confidence interval for cycles %-change: -6.72% -5.63%
   Cycles are helped.

   total sends in shared programs: 1025285 -> 1024355 (-0.09%)
   sends in affected programs: 6454 -> 5524 (-14.41%)
   helped: 271 / HURT: 0
   helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4
   helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39%
   95% mean confidence interval for sends value: -3.57 -3.29
   95% mean confidence interval for sends %-change: -15.42% -14.54%
   Sends are helped.

According to Felix DeGrood, this results in a 10% improvement in
the draw call time for certain draw calls from Strange Brigade.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>
2022-09-27 18:17:56 -07:00
Caio Oliveira
a1b1fdf70d intel/compiler: Rename 8_PATCH to MULTI_PATCH
Make it clearer we are dealing with multiple patches,
works better in constrast with SINGLE_PATCH.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
2022-08-24 00:39:57 +00:00
Caio Oliveira
7cd06249b9 intel/compiler: Remove INTEL_DEBUG=tcs8
For Gen11 and prior, the dispatch mode for TCS was SINGLE_PATCH, and
this debug setting could be used to change it to 8_PATCH (falling back
to SINGLE_PATCH when shader couldn't be in the multi dispatch mode).
However after talking to Ken, seems this debug setting is not really
worth keeping around, so removing it.

For Gen12+ the only option is 8_PATCH, so it was always using that
dispatch mode as before.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
2022-08-24 00:39:57 +00:00
Ian Romanick
95e50d198f intel/vec4: Set lower_usub_sat
Reviewed-by: Emma Anholt <emma@anholt.net>
Closes: #6900
Fixes: 90a8fb03 ("nir/lower_io: Fix array length of buffers larger than INT32_MAX.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17637>
2022-07-22 17:54:28 +00:00
Jordan Justen
4246a1ff47 intel/compiler: Don't create vec4 reg-set for gen8+
After 60e1d0f028, we know that vec4 will never be used for gen >= 8.

Ref: 60e1d0f028 ("intel/compiler: Remove INTEL_SCALAR_... env variables")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17437>
2022-07-14 17:49:01 +00:00
Kenneth Graunke
72e9843991 intel/compiler: Introduce a new brw_isa_info structure
This structure will contain the opcode mapping tables in the next
commit.  For now, this is the mechanical change to plumb it into all
the necessary places, and it continues simply holding devinfo.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>
2022-06-30 23:46:35 +00:00
Marcin Ślusarz
f4386b81e6 intel: fix typos found by codespell
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17191>
2022-06-27 10:20:55 +00:00
Paulo Zanoni
8f02e6cb19 intel/compiler: compute int64_options based on devinfo->has_64bit_int
Don't compute it based on devinfo->has_64bit_float. Othwerwise we may
end up emitting 64bit-int (Q) instructions on platforms with 64bit
floats but not 64bit integers.

Right now, the only platforms where has_64bit_int is different from
has_64bit_float are the platforms that use GFX7_FEATURES.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15835>
2022-06-02 23:04:39 +00:00
Timothy Arceri
d7a071a28f gallium/drivers: set force_indirect_unrolling_sampler for all required drivers
This is set to true for all drivers that have a GLSL level
of support lower than 4.00. This matches the rule for setting the
GLSL IR option EmitNoIndirectSampler.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16543>
2022-05-17 02:12:21 +00:00
Jason Ekstrand
ad0dc8e4ab intel/compiler: Set lower_fisnormal
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15985>
2022-04-16 00:26:43 +00:00
Georg Lehmann
922916bf64 nir: Move lower_usub_sat64 to nir_lower_int64_options.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421>
2022-03-28 20:02:52 +00:00
Kenneth Graunke
b3942beecf intel/compiler: Set divergence analysis options
Although we don't use divergence analysis yet, we've had several
work-in-progress series that make use of it.  We may as well set
our options so that those series can assume they're in place.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>
2022-03-26 00:28:19 +00:00
Danylo Piliaiev
b8d486f298 nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>
2022-01-10 13:20:39 +02:00
Ian Romanick
ff74d5dd1b intel/compiler: Don't store "scalar stage" bits on Gfx8 or Gfx9
Since 1d71b1a311, only Gfx7 and earlier have any vec4 stages ever.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14128>
2021-12-08 14:59:32 -08:00
Dave Airlie
9bb375b0be intel/compiler: drop glsl options from brw_compiler
Only the nir options are used now, since i965 was dropped,
the glsl options come from the state tracker

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14102>
2021-12-07 08:52:36 +00:00
Caio Oliveira
db23c41537 intel/compiler: Add backend compiler basics for Task/Mesh
Task/Mesh stages are CS-like stages, and include many
builtins (e.g. workgroup ID/index) and intrinsics (e.g. workgroup
memory primitives) originally present only in CS.

This commit add two new stages (task and mesh) that 'inherit' from CS
by embedding a brw_cs_prog_data in their own prog_data structure, so
that CS functionality can be easily reused.  They also currently use
the same helpers to select the SIMD variant to use -- that was
recently added for CS.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13661>
2021-12-04 00:41:46 +00:00
Marcin Ślusarz
d05f7b4a2c intel: fix INTEL_DEBUG environment variable on 32-bit systems
INTEL_DEBUG is defined (since 4015e1876a) as:

 #define INTEL_DEBUG __builtin_expect(intel_debug, 0)

which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.

Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.

Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
    perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
    perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
2021-10-15 19:55:14 +00:00
Ian Romanick
0f809dbf40 intel/compiler: Basic support for DP4A instruction
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
5ce3bfcdf3 intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76.

With the previous optimizations in place, this change seems to improve
the quality of the generated code.  Comparing a couple Vulkan CTS tests
on Skylake had the following results.

dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends

dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
f0a8a9816a nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Rhys Perry
ed70b256ce nir: add ffma creation helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00
Timothy Arceri
a9ed4538ab nir: add indirect loop unrolling to compiler options
This is where it should be rather than having to pass it into the
optimisation pass every time.

It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
2021-08-03 10:54:50 +00:00
Sagar Ghuge
89b7a40212 intel/compiler: Enable has_iadd3 option on XeHP
shader-db result is inconclusive but doesn't harm to include it for
reference.

Shader-db result on XeHPG:

total instructions in shared programs: 1397405 -> 1397315 (<.01%)
instructions in affected programs: 88252 -> 88162 (-0.10%)
helped: 20
HURT: 7
helped stats (abs) min: 1 max: 18 x̄: 7.20 x̃: 7
helped stats (rel) min: 0.03% max: 2.20% x̄: 0.37% x̃: 0.23%
HURT stats (abs)   min: 4 max: 23 x̄: 7.71 x̃: 4
HURT stats (rel)   min: 0.10% max: 0.68% x̄: 0.22% x̃: 0.11%
95% mean confidence interval for instructions value: -6.81 0.14
95% mean confidence interval for instructions %-change: -0.42% -0.02%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 119924219 -> 119931868 (<.01%)
cycles in affected programs: 45029193 -> 45036842 (0.02%)
helped: 11
HURT: 16
helped stats (abs) min: 15 max: 5490 x̄: 1655.73 x̃: 140
helped stats (rel) min: <.01% max: 0.35% x̄: 0.11% x̃: <.01%
HURT stats (abs)   min: 1 max: 2944 x̄: 1616.38 x̃: 1743
HURT stats (rel)   min: <.01% max: 0.17% x̄: 0.09% x̃: 0.10%
95% mean confidence interval for cycles value: -606.11 1172.70
95% mean confidence interval for cycles %-change: -0.04% 0.07%
Inconclusive result (value mean confidence interval includes 0).

v2:
- Include shader-db result (Jason)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
2021-07-16 15:59:56 +00:00
Andrii Simiklit
57f54bb9cc Remove redundant assignment
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4957
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11780>
2021-07-09 09:34:27 +00:00
Jason Ekstrand
d055ac9bdf intel/compiler: Add a U32 reloc type
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Jason Ekstrand
55508bbe66 intel/compiler: Generalize shader relocations a bit
This commit adds a delta to be added to the relocated value as well as
the possibility of multiple types of relocations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
2021-06-22 21:09:25 +00:00
Rhys Perry
1cbcfb8b38 nir, nir/algebraic: add byte/word insertion instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Anuj Phogat
61e8636557 intel: Rename gen_device prefix to intel_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat
926d343acf intel: Rename files with gen_debug prefix
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
find $SEARCH_PATH -type f -name "*gen_debug.*[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \;
grep -E "gen_debug" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_debug\./intel_debug\./g"
grep -E "GEN_DEBUG" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat
1d296484b4 intel: Rename Genx keyword to Gfxx
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/Gen\([[:digit:]]\+\)/Gfx\1/g"

Exclude changes in src/intel/perf/oa-*.xml:
find src/intel/perf -type f \( -name "*.xml" \) | xargs sed -ie "s/Gfx/Gen/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat
abe9a71a09 intel: Rename gen field in gen_device_info struct to ver
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Christian Gmeiner
3fbde2fd93 nir: add has_txs flag
Some nir lowerings might need to know if txs is supported by
the backend.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>
2021-02-23 14:04:30 +00:00
Daniel Schürmann
bd8e84eb8d nir: replace .lower_sub with .has_fsub and .has_isub
This allows a more fine-grained control about whether
a backend supports one of these instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Jason Ekstrand
7280b0911d intel/compiler: Add support for bindless shaders
The Intel bindless thread dispatch model is very simple.  When a compute
shader is to be used for bindless dispatch, it can request a set of
stack IDs.  These are allocated per-dual-subslice by the hardware and
recycled automatically when the stack ID is returned.  Passed to the
bindless dispatch are a global argument address, a stack ID, and an
address of the BINDLESS_SHADER_RECORD to invoke.  When the bindless
shader is dispatched, it is passed its stack ID as well as the global
and local argument pointers.  The local argument pointer is the address
of the BINDLESS_SHADER_RECORD plus some offset which is specified as
part of the BINDLESS_SHADER_RECORD.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>
2020-11-25 05:37:09 +00:00
Jason Ekstrand
68092df8d8 intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:

  - Byte operations don't actually execute with a byte type.  The
    execution type for byte operations is actually word.  (I don't know
    if this has implications for the HW implementation.  Probably?)

  - Destinations are required to be strided out to at least the
    execution type size.  This means that B-type operations always have
    a stride of at least 2.  This means wreaks havoc on the back-end in
    multiple ways.

  - Thanks to the strided destination, we don't actually save register
    space by storing things in bytes.  We could, in theory, interleave
    two byte values into a single 2B-strided register but that's both a
    pain for RA and would lead to piles of false dependencies pre-Gen12
    and on Gen12+, we'd need some significant improvements to the SWSB
    pass.

  - Also thanks to the strided destination, all byte writes are treated
    as partial writes by the back-end and we don't know how to copy-prop
    them.

  - On Gen11, they added a new hardware restriction that byte types
    aren't allowed in the 2nd and 3rd sources of instructions.  This
    means that we have to emit B->W conversions all over to resolve
    things.  If we emit said conversions in NIR, instead, there's a
    chance NIR can get rid of some of them for us.

We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us.  It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus.  There is still a bit we have to handle in the back-end.  In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand
3d22de05ca intel/fs: Add an option to use dataport messages for UBOs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932>
2020-10-08 01:17:06 -05:00