Commit Graph

109 Commits

Author SHA1 Message Date
Lionel Landwerlin
952a523abb intel: switch over to unified atomics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23004>
2023-05-15 16:32:21 +00:00
Kenneth Graunke
14f9f98dcb i965/vec4: Implement uclz in the vec4 backend
Commit 28311f9d02 moved ufind_msb lowering to NIR and started emitting
uclz.  Unfortunately, the vec4 backend never actually implemented uclz.

It's trivial to do.  Now it does.

Fixes: 28311f9d02 ("nir: intel/compiler: Move ufind_msb lowering to NIR")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>
2023-03-17 09:01:18 +00:00
Kenneth Graunke
84197bc0a4 intel/vec4: Retype texture/sampler indexes to UD
generate_tex() asserts that sampler_index.type == UD, but commit
83fd7a5ed1 removed the uint temporary, which caused us to see D at
some points.  Really, either should be fine, but let's just put the
UD retype back.  This fixes a ton of things in crocus.

Fixes: 83fd7a5ed1 ("intel: Use nir_lower_tex_options::lower_index_to_offset")
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21974>
2023-03-17 09:01:18 +00:00
Ian Romanick
28311f9d02 nir: intel/compiler: Move ufind_msb lowering to NIR
Fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Cycles in all programs: 9098346105 -> 9098333765 (-0.0%)
Cycles helped: 6

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
08ca862ef8 intel/compiler: Tighter src and dest size bounds checking for some opcodes
Enforce the sizes listed in the Skylake PRM:

BFREV:
    source types: *D
    destination types: *D

CBIT:
    source types: UB, UW, UD
    destination types: UD

FBH:
    source types: D, UD
    destination types: UD

FBL:
    source types: UD
    destination types: UD

LZD:
    source types: D, UD
    destination types: UD

v2: Update BFREV commit message documentation. Suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
0cc7bf63b7 nir: intel/compiler: Move ifind_msb lowering to NIR
Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so
no @32 annotation is required.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Ian Romanick
15c6c859cf intel/compiler: Lower find_lsb in NIR
No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>
2023-03-10 15:27:17 +00:00
Caio Oliveira
c92d589597 intel/compiler: Drop non-scoped barrier handling
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>
2023-03-07 00:41:13 +00:00
Caio Oliveira
dfc34b1a65 intel/vec4: Handle scoped barriers with execution scope
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>
2023-03-07 00:41:13 +00:00
Faith Ekstrand
83fd7a5ed1 intel: Use nir_lower_tex_options::lower_index_to_offset
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21546>
2023-03-06 21:38:32 +00:00
Caio Oliveira
c80268a20d intel/compiler: Mark various memory barriers intrinsics unreachable
Now that both SPIR-V and GLSL are using scoped barriers, we can stop
handling the specialized ones.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3339>
2023-02-27 20:24:01 +00:00
Daniel Schürmann
2bb369dd8d nir: add assertions that loops don't have a Continue Construct
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.

Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
2023-02-21 10:41:11 +00:00
Väinö Mäkelä
dcad4a2cd1 intel/vec4: Set the rounding mode
The rounding mode only needs to be set once, because 16-bit floats or
preserving denorms aren't supported for the platforms where vec4 is
used.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
2023-02-10 16:34:00 +00:00
Ian Romanick
ea413e826b nir: Eliminate nir_op_f2b
Builds on the work of !15121.  This gets to delete even more code
because many drivers shared a lot of code for i2b and f2b.

No shader-db or fossil-db changes on any Intel platform.

v2: Rebase on 1a35acd8d9.

v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin.

v4: Another rebase. Remove f2b stuff from Midgard.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>
2023-02-03 22:39:57 +00:00
Kenneth Graunke
90a2137cd5 intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs
This gets our logical atomic messages using the lsc_opcode enum rather
than the legacy BRW_AOP_* defines.  We have to translate one way or
another, and using the modern set makes sense going forward.

One advantage is that the lsc_opcode encoding has opcodes for both
integer and floating point atomics in the same enum, whereas the legacy
encoding used overlapping values (BRW_AOP_AND == 1 == BRW_AOP_FMAX),
which made it impossible to handle both sensibly in common code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Ian Romanick
eb76cee9f8 nir: Eliminate nir_op_i2b
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b.  Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.

At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.

The failing test on d3d12 is a pre-existing bug that is triggered by
this change.  I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.

v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.

v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.

v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress.  The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).

v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.

v6: Just eliminate the i2b instruction.

v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷

No shader-db changes on any Intel platform.

All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2

Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2

The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.

Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
2022-12-14 06:23:21 +00:00
Kenneth Graunke
584e18863e intel: Drop compressed_multisample_layout_mask from the compiler keys
The compiler looks at this key field to determine whether to perform
an MCS fetch for a txf_ms or samples_identical texture message, if a
nir_tex_src_ms_mcs_intel source wasn't provided.  If it isn't set,
it instead uses constant 0 (nothing is compressed).

All of the drivers (iris, crocus, anv, hasvk) unconditionally set this
to ~0 because we don't want to pay for costly shader recompiles (which
can cause nasty stuttering).  Most textures are compressed anyway, and
the hardware ignores the l2dms MCS parameter if MCS is disabled.

The only user was BLORP, which sets the key field based on whether the
texture's aux usage has MCS.  But if it has MCS, it also does the MCS
fetch itself and supplies it directly.  Otherwise, it relies on the
compiler to fill in the 0 value.  But it could easily just provide the
0 value itself in that case and not rely on the compiler at all.

With that fixed, we can just drop the key fields entirely.  We leave
them as padding for now to avoid repacking structures; we won't need
to after the next commits anyway.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>
2022-12-09 10:18:25 +00:00
Emma Anholt
5f66a927ec gallium,glsl: Delete PIPE_CAP_VERTEXID_NOBASE and lower_vertex_id.
Every driver uses the nir_lower_system_values path now.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18327>
2022-08-31 22:57:03 +00:00
Kenneth Graunke
bb5d09da6c intel/compiler: Use named NIR intrinsic const index accessors
In the early days of NIR, you had to prod at inst->const_index[]
directly, but a long while back, we added handy accessor functions
that let you use the actual name of the thing you want instead of
memorizing the exact order of parameters.

Also rewrite a comment I had a hard time parsing.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18067>
2022-08-16 05:44:30 +00:00
Marcin Ślusarz
f4386b81e6 intel: fix typos found by codespell
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17191>
2022-06-27 10:20:55 +00:00
Emma Anholt
464b32c030 glsl: Drop the div-to-mul-rcp lowering for floats.
NIR has fdiv, and all the NIR backends have to have lower_fdiv set
appropriately already since various passes (format conversions,
tgsi_to_nir, nir_fast_normalize(), etc.) might generate one.

This causes softpipe and llvmpipe to now do actual divides, since
lower_fdiv is not set there.  Note that llvmpipe's rcp implementation is a
divide of 1.0 by x, so now we're going to be just doing div(x, y) instead
of mul(x, div(1.0, y)).

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16823>
2022-06-07 02:38:42 +00:00
Emma Anholt
3a42e92a4f glsl: Drop the dead MOD_TO_FLOOR path.
It's now called lower_fmod in NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8044>
2022-05-05 22:25:03 +00:00
Dave Airlie
e12b0d0d60 intel/compiler: remove gfx6 gather wa from backend.
Crocus lowers this in the frontend, they key member is still used
but reset prior to backend.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14202>
2021-12-22 21:37:55 +00:00
Kenneth Graunke
d3f4f23ca3 intel/vec4: Inline emit_texture and move helpers to brw_vec4_nir.cpp
emit_texture() only has one caller, nir_emit_texture().  We may as well
inline that.  Move the associated helper functions for emitting sampler
messages there as well, to keep associated code nearby.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5183
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14191>
2021-12-16 00:09:45 -08:00
Kenneth Graunke
92d194427d intel/vec4: Use nir_texop in emit_texture instead of translating
We eliminated the GLSL IR -> vec4 backend ages ago, so the only caller
uses a nir_texop enum.  Drop a layer of translating.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14191>
2021-12-16 00:09:44 -08:00
Jason Ekstrand
278d12f991 intel/fs,vec4: Drop prog_data binding tables
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>
2021-12-10 21:20:47 +00:00
Lionel Landwerlin
361b3fee3c intel: move away from booleans to identify platforms
v2: Drop changes around GFX_VERx10 == 75 (Luis)

v3: Replace
   (GFX_VERx10 < 75 && devinfo->platform != INTEL_PLATFORM_BYT)
   by
   (devinfo->platform == INTEL_PLATFORM_IVB)
   Replace
   (devinfo->ver >= 5 || devinfo->platform == INTEL_PLATFORM_G4X)
   by
   (devinfo->verx10 >= 45)
   Replace
   (devinfo->platform != INTEL_PLATFORM_G4X)
   by
   (devinfo->verx10 != 45)

v4: Fix crocus typo

v5: Rebase

v6: Add GFX3, ILK & I965 platforms (Jordan)
    Move ifdef to code expressions (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12981>
2021-11-08 16:48:06 +00:00
Ian Romanick
a9120eccff intel/compiler: Move type_is_unsigned_int to brw_reg_type.h
...and rename it to brw_reg_type_is_unsigned_integer.  It is now next to
brw_reg_type_is_floating_point and brw_reg_type_is_integer.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
f9665040f1 intel/compiler: Document and assert some aspects of 8-bit integer lowering
In the vec4 compiler, 8-bit types should never exist.

In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.

Some instructions are handled in non-obvious ways.

Hopefully this will save the next person some time.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Jason Ekstrand
3d934ee03f glsl: Delete lower_texture_projection
This is only used by i965 and we've been getting it through
nir_lower_tex since forever.  Get rid of the GLSL IR pass.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11827>
2021-07-13 14:06:33 +00:00
Jason Ekstrand
06ae2723d1 intel/vec4: Also use MOV_FOR_SCRATCH for swizzle resolves
In 2db8867943, we introduced a new meta-op MOV_FOR_SCRATCH which is
identical to MOV except it lets us identify MOVs emitted during spilling
so we know not to re-spill those instructions.  We emit them from
shuffle_for_64bit_data whenever the new for_scratch parameter is true.
Unfortunately, we missed the one used for resolving swizzles.

Fixes: 2db8867943 "intel/vec4: Don't spill fp64 registers more..."
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11155>
2021-06-03 06:14:17 +00:00
Jason Ekstrand
ebba3cad81 intel/vec4: Add support for UBO pushing
Shader-db results on Haswell (vec4 only):

    total instructions in shared programs: 2853928 -> 2726576 (-4.46%)
    instructions in affected programs: 855840 -> 728488 (-14.88%)
    helped: 9500
    HURT: 18
    helped stats (abs) min: 1 max: 359 x̄: 13.54 x̃: 11
    helped stats (rel) min: 0.44% max: 53.33% x̄: 19.13% x̃: 17.44%
    HURT stats (abs)   min: 4 max: 124 x̄: 71.00 x̃: 92
    HURT stats (rel)   min: 3.64% max: 77.86% x̄: 46.43% x̃: 52.12%
    95% mean confidence interval for instructions value: -13.78 -12.98
    95% mean confidence interval for instructions %-change: -19.21% -18.81%
    Instructions are helped.

    total cycles in shared programs: 101822616 -> 60245580 (-40.83%)
    cycles in affected programs: 93312382 -> 51735346 (-44.56%)
    helped: 13292
    HURT: 4506
    helped stats (abs) min: 2 max: 1229260 x̄: 3370.82 x̃: 776
    helped stats (rel) min: 0.04% max: 96.70% x̄: 47.56% x̃: 43.76%
    HURT stats (abs)   min: 2 max: 17644 x̄: 716.37 x̃: 82
    HURT stats (rel)   min: 0.02% max: 491.80% x̄: 41.00% x̃: 11.11%
    95% mean confidence interval for cycles value: -3037.07 -1635.03
    95% mean confidence interval for cycles %-change: -26.03% -24.25%
    Cycles are helped.

    total spills in shared programs: 1080 -> 1314 (21.67%)
    spills in affected programs: 74 -> 308 (316.22%)
    helped: 0
    HURT: 47

    total fills in shared programs: 310 -> 497 (60.32%)
    fills in affected programs: 71 -> 258 (263.38%)
    helped: 0
    HURT: 47

    total sends in shared programs: 239884 -> 151799 (-36.72%)
    sends in affected programs: 129302 -> 41217 (-68.12%)
    helped: 9547
    HURT: 0
    helped stats (abs) min: 1 max: 226 x̄: 9.23 x̃: 8
    helped stats (rel) min: 3.12% max: 98.15% x̄: 72.38% x̃: 80.00%
    95% mean confidence interval for sends value: -9.48 -8.98
    95% mean confidence interval for sends %-change: -72.80% -71.97%
    Sends are helped.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>
2021-05-19 14:38:13 +00:00
Jason Ekstrand
2db8867943 intel/vec4: Don't spill fp64 registers more than once
The way we handle spilling for fp64 in vec4 is to emit a series of MOVs
which swizzles the data around and then a pair of 32-bit spills.  This
works great except that the next time we go to pick a spill reg, the
compiler isn't smart enough to figure out that the register has already
been spilled.  Normally we do this by looking at the sources of spill
instructions (or destinations of fills) but, because it's separated from
the actual value by a MOV, we can't see it.  This commit adds a new
opcode VEC4_OPCODE_MOV_FOR_SCRATCH which is identical to MOV in
semantics except that it lets RA know not to spill again.

Fixes: 82c69426a5 "i965/vec4: support basic spilling of 64-bit registers"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>
2021-05-19 14:38:13 +00:00
Anuj Phogat
61e8636557 intel: Rename gen_device prefix to intel_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Michel Dänzer
2928c21eb7 Convert most remaining free-form fall-through comments to FALLTHROUGH
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Anuj Phogat
b75f095bc7 intel: Rename genx keyword to gfxx in source files
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/gen\([[:digit:]]\+\)/gfx\1/g"

Exclude pack.h and xml changes in this patch:
grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+_pack\.h\)/gen\1/g"
grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+\.xml\)/gen\1/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat
c1f3a778de intel: Rename GENx prefix in macros to GFXx in source files
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN" -rIl src/intel/genxml | grep -E ".*py" |  xargs sed -ie "s/GEN\([%{]\)/GFX\1/g"
grep -E "[^_]GEN[[:digit:]]+" -rIl $SEARCH_PATH | grep -E ".*(\.c|\.h|\.y|\.l)" | xargs sed -ie "s/\([^_]\)GEN\([[:digit:]]\+\)/\1GFX\2/g"

Leave out renaming GFX12_CCS_E macros. They fall under renaming pattern like "_GEN[[:digit:]]+":
grep -E "GFX12_CCS_E" -rIl $SEARCH_PATH | xargs sed -ie "s/GFX12_CCS_E/GEN12_CCS_E/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat
abe9a71a09 intel: Rename gen field in gen_device_info struct to ver
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Connor Abbott
68969cbcb7 brw/vec4: Don't convert tex dest type to glsl_type
We were using nir_tex_instr::dest_type to a glsl_type, then passing it
to emit_texture(), only to just check the number of components. Just
pass the number of components directly. This lets us delete
brw_glsl_base_type_for_nir_type, which was asserting with
nir_texop_all_samples_equal because it didn't handle bool32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:42 +01:00
Ian Romanick
67956689bb nir: Rename replicated-result dot-product instructions
All these instructions replicate the result of a N-component dot-product
to a vec4.  Naming them fdot_replicatedN gives the impression that are
some sort of abstract dot-product that replicates the result to a vecN.
They also deviate from fdph_replicated... which nobody would reasonably
consider naming fdot_replicatedh.

Naming these opcodes fdotN_replicated more closely matches what they
are, and it matches the pattern of fdph_replicated.

I believe that the only reason these opcodes were named this way was
because it simplified the implementation of the binop_reduce function in
nir_opcodes.py.  I made some fairly simple changes to that function, and
I think the end result is ok.

The bulk of the changes come from the sed rename:

    sed --in-place -e 's/fdot_replicated\([234]\)/fdot\1_replicated/g' \
        $(grep -r 'fdot_replicated[234]' src/)

v2: Use a named parameter to binop_reduce instead of using
isinstance(name, str).  Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5725>
2020-10-22 18:00:19 +00:00
Ian Romanick
2a49007411 intel/vec4: Remove all support for Gen8+ [v2]
v2: Restore the gen == 10 hunk in brw_compile_vs (around line 2940).
This function is also used for scalar VS compiles.  Squash in:

    intel/vec4: Reindent after removing Gen8+ support
    intel/vec4: Silence unused parameter warning in try_immediate_source

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826>
2020-09-28 11:43:10 -07:00
Jason Ekstrand
9750164c09 nir: Rename get_buffer_size to get_ssbo_size
This makes it explicit that this intrinsic is only for SSBOs.  For the
v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be
able to distinguish between the two.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812>
2020-09-22 13:34:12 +00:00
Marcin Ślusarz
ed9ac3d60c intel/fs,vec4: remove unused assignments
Reported by Coverity.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6126>
2020-09-02 15:08:01 +00:00
Karol Herbst
e5899c1e88 nir: rename nir_op_fne to nir_op_fneu
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
2020-08-21 17:26:21 +00:00
Marcin Ślusarz
cb19fe24d3 intel/vec4: fix out of bounds read
NIR_MAX_VEC_COMPONENTS was bumped from 4 to 16 in a8ec4082
(2019.03.09, merged 2019.12.21)

float[4] array was added in acd7796a
(2019.06.11, merged 2019.07.11)

Found by Coverity.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3014

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Fixes: a8ec4082a4 ("nir+vtn: vec8+vec16 support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6067>
2020-07-30 10:41:00 +00:00
Boris Brezillon
345b5847b4 nir: Replace the scoped_memory barrier by a scoped_barrier
SPIRV OpControlBarrier can have both a memory and a control barrier
which some hardware can handle with a single instruction. Let's
turn the scoped_memory_barrier into a scoped barrier which can embed
both barrier types. Note that control-only or memory-only barriers can
be supported through this new intrinsic by passing NIR_SCOPE_NONE to the
unused barrier type.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4900>
2020-06-03 07:39:52 +00:00
Caio Marcelo de Oliveira Filho
f858fa26b4 intel/fs,vec4: Pull stall logic for memory fences up into the IR
Instead of emitting the stall MOV "inside" the
SHADER_OPCODE_MEMORY_FENCE generation, use the scheduling fences when
creating the IR.

For IvyBridge, every (data cache) fence is accompained by a render
cache fence, that now is explicit in the IR, two
SHADER_OPCODE_MEMORY_FENCEs are emitted (with different SFIDs).

Because Begin and End interlock intrinsics are effectively memory
barriers, move its handling alongside the other memory barrier
intrinsics.  The SHADER_OPCODE_INTERLOCK is still used to distinguish
if we are going to use a SENDC (for Begin) or regular SEND (for End).

This change is a preparation to allow emitting both SENDs in Gen11+
before we can stall on them.

Shader-db results for IVB (i965):

    total instructions in shared programs: 11971190 -> 11971200 (<.01%)
    instructions in affected programs: 11482 -> 11492 (0.09%)
    helped: 0
    HURT: 8
    HURT stats (abs)   min: 1 max: 3 x̄: 1.25 x̃: 1
    HURT stats (rel)   min: 0.03% max: 0.50% x̄: 0.14% x̃: 0.10%
    95% mean confidence interval for instructions value: 0.66 1.84
    95% mean confidence interval for instructions %-change: 0.01% 0.27%
    Instructions are HURT.

  Unlike the previous code, that used the `mov g1 g2` trick to force
  both `g1` and `g2` to stall, the scheduling fence will generate `mov
  null g1` and `mov null g2`.  During review it was decided it was not
  worth keeping the special codepath for the small effect will have.

Shader-db results for HSW (i965), BDW and SKL don't have a change
on instruction count, but do report changes in cycles count, showing
SKL results below

    total cycles in shared programs: 341738444 -> 341710570 (<.01%)
    cycles in affected programs: 7240002 -> 7212128 (-0.38%)
    helped: 46
    HURT: 5
    helped stats (abs) min: 14 max: 1940 x̄: 676.22 x̃: 154
    helped stats (rel) min: <.01% max: 2.62% x̄: 1.28% x̃: 0.95%
    HURT stats (abs)   min: 2 max: 1768 x̄: 646.40 x̃: 362
    HURT stats (rel)   min: <.01% max: 0.83% x̄: 0.28% x̃: 0.08%
    95% mean confidence interval for cycles value: -777.71 -315.38
    95% mean confidence interval for cycles %-change: -1.42% -0.83%
    Cycles are helped.

  This seems to be the effect of allocating two registers separatedly
  instead of a single one with size 2, which causes different register
  allocation, affecting the cycle estimates.

while ICL also has not change on instruction count but report changes
negative changes in cycles

    total cycles in shared programs: 352665369 -> 352707484 (0.01%)
    cycles in affected programs: 9608288 -> 9650403 (0.44%)
    helped: 4
    HURT: 104
    helped stats (abs) min: 24 max: 128 x̄: 88.50 x̃: 101
    helped stats (rel) min: <.01% max: 0.85% x̄: 0.46% x̃: 0.49%
    HURT stats (abs)   min: 2 max: 2016 x̄: 408.36 x̃: 48
    HURT stats (rel)   min: <.01% max: 3.31% x̄: 0.88% x̃: 0.45%
    95% mean confidence interval for cycles value: 256.67 523.24
    95% mean confidence interval for cycles %-change: 0.63% 1.03%
    Cycles are HURT.

  AFAICT this is the result of the case above.

Shader-db results for TGL have similar cycles result as ICL, but also
affect instructions

    total instructions in shared programs: 17690586 -> 17690597 (<.01%)
    instructions in affected programs: 64617 -> 64628 (0.02%)
    helped: 55
    HURT: 32
    helped stats (abs) min: 1 max: 16 x̄: 4.13 x̃: 3
    helped stats (rel) min: 0.05% max: 2.78% x̄: 0.86% x̃: 0.74%
    HURT stats (abs)   min: 1 max: 65 x̄: 7.44 x̃: 2
    HURT stats (rel)   min: 0.05% max: 4.58% x̄: 1.13% x̃: 0.69%
    95% mean confidence interval for instructions value: -2.03 2.28
    95% mean confidence interval for instructions %-change: -0.41% 0.15%
    Inconclusive result (value mean confidence interval includes 0).

  Now that more is done in the IR, more dependencies are visible and
  more SWSB annotations are emitted.  Mixed with different register
  allocation decisions like above, some shaders will see more `sync
  nops` while others able to avoid them.

  Most of the new `sync nops` are also redundant and could be dropped,
  which will be fixed in a separate change.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3278>
2020-04-29 07:17:27 +00:00
Kenneth Graunke
4459a70a6e intel/compiler: Delete abs/neg handling in fsign code
This should have gone away when removing source modifiers.  They won't
be set any longer, so this is simply dead code.

Fixes: b7c47c4f7c ("intel/compiler: Drop nir_lower_to_source_mods() and related handling.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4691>
2020-04-22 17:04:37 -07:00
Kenneth Graunke
b7c47c4f7c intel/compiler: Drop nir_lower_to_source_mods() and related handling.
I think we're unanimous in wanting to drop nir_lower_to_source_mods.
It's a bit of complexity to handle in the backend, but perhaps more
importantly, would be even more complexity to handle in nir_search.

And, it turns out that since we made other compiler improvements in the
last few years, they no longer appear to buy us anything of value.
Summarizing the results from shader-db from this patch:

 - Icelake (scalar mode)

   Instruction counts:

   - 411 helped, 598 hurt (out of 139,470 shaders)
   - 99.2% of shaders remain unaffected.  The average increase in
     instruction count in hurt programs is 1.78 instructions.
   - total instructions in shared programs: 17214951 -> 17215206 (<.01%)
   - instructions in affected programs: 1143879 -> 1144134 (0.02%)

   Cycles:

   - 1042 helped, 1357 hurt
   - total cycles in shared programs: 365613294 -> 365882263 (0.07%)
   - cycles in affected programs: 138155497 -> 138424466 (0.19%)

 - Haswell (both scalar and vector modes)

   Instruction counts:

   - 73 helped, 1680 hurt (out of 139,470 shaders)
   - 98.7% of shaders remain unaffected.  The average increase in
     instruction count in hurt programs is 1.9 instructions.
   - total instructions in shared programs: 14199527 -> 14202262 (0.02%)
   - instructions in affected programs: 446499 -> 449234 (0.61%)

   Cycles:

   - 5253 helped, 5559 hurt
   - total cycles in shared programs: 359996545 -> 360038731 (0.01%)
   - cycles in affected programs: 155897127 -> 155939313 (0.03%)

Given that ~99% of shader-db remains unaffected, and the affected
programs are hurt by about 1-2 instructions - which are all cheap
ALU instructions - this is unlikely to be measurable in terms of
any real performance impact that would affect users.

So, drop them and simplify the backend, and hopefully enable other
future simplifications in NIR.

Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4616>
2020-04-21 21:42:21 +00:00
Jason Ekstrand
7c43b8ce1b nir: Delete the fnoise opcodes
As of the previous commit, they are never used.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4624>
2020-04-21 06:16:13 +00:00