Commit Graph

162575 Commits

Author SHA1 Message Date
Yonggang Luo
d61ac94658 c11: Remove _MTX_INITIALIZER_NP for windows
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18493>
2022-11-09 04:38:28 +00:00
Yonggang Luo
37d79e38e9 egl: Remove the need of _MTX_INITIALIZER_NP by using simple_mtx_t/SIMPLE_MTX_INITIALIZER in egllog.c
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18493>
2022-11-09 04:38:28 +00:00
Yonggang Luo
23e6a4ccda nir: Remove the need of _MTX_INITIALIZER_NP by using simple_mtx_t/SIMPLE_MTX_INITIALIZER in nir/nir_validate.c
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18493>
2022-11-09 04:38:28 +00:00
Yonggang Luo
e518ff4fd5 glsl: Remove the need of _MTX_INITIALIZER_NP by using simple_mtx_t/SIMPLE_MTX_INITIALIZER
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18493>
2022-11-09 04:38:28 +00:00
Yonggang Luo
db708b7e9c llvmpipe: Remove the need of _MTX_INITIALIZER_NP by using simple_mtx_t/SIMPLE_MTX_INITIALIZER in lp_texture.c
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18493>
2022-11-09 04:38:28 +00:00
Yonggang Luo
fb979a19b0 vulkan/device-select-layer: Remove the need of call_once by using simple_mtx_t instead mtx_t
Function device_select_once_init are removed in-favor of SIMPLE_MTX_INITIALIZER

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18493>
2022-11-09 04:38:28 +00:00
Rob Clark
c0fc8d5046 freedreno/a6xx: Switch to global bcolor buffer
Since we expect a limited # of unique border-color entry states, we can
use a global table of border-color entries, rather than constructing the
state at draw time.  This shifts all the border-color overhead from draw
time to sampler state CSO creation time.  And it's less code!

A hashtable is used to map unique border-color table value to entry so
multiple usages of what maps to the same table entry all re-use a single
slot in the table.  This puts an upper bound on the # of unique border-
color plus format value.  In practice this shouldn't be a problem, we'll
just size the table to be large enough to not run into problems with
CTS.  Note that the border-color table entry is not completely format
dependent (mostly just integer vs float dependent), so for example a
single color with different float formats can map to a single table
entry.

This also fixes the problem that we completely ignored border-color for
GS/tess stages.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7518
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19561>
2022-11-09 02:51:17 +00:00
Rob Clark
27b2496bae freedreno/a6xx: Rename tex cache key/equals fxns
We'll need different functions for border-color cache.  Prep for next
patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19561>
2022-11-09 02:51:17 +00:00
Rob Clark
c8cf786976 freedreno/a6xx: Move bcolor entry setup
Just code motion, in prep for a following patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19561>
2022-11-09 02:51:17 +00:00
Rob Clark
755e3ff0ee freedreno/ci: Update a5xx expectations
These seem to have not been updated in a while.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19561>
2022-11-09 02:51:17 +00:00
Rob Clark
ed9152e2c1 freedreno: Use our border-color quirk
This will let us remove our assumption that samplers and views map 1:1,
and generally simplify our border-color handling.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19561>
2022-11-09 02:51:17 +00:00
David Heidelberg
26e742c661 ci/bare-metal: remove consolidations leftovers
All defined in the baremetal-test-arm*

Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19548>
2022-11-09 02:23:37 +00:00
Rob Clark
e090e313fa freedreno/ir3: Reduce compiler thread pool size
With the current scheme, looking at game startup which should be the
worst case (most heavily loaded) time for the compiler threads, and they
seem to be ~10% busy.  Furthermore we typically have a mix of "big" and
"LITTLE" cores.. with about half being "big".  So sizing the thread pool
to the half the # of CPU cores seems reasonable.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19387>
2022-11-08 23:36:51 +00:00
Rob Clark
a6e4f8d03f util/disk_cache: Add some blob cache traces
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19387>
2022-11-08 23:36:51 +00:00
Rob Clark
d831fd40c8 util/disk_cache: Add compression in blob cb path
Android's implementation of the blob-cache get/put funcs do not
implement any compression.  And the default cache size is rather small,
at 2MB (!!) per app (although I assume everyone patches android to
increase the size limit).

We don't bother compressing the has_key/put_key path, since that path is
only storing a uint32_t.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19387>
2022-11-08 23:36:51 +00:00
Simon Ser
2fdc3846e7 vulkan/wsi/wayland: return VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
If the surface is already in use by another swapchain, return
VK_ERROR_NATIVE_WINDOW_IN_USE_KHR. The spec states:

> If pCreateInfo->oldSwapchain is VK_NULL_HANDLE, and the native
> window referred to by pCreateInfo->surface is already associated
> with a Vulkan swapchain, VK_ERROR_NATIVE_WINDOW_IN_USE_KHR must
> be returned.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7467
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19229>
2022-11-08 22:52:41 +00:00
Eric Engestrom
b4921b5d7a ci: run shaderdb on vc4 as well
Signed-off-by: Eric Engestrom <eric@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19305>
2022-11-08 21:23:27 +00:00
Eric Engestrom
83b1cb936e vc4: add DRM_VC4_CREATE_SHADER_BO support to drm-shim
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19305>
2022-11-08 21:23:27 +00:00
Yusuf Khan
2c5b1d0e3b nv50/ir: Support fmulz and ffmaz
Signed-off-by: Yusuf Khan <yusisamerican@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19333>
2022-11-08 21:10:08 +00:00
Yusuf Khan
47251d2852 nv50/ir: add prefer_nir flag for getting compiler options
So that we dont expose certain options for nir_to_tgsi

Signed-off-by: Yusuf Khan <yusiamerican@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19333>
2022-11-08 21:10:08 +00:00
Connor Abbott
def56b531c tu: Support GMEM with layered rendering and multiview
It turns out that this actually is supported. GMEM can hold multiple
layers which are cleared, loaded, and resolved separately. The stride
between layers seems to be implicitly calculated based on the tile size,
and we have to match it when blitting to/from GMEM. One tricky thing is
that now we may realize that we don't have enough space for GMEM only
when computing the tiling config, because we may not know the number of
framebuffer layers until we have the framebuffer and too many
framebuffer layers will exhaust GMEM.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19505>
2022-11-08 16:35:02 +00:00
Samuel Pitoiset
a9ab53fbe2 radv: stop emulating number of generated primitives by GS on GFX11
According to RadeonSI, only GFX10 and GFX10.3 need to emulate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19319>
2022-11-08 16:15:16 +00:00
Lionel Landwerlin
97b3dd34c1 anv: fix missing VkPhysicalDeviceExtendedDynamicState3PropertiesEXT handling
Fixes: 13c422e1b2 ("anv: toggle on EXT_extended_dynamic_state3")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19573>
2022-11-08 15:28:57 +00:00
Tapani Pälli
2a60037523 crocus: enable NV_alpha_to_coverage_dither_control
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19463>
2022-11-08 11:45:46 +00:00
Tapani Pälli
3c84809ca6 iris: enable NV_alpha_to_coverage_dither_control
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19463>
2022-11-08 11:45:46 +00:00
Samuel Pitoiset
bff6a38ed9 radv: advertise extendedDynamicState3ColorWriteMask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19589>
2022-11-08 11:04:54 +00:00
Samuel Pitoiset
a92d1d13c5 radv: add support for dynamic color write mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19589>
2022-11-08 11:04:54 +00:00
Caio Oliveira
22d8ed84b8 intel/compiler: Remove unused fs_visitor::emit_percomp()
Since 7ef7738a61 ("i965: Write gl_FragCoord directly to the destination.") this
is not used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>
2022-11-08 07:33:09 +00:00
Caio Oliveira
90861e6fea intel/compiler: Remove various unused function declarations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>
2022-11-08 07:33:08 +00:00
Caio Oliveira
48506a9029 intel/compiler: Remove unused data members
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>
2022-11-08 07:33:08 +00:00
Yonggang Luo
7fe5fec747 util: Remove os/os_thread.h and replace #include "os/os_thread.h" with #include "util/u_thread.h"
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19557>
2022-11-08 05:21:42 +00:00
Yonggang Luo
a72d57fe26 util: cleanup os_thread.h
__pipe_mutex_assert_locked is not used anymore so remove it from os_thread.h
The remove of "pipe/p_compiler.h" caused compiling failure also fixed

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19557>
2022-11-08 05:21:42 +00:00
Yonggang Luo
1129537e4c util: Move pipe_semaphore to u_thread.h and rename it to util_semaphore
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19557>
2022-11-08 05:21:42 +00:00
Yonggang Luo
b732064f9e gallium/util: Remove the EMBEDDED_DEVICE macro because nobody use it
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7641

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19552>
2022-11-08 02:37:20 +00:00
Ian Romanick
9abeb3d739 intel/fs: Optimize integer multiplication of large constants by factoring
Many Intel platforms can only perform 32x16 bit multiplication.  The
straightforward way to implement 32x32 bit multiplications is by
splitting one of the operands into high and low parts called H and L,
repsectively.  The full multiplication can be implemented as:

         ((A * H) << 16) + (A * L)

On Intel platforms, special register accesses can be used to eliminate
the shift operation.  This results in three instructions and a temporary
register for most values.

If H or L is 1, then one (or both) of the multiplications will later be
eliminated.  On some platforms it may be possible to eliminate the
multiplication when H is 256.

If L is zero (note that H cannot be zero), one of the multiplications
will also be eliminated.

Instead of splitting the operand into high and low parts, it may
possible to factor the operand into two 16-bit factors X and Y.  The
original multiplication can be replaced with (A * (X * Y)) = ((A * X) *
Y).  This requires two instructions without a temporary register.

I may have gone a bit overboard with optimizing the factorization
routine.  It was a fun brainteaser, and I couldn't put it down. :) On my
1.3GHz Ice Lake, a standalone test could chug through 1,000,000 randomly
selected values in about 5.7 seconds.  This is about 9x the performance
of the obvious, straightforward implementation that I started with.

v2: Drop an unnecessary return.  Rearrange logic slightly and rename
variables in factor_uint32 to better match the names used in the large
comment.  Both suggested by Caio. Rearrange logic to avoid possibly
using `a` uninitialized. Noticed by Marcin.

v3: Use DIV_ROUND_UP instead of open coding it. Noticed by Caio.

Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown)
total instructions in shared programs: 19912558 -> 19912526 (<.01%)
instructions in affected programs: 3432 -> 3400 (-0.93%)
helped: 10 / HURT: 0

total cycles in shared programs: 856413218 -> 856412810 (<.01%)
cycles in affected programs: 122032 -> 121624 (-0.33%)
helped: 9 / HURT: 0

No shader-db changes on any other Intel platforms.

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
Instructions in all programs: 141997227 -> 141996923 (-0.0%)
Instructions helped: 71

Cycles in all programs: 9162524757 -> 9162523886 (-0.0%)
Cycles helped: 63
Cycles hurt: 5

No fossil-db changes on any other Intel platforms.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick
5ec75ca10d intel/compiler: Teach signed integer range analysis about imax and imin
This is especially helpful for a*isign(a) generated by idiv_by_const
optimization.  On many GPUs, isign(a) is lowered to imax(imin(a, 1),
-1).

There are no changes on fossil-db because ANV uses a different
optimization path for idiv with a constant denominator.  A future MR
will change this.

NOTE: This commit used to help a few hundred shader-db shaders, but
now none are affected.  I suspect this is due to some change in the
idiv_by_const optimization.  This could possibly be dropped.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick
1b0da3a765 intel/compiler: Signed integer range analysis for imul_32x16 generation
Only iabs and ineg are treated specially.  Everything else just uses
nir_unsigned_upper_bound.  The special treatment of source modifiers is
because they cause problems for nir_unsigned_upper_bound.  Once those
are peeled off, nir_unsigned_upper_bound can generally produce a
tighter bound.

Future commits will add more opcodes.  This mostly introduces the
basic framework.

v2: Add a bunch of comments to signed_integer_range_analysis. Re-arrange
the code a little to reduce duplication.  Both suggested by
Caio. Rearrange some logic to simplify things. Suggested by Marcin.

Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown)
total instructions in shared programs: 19912894 -> 19912558 (<.01%)
instructions in affected programs: 109275 -> 108939 (-0.31%)
helped: 74 / HURT: 0

total cycles in shared programs: 856422769 -> 856413218 (<.01%)
cycles in affected programs: 15268102 -> 15258551 (-0.06%)
helped: 65 / HURT: 4

total fills in shared programs: 8218 -> 8217 (-0.01%)
fills in affected programs: 1171 -> 1170 (-0.09%)
helped: 1 / HURT: 0

Skylake and Broadwell had similar results. (Skylake shown)
total cycles in shared programs: 845145547 -> 845142263 (<.01%)
cycles in affected programs: 15261465 -> 15258181 (-0.02%)
helped: 65 / HURT: 0

Tiger Lake
Tiger Lake
Instructions in all programs: 157580768 -> 157579730 (-0.0%)
Instructions helped: 312
Instructions hurt: 28

Cycles in all programs: 7566977172 -> 7566967746 (-0.0%)
Cycles helped: 288
Cycles hurt: 53

Spills in all programs: 19701 -> 19700 (-0.0%)
Spills helped: 2
Spills hurt: 4

Fills in all programs: 33311 -> 33335 (+0.1%)
Fills helped: 5
Fills hurt: 4

Ice Lake
Instructions in all programs: 141998667 -> 141997227 (-0.0%)
Instructions helped: 420
Instructions hurt: 3

Cycles in all programs: 9162565297 -> 9162524757 (-0.0%)
Cycles helped: 389
Cycles hurt: 29

Spills in all programs: 19918 -> 19916 (-0.0%)
Spills helped: 2
Spills hurt: 3

Fills in all programs: 32795 -> 32814 (+0.1%)
Fills helped: 6
Fills hurt: 3

Skylake
Instructions in all programs: 132567691 -> 132567745 (+0.0%)
Instructions hurt: 24

Cycles in all programs: 8828897462 -> 8828889517 (-0.0%)
Cycles helped: 405
Cycles hurt: 6

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick
f90d71055b intel/compiler: Add and use a pass to generate imul_32x16 instructions
Gfx8 and Gfx9 platforms are helped for cycles because now many
instructions like

    mul(8)          g12<1>D         g10<8,8,1>D     6D

become

    mul(8)          g12<1>D         g10<8,8,1>D     6W

It is the same number of instructions, but the 32x16 multiply is a
little faster.

v2: Fix transposed hi and lo in "(hi >= INT16_MIN && lo <= INT16_MAX)".
Noticed by Caio.  Use nir_src_is_const instead of open coding it.
Suggested by Caio.

Broadwell and Skylake had similar results. (Skylake shown)
total cycles in shared programs: 845748380 -> 845145547 (-0.07%)
cycles in affected programs: 446346348 -> 445743515 (-0.14%)
helped: 6017
HURT: 0
helped stats (abs) min: 2 max: 7380 x̄: 100.19 x̃: 8
helped stats (rel) min: <.01% max: 3.72% x̄: 0.41% x̃: 0.39%
95% mean confidence interval for cycles value: -113.37 -87.00
95% mean confidence interval for cycles %-change: -0.42% -0.41%
Cycles are helped.

Skylake
Cycles in all programs: 8844820715 -> 8828897462 (-0.2%)
Cycles helped: 47914
Cycles hurt: 1

No shader-db or fossil-db changes on any other Intel platform.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick
9479e3a19b intel/fs: Allow constant copy prop from DW to W
This enables copy propagation of

    mov(8)          g5<1>UD         0x00000180UD
    mul(8)          g10<1>D         g2.3<0,1,0>D    g5<16,8,2>W

into

    mul(8)          g10<1>D         g2.3<0,1,0>D    180W

This is necessary for any optimization passes that generate imul_32x16
instructions.

No fossil-db or shader-db changes on any Intel platform.

v2: Fix type size check to (src size != 2) || (dest size != 4).  It was
previously &&. :( This allowed copying constants into UB sources, and
that is invalid.

v3: Fix incorrect extraction of upper 16-bits of immediate value when
subnr=2. Noticed by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick
90d267b2d1 intel/fs: Fix bounds checking for integer multiplication lowering
The previous bounds checking would cause

    mul(8)          g121<1>D        g120<8,8,1>D    0xec4dD

to be lowered to

    mul(8)          g121<1>D        g120<8,8,1>D    0xec4dUW
    mul(8)          g41<1>D         g120<8,8,1>D    0x0000UW
    add(8)          g121.1<2>UW     g121.1<16,8,2>UW g41<16,8,2>UW

Instead of picking the bounds (and the new type) based on the old type,
pick the new type based on the value only.

This helps a few fossil-db shaders in Witcher 3 and Geekbench5.  No
changes on any other Intel platforms.

Tiger Lake
Instructions in all programs: 157581069 -> 157580768 (-0.0%)
Instructions helped: 24

Cycles in all programs: 7566979620 -> 7566977172 (-0.0%)
Cycles helped: 22
Cycles hurt: 4

Ice Lake
Instructions in all programs: 141998965 -> 141998667 (-0.0%)
Instructions helped: 26

Cycles in all programs: 9162568666 -> 9162565297 (-0.0%)
Cycles helped: 24
Cycles hurt: 2

Skylake
No changes.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick
db20412168 intel/fs: Fix constant propagation into 32x16 integer multiplication
Don't copy propagate the constant in situations like

    mov(8)          g8<1>D          0x7fffffffD
    mul(8)          g16<1>D         g8<8,8,1>D      g15<16,8,2>W

On platforms that only have a 32x16 multiplier, this will result in
lowering the multiply to

    mul(8)          g15<1>D         g14<8,8,1>D     0xffffUW
    mul(8)          g16<1>D         g14<8,8,1>D     0x7fffUW
    add(8)          g15.1<2>UW      g15.1<16,8,2>UW g16<16,8,2>UW

On Gfx8 and Gfx9, which have the full 32x32 multiplier, it results in

    mul(8)          g16<1>D         g15<16,8,2>W    0x7fffffffD

Volume 2a of the Skylake PRM says:

    When multiplying a DW and any lower precision integer, the
    DW operand must on src0.

See also https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/104.

Previous to INTEL_shader_integer_functions2 (in Vulkan or OpenGL), I
don't think it would be possible to create a situation where this could
occur.  I discovered this via some optimizations that can determine that
the non-constant source must be able to fit in 16-bits.  The case listed
above came from piglit's "ext_transform_feedback-order arrays points"
with those optimizations in place.

No shader-db or fossil-db changes on any Intel platform.

Fixes: de6c0f8487 ("intel/fs: Implement support for NIR opcodes for INTEL_shader_integer_functions2")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Charmaine Lee
3194fe9362 wgl: fix reference to wgl(Create|Delete)Context function pointers
Currently in wglCreateContextAttribsARB(), we get and save the
pointers to OPENGL32.DLL's wglCreate/DeleteContext() functions.
But these function pointers might be invalid after opengl32.dll is
unloaded and reloaded again and possibly in a different address space.
This patch, provided by Jose Fonseca, uses GetModuleHandle and gets
the proc address of wglCreate/DeleteContext functions every time the
function is called.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19478>
2022-11-07 23:48:30 +00:00
Gert Wollny
4f599dc3a5 r600: Fix some border color swizzles on Evergreen
Note: (u)int32 is broken on this hardware.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19532>
2022-11-07 22:37:06 +00:00
Gert Wollny
923d635357 r600: fix some border color swizzles on CAYMAN
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19532>
2022-11-07 22:37:06 +00:00
Dylan Baker
196499d75e docs: update calendar and link releases notes for 22.2.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19585>
2022-11-07 10:28:11 -08:00
Dylan Baker
616635909e docs: Add sha256 sum for 22.2.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19585>
2022-11-07 10:28:07 -08:00
Dylan Baker
2fe1aab18f docs: add release notes for 22.2.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19585>
2022-11-07 10:28:06 -08:00
Mauro Rossi
fd8ec189e5 Android.mk: Fix gnu++14 related build failures
This patch filters-out '-std=gnu++14' from the cflags obtained
from AOSP/KATI dummy target output to avoid the following building errors:

FAILED: src/gallium/drivers/r600/45f68e3@@r600@sta/sfn_sfn_assembler.cpp.o
...
clang++ ... -std=c++17 ... -std=gnu++14
...
In file included from ../src/gallium/drivers/r600/sfn/sfn_assembler.cpp:27:
In file included from ../src/gallium/drivers/r600/sfn/sfn_assembler.h:32:
In file included from ../src/gallium/drivers/r600/sfn/sfn_shader.h:31:
../src/gallium/drivers/r600/sfn/sfn_instr.h:369:56: error: no template named 'is_base_of_v' in namespace 'std'; did you mean 'is_base_of'?
template <typename T, typename = std::enable_if_t<std::is_base_of_v<Instr, T>>>
                                                  ~~~~~^~~~~~~~~~~~
                                                       is_base_of
/home/utente/pie-x86_kernel/external/libcxx/include/type_traits:1412:29: note: 'is_base_of' declared here
struct _LIBCPP_TEMPLATE_VIS is_base_of
                            ^
In file included from ../src/gallium/drivers/r600/sfn/sfn_assembler.cpp:27:
In file included from ../src/gallium/drivers/r600/sfn/sfn_assembler.h:32:
In file included from ../src/gallium/drivers/r600/sfn/sfn_shader.h:31:
../src/gallium/drivers/r600/sfn/sfn_instr.h:369:51: error: template argument for non-type template parameter must be an expression
template <typename T, typename = std::enable_if_t<std::is_base_of_v<Instr, T>>>
                                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/utente/pie-x86_kernel/external/libcxx/include/type_traits:439:16: note: template parameter is declared here
template <bool _Bp, class _Tp = void> using enable_if_t = typename enable_if<_Bp, _Tp>::type;
               ^
2 errors generated.

Cc: "22.2" "22.3" mesa-stable
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19563>
2022-11-07 17:46:42 +00:00
José Roberto de Souza
41ee836c9a intel: Add and use intel_gem_can_render_on_fd()
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>
2022-11-07 17:22:14 +00:00
José Roberto de Souza
29550bc50a intel: Add has_context_isolation to intel_device_info
Iris, hasvk and anv were fetching the same information, better do it
on one place.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19425>
2022-11-07 17:22:14 +00:00