Commit Graph

67525 Commits

Author SHA1 Message Date
Jason Ekstrand
458a6ce500 nir/glsl: Add support for saturate
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
4582341ea7 i965/fs_nir: Add support for sample_pos and sample_id 2015-01-15 07:18:59 -08:00
Jason Ekstrand
7cd1537aae Fix up varying pull constants
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
4bb81f6d02 Fix what I think are a few NIR typos
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
b092bc9805 i965/fs_nir: Use the correct texture offset immediate
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
c181ff268e i965/fs_nir: Use the correct types for texture inputs
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
c2ded36bb6 i965/fs_nir: Make the sampler register always unsigned
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Jason Ekstrand
ae2880d131 i965/fs: Only use nir for 8-wide non-fast-clear shaders.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:18:59 -08:00
Connor Abbott
2faf7f87d6 i965/fs: add a NIR frontend
This is similar to the GLSL IR frontend, except consuming NIR. This lets
us test NIR as part of an actual compiler.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Make brw_fs_nir build again
   Only use NIR of INTEL_USE_NIR is set
   whitespace fixes
2015-01-15 07:18:59 -08:00
Connor Abbott
9afc566e2d i965/fs: Don't pass through the coordinate type
All we really need is the number of components.
2015-01-15 07:18:58 -08:00
Connor Abbott
616a48ebc6 i965/fs: make emit_fragcoord_interpolation() not take an ir_variable 2015-01-15 07:18:58 -08:00
Connor Abbott
7602385ac5 nir: add an SSA-based dead code elimination pass
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
8b7cb7674c nir: add an SSA-based copy propagation pass 2015-01-15 07:18:58 -08:00
Connor Abbott
4553887d4a nir: add a pass to convert to SSA
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
b559ee709b nir: calculate dominance information 2015-01-15 07:18:58 -08:00
Connor Abbott
cff1deff72 nir: add an optimization to turn global registers into local registers
After linking and inlining, this allows us to convert these registers
into SSA values and optimise more code.
2015-01-15 07:18:58 -08:00
Connor Abbott
613bf6818a nir: add a pass to lower atomics
v2: Jason Ekstrand <jason.ekstrand@intel.com>
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
8692c6a023 nir: add a pass to lower system value reads
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
8cdcfce5ce nir: add a pass to lower sampler instructions 2015-01-15 07:18:58 -08:00
Connor Abbott
370e875b32 nir: add a pass to remove unused variables
After we lower variables, we want to delete them in order to free up
some memory.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
    whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
494790b2a9 nir: keep track of the number of input, output, and uniform slots 2015-01-15 07:18:58 -08:00
Connor Abbott
c2f36cf125 nir: add a pass to lower variables for scalar backends 2015-01-15 07:18:58 -08:00
Connor Abbott
7f0daaa5e7 nir: add a glsl-to-nir pass
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Make glsl_to_nir build again
   fix whitespace
2015-01-15 07:18:58 -08:00
Connor Abbott
dbb76421da nir: add a validation pass
This is similar to ir_validate.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Connor Abbott
98fa28bff7 nir: add a printer
This is similar to ir_print_visitor.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace fixes
2015-01-15 07:18:58 -08:00
Jason Ekstrand
9b1139649d SQUASH: Fix comments from eric
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:58 -08:00
Jason Ekstrand
8b4c860580 SQUASH: Add an assert 2015-01-15 07:18:58 -08:00
Connor Abbott
2812e5de93 nir: add core helper functions
These include functions for adding and removing various bits of IR and
helpers for iterating over all the sources and destinations of an
instruction. This is similar to ir.cpp.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   whitespace and automake fixes
2015-01-15 07:18:58 -08:00
Jason Ekstrand
f521a3c543 SQUASH: Use the enum for the variable mode 2015-01-15 07:18:57 -08:00
Connor Abbott
30c4678f64 nir: add the core datastructures
This includes all the instructions, ifs, loops, functions, etc. This is
similar to the information in ir.h.

v2: Jason Ekstrand <jason.ekstrand@intel.com>:
   Include ralloc and hash_table from the util directory
   whitespace fixes

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-By glenn.kennard <glenn.kennard@gmail.com>
2015-01-15 07:18:57 -08:00
Connor Abbott
b5ca34a211 nir: add a simple C wrapper around glsl_types.h
v2: Jason Ekstrand <jason.ekstrand@intel.com>:
    whitespace and automake fixes

Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:57 -08:00
Connor Abbott
77e7a00267 nir: add initial README
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:57 -08:00
Connor Abbott
ab2ae63854 exec_list: add a list_foreach_typed_reverse() macro
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 07:18:57 -08:00
Eric Anholt
84ef2d4156 vc4: Add some dumping for STORE_TILE_BUFFER_GENERAL. 2015-01-15 22:21:29 +13:00
Eric Anholt
1b241c59e8 vc4: Add dumping for the TILE_RENDERING_MODE_CONFIG packet.
I wanted to read it, so I wrote parsing.
2015-01-15 22:19:25 +13:00
Eric Anholt
d0d6d24723 vc4: Fix CL dumping trying to dump too far.
Execution will end at the cl->next, because that's what ct0ea/ct1ea get
programmed to.
2015-01-15 22:19:25 +13:00
Eric Anholt
0471f72755 vc4: Fix texture type masking.
Everything from ETC1 to RGBA64 was getting its top bit dropped, but we
didn't use any of those formats.
2015-01-15 22:19:25 +13:00
Eric Anholt
6313a2c8f0 vc4: Colormask should apply after all other fragment ops (like logic op).
Theoretically it should apply after dithering as well, but ditehring for
565 happens in fixed function in the TLB store.
2015-01-15 22:19:25 +13:00
Eric Anholt
0289a26201 vc4: No turning unpack arguments into small immediates.
Since unpack only happens on things read from the A register file, we have
to leave them as something that can be allocated to A (temp or uniform).
2015-01-15 22:19:25 +13:00
Eric Anholt
772c47aefe vc4: Move the tests for src needing to be an A register to vc4_qir.c.
I want it from another location.
2015-01-15 22:19:25 +13:00
Eric Anholt
8f2fb68026 vc4: Don't swap the raddr on instructions doing unpacks.
It would mean different unpacking behavior, since only the A file does
unpack (with PM==0).
2015-01-15 22:19:25 +13:00
Eric Anholt
5d5707707f vc4: Don't let pairing happen with badly mismatched unpack flags.
No difference on shader-db, but prevents definite regressions in the
blending changes.
2015-01-15 22:19:25 +13:00
Eric Anholt
3820866e40 vc4: Don't let pairing happen with badly mismatched pack flags.
No difference on shader-db, but will become more important as I introduce
more use of pack flags with the blending changes.
2015-01-15 22:19:25 +13:00
Eric Anholt
d1f2fc834d vc4: Fix early Z behavior on hardware.
It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z.  The result was
that you'd get big chunks of your rendering missing.
2015-01-15 22:19:25 +13:00
Michel Dänzer
82b7ee62fc Revert "radeonsi: only set BC_OPTIMIZE_DISABLE when necessary"
This reverts commit 0543630d0b.

It caused flickering artifacts in Steam games such as Team Fortress 2 or
Left 4 Dead 2.

We could probably only enable this optimization by also making sure the
shader code only uses either SI_PARAM_LINEAR_CENTROID or
SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader
variant.

Sorry I didn't remember this when reviewing the reverted change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-15 15:09:48 +09:00
Michel Dänzer
a6a75f1286 st/clover: Adapt to TargetLibraryInfo.h move in LLVM SVN r226078
Trivial.
2015-01-15 12:57:05 +09:00
Ian Romanick
0a0d2c9443 mesa: Micro-optimize _mesa_is_valid_prim_mode
You would not believe the mess GCC 4.8.3 generated for the old
switch-statement.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence -0.37374% +/- 0.184057% (n=40)
64-bit: Difference at 95.0% confidence 0.966722% +/- 0.338442% (n=40)

The regression on 32-bit is odd.  Callgrind says the caller,
_mesa_is_valid_prim_mode is faster.  Before it says 2,293,760
cycles, and after it says 917,504.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
ead200d156 mesa: Check for vertex program the same way in desktop GL and ES
On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Multithread:

32-bit: Difference at 95.0% confidence 0.416027% +/- 0.163529% (n=40)
64-bit: Difference at 95.0% confidence 0.494771% +/- 0.259985% (n=40)

Gl32Batch7 had no difference proven at 95.0% confidence (n=120) on
32-bit or 64-bit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
d5f936367f mesa: Drop index buffer bounds check
The previous check was insufficient (as it did not take 'indices' into
consideration), and DX10 hardware does not need this check anyway.

Since index_bytes is no longer used, remove it.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: Difference at 95.0% confidence 1.66929% +/- 0.230107% (n=40)
64-bit: Difference at 95.0% confidence -1.40848% +/- 0.288038% (n=40)

The regression on 64-bit is odd.  Callgrind says the caller,
validate_DrawElements_common is faster.  Before it says 10,321,920
cycles, and after it says 8,945,664.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00
Ian Romanick
a4aeb534ea mesa: Only check for a current vertex shader in core profile
This doesn't affect performance, but it feels more correct.

On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic
for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects
Gl32Batch7:

32-bit: No difference proven at 95.0% confidence (n=120)
64-bit: No difference proven at 95.0% confidence (n=120)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-14 17:09:50 -08:00