Commit Graph

1106 Commits

Author SHA1 Message Date
Francisco Jerez
a81f9b5e3e intel/eu/validate/gen12: Fix validation of SYNC instruction.
src0 will typically be null for this instruction.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
45768e6b3c intel/eu/validate/gen12: Implement integer multiply restrictions in EU validator.
Due to hardware bug filed as HSDES#1604601757.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Jordan Justen
f9ec4ac5a1 intel/ir: Lower fpow on Gen12.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
cb6db5bfb3 intel/fs/gen12: Don't support source mods for 32x16 integer multiply.
Due to hardware bug filed as HSDES#1604601757.

v2: Only return if result of fs_inst::can_do_source_mods() is known to
    be false for the case new orthogonal restrictions are implemented
    below in the future. (Caio)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
de5d106ccf intel/disasm: Disassemble register file of split SEND sources.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c03869323b intel/disasm: Don't disassemble saturate control on SEND instructions.
The field is gone on Gen12+ and it was illegal on previous
generations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
f15e0b3439 intel/disasm/gen12: Disassemble Gen12 SEND instructions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
fd7e21dd90 intel/disasm/gen12: Disassemble Gen12 SYNC instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
606d823b42 intel/disasm/gen12: Disassemble three-source instruction source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
8263d300c2 intel/disasm/gen12: Fix disassembly of some common instruction controls.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
83612c0127 intel/disasm/gen12: Disassemble software scoreboard information.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
396f6b27a7 intel/fs/gen12: Demodernize software scoreboard lowering pass.
Kept as a separate commit in order to avoid distracting reviewers of
the software scoreboard pass with memory management boilerplate.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
265c7c8971 intel/fs/gen12: Introduce software scoreboard lowering pass.
Gen12+ hardware lacks the register scoreboard logic that used to
guarantee data coherency between register reads and writes in previous
generations.  This lowering pass runs after register allocation in
order to make up for it.

It works by performing global dataflow analysis in order to determine
the set of potential dependencies of every instruction in the shader,
and then inserts any required SWSB annotations and additional SYNC
instructions in order to guarantee data coherency.

v2: Drop unnecessary _safe list iteration (Caio).

v3: Temporarily workaround potential WaR hazard between FPU
    instruction and subsequent out-of-order write, pending
    clarification from the hardware team.  Drop redundant tracking of
    implicit access of acc0-1, since the hardware guarantees coherency
    of these (but not the other accumulators...).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
e0b8d7953e intel/fs/gen12: Add scheduling information to the IR.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
15e3a0d9d2 intel/eu/gen12: Set SWSB annotations in hand-crafted assembly.
Reviewers are encouraged to audit the code generation pass
independently for the case I missed some potential data hazard or new
code has been added in the meantime.

v2: Add SYNC instruction to cr0 workaround in brw_float_controls_mode().

v3: Drop likely redundant (and potentially harmful) RegDist SWSB
    annotation from ce0 read in brw_find_live_channel() (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d3f3bdcd18 intel/eu/gen12: Add tracking of default SWSB state to the current brw_codegen instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6154cdf924 intel/eu/gen12: Add auxiliary type to represent SWSB information during codegen.
v2: Introduce extra tgl_swsb_sbid() constructor (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c22db5e188 intel/fs/gen12: Add codegen support for the SYNC instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
0e57dbc55c intel/ir/gen12: Add SYNC hardware instruction.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
7499e10383 intel/eu/gen12: Don't set thread control, it's gone.
An effect similar to the one formerly provided by setting thread
control to "switch" can be achieved now by setting a RegDist of 1 on
the SWSB field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
a66ea33991 intel/eu/gen12: Don't set DD control, it's gone.
A future lowering pass will simulate the same behavior originally
provided by NoDDChk/NoDDClr at the IR level by using appropriate SWSB
annotations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
8a5fad0d92 intel/eu/gen12: Use SEND instruction for split sends.
The new SEND instruction behaves like the former SENDS instruction.
The original single-payload SEND instruction is gone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6634ede7aa intel/eu/gen12: Codegen SEND descriptor regions correctly.
The SEND instruction is now four-source.  The descriptor is no longer
part of source 1, so avoid touching it to avoid corruption while
initializing the descriptor.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
2c4c9aba30 intel/eu/gen12: Codegen pathological SEND source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
bafc9515db intel/eu/gen12: Codegen control flow instructions correctly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6e1daba3b4 intel/eu/gen12: Codegen three-source instruction source and destination regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
9fdb67aa09 intel/eu/gen12: Fix codegen of immediate source regions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
6cb764ae9c intel/eu/gen12: Add Gen12 opcode descriptions to the table.
Quite a lot of churn because the encoding of most hardware opcodes has
changed unfortunately.

v2: Split dot-product description fixes to separate patch (Caio).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
31182e7aa9 intel/eu/gen11+: Mark dot product opcodes as unsupported on opcode_descs table.
These instructions have been removed from the hardware.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c742be1437 intel/eu/gen12: Implement datatype binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Sagar Ghuge
a12533f2ce intel/eu/gen12: Implement immediate 64 bit constant encoding.
On Gen12, 64 bit immediate constants are loaded in reverse order. Lower
32 bit gets loaded from bit 96-127 and higher 32 bits from 64-95 in
instruction encoding.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Co-authored-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
5291283af0 intel/eu/gen12: Implement compact instruction binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
77d09d0d50 intel/eu/gen12: Implement indirect region binary encoding.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
81400470be intel/eu/gen12: Implement SEND instruction binary encoding.
v2: Fix off-by-one upper GET_BITS() bound, combine 25-29 and 30-31
    descriptor fields (Ken).  Shorten name of GEN12_MD() macro, drop
    some removed TS message descriptor fields (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d24b8af23d intel/eu/gen12: Implement control flow instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
956c156dc4 intel/eu/gen12: Implement three-source instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
fa48281795 intel/eu/gen12: Implement basic instruction binary encoding.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
143176163d intel/eu/gen12: Add sanity-check asserts to brw_inst_bits() and brw_inst_set_bits().
These caught a few bugs during the development of this series.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
7e5a8638d3 intel/eu/gen12: Extend brw_inst.h macros for Gen12 support.
The encoding of almost every instruction field has changed in Gen12,
so this involves adding a Gen12+ bitfield spec to every brw_inst
macro.  In addition some new macros are required to handle certain
discontiguous and variable-length fields.

This commit doesn't actually include the Gen12 updated bitfield specs,
only the macros are extended here for reviewability.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

v2: Rename FDC() to FFDC() and FDC1() to FDC() for consistency with
    the existing F() and FF() macros.
2019-10-11 12:24:16 -07:00
Francisco Jerez
6965a02e09 intel/ir: Represent physical edge of unconditional CONTINUE instruction.
This edge doesn't exist in the original scalar program, but it
represents a potential control flow path the EU will take in cases
where control flow isn't uniform across channels of the same SIMD
thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
eeaad2992c intel/ir: Represent physical edge of ELSE instruction.
This edge doesn't exist in the original scalar program, but it
represents a potential control flow path the EU will take in cases
where the condition isn't uniform across channels of the same SIMD
thread.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
152754665a intel/ir: Represent logical edge of BREAK instruction.
Currently only the physical back-edge is represented, which
incidentally also leads to the exit block of the loop, but we need the
direct logical edge in addition for our logical CFG representation to
be complete.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
c344c92b31 intel/ir: Add helper function to push block onto CFG analysis stack.
Requested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
d6a9731d8f intel/ir: Represent physical and logical subsets of the CFG.
This represents two control flow graphs in the same cfg_t data
structure: The physical CFG that will include all possible control
flow paths the EU can physically take, and the logical CFG restricted
to the control flow paths that exist in the original scalar program.
The latter is a subset of the former because in case of divergence the
SIMD vectorized program will take control flow paths that aren't part
of the original scalar program.

The bblock_link constructor and bblock_t::add_successor() now take a
"kind" parameter that specifies whether the edge is purely physical or
whether it's part of both the logical and physical CFGs (a logical
edge is of course always guaranteed to be in the physical CFG as
well).  bblock_t::is_predecessor_of() and ::is_successor_of() also
take a kind parameter specifying which CFG is being queried.  The '~>'
notation will be used now in order to represent purely physical edges
in IR dumps.

This commit doesn't actually add nor remove any edges from the CFG
(the only edges marked as purely physical here are the two WHILE loop
ones that already existed).  Optimization passes should continue using
the same (incomplete) physical CFG they were using before until
they're fixed to do something smarter in a later commit, so this
shouldn't lead to any functional changes.

v2: Remove tabs from lines changed in this file (Caio).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
1b570456ca intel/ir: Drop hard-coded correspondence between IR and HW opcodes.
Having the IR opcodes locked to their hardware representation is risky
because it causes opcodes as different as BRC and IFF to compare equal
at the IR level (luckily the back-end only ever uses one opcode from
each group, right now), and it prevents us from supporting
instructions that change their hardware representation across
generations, which will become a problem on Gen12+ platforms.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
057902dcf8 intel/eu: Encode and decode native instruction opcodes from/to IR opcodes.
Change brw_inst_set_opcode() and brw_inst_opcode() to call
brw_opcode_encode/decode() transparently in order to translate between
hardware and IR opcodes, and update the EU compaction code in order to
do the same as needed, so we can eventually drop the one-to-one
correspondence between hardware and IR opcodes.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
25dd67099d intel/eu: Rework opcode description tables to allow efficient look-up by either HW or IR opcode.
This rewrites the current opcode description tables as a more compact
flat data structure.  The purpose is to allow efficient constant-time
look-up by either HW or IR opcode, which will allow us to drop the
hard-coded correspondence between HW and IR opcodes -- See the next
commits for the rationale.

brw_eu.c is now built as C++ source so we can take advantage of
pointers to member in order to make the look-up function work
regardless of the opcode_desc member used as look-up key.

v2: Optimize devinfo struct comparison (Caio)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
51dc40cefb intel/eu: Fix up various type conversions in brw_eu.c that are illegal C++.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-11 12:24:16 -07:00
Francisco Jerez
35bcd08d61 intel/eu: Split brw_inst ex_desc accessors for SEND(C) vs. SENDS(C).
The brw_inst opcode accessors are going away in one of the following
commits.  We could potentially replace them with the new helpers that
do opcode remapping, but that would lead to a circular dependency
between brw_inst.h and brw_eu.h.  This way we also avoid ordering
issues that can cause the semantics of the ex_desc accessors to change
depending on whether the ex_desc field is set after or before the
opcode instruction field.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00
Francisco Jerez
b2ae65c7d9 intel/fs: Fix constness of implied_mrf_writes() argument.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-11 12:24:16 -07:00