docs/freedreno: Update for the fanin/fanout -> collect/split rename.
See 611258d578
("freedreno/ir3: rename fanin/fanout to collect/split")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11147>
This commit is contained in:
@@ -163,14 +163,14 @@ Meta Instructions
|
|||||||
**phi**
|
**phi**
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
**fanin**
|
**collect**
|
||||||
Groups registers which need to be assigned to consecutive scalar
|
Groups registers which need to be assigned to consecutive scalar
|
||||||
registers, for example `sam` (texture fetch) src instructions (see
|
registers, for example `sam` (texture fetch) src instructions (see
|
||||||
`register groups`_) or array element dereference
|
`register groups`_) or array element dereference
|
||||||
(see `relative addressing`_).
|
(see `relative addressing`_).
|
||||||
|
|
||||||
**fanout**
|
**split**
|
||||||
The counterpart to **fanin**, when an instruction such as `sam`
|
The counterpart to **collect**, when an instruction such as `sam`
|
||||||
writes multiple components, splits the result into individual
|
writes multiple components, splits the result into individual
|
||||||
scalar components to be consumed by other instructions.
|
scalar components to be consumed by other instructions.
|
||||||
|
|
||||||
@@ -202,22 +202,22 @@ Before register assignment, to group the two components of the texture src toget
|
|||||||
|
|
||||||
digraph G {
|
digraph G {
|
||||||
{ rank=same;
|
{ rank=same;
|
||||||
fanin;
|
collect;
|
||||||
};
|
};
|
||||||
{ rank=same;
|
{ rank=same;
|
||||||
coord_x;
|
coord_x;
|
||||||
coord_y;
|
coord_y;
|
||||||
};
|
};
|
||||||
sam -> fanin [label="regs[1]"];
|
sam -> collect [label="regs[1]"];
|
||||||
fanin -> coord_x [label="regs[1]"];
|
collect -> coord_x [label="regs[1]"];
|
||||||
fanin -> coord_y [label="regs[2]"];
|
collect -> coord_y [label="regs[2]"];
|
||||||
coord_x -> coord_y [label="right",style=dotted];
|
coord_x -> coord_y [label="right",style=dotted];
|
||||||
coord_y -> coord_x [label="left",style=dotted];
|
coord_y -> coord_x [label="left",style=dotted];
|
||||||
coord_x [label="coord.x"];
|
coord_x [label="coord.x"];
|
||||||
coord_y [label="coord.y"];
|
coord_y [label="coord.y"];
|
||||||
}
|
}
|
||||||
|
|
||||||
The frontend sets up the SSA ptrs from ``sam`` source register to the ``fanin`` meta instruction, which in turn points to the instructions producing the ``coord.x`` and ``coord.y`` values. And the grouping_ pass sets up the ``left`` and ``right`` neighbor pointers to the ``fanin``\'s sources, used later by the `register assignment`_ pass to assign blocks of scalar registers.
|
The frontend sets up the SSA ptrs from ``sam`` source register to the ``collect`` meta instruction, which in turn points to the instructions producing the ``coord.x`` and ``coord.y`` values. And the grouping_ pass sets up the ``left`` and ``right`` neighbor pointers to the ``collect``\'s sources, used later by the `register assignment`_ pass to assign blocks of scalar registers.
|
||||||
|
|
||||||
And likewise, for the consecutive scalar registers for the destination:
|
And likewise, for the consecutive scalar registers for the destination:
|
||||||
|
|
||||||
@@ -230,23 +230,23 @@ And likewise, for the consecutive scalar registers for the destination:
|
|||||||
C;
|
C;
|
||||||
};
|
};
|
||||||
{ rank=same;
|
{ rank=same;
|
||||||
fanout_0;
|
split_0;
|
||||||
fanout_1;
|
split_1;
|
||||||
fanout_2;
|
split_2;
|
||||||
};
|
};
|
||||||
A -> fanout_0;
|
A -> split_0;
|
||||||
B -> fanout_1;
|
B -> split_1;
|
||||||
C -> fanout_2;
|
C -> split_2;
|
||||||
fanout_0 [label="fanout\noff=0"];
|
split_0 [label="split\noff=0"];
|
||||||
fanout_0 -> sam;
|
split_0 -> sam;
|
||||||
fanout_1 [label="fanout\noff=1"];
|
split_1 [label="split\noff=1"];
|
||||||
fanout_1 -> sam;
|
split_1 -> sam;
|
||||||
fanout_2 [label="fanout\noff=2"];
|
split_2 [label="split\noff=2"];
|
||||||
fanout_2 -> sam;
|
split_2 -> sam;
|
||||||
fanout_0 -> fanout_1 [label="right",style=dotted];
|
split_0 -> split_1 [label="right",style=dotted];
|
||||||
fanout_1 -> fanout_0 [label="left",style=dotted];
|
split_1 -> split_0 [label="left",style=dotted];
|
||||||
fanout_1 -> fanout_2 [label="right",style=dotted];
|
split_1 -> split_2 [label="right",style=dotted];
|
||||||
fanout_2 -> fanout_1 [label="left",style=dotted];
|
split_2 -> split_1 [label="left",style=dotted];
|
||||||
sam;
|
sam;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -292,9 +292,9 @@ results in:
|
|||||||
|
|
||||||
The scheduling pass has some smarts to schedule things such that only a single ``a0.x`` value is used at any one time.
|
The scheduling pass has some smarts to schedule things such that only a single ``a0.x`` value is used at any one time.
|
||||||
|
|
||||||
To implement variable arrays, values are stored in consecutive scalar registers. This has some overlap with `register groups`_, in that ``fanin`` and ``fanout`` are used to help group things for the `register assignment`_ pass.
|
To implement variable arrays, values are stored in consecutive scalar registers. This has some overlap with `register groups`_, in that ``collect`` and ``split`` are used to help group things for the `register assignment`_ pass.
|
||||||
|
|
||||||
To use a variable array as a src register, a slight variation of what is done for const array src. The instruction src is a `fanin` instruction that groups all the array members:
|
To use a variable array as a src register, a slight variation of what is done for const array src. The instruction src is a `collect` instruction that groups all the array members:
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@@ -312,21 +312,21 @@ results in:
|
|||||||
a2 [label="r1.x"];
|
a2 [label="r1.x"];
|
||||||
a3 [label="r1.y"];
|
a3 [label="r1.y"];
|
||||||
sub;
|
sub;
|
||||||
fanin;
|
collect;
|
||||||
mova;
|
mova;
|
||||||
add;
|
add;
|
||||||
add -> sub;
|
add -> sub;
|
||||||
add -> fanin [label="off=2"];
|
add -> collect [label="off=2"];
|
||||||
add -> mova;
|
add -> mova;
|
||||||
fanin -> a0;
|
collect -> a0;
|
||||||
fanin -> a1;
|
collect -> a1;
|
||||||
fanin -> a2;
|
collect -> a2;
|
||||||
fanin -> a3;
|
collect -> a3;
|
||||||
}
|
}
|
||||||
|
|
||||||
TODO better describe how actual deref offset is derived, i.e. based on array base register.
|
TODO better describe how actual deref offset is derived, i.e. based on array base register.
|
||||||
|
|
||||||
To do an indirect write to a variable array, a ``fanout`` is used. Say the array was assigned to registers ``r0.z`` through ``r1.y`` (hence the constant offset of 2):
|
To do an indirect write to a variable array, a ``split`` is used. Say the array was assigned to registers ``r0.z`` through ``r1.y`` (hence the constant offset of 2):
|
||||||
|
|
||||||
Note that only cat1 (mov) can do indirect write.
|
Note that only cat1 (mov) can do indirect write.
|
||||||
|
|
||||||
@@ -338,7 +338,7 @@ To do an indirect write to a variable array, a ``fanout`` is used. Say the arra
|
|||||||
mul r0.x, r0.z, c0.z
|
mul r0.x, r0.z, c0.z
|
||||||
|
|
||||||
|
|
||||||
In this case, the ``mov`` instruction does not write all elements of the array (compared to usage of ``fanout`` for ``sam`` instructions in grouping_). But the ``mov`` instruction does need an additional dependency (via ``fanin``) on instructions that last wrote the array element members, to ensure that they get scheduled before the ``mov`` in scheduling_ stage (which also serves to group the array elements for the `register assignment`_ stage).
|
In this case, the ``mov`` instruction does not write all elements of the array (compared to usage of ``split`` for ``sam`` instructions in grouping_). But the ``mov`` instruction does need an additional dependency (via ``collect``) on instructions that last wrote the array element members, to ensure that they get scheduled before the ``mov`` in scheduling_ stage (which also serves to group the array elements for the `register assignment`_ stage).
|
||||||
|
|
||||||
.. graphviz::
|
.. graphviz::
|
||||||
|
|
||||||
@@ -351,20 +351,20 @@ In this case, the ``mov`` instruction does not write all elements of the array (
|
|||||||
mova;
|
mova;
|
||||||
mov;
|
mov;
|
||||||
mul;
|
mul;
|
||||||
fanout [label="fanout\noff=0"];
|
split [label="split\noff=0"];
|
||||||
mul -> fanout;
|
mul -> split;
|
||||||
fanout -> mov;
|
split -> mov;
|
||||||
fanin;
|
collect;
|
||||||
fanin -> a0;
|
collect -> a0;
|
||||||
fanin -> a1;
|
collect -> a1;
|
||||||
fanin -> a2;
|
collect -> a2;
|
||||||
fanin -> a3;
|
collect -> a3;
|
||||||
mov -> min;
|
mov -> min;
|
||||||
mov -> mova;
|
mov -> mova;
|
||||||
mov -> fanin;
|
mov -> collect;
|
||||||
}
|
}
|
||||||
|
|
||||||
Note that there would in fact be ``fanout`` nodes generated for each array element (although only the reachable ones will be scheduled, etc).
|
Note that there would in fact be ``split`` nodes generated for each array element (although only the reachable ones will be scheduled, etc).
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -401,7 +401,7 @@ The eventual plan is to invert that, with the front-end inserting no ``mov``\s a
|
|||||||
Grouping
|
Grouping
|
||||||
~~~~~~~~
|
~~~~~~~~
|
||||||
|
|
||||||
In the grouping pass, instructions which need to be grouped (for ``fanin``\s, etc) have their ``left`` / ``right`` neighbor pointers setup. In cases where there is a conflict (i.e. one instruction cannot have two unique left or right neighbors), an additional ``mov`` instruction is inserted. This ensures that there is some possible valid `register assignment`_ at the later stages.
|
In the grouping pass, instructions which need to be grouped (for ``collect``\s, etc) have their ``left`` / ``right`` neighbor pointers setup. In cases where there is a conflict (i.e. one instruction cannot have two unique left or right neighbors), an additional ``mov`` instruction is inserted. This ensures that there is some possible valid `register assignment`_ at the later stages.
|
||||||
|
|
||||||
|
|
||||||
.. _depth:
|
.. _depth:
|
||||||
|
Reference in New Issue
Block a user