rfc9692v4.txt | rfc9692.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) T. Przygienda, Ed. | Internet Engineering Task Force (IETF) T. Przygienda, Ed. | |||
Request for Comments: 9692 J. Head, Ed. | Request for Comments: 9692 J. Head, Ed. | |||
Category: Standards Track Juniper Networks | Category: Standards Track Juniper Networks | |||
ISSN: 2070-1721 A. Sharma | ISSN: 2070-1721 A. Sharma | |||
Hudson River Trading | Hudson River Trading | |||
P. Thubert | P. Thubert | |||
B. Rijsman | B. Rijsman | |||
Individual | Individual | |||
D. Afanasiev | March 2025 | |||
Yandex | ||||
January 2025 | ||||
RIFT: Routing in Fat Trees | RIFT: Routing in Fat Trees | |||
Abstract | Abstract | |||
This document defines a specialized, dynamic routing protocol for | This document defines a specialized, dynamic routing protocol for | |||
Clos, fat tree, and variants thereof. These topologies were | Clos, fat tree, and variants thereof. These topologies were | |||
initially used within crossbar interconnects and consequently router | initially used within crossbar interconnects and consequently router | |||
and switch backplanes, but their characteristics make them ideal for | and switch backplanes, but their characteristics make them ideal for | |||
constructing IP fabrics as well. The protocol specified by this | constructing IP fabrics as well. The protocol specified by this | |||
skipping to change at line 467 ¶ | skipping to change at line 465 ¶ | |||
Clos / fat tree: | Clos / fat tree: | |||
This document uses the terms "Clos" and "fat tree" interchangeably | This document uses the terms "Clos" and "fat tree" interchangeably | |||
where it always refers to a folded spine-and-leaf topology with | where it always refers to a folded spine-and-leaf topology with | |||
possibly multiple Points of Delivery (PoDs) and one or multiple | possibly multiple Points of Delivery (PoDs) and one or multiple | |||
Top of Fabric (ToF) planes. Several modifications such as L2L | Top of Fabric (ToF) planes. Several modifications such as L2L | |||
shortcuts and multi-level shortcuts are possible and described | shortcuts and multi-level shortcuts are possible and described | |||
further in the document. | further in the document. | |||
Cost: | Cost: | |||
A natural number without the unit associated with two entities. | A natural number without a unit associated with a single entity. | |||
The cost is a monoid under addition. A cost may be associated | The cost is a monoid under addition. A cost may be associated | |||
with either a single link or prefix, or it may represent the sum | with either a single link or prefix, or it may represent the sum | |||
of costs (distance) of links in the path between two nodes. | of costs (distance) of links in the path between two nodes. | |||
Crossbar: | Crossbar: | |||
Physical arrangement of ports in a switching matrix without | Physical arrangement of ports in a switching matrix without | |||
implying any further scheduling or buffering disciplines. | implying any further scheduling or buffering disciplines. | |||
Directed Acyclic Graph (DAG): | Directed Acyclic Graph (DAG): | |||
A finite directed graph with no directed cycles (loops). If links | A finite directed graph with no directed cycles (loops). If links | |||
skipping to change at line 1743 ¶ | skipping to change at line 1741 ¶ | |||
derived by the node automatically. | derived by the node automatically. | |||
Further leaf flag definitions are found in Section 6.7 as they have | Further leaf flag definitions are found in Section 6.7 as they have | |||
implications in terms of level and adjacency formation. Leaf flags | implications in terms of level and adjacency formation. Leaf flags | |||
are carried in _HierarchyIndications_. | are carried in _HierarchyIndications_. | |||
A node MUST form a _ThreeWay_ adjacency if, at a minimum, the | A node MUST form a _ThreeWay_ adjacency if, at a minimum, the | |||
following first order logic conditions are satisfied on a LIE packet, | following first order logic conditions are satisfied on a LIE packet, | |||
as specified by the _LIEPacket_ schema element and received on a link | as specified by the _LIEPacket_ schema element and received on a link | |||
(such a LIE is considered a "minimally valid" LIE). Observe that, | (such a LIE is considered a "minimally valid" LIE). Observe that, | |||
depending on the FSM involved and its state further, conditions may | depending on the FSM involved and its state, further conditions may | |||
be checked, and even a minimally valid LIE can be considered | be checked, and even a minimally valid LIE can be considered | |||
ultimately invalid if any of the additional conditions fail: | ultimately invalid if any of the additional conditions fail: | |||
1. the neighboring node is running the same major schema version as | 1. the neighboring node is running the same major schema version as | |||
indicated in the _major_version_ element in _PacketHeader_ *and* | indicated in the _major_version_ element in _PacketHeader_ *and* | |||
2. the neighboring node uses a valid System ID (i.e., a value | 2. the neighboring node uses a valid System ID (i.e., a value | |||
different from _IllegalSystemID_) in the _sender_ element in | different from _IllegalSystemID_) in the _sender_ element in | |||
_PacketHeader_ *and* | _PacketHeader_ *and* | |||
skipping to change at line 1964 ¶ | skipping to change at line 1962 ¶ | |||
1. if LIE has a major version not equal to this node's major | 1. if LIE has a major version not equal to this node's major | |||
version *or* System ID equal to this node's System ID or | version *or* System ID equal to this node's System ID or | |||
_IllegalSystemID_, then CLEANUP, else | _IllegalSystemID_, then CLEANUP, else | |||
2. if both sides advertise Layer 2 MTU values and the MTU in the | 2. if both sides advertise Layer 2 MTU values and the MTU in the | |||
received LIE does not match the MTU advertised by the local | received LIE does not match the MTU advertised by the local | |||
system *or* at least one of the nodes does not advertise an | system *or* at least one of the nodes does not advertise an | |||
MTU value and the advertising node's LIE does not match the | MTU value and the advertising node's LIE does not match the | |||
_default_mtu_size_ of the system not advertising an MTU, then | _default_mtu_size_ of the system not advertising an MTU, then | |||
CLEANUP, PUSH UpdateZTPOffer, and PUSH MTUMismatch, else | CLEANUP, then PUSH UpdateZTPOffer, then PUSH MTUMismatch, else | |||
3. if the LIE has an undefined level *or* this node's level is | 3. if the LIE has an undefined level *or* this node's level is | |||
undefined *or* this node is a leaf and the remote level is | undefined *or* this node is a leaf and the remote level is | |||
lower than HAT *or* the LIE's level is not leaf *and* its | lower than HAT *or* the LIE's level is not leaf *and* its | |||
difference is more than one from this node's level, then | difference is more than one from this node's level, then | |||
CLEANUP, PUSH UpdateZTPOffer, and PUSH UnacceptableHeader, | CLEANUP, then PUSH UpdateZTPOffer, then PUSH | |||
else | UnacceptableHeader, else | |||
4. PUSH UpdateZTPOffer, construct a temporary new neighbor | 4. PUSH UpdateZTPOffer, construct a temporary new neighbor | |||
structure with values from LIE, if no current neighbor exists, | structure with values from LIE, if no current neighbor exists, | |||
then set current neighbor to new neighbor, PUSH NewNeighbor | then set current neighbor to new neighbor, PUSH NewNeighbor | |||
event, CHECK_THREE_WAY, else | event, CHECK_THREE_WAY, else | |||
a. if the current neighbor System ID differs from LIE's | a. if the current neighbor System ID differs from LIE's | |||
System ID, then PUSH MultipleNeighbors, else | System ID, then PUSH MultipleNeighbors, else | |||
b. if the current neighbor stored level differs from LIE's | b. if the current neighbor stored level differs from LIE's | |||
skipping to change at line 4160 ¶ | skipping to change at line 4158 ¶ | |||
+---> | Via S4 | +---> | Via S4 | +---> | Via S4 | | +---> | Via S4 | +---> | Via S4 | +---> | Via S4 | | |||
+--------+ +--------+ +--------+ | +--------+ +--------+ +--------+ | |||
Figure 27: Abstract FIB After Negative 2001:db8:2::/48 from S4 | Figure 27: Abstract FIB After Negative 2001:db8:2::/48 from S4 | |||
6.7. Optional Zero Touch Provisioning (RIFT ZTP) | 6.7. Optional Zero Touch Provisioning (RIFT ZTP) | |||
Each RIFT node can operate in Zero Touch Provisioning (ZTP) mode, | Each RIFT node can operate in Zero Touch Provisioning (ZTP) mode, | |||
i.e., it has no RIFT-specific configuration (unless it is a ToF or it | i.e., it has no RIFT-specific configuration (unless it is a ToF or it | |||
is explicitly configured to operate in the overall topology as a leaf | is explicitly configured to operate in the overall topology as a leaf | |||
and/or support L2L procedures), and it will fully, automatically | and/or support L2L procedures), and it will fully automatically | |||
derive necessary RIFT parameters itself after being attached to the | derive necessary RIFT parameters itself after being attached to the | |||
topology. Manually configured nodes and nodes operating using RIFT | topology. Manually configured nodes and nodes operating using RIFT | |||
ZTP can be mixed freely and will form a valid topology if achievable. | ZTP can be mixed freely and will form a valid topology if achievable. | |||
The derivation of the level of each node happens based on offers | The derivation of the level of each node happens based on offers | |||
received from its neighbors, whereas each node (with the possible | received from its neighbors, whereas each node (with the possible | |||
exception of nodes configured as leaves) tries to attach at the | exception of nodes configured as leaves) tries to attach at the | |||
highest possible point in the fabric. This guarantees that even if | highest possible point in the fabric. This guarantees that even if | |||
the diffusion front of offers reaches a node from "below" faster than | the diffusion front of offers reaches a node from "below" faster than | |||
from "above", it will greedily abandon an already negotiated level | from "above", it will greedily abandon an already negotiated level | |||
skipping to change at line 8425 ¶ | skipping to change at line 8423 ¶ | |||
Email: as3957@gmail.com | Email: as3957@gmail.com | |||
Pascal Thubert | Pascal Thubert | |||
Individual | Individual | |||
France | France | |||
Email: pascal.thubert@gmail.com | Email: pascal.thubert@gmail.com | |||
Bruno Rijsman | Bruno Rijsman | |||
Individual | Individual | |||
Email: brunorijsman@gmail.com | Email: brunorijsman@gmail.com | |||
Dmitry Afanasiev | ||||
Yandex | ||||
Email: fl0w@yandex-team.ru | ||||
End of changes. 7 change blocks. | ||||
9 lines changed or deleted | 7 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |