rfc9689.original   rfc9689.txt 
TEAS Working Group Z. Li Internet Engineering Task Force (IETF) Z. Li
Internet-Draft D. Dhody Request for Comments: 9689 D. Dhody
Intended status: Informational Huawei Technologies Category: Informational Huawei Technologies
Expires: 2 December 2024 Q. Zhao ISSN: 2070-1721 Q. Zhao
Etheric Networks Etheric Networks
K. He K. He
Tencent Holdings Ltd. Tencent Holdings Ltd.
B. Khasanov B. Khasanov
Yandex LLC Yandex LLC
31 May 2024 November 2024
Use Cases for a PCE as a Central Controller (PCECC) Use Cases for a PCE as a Central Controller (PCECC)
draft-ietf-teas-pcecc-use-cases-18
Abstract Abstract
The PCE is a core component of a Software-Defined Networking (SDN) The PCE is a core component of a Software-Defined Networking (SDN)
system. It can be used to compute optimal paths for network traffic system. It can be used to compute optimal paths for network traffic
and update existing paths to reflect changes in the network or and update existing paths to reflect changes in the network or
traffic demands. PCE was developed to derive traffic-engineered traffic demands. The PCE was developed to derive traffic-engineered
paths in MPLS networks, which are supplied to the head end of the (TE) paths in MPLS networks, which are supplied to the head end of
paths using the Path Computation Element Communication Protocol the paths using the Path Computation Element Communication Protocol
(PCEP). (PCEP).
SDN has much broader applicability than signaled MPLS traffic- SDN has much broader applicability than signalled MPLS TE networks,
engineered (TE) networks, and the PCE may be used to determine paths and the PCE may be used to determine paths in a range of use cases
in a range of use cases including static LSPs, Segment Routing (SR), including static Label-Switched Paths (LSPs), Segment Routing (SR),
Service Function Chaining (SFC), and most forms of a routed or Service Function Chaining (SFC), and most forms of a routed or
switched network. It is, therefore, reasonable to consider PCEP as a switched network. Therefore, it is reasonable to consider PCEP as a
control protocol for use in these environments to allow the PCE to be control protocol for use in these environments to allow the PCE to be
fully enabled as a central controller. fully enabled as a central controller.
A PCE as a Central Controller (PCECC) can simplify the processing of A PCE as a Central Controller (PCECC) can simplify the processing of
a distributed control plane by blending it with elements of SDN a distributed control plane by blending it with elements of SDN
without necessarily completely replacing it. This document describes without necessarily completely replacing it. This document describes
general considerations for PCECC deployment and examines its general considerations for PCECC deployment and examines its
applicability and benefits, as well as its challenges and applicability and benefits, as well as its challenges and
limitations, through a number of use cases. PCEP extensions which limitations, through a number of use cases. PCEP extensions, which
are required for the PCECC use cases are covered in separate are required for the PCECC use cases, are covered in separate
documents. documents.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are candidates for any level of Internet
Standard; see Section 2 of RFC 7841.
This Internet-Draft will expire on 2 December 2024. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9689.
Copyright Notice Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. Code Components carefully, as they describe your rights and restrictions with respect
extracted from this document must include Revised BSD License text as to this document. Code Components extracted from this document must
described in Section 4.e of the Trust Legal Provisions and are include Revised BSD License text as described in Section 4.e of the
provided without warranty as described in the Revised BSD License. Trust Legal Provisions and are provided without warranty as described
in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology
3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Use Cases
3.1. PCECC for Label Management . . . . . . . . . . . . . . . 5 3.1. PCECC for Label Management
3.2. PCECC and Segment Routing . . . . . . . . . . . . . . . . 7 3.2. PCECC and Segment Routing
3.2.1. PCECC SID Allocation for SR-MPLS . . . . . . . . . . 8 3.2.1. PCECC SID Allocation for SR-MPLS
3.2.2. PCECC for SR-MPLS Best Effort (BE) Path . . . . . . . 9 3.2.2. PCECC for SR-MPLS Best Effort (BE) Paths
3.2.3. PCECC for SR-MPLS TE Path . . . . . . . . . . . . . . 9 3.2.3. PCECC for SR-MPLS TE Paths
3.2.4. PCECC for SRv6 . . . . . . . . . . . . . . . . . . . 12 3.2.4. PCECC for SRv6
3.3. PCECC for Static TE LSP . . . . . . . . . . . . . . . . . 14 3.3. PCECC for Static TE LSPs
3.4. PCECC for Load Balancing (LB) . . . . . . . . . . . . . . 16 3.4. PCECC for Load Balancing (LB)
3.5. PCECC and Inter-AS TE . . . . . . . . . . . . . . . . . . 18 3.5. PCECC and Inter-AS TE
3.6. PCECC for Multicast LSPs . . . . . . . . . . . . . . . . 21 3.6. PCECC for Multicast LSPs
3.6.1. PCECC for P2MP/MP2MP LSPs' Setup . . . . . . . . . . 21 3.6.1. PCECC for the Setup of P2MP/MP2MP LSPs
3.6.2. PCECC for the End-to-End Protection of P2MP/MP2MP 3.6.2. PCECC for the End-to-End Protection of P2MP/MP2MP LSPs
LSPs . . . . . . . . . . . . . . . . . . . . . . . . 24 3.6.3. PCECC for the Local Protection of P2MP/MP2MP LSPs
3.6.3. PCECC for the Local Protection of the P2MP/MP2MP 3.7. PCECC for Traffic Classification
LSPs . . . . . . . . . . . . . . . . . . . . . . . . 25 3.8. PCECC for SFC
3.7. PCECC for Traffic Classification . . . . . . . . . . . . 26 3.9. PCECC for Native IP
3.8. PCECC for SFC . . . . . . . . . . . . . . . . . . . . . . 27 3.10. PCECC for BIER
3.9. PCECC for Native IP . . . . . . . . . . . . . . . . . . . 28 4. IANA Considerations
3.10. PCECC for BIER . . . . . . . . . . . . . . . . . . . . . 29 5. Security Considerations
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 6. References
5. Security Considerations . . . . . . . . . . . . . . . . . . . 29 6.1. Normative References
6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 6.2. Informative References
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 Appendix A. Other Use Cases of the PCECC
7.1. Normative References . . . . . . . . . . . . . . . . . . 30 A.1. PCECC for Network Migration
7.2. Informative References . . . . . . . . . . . . . . . . . 31 A.2. PCECC for L3VPN and PWE3
Appendix A. Other Use Cases of PCECC . . . . . . . . . . . . . . 38 A.3. PCECC for Local Protection (RSVP-TE)
A.1. PCECC for Network Migration . . . . . . . . . . . . . . . 38 A.4. Using Reliable P2MP TE-Based Multicast Delivery for
A.2. PCECC for L3VPN and PWE3 . . . . . . . . . . . . . . . . 39 Distributed Computations (MapReduce-Hadoop)
A.3. PCECC for Local Protection (RSVP-TE) . . . . . . . . . . 40 Acknowledgments
A.4. Using reliable P2MP TE based multicast delivery for Contributors
distributed computations (MapReduce-Hadoop) . . . . . . . 41 Authors' Addresses
Appendix B. Contributor Addresses . . . . . . . . . . . . . . . 43
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44
1. Introduction 1. Introduction
The PCE [RFC4655] was developed to offload the path computation The PCE [RFC4655] was developed to offload the path computation
function from routers in an MPLS traffic-engineered (TE) network. It function from routers in an MPLS traffic-engineered (TE) network. It
can compute optimal paths for traffic across a network and can also can compute optimal paths for traffic across a network and can also
update the paths to reflect changes in the network or traffic update the paths to reflect changes in the network or traffic
demands. The role and function of PCE have grown to cover several demands. The role and function of the PCE have grown to cover
other uses (such as GMPLS [RFC7025] or Multicast), and to allow several other uses (such as GMPLS [RFC7025] or Multicast) and to
delegated stateful control [RFC8231] and PCE-initiated use of network allow delegated stateful control [RFC8231] and PCE-initiated use of
resources [RFC8281]. network resources [RFC8281].
According to [RFC7399], Software-Defined Networking (SDN) refers to a According to [RFC7399], Software-Defined Networking (SDN) refers to a
separation between the control elements and the forwarding components separation between the control elements and the forwarding components
so that software running in a centralized system, called a so that software running in a centralized system, called a
controller, can act to program the devices in the network to behave "controller", can act to program the devices in the network to behave
in specific ways. A required element in an SDN architecture is a in specific ways. A required element in an SDN architecture is a
component that plans how the network resources will be used and how component that plans how the network resources will be used and how
the devices will be programmed. It is possible to view this the devices will be programmed. It is possible to view this
component as performing specific computations to place traffic flows component as performing specific computations to place traffic flows
within the network given knowledge of the availability of the network within the network given knowledge of the availability of the network
resources, how other forwarding devices are programmed, and the way resources, how other forwarding devices are programmed, and the way
that other flows are routed. This is the function and purpose of a that other flows are routed. This is the function and purpose of a
PCE, and the way that a PCE integrates into a wider network control PCE, and the way that a PCE integrates into a wider network control
system (including an SDN system) is presented in [RFC7491]. system (including an SDN system) is presented in [RFC7491].
[RFC8283] introduces the architecture for the PCE as a central [RFC8283] introduces the architecture for the PCE as a central
controller as an extension to the architecture described in [RFC4655] controller as an extension to the architecture described in [RFC4655]
and assumes the continued use of PCEP as the protocol used between and assumes the continued use of PCEP as the protocol used between
the PCE and PCC. [RFC8283] further examines the motivations and the PCE and Path Computation Client (PCC). [RFC8283] further
applicability of PCEP as a Southbound Interface (SBI) and introduces examines the motivations and applicability of PCEP as a Southbound
the implications for the protocol. Interface (SBI) and introduces the implications for the protocol.
[RFC9050] introduces the procedures and extensions for PCEP to [RFC9050] introduces the procedures and extensions for PCEP to
support the PCECC architecture [RFC8283]. support the PCECC architecture [RFC8283].
This document describes the various use cases for the PCECC This document describes the various use cases for the PCECC
architecture. architecture.
2. Terminology 2. Terminology
The following terminology is used in this document. The following terminology is used in this document.
BGP-LS: Border Gateway Protocol - Link State [RFC9552]. AS: Autonomous System
LSP: Label Switched Path. ASBR: Autonomous System Border Router
IGP: Interior Gateway Protocol. In the document, we assume either BGP-LS: Border Gateway Protocol - Link State [RFC9552]
Open Shortest Path First (OSPF) [RFC2328][RFC5340] or Intermediate
System to Intermediate System (IS-IS) [RFC1195] as IGP.
PCC: Path Computation Client. As per [RFC4655], any client IGP: Interior Gateway Protocol (in this document, we assume IGP as
application requesting a path computation to be performed by a Path either Open Shortest Path First (OSPF) [RFC2328] [RFC5340] or
Computation Element. Intermediate System to Intermediate System (IS-IS) [RFC1195])
PCE: Path Computation Element. As per [RFC4655], an entity LSP: Label-Switched Path
(component, application, or network node) that is capable of
computing a network path or route based on a network graph and
applying computational constraints.
PCECC: PCE as a Central Controller. Extension of PCE to support SDN PCC: Path Computation Client (as per [RFC4655], any client
functions as per [RFC8283]. application requesting a path computation to be performed by a PCE)
PST: Path Setup Type [RFC8408]. PCE: Path Computation Element (as per [RFC4655], an entity such as a
component, application, or network node that is capable of computing
a network path or route based on a network graph and applying
computational constraints)
RR: Route Reflector [RFC4456]. PCECC: PCE as a Central Controller (an extension of a PCE to support
SDN functions as per [RFC8283])
SID: Segment Identifier [RFC8402]. PST: Path Setup Type [RFC8408]
SR: Segment Routing [RFC8402]. RR: Route Reflector [RFC4456]
SRGB: Segment Routing Global Block [RFC8402]. SID: Segment Identifier [RFC8402]
SRLB: Segment Routing Local Block [RFC8402]. SR: Segment Routing [RFC8402]
TE: Traffic Engineering [RFC9522]. SRGB: Segment Routing Global Block [RFC8402]
SRLB: Segment Routing Local Block [RFC8402]
TE: Traffic Engineering [RFC9522]
3. Use Cases 3. Use Cases
[RFC8283] describes various use cases for PCECC such as: [RFC8283] describes various use cases for a PCECC such as:
* Use of PCECC to set up Static TE LSPs. The PCEP extension for * use of a PCECC to set up static TE LSPs (the PCEP extension for
this use case is in [RFC9050]. this use case is in [RFC9050])
* Use of PCECC in Segment Routing [RFC8402]. * use of a PCECC in Segment Routing [RFC8402]
* Use of PCECC to set up Multicast Point-to-Multipoint (P2MP) LSP. * use of a PCECC to set up Multicast Point-to-Multipoint (P2MP) LSPs
* Use of PCECC to set up Service Function Chaining (SFC) [RFC7665]. * use of a PCECC to set up Service Function Chaining (SFC) [RFC7665]
* Use of PCECC in Optical Networks. * use of a PCECC in optical networks
Section 3.1 describes the general case of PCECC being in charge of Section 3.1 describes the general case of a PCECC being in charge of
managing MPLS label space which is a prerequisite for further use managing MPLS label space, which is a prerequisite for further use
cases. Further, various use cases (SR, Multicast etc) are described cases. Further, various use cases (SR, Multicast, etc.) are
in the following sections to showcase scenarios that can benefit from described in the following sections to showcase scenarios that can
the use of PCECC. benefit from the use of a PCECC.
It is interesting to note that some of the use cases listed can also It is interesting to note that some of the use cases listed can also
be supported via BGP instead of PCEP. However, within the scope of be supported via BGP instead of PCEP. However, within the scope of
this document, the focus is on the use of PCEP. this document, the focus is on the use of PCEP.
3.1. PCECC for Label Management 3.1. PCECC for Label Management
As per [RFC8283], in some cases, the PCE-based controller can take As per [RFC8283], in some cases, the PCE-based controller can take
responsibility for managing some part of the MPLS label space for responsibility for managing some part of the MPLS label space for
each of the routers that it controls, and it may take wider each of the routers that it controls, and it may take wider
skipping to change at page 5, line 43 skipping to change at line 231
ranges to the router using PCEP. ranges to the router using PCEP.
[RFC9050] describes a mode where LSPs are provisioned as explicit [RFC9050] describes a mode where LSPs are provisioned as explicit
label instructions at each hop on the end-to-end path. Each router label instructions at each hop on the end-to-end path. Each router
along the path must be told what label forwarding instructions to along the path must be told what label forwarding instructions to
program and what resources to reserve. The controller uses PCEP to program and what resources to reserve. The controller uses PCEP to
communicate with each router along the path of the end-to-end LSP. communicate with each router along the path of the end-to-end LSP.
For this to work, the PCE-based controller will take responsibility For this to work, the PCE-based controller will take responsibility
for managing some part of the MPLS label space for each of the for managing some part of the MPLS label space for each of the
routers that it controls. An extension to PCEP could be done to routers that it controls. An extension to PCEP could be done to
allow a PCC to inform the PCE of such a label space to control. (See allow a PCC to inform the PCE of such a label space to control (see
[I-D.li-pce-controlled-id-space] for a possible PCEP extension to [PCE-ID] for a possible PCEP extension to support the advertisement
support the advertisement of the MPLS label space to the PCE to of the MPLS label space for the PCE to control).
control.)
[RFC8664] specifies extensions to PCEP that allow a stateful PCE to [RFC8664] specifies extensions to PCEP that allow a stateful PCE to
compute, update or initiate SR-TE paths. compute, update, or initiate SR-TE paths. [PCECC-SR] describes the
[I-D.ietf-pce-pcep-extension-pce-controller-sr] describes the mechanism for a PCECC to allocate and provision the node/prefix/
mechanism for PCECC to allocate and provision the node/prefix/
adjacency label (Segment Routing Identifier (SID)) via PCEP. To make adjacency label (Segment Routing Identifier (SID)) via PCEP. To make
such an allocation PCE needs to be aware of the label space from the such an allocation, the PCE needs to be aware of the label space from
Segment Routing Global Block (SRGB) or Segment Routing Local Block the Segment Routing Global Block (SRGB) or Segment Routing Local
(SRLB) [RFC8402] of the node that it controls. A mechanism for a PCC Block (SRLB) [RFC8402] of the node that it controls. A mechanism for
to inform the PCE of such a label space to control is needed within a PCC to inform the PCE of such a label space to control is needed
the PCEP. The full SRGB/SRLB of a node could be learned via existing within the PCEP. The full SRGB/SRLB of a node could be learned via
IGP or BGP-LS [RFC9552] mechanisms. existing IGP or BGP-LS [RFC9552] mechanisms.
Further, there have been proposals for a global label range in MPLS, Further, there have been proposals for a global label range in MPLS
the PCECC architecture could be used as a means to learn the label as well as PCECC architecture being used as a means to learn the
space of nodes, and could also be used to determine and provision the label space of nodes and being used to determine and provision the
global label range. global label range.
+------------------------------+ +------------------------------+ +------------------------------+ +------------------------------+
| PCE DOMAIN 1 | | PCE DOMAIN 2 | | PCE DOMAIN 1 | | PCE DOMAIN 2 |
| +--------+ | | +--------+ | | +--------+ | | +--------+ |
| | | | | | | | | | | | | | | |
| | PCECC1 | ---------PCEP---------- | PCECC2 | | | | PCECC1 | ---------PCEP---------- | PCECC2 | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| +--------+ | | +--------+ | | +--------+ | | +--------+ |
| ^ ^ | | ^ ^ | | ^ ^ | | ^ ^ |
| / \ PCEP | | PCEP / \ | | / \ PCEP | | PCEP / \ |
| V V | | V V | | V V | | V V |
| +--------+ +--------+ | | +--------+ +--------+ | | +--------+ +--------+ | | +--------+ +--------+ |
| |NODE 11 | | NODE 1n| | | |NODE 21 | | NODE 2n| | | |NODE 11 | | NODE 1n| | | |NODE 21 | | NODE 2n| |
| | | ...... | | | | | | ...... | | | | | | ...... | | | | | | ...... | | |
| | PCECC | | PCECC | | | | PCECC | |PCECC | | | | PCECC | | PCECC | | | | PCECC | |PCECC | |
| |Enabled | | Enabled| | |Enabled | |Enabled | | | |Enabled | | Enabled| | | |Enabled | |Enabled | |
| +--------+ +--------+ | | +--------+ +--------+ | | +--------+ +--------+ | | +--------+ +--------+ |
| | | | | | | |
+------------------------------+ +------------------------------+ +------------------------------+ +------------------------------+
Figure 1: PCECC for MPLS Label Management Figure 1: PCECC for MPLS Label Management
* As shown in Figure 1, PCC will advertise the PCECC capability to * As shown in Figure 1, the PCC will advertise the PCECC capability
the PCE central controller (PCECC) [RFC9050]. to the PCECC [RFC9050].
* The PCECC could also learn the label range set aside by the PCC * The PCECC could also learn the label range set aside by the PCC
(via [I-D.li-pce-controlled-id-space]). (via [PCE-ID]).
* Optionally, the PCECC could determine the shared MPLS global label * Optionally, the PCECC could determine the shared MPLS global label
range for the network. range for the network.
- In the case that the shared global label range needs to be - In the case that the shared global label range needs to be
negotiated across multiple domains, the central controllers of negotiated across multiple domains, the central controllers of
these domains will also need to negotiate a common global label these domains will also need to negotiate a common global label
range across domains. range across domains.
- The PCECC will need to set the shared global label range to all - The PCECC will need to set the shared global label range to all
PCC nodes in the network. PCC nodes in the network.
As per [RFC9050], PCECC could also rely on the PCC to make label As per [RFC9050], the PCECC could also rely on the PCC to make label
allocations initially and use PCEP to distribute it to where it is allocations initially and use PCEP to distribute it to where it is
needed. needed.
3.2. PCECC and Segment Routing 3.2. PCECC and Segment Routing
Segment Routing (SR) [RFC8402] leverages the source routing paradigm. Segment Routing (SR) [RFC8402] leverages the source routing paradigm.
Using SR, a source node steers a packet through a path without Using SR, a source node steers a packet through a path without
relying on hop-by-hop signalling protocols such as LDP [RFC5036] or relying on hop-by-hop signalling protocols such as LDP [RFC5036] or
RSVP-TE [RFC3209]. Each path is specified as an ordered list of RSVP-TE [RFC3209]. Each path is specified as an ordered list of
instructions called "segments". Each segment is an instruction to instructions called "segments". Each segment is an instruction to
route the packet to a specific place in the network, or to perform a route the packet to a specific place in the network or to perform a
specific service on the packet. A database of segments can be specific service on the packet. A database of segments can be
distributed through the network using a intra-domain routing protocol distributed through the network using an intra-domain routing
(such as IS-IS or OSPF) or an inter-domain protocol (BGP), or by any protocol (such as IS-IS or OSPF), an inter-domain protocol (such as
other means. PCEP could also be one of other protocols. BGP), or by any other means. PCEP could also be one of other
protocols.
[RFC8664] specifies the SR-specific PCEP extension for SR-MPLS. [RFC8664] specifies the PCEP extension specific to SR for SR over
PCECC may further use PCEP protocol for SR SIDs (Segment Identifiers) MPLS (SR-MPLS). The PCECC may further use the PCEP protocol for
distribution to the SR nodes (PCC) with some benefits. If the PCECC distributing SR Segment Identifiers (SIDs) to the SR nodes (PCC) with
allocates and maintains the SIDs in the network for the nodes and some benefits. If the PCECC allocates and maintains the SIDs in the
adjacencies; and further distributes them to the SR nodes directly network for the nodes and adjacencies, and further distributes them
via the PCEP session then it is more advantageous over the to the SR nodes directly via the PCEP session, then it is more
configurations on each SR node and flooding them via IGP, especially advantageous over the configurations on each SR node and flooding
in an SDN environment. them via IGP, especially in an SDN environment.
When the PCECC is used for the distribution of the Node-SID and Adj- When the PCECC is used for the distribution of the Node-SID and Adj-
SID for SR-MPLS, the Node-SID is allocated from the SRGB of the node. SID for SR-MPLS, the Node-SID is allocated from the SRGB of the node
For the allocation of Adj-SID, the allocation is from the SRLB of the and the Adj-SID is allocated from the SRLB of the node as described
node as described in [I-D.ietf-pce-pcep-extension-pce-controller-sr]. in [PCECC-SR].
[RFC8355] identifies various protection and resiliency usecases for [RFC8355] identifies various protection and resiliency use cases for
SR. Path protection lets the ingress node be in charge of the SR. Path protection lets the ingress node be in charge of the
failure recovery (used for SR-TE [RFC8664]). Also, protection can be failure recovery (used for SR-TE [RFC8664]). Also, protection can be
performed by the node adjacent to the failed component, commonly performed by the node adjacent to the failed component, commonly
referred to as local protection techniques or fast-reroute (FRR) referred to as "local protection techniques" or "fast-reroute (FRR)
techniques. In the case of PCECC, the protection paths can be pre- techniques". In the case of the PCECC, the protection paths can be
computed and set up by the PCE. precomputed and set up by the PCE.
The Figure 2 illustrates the use case where the Node-SID and Adj-SID Figure 2 illustrates the use case where the Node-SID and Adj-SID are
are allocated by the PCECC for SR-MPLS. allocated by the PCECC for SR-MPLS.
192.0.2.1/32 192.0.2.1/32
+----------+ +----------+
| R1(1001) | | R1(1001) |
+----------+ +----------+
| |
+----------+ +----------+
| R2(1002) | 192.0.2.2/32 | R2(1002) | 192.0.2.2/32
+----------+ +----------+
* | * * * | * *
skipping to change at page 8, line 38 skipping to change at line 366
| R8(1008) | 192.0.2.8/32 | R8(1008) | 192.0.2.8/32
+-----------+ +-----------+
Figure 2: SR Topology Figure 2: SR Topology
3.2.1. PCECC SID Allocation for SR-MPLS 3.2.1. PCECC SID Allocation for SR-MPLS
Each node (PCC) is allocated a Node-SID by the PCECC. The PCECC Each node (PCC) is allocated a Node-SID by the PCECC. The PCECC
needs to update the label mapping of each node to all the other nodes needs to update the label mapping of each node to all the other nodes
in the domain. After receiving the label mapping, each node (PCC) in the domain. After receiving the label mapping, each node (PCC)
uses the local routing information to determine the nexthop and uses the local routing information to determine the next hop and
download the label forwarding instructions accordingly. The download the label forwarding instructions accordingly. The
forwarding behaviour and the end result are the same as IGP shortest- forwarding behavior and the end result are the same as IGP shortest-
path SR forwarding based on Node-SID. Thus, from anywhere in the path SR forwarding based on Node-SIDs. Thus, from anywhere in the
domain, it enforces the ECMP-aware shortest-path forwarding of the domain, it enforces the ECMP-aware shortest-path forwarding of the
packet towards the related node. packet towards the related node.
For each adjacency in the network, a PCECC can allocate an Adj-SID. The PCECC can allocate an Adj-SID for each adjacency in the network.
The PCECC sends a PCInitiate message to update the label mapping of The PCECC sends a PCInitiate message to update the label mapping of
each adjacency to the corresponding nodes in the domain. Each node each adjacency to the corresponding nodes in the domain. Each node
(PCC) downloads the label forwarding instructions accordingly. The (PCC) downloads the label forwarding instructions accordingly. The
forwarding behaviour and the end result are similar to IGP-based Adj- forwarding behavior and the end result are similar to IGP-based Adj-
SID allocation and usage in SR. SID allocation and usage in SR.
These mechanisms are described in These mechanisms are described in [PCECC-SR].
[I-D.ietf-pce-pcep-extension-pce-controller-sr].
3.2.2. PCECC for SR-MPLS Best Effort (BE) Path 3.2.2. PCECC for SR-MPLS Best Effort (BE) Paths
In this use case, the PCECC needs to allocate the Node-SID (without In this use case, the PCECC needs to allocate the Node-SID (without
calculating the explicit path for the SR path). The ingress router calculating the explicit path for the SR path). The ingress router
of the forwarding path needs to encapsulate the destination Node-SID of the forwarding path needs to encapsulate the destination Node-SID
on top of the packet. All the intermediate nodes will forward the on top of the packet. All the intermediate nodes will forward the
packet based on the destination Node-SID. It is similar to the LDP packet based on the destination Node-SID. It is similar to the LDP
LSP. LSP.
R1 may send a packet to R8 simply by pushing an SR label with segment R1 may send a packet to R8 simply by pushing an SR label with segment
{1008} (Node-SID for R8). The path will be based on the routing/ {1008} (Node-SID for R8). The path will be based on the routing /
nexthop calculation on the routers. next hop calculation on the routers.
3.2.3. PCECC for SR-MPLS TE Path 3.2.3. PCECC for SR-MPLS TE Paths
SR-TE paths may not follow an IGP shortest path tree (SPT). Such SR-TE paths may not follow an IGP shortest path tree (SPT). Such
paths may be chosen by a PCECC and provisioned on the ingress node of paths may be chosen by a PCECC and provisioned on the ingress node of
the SR-TE path. The SR header consists of a list of SIDs (or MPLS the SR-TE path. The SR header consists of a list of SIDs (or MPLS
labels). The header has all necessary information so that the labels). The header has all necessary information so that the
packets can be guided from the ingress node to the egress node of the packets can be guided from the ingress node to the egress node of the
path. Hence, there is no need for any signalling protocol. For the path. Hence, there is no need for any signalling protocol. For the
case where a strict traffic engineering path is needed, all the Adj- case where a strict traffic engineering path is needed, all the Adj-
SID are stacked, otherwise, a combination of node-SID or adj-SID can SIDs are stacked; otherwise, a combination of Node-SIDs or Adj-SIDs
be used for the SR-TE paths. can be used for the SR-TE paths.
As shown in Figure 3, R1 may send a packet to R8 by pushing an SR As shown in Figure 3, R1 may send a packet to R8 by pushing an SR
header with segment list {1002, 9001, 1008}. Where 1002 and 1008 are header with segment list {1002, 9001, 1008}, where 1002 and 1008 are
the Node-SID of R2 and R8 respectively. 9001 is the Adj-SID for the Node-SIDs of R2 and R8, respectively. 9001 is the Adj-SID for
link1. The path should be: R1-R2-link1-R3-R8. link1. The path should be: R1-R2-link1-R3-R8.
To achieve this, the PCECC first allocates and distributes SIDs as To achieve this, the PCECC first allocates and distributes SIDs as
described in Section 3.2.1. [RFC8664] describes the mechanism for a described in Section 3.2.1. [RFC8664] describes the mechanism for a
PCE to compute, update, or initiate SR-MPLS TE paths. PCE to compute, update, or initiate SR-MPLS TE paths.
192.0.2.1/32 192.0.2.1/32
+----------+ +----------+
| R1 (1001)| | R1 (1001)|
+----------+ +----------+
skipping to change at page 10, line 37 skipping to change at line 448
+-----------+ +-----------+ +-----------+ +-----------+
| | | |
|link8 | |link8 |
| |----------|link9 | |----------|link9
+-----------+ +-----------+
| R8 (1008) | 192.0.2.8/32 | R8 (1008) | 192.0.2.8/32
+-----------+ +-----------+
Figure 3: PCECC TE LSP Setup Example Figure 3: PCECC TE LSP Setup Example
Refer to Figure 3 for an example of TE topology, where, 100x - are Refer to Figure 3 for an example of TE topology, where 100x are Node-
Node-SIDs and 900xx - are Adj-SIDs. SIDs and 900xx are Adj-SIDs.
* The SID allocation and distribution are done by the PCECC with all * The SID allocation and distribution are done by the PCECC with all
Node-SIDs (100x) and all Adj-SIDs (900xx). Node-SIDs (100x) and all Adj-SIDs (900xx).
* Based on path computation request/delegation or PCE initiation, * Based on path computation request/delegation or PCE initiation,
the PCECC receives a request with constraints and optimization the PCECC receives a request with constraints and optimization
criteria from a PCC. criteria from a PCC.
* PCECC will calculate the optimal path according to the given * The PCECC will calculate the optimal path according to the given
constraints (e.g. bandwidth). constraints (e.g., bandwidth).
* PCECC will provision SR-MPLS TE LSP (path R1-link1-R2-link6-R3-R8) * The PCECC will provision the SR-MPLS TE LSP path
at the ingress node: {90011,1002,90026,1003,1008} (R1-link1-R2-link6-R3-R8) at the ingress node:
{90011,1002,90026,1003,1008}
* For the end-to-end protection, PCECC can provision the secondary * For the end-to-end protection, the PCECC can provision the
path (R1-link2-R2-link4-R5-R8): {90012,1002,90024,1005,1008}. secondary path (R1-link2-R2-link4-R5-R8):
{90012,1002,90024,1005,1008}.
3.2.3.1. PCECC for SR Policy 3.2.3.1. PCECC for SR Policy
[RFC8402] defines Segment Routing architecture, which uses an SR [RFC8402] defines Segment Routing architecture, which uses an SR
Policy to steer packets from a node through an ordered list of Policy to steer packets from a node through an ordered list of
segments. The SR Policy could be configured on the headend or segments. The SR Policy could be configured on the headend or
instantiated by an SR controller. The SR architecture does not instantiated by an SR controller. The SR architecture does not
restrict how the controller programs the network. In this case, the restrict how the controller programs the network. In this case, the
focus is on PCEP as the protocol for SR Policy delivery from PCE to focus is on PCEP as the protocol for SR Policy delivery from the PCE
PCC. to PCC.
An SR Policy architecture is described in [RFC9256]. An SR Policy is An SR Policy architecture is described in [RFC9256]. An SR Policy is
a framework that enables the instantiation of an ordered list of a framework that enables the instantiation of an ordered list of
segments on a node for implementing a source routing policy for the segments on a node for implementing a source routing policy for the
steering of traffic for a specific purpose (e.g. for a specific SLA) steering of traffic for a specific purpose (e.g., for a specific
from that node. Service Level Agreement (SLA)) from that node.
An SR Policy is identified through the tuple <headend, color, An SR Policy is identified through the tuple <headend, color,
endpoint>. endpoint>.
Figure 3 is used as an example of PCECC application for SR Policy Figure 3 is used as an example of PCECC application for SR Policy
instantiation for SR-MPLS, where, 100x - are Node-SIDs and 900xx - instantiation for SR-MPLS, where the Node-SIDs are 100x and the Adj-
are Adj-SIDs. SIDs are 900xx.
Let's assume that R1 needs to have two disjoint SR Policies towards Let's assume that R1 needs to have two disjoint SR Policies towards
R8 based on different bandwidths, the possible paths are: R8 based on different bandwidths. This means the possible paths are:
POL1: {Headend R1, color 100, Endpoint R8; Candidate Path1: * POL1: {Headend R1, color 100, Endpoint R8; Candidate Path1:
Segment List 1: {90011,1002,90023,1004,1003,1008}} Segment List 1: {90011,1002,90023,1004,1003,1008}}
POL2: {Headend R1, color 200, Endpoint R8; Candidate Path1: * POL2: {Headend R1, color 200, Endpoint R8; Candidate Path1:
Segment List 1: {90012,1002,90024,1005,1006,1008}} Segment List 1: {90012,1002,90024,1005,1006,1008}}
Each SR Policy (including candidate path and segment list) will be Each SR Policy (including the candidate path and segment list) will
signalled to a headend (R1) via PCEP be signalled to a headend (R1) via PCEP [PCEP-POLICY] with the
[I-D.ietf-pce-segment-routing-policy-cp] with the addition of an addition of an ASSOCIATION object. A Binding SID (BSID) [RFC8402]
ASSOCIATION object. Binding SID (BSID) [RFC8402] can be used for can be used for traffic steering of labelled traffic into an SR
traffic steering of labelled traffic into SR Policy, BSID can be Policy; a BSID can be provisioned from the PCECC also via PCEP
provisioned from PCECC also via PCEP [RFC9604]. For non-labelled traffic steering into the SR Policy POL1
[I-D.ietf-pce-binding-label-sid]. For non-labelled traffic steering or POL2, a per-destination traffic steering will be used by means of
into the SR Policy POL1 or POL2, a per-destination traffic steering the BGP Color Extended Community [RFC9012].
will be used by means of the BGP Color extended community [RFC9012]
The procedure: The procedure is as follows:
PCECC allocates Node-SIDs and Adj-SIDs using the mechanism * The PCECC allocates Node-SIDs and Adj-SIDs using the mechanism
described in Section 3.2.1 for all nodes and links. described in Section 3.2.1 for all nodes and links.
PCECC will calculate disjoint paths for POL1 and POL2 and create * The PCECC calculates disjoint paths for POL1 and POL2 and create
Segment Lists for them:{90011,1002,90023,1004,1003,1008};{90012,10 segment lists for them: {90011,1002,90023,1004,1003,1008};{90012,1
02,90024,1005,1006,1008}. 002,90024,1005,1006,1008}.
PCECC will form both SR Policies POL1 and POL2. * The PCECC forms both SR Policies POL1 and POL2.
PCECC will send both POL1 and POl2 to R1 via PCEP. * The PCECC sends both POL1 and POL2 to R1 via PCEP.
PCECC optionally can allocate BSIDs for the SR Policies. * The PCECC optionally allocates BSIDs for the SR Policies.
The traffic from R1 to R8 which fits to color 100 will be steered * The traffic from R1 to R8, which fits to color 100, will be
to POL1 and follows the path: R1-link1-R2-link3-R4-R3-R8. The steered to POL1 and follows the path: R1-link1-R2-link3-R4-R3-R8.
traffic from R1 to R8 which fits color 200 will be steered to POL2 The traffic from R1 to R8, which fits color 200, will be steered
and follows the path: R1-link2-R2-link4-R5-R6-R8. Due to the to POL2 and follows the path: R1-link2-R2-link4-R5-R6-R8. Due to
possibility of having many Segment Lists in the same Candidate the possibility of having many segment lists in the same candidate
Path of each POL1/POL2, PCECC could provision more paths towards path of each POL1/POL2, the PCECC could provision more paths
R8 and traffic will be balanced either as ECMP or as w/ECMP. This towards R8 and traffic will be balanced either as ECMP or as w/
is the advantage of SR Policy architecture. ECMP. This is the advantage of SR Policy architecture.
Note that an SR Policy can be associated with multiple candidate Note that an SR Policy can be associated with multiple candidate
paths. A candidate path is selected when it is valid and it is paths. A candidate path is selected when it is valid and it is
determined to be the best path of the SR Policy as described in determined to be the best path of the SR Policy as described in
[RFC9256]. [RFC9256].
3.2.4. PCECC for SRv6 3.2.4. PCECC for SRv6
As per [RFC8402], with Segment Routing (SR), a node steers a packet As per [RFC8402], with Segment Routing (SR), a node steers a packet
through an ordered list of instructions, called segments. Segment through an ordered list of instructions, called segments. Segment
Routing can be applied to the IPv6 architecture with the Segment Routing can be applied to the IPv6 architecture with the Segment
Routing Header (SRH) [RFC8754]. A segment is encoded as an IPv6 Routing Header (SRH) [RFC8754]. A segment is encoded as an IPv6
address. An ordered list of segments is encoded as an ordered list address. An ordered list of segments is encoded as an ordered list
of IPv6 addresses in the routing header. The active segment is of IPv6 addresses in the routing header. The active segment is
indicated by the Destination Address of the packet. Upon completion indicated by the destination address of the packet. Upon completion
of a segment, a pointer in the new routing header is incremented and of a segment, a pointer in the new routing header is incremented and
indicates the next segment. indicates the next segment.
As per [RFC8754], an SRv6 Segment is a 128-bit value. "SRv6 SID" or As per [RFC8754], an SR over IPv6 (SRv6) Segment is a 128-bit value.
simply "SID" are often used as a shorter reference for "SRv6 "SRv6 SID" or simply "SID" are often used as a shorter reference for
Segment". [RFC8986] defines the SRv6 SID as consisting of "SRv6 Segment". [RFC8986] defines the SRv6 SID as consisting of
LOC:FUNCT:ARG. LOC:FUNCT:ARG.
[I-D.ietf-pce-segment-routing-ipv6] extends [RFC8664] to support SR [RFC9603] extends [RFC8664] to support SR for the IPv6 data plane.
for the IPv6 data plane. Further, a PCECC could be extended to Further, a PCECC could be extended to support SRv6 SID allocation and
support SRv6 SID allocation and distribution. An example of how PCEP distribution. An example of how PCEP extensions could be extended
extensions could be extended for SRv6 for PCECC is described in for SRv6 for a PCECC is described in [PCECC-SRv6].
[I-D.dhody-pce-pcep-extension-pce-controller-srv6].
2001:db8::1 2001:db8::1
+----------+ +----------+
| R1 | | R1 |
+----------+ +----------+
| |
+----------+ +----------+
| R2 | 2001:db8::2 | R2 | 2001:db8::2
+----------+ +----------+
* | * * * | * *
skipping to change at page 13, line 39 skipping to change at line 589
+-----------+ +-----------+ +-----------+ +-----------+
2001:db8::3 | R3 | |R6 |2001:db8::6 2001:db8::3 | R3 | |R6 |2001:db8::6
+-----------+ +-----------+ +-----------+ +-----------+
| |
+-----------+ +-----------+
| R8 | 2001:db8::8 | R8 | 2001:db8::8
+-----------+ +-----------+
Figure 4: PCECC for SRv6 Figure 4: PCECC for SRv6
In this case, as shown in Figure 4, PCECC could assign the SRv6 SID In this case, as shown in Figure 4, the PCECC could assign the SRv6
(in the form of an IPv6 address) to be used for node and adjacency. SID (in the form of an IPv6 address) to be used for node and
Later, the SRv6 path in the form of a list of SRv6 SIDs could be used adjacency. Later, the SRv6 path in the form of a list of SRv6 SIDs
at the ingress. Some examples - could be used at the ingress. Some examples:
* SRv6 SID-List={2001:db8::8} - The best path towards R8 * The best path towards R8: SRv6 SID-List={2001:db8::8}
* SRv6 SID-List={2001:db8::5, 2001:db8::8} - The path towards R8 via * The path towards R8 via R5: SRv6 SID-List={2001:db8::5,
R5 2001:db8::8}
The rest of the procedures and mechanisms remain the same as SR-MPLS. The rest of the procedures and mechanisms remain the same as SR-MPLS.
3.3. PCECC for Static TE LSP 3.3. PCECC for Static TE LSPs
As described in Section 3.1.2 of [RFC8283], PCECC architecture As described in Section 3.1.2 of [RFC8283], the PCECC architecture
supports the provisioning of static TE LSP. To achieve this, the supports the provisioning of static TE LSPs. To achieve this, the
existing PCEP can be used to communicate between the PCECC and nodes existing PCEP can be used to communicate between the PCECC and nodes
along the path to provision explicit label instructions at each hop along the path to provision explicit label instructions at each hop
on the end-to-end path. Each router along the path must be told what on the end-to-end path. Each router along the path must be told what
label-forwarding instructions to program and what resources to label-forwarding instructions to program and what resources to
reserve. The PCE-based controller keeps a view of the network and reserve. The PCE-based controller keeps a view of the network and
determines the paths of the end-to-end LSPs, and the controller uses determines the paths of the end-to-end LSPs, and the controller uses
PCEP to communicate with each router along the path of the end-to-end PCEP to communicate with each router along the path of the end-to-end
LSP. LSP.
192.0.2.1/32 192.0.2.1/32
skipping to change at page 15, line 9 skipping to change at line 652
+-----------+ +-----------+
Figure 5: PCECC TE LSP Setup Example Figure 5: PCECC TE LSP Setup Example
Refer to Figure 5 for an example TE topology. Refer to Figure 5 for an example TE topology.
* Based on path computation request/delegation or PCE initiation, * Based on path computation request/delegation or PCE initiation,
the PCECC receives a request with constraints and optimization the PCECC receives a request with constraints and optimization
criteria. criteria.
* PCECC will calculate the optimal path according to the given * The PCECC will calculate the optimal path according to the given
constraints (e.g. bandwidth). constraints (e.g., bandwidth).
* PCECC will provision each node along the path and assign incoming * The PCECC will provision each node along the path and assign
and outgoing labels from R1 to R8 with the path as incoming and outgoing labels from R1 to R8 with the path as
"R1-link1-R2-link3-R4-link10-R3-link8-R8": "R1-link1-R2-link3-R4-link10-R3-link8-R8":
- R1: Outgoing label 1001 on link 1 - R1: Outgoing label 1001 on link 1
- R2: Incoming label 1001 on link 1 - R2: Incoming label 1001 on link 1
- R2: Outgoing label 2003 on link 3 - R2: Outgoing label 2003 on link 3
- R4: Incoming label 2003 on link 3 - R4: Incoming label 2003 on link 3
- R4: Outgoing label 4010 on link 10 - R4: Outgoing label 4010 on link 10
- R3: Incoming label 4010 on link 10 - R3: Incoming label 4010 on link 10
- R3: Outgoing label 3008 on link 8 - R3: Outgoing label 3008 on link 8
- R8: Incoming label 3008 on link 8 - R8: Incoming label 3008 on link 8
* This can also be represented as {R1, link1, 1001}, {1001, R2, * This can also be represented as: {R1, link1, 1001}, {1001, R2,
link3, 2003], {2003, R4, link10, 4010}, {4010, R3, link8, 3008}, link3, 2003}, {2003, R4, link10, 4010}, {4010, R3, link8, 3008},
{3008, R8}. {3008, R8}.
* For the end-to-end protection, PCECC programs each node along the * For the end-to-end protection, the PCECC programs each node along
path from R1 to R8 with the secondary path: {R1, link2, 1002}, the path from R1 to R8 with the secondary path: {R1, link2, 1002},
{1002, R2, link4, 2004], {2004, R5, link7, 5007}, {5007, R3, {1002, R2, link4, 2004}, {2004, R5, link7, 5007}, {5007, R3,
link9, 3009}, {3009, R8}. link9, 3009}, {3009, R8}.
* It is also possible to have a bypass path for the local protection * It is also possible to have a bypass path for the local protection
set up by the PCECC. For example, the primary path as above, then set up by the PCECC. For example, use the primary path as above,
to protect the node R4 locally, PCECC can program the bypass path then to protect the node R4 locally, the PCECC can program the
like this: {R2, link5, 2005}, {2005, R3}. By doing this, the node bypass path like this: {R2, link5, 2005}, {2005, R3}. By doing
R4 is locally protected at R2. this, the node R4 is locally protected at R2.
3.4. PCECC for Load Balancing (LB) 3.4. PCECC for Load Balancing (LB)
Very often many service providers use TE tunnels for solving issues Very often, many service providers use TE tunnels for solving issues
with non-deterministic paths in their networks. One example of such with non-deterministic paths in their networks. One example of such
applications is the usage of TEs in the mobile backhaul (MBH). applications is the usage of TEs in the mobile backhaul (MBH).
Consider the topology as shown in Figure 6 (AGG1...AGGN are Consider the topology as shown in Figure 6 (where AGG 1...AGG N are
Aggregation Routers, Core 1...Core N are Core routers) - Aggregation routers, and Core 1...Core N are Core routers).
TE1 --------------> TE1 ----------->
+---------+ +--------+ +--------+ +--------+ +------+ +---+ +--------+ +------+ +-----+ +-------+ +------+ +---+
| Access |----| Access |----| AGG 1 |----| AGG N-1|----|Core 1|--|SR1| |Access |----|Access|----|AGG 1|----|AGG N-1|----|Core 1|--|SR1|
| SubNode1| | Node 1 | +--------+ +--------+ +------+ +---+ |SubNode1| |Node 1| +-----+ +-------+ +------+ +---+
+---------+ +--------+ | | | ^ | +--------+ +------+ | | | ^ |
| Access | Access | AGG Ring 1 | | | | Access | Access | AGG Ring 1| | |
| SubRing 1 | Ring 1 | | | | | | SubRing 1 | Ring 1 | | | | |
+---------+ +--------+ +--------+ | | | +--------+ +------+ +-----+ | | |
| Access | | Access | | AGG 2 | | | | |Access | |Access| |AGG 2| | | |
| SubNode2| | Node 2 | +--------+ | | | |SubNode2| |Node 2| +-----+ | | |
+---------+ +--------+ | | | | | +--------+ +------+ | | | | |
| | | | | | | | | | | | | |
| | | +----TE2----|-+ | | | | +---TE2---|-+ |
+---------+ +--------+ +--------+ +--------+ +------+ +---+ +--------+ +------+ +-----+ +-------+ +------+ +---+
| Access | | Access |----| AGG 3 |----| AGG N |----|Core N|--|SRn| |Access | |Access|----|AGG 3|----| AGG N |----|Core N|--|SRn|
| SubNodeN|----| Node N | +--------+ +--------+ +------+ +---+ |SubNodeN|----|Node N| +-----+ +-------+ +------+ +---+
+---------+ +--------+ +--------+ +------+
Figure 6: PCECC Load Balancing (LB) Use Case Figure 6: PCECC Load Balancing (LB) Use Case
This MBH architecture uses L2 access rings and sub-rings. L3 starts This MBH architecture uses L2 access rings and sub-rings. L3 starts
at the aggregation layer. For the sake of simplicity, the figure at the aggregation layer. For the sake of simplicity, the figure
shows only one access sub-ring. The access ring and aggregation ring shows only one access sub-ring. The access ring and aggregation ring
are connected by Nx10GE interfaces. The aggregation domain runs its are connected by Nx10GE interfaces. The aggregation domain runs its
own IGP. There are two Egress routers (AGG N-1, AGG N) that are own IGP. There are two egress routers (AGG N-1 and AGG N) that are
connected to the Core domain (Core 1...Core N) via L2 interfaces. connected to the Core domain (Core 1...Core N) via L2 interfaces.
Core also has connections to service routers, RSVP-TE or SR-TE is The Core also has connections to service routers; RSVP-TE or SR-TE is
used for MPLS transport inside the ring. There could be at least 2 used for MPLS transport inside the ring. There could be at least two
tunnels (one way) from each AGG router to egress AGG routers. There tunnels (one way) from each AGG router to egress AGG routers. There
are also many L2 access rings connected to AGG routers. are also many L2 access rings connected to AGG routers.
Service deployment is made by means of Layer 2 Virtual Private Service deployment is made by means of Layer 2 Virtual Private
Networks (L2VPNs) (Virtual Private LAN Services (VPLS)), Layer 3 Networks (L2VPNs), Virtual Private LAN Services (VPLSs), Layer 3
Virtual Private Networks (L3VPNs) or Ethernet VPNs (EVPNs). Those Virtual Private Networks (L3VPNs), or Ethernet VPNs (EVPNs). Those
services use MPLS TE (or SR-TE) as transport towards egress AGG services use MPLS TE (or SR-TE) as transport towards egress AGG
routers. TE tunnels could be used as transport towards service routers. TE tunnels could be used as transport towards service
routers in case of seamless MPLS ([I-D.ietf-mpls-seamless-mpls]) routers in case of architecture based on seamless MPLS
based architecture. [MPLS-SEAMLESS].
Load balancing between TE tunnels involves distributing network Load balancing between TE tunnels involves distributing network
traffic across multiple TE tunnels to optimize the use of available traffic across multiple TE tunnels to optimize the use of available
network resources, enhance performance, and ensure reliability. Some network resources, enhance performance, and ensure reliability. Some
common techniques include Equal-Cost Multi-Path (ECMP) and Unequal- common techniques include Equal-Cost Multipath (ECMP) and Unequal-
Cost Multi-Path (UCMP) based on the bandwidth of the TE tunnels. Cost Multipath (UCMP) based on the bandwidth of the TE tunnels.
There is a need to solve the following tasks: There is a need to solve the following tasks:
* Perform automatic load-balance amongst TE tunnels according to * Perform automatic load balancing amongst TE tunnels according to
current traffic load. current traffic loads.
* TE bandwidth (BW) management: Provide guaranteed BW for specific * TE bandwidth (BW) management: Provide guaranteed BW for specific
services: High-Speed Data Service (HSI)), IPTV, etc., and provide services: High-Speed Data Service (HSI)), IPTV, etc., and provide
time-based BW reservation (BW on demand (BoD)) for other services. time-based BW reservation (BW on demand (BoD)) for other services.
* Simplify the development of TE tunnels by automation without any * Simplify the development of TE tunnels by automation without any
manual intervention. manual intervention.
* Provide flexibility for Service Router placement (anywhere in the * Provide flexibility for service router placement (anywhere in the
network by the creation of transport LSPs to them). network by the creation of transport LSPs to them).
In this section, the focus is on load balancing (LB) tasks. LB task In this section, the focus is on load balancing (LB) tasks. LB tasks
could be solved by means of PCECC in the following way: could be solved by means of the PCECC in the following ways:
* Application or network service or operator can ask the SDN * Applications, network services, or operators can ask the SDN
controller (PCECC) for LSP-based load balancing between AGG X and controller (PCECC) for LSP-based load balancing between AGG X and
AGG N/AGG N-1 (egress AGG routers that have connections to the AGG N/AGG N-1 (egress AGG routers that have connections to the
core). Each of these will have associated constraints (i.e. core). Each of these will have associated constraints (such as
bandwidth, inclusion or exclusion specific links or nodes, number bandwidth, inclusion or exclusion of specific links or nodes,
of paths, objective function (OF), need for disjoint LSP paths number of paths, Objective Function (OF), need for disjoint LSP
etc.); paths, etc.).
* PCECC could calculate multiple (say N) LSPs according to given * The PCECC could calculate multiple (say N) LSPs according to given
constraints, the calculation is based on results of Objective constraints. The calculation is based on the results of Objective
Function (OF) [RFC5541], constraints, endpoints, same or different Function (OF) [RFC5541], constraints, endpoints, same or different
bandwidth (BW), different links (in case of disjoint paths) and bandwidth (BW), different links (in case of disjoint paths), and
other constraints. other constraints.
* Depending on the given LSP Path setup type (PST), PCECC will * Depending on the given LSP Path Setup Type (PST), the PCECC will
download instructions to the PCC. At this stage, it is assumed download instructions to the PCC. At this stage, it is assumed
the PCECC is aware of the label space it controls and SID the PCECC is aware of the label space it controls and SID
allocation and distribution is already done in the case of SR. allocation and distribution is already done in the case of SR.
* PCECC will send PCInitiate message [RFC8281] towards ingress AGG X * The PCECC will send a PCInitiate message [RFC8281] towards the
router(PCC) for each of N LSPs and receive PCRpt message [RFC8231] ingress AGG X router (PCC) for each of N LSPs and receive a PCRpt
back from PCCs. If PST is PCECC-SR, the PCECC will include a SID message [RFC8231] back from PCCs. If the PST is a PCECC-SR, the
stack as per [RFC8664]. If PST is PCECC (basic), then the PCECC PCECC will include a SID stack as per [RFC8664]. If the PST is a
will assign labels along the calculated path and set up the path PCECC (basic), then the PCECC will assign labels along the
by sending central controller instructions in a PCEP message to calculated path and set up the path by sending central controller
each node along the path of the LSP as per [RFC9050] and then send instructions in a PCEP message to each node along the path of the
PCUpd message to the ingress AGG X router with information about LSP as per [RFC9050]. Then, the PCECC will send a PCUpd message
new LSP. AGG X(PCC) will respond with PCRpt with LSP status. to the ingress AGG X router with information about the new LSP.
AGG X (PCC) will respond with a PCRpt with LSP status.
* AGG X as an ingress router now has N LSPs towards AGG N and AGG * AGG X as an ingress router now has N LSPs towards AGG N and AGG
N-1 which are available for installation to the router's N-1, which are available for installation to the router's
forwarding table and load-balance traffic between them. Traffic forwarding table and for load balancing traffic between them.
distribution between those LSPs depends on the particular Traffic distribution between those LSPs depends on the particular
realization of the hash-function on that router. realization of the hash function on that router.
* Since PCECC is aware of TEDB (TE state) and LSP-DB, it can manage * Since the PCECC is aware of the TEDB (TE state) and the LSP
and prevent possible over-subscriptions and limit the number of Database (LSP-DB), it can manage and prevent possible over-
available load-balance states. Via PCECC mechanism the control subscriptions and limit the number of available load-balance
can take quick actions into the network by directly provisioning states. Via a PCECC mechanism, the control can take quick actions
the central control instructions. into the network by directly provisioning the central control
instructions.
3.5. PCECC and Inter-AS TE 3.5. PCECC and Inter-AS TE
There are various signalling options for establishing Inter-AS TE There are various signalling options for establishing Inter-AS TE
LSP: contiguous TE LSP [RFC5151], stitched TE LSP [RFC5150], and LSPs: contiguous TE LSPs [RFC5151], stitched TE LSPs [RFC5150], and
nested TE LSP [RFC4206]. nested TE LSPs [RFC4206].
Requirements for PCE-based Inter-AS setup [RFC5376] describe the The requirements for PCE-based Inter-AS setup [RFC5376] describe the
approach and PCEP functionality that is needed for establishing approach and PCEP functionality that is needed for establishing
Inter-AS TE LSPs. Inter-AS TE LSPs.
[RFC5376] also gives Inter- and Intra-AS PCE Reference Model (as [RFC5376] also gives an Inter-AS and Intra-AS PCE Reference Model (as
shown in Figure 7) that is provided below in shortened form for the shown in Figure 7) that is provided below in shortened form for the
sake of simplicity. sake of simplicity.
Inter-AS Inter-AS Inter-AS Inter-AS
PCC <-->PCE1<--------->PCE2 PCC <-->PCE1<--------->PCE2
:: :: :: :: :: ::
:: :: :: :: :: ::
R1----ASBR1====ASBR3---R3---ASBR5 R1----ASBR1====ASBR3---R3---ASBR5
| AS1 | | PCC | | AS1 | | PCC |
| | | AS2 | | | | AS2 |
R2----ASBR2====ASBR4---R4---ASBR6 R2----ASBR2====ASBR4---R4---ASBR6
:: :: :: ::
:: :: :: ::
Intra-AS Intra-AS Intra-AS Intra-AS
PCE3 PCE4 PCE3 PCE4
Figure 7: Shortened form of Inter- and Intra-AS PCE Reference Model Figure 7: Shortened Form of the Inter-AS and Intra-AS PCE
Reference Model
The PCECC belonging to the different domains can cooperate to set up The PCECC belonging to the different domains can cooperate to set up
inter-AS TE LSP. The stateful H-PCE [RFC8751] mechanism could also Inter-AS TE LSPs. The stateful Hierarchical PCE (H-PCE) mechanism
be used to establish a per-domain PCECC LSP first. These could be [RFC8751] could also be used to establish a per-domain PCECC LSP
stitched together to form inter-AS TE LSP as described in first. These could be stitched together to form an Inter-AS TE LSP
[I-D.ietf-pce-stateful-interdomain]. as described in [PCE-INTERDOMAIN].
For the sake of simplicity, here the focus is on a simplified Inter- For the sake of simplicity, here the focus is on a simplified Inter-
AS case when both AS1 and AS2 belong to the same service provider AS case when both AS1 and AS2 belong to the same service provider
administration. In that case, Inter and Intra-AS PCEs could be administration. In that case, Inter-AS and Intra-AS PCEs could be
combined in one single PCE if such combined PCE performance is enough combined in one single PCE if such combined PCE performance is enough
to handle the load. The PCE will require interfaces (PCEP and BGP- to handle the load. The PCE will require interfaces (PCEP and BGP-
LS) to both domains. PCECC redundancy mechanisms are described in LS) to both domains. PCECC redundancy mechanisms are described in
[RFC8283]. Thus routers (PCCs) in AS1 and AS2 can send PCEP messages [RFC8283]. Thus, routers (PCCs) in AS1 and AS2 can send PCEP
towards the same PCECC. In Figure 8, PCECC maintains a BGP-LS messages towards the same PCECC. In Figure 8, the PCECC maintains a
session with route reflectors (RRs) in each AS. This allows the RRs BGP-LS session with Route Reflectors (RRs) in each AS. This allows
to redistribute routes to other BGP routers (clients) without the RRs to redistribute routes to other BGP routers (clients) without
requiring a full mesh. The RRs act as BGP-LS Propagator and PCECC requiring a full mesh. The RRs act as a BGP-LS Propagator, and the
act as a BGP-LS Consumer [RFC9552]. PCECC acts as a BGP-LS Consumer [RFC9552].
+----BGP-LS------+ +------BGP-LS-----+ +----BGP-LS------+ +------BGP-LS-----+
| | | | | | | |
+-PCEP-|----++-+-------PCECC-----PCEP--++-+-|-------+ +-PCEP-|----++-+-------PCECC-----PCEP--++-+-|-------+
+-:------|----::-:-+ +--::-:-|-------:---+ +-:------|----::-:-+ +--::-:-|-------:---+
| : | :: : | | :: : | : | | : | :: : | | :: : | : |
| : RR1 :: : | | :: : RR2 : | | : RR1 :: : | | :: : RR2 : |
| v v: : | LSP1 | :: v v | | v v: : | LSP1 | :: v v |
| R1---------ASBR1=======================ASBR3--------R3 | | R1---------ASBR1=======================ASBR3--------R3 |
| | v : | | :v | | | | v : | | :v | |
| +----------ASBR2=======================ASBR4---------+ | | +----------ASBR2=======================ASBR4---------+ |
| | Region 1 : | | : Region 1 | | | | Region 1 : | | : Region 1 | |
|----------------:-| |--:-------------|--| |----------------:-| |--:-------------|--|
| | v | LSP2 | v | | | | v | LSP2 | v | |
| +----------ASBR5=======================ASBR6---------+ | | +----------ASBR5=======================ASBR6---------+ |
| Region 2 | | Region 2 | | Region 2 | | Region 2 |
+------------------+ <--------------> +-------------------+ +------------------+ <--------------> +-------------------+
MPLS Domain 1 Inter-AS MPLS Domain 2 MPLS Domain 1 Inter-AS MPLS Domain 2
<=======AS1=======> <========AS2=======> <=======AS1=======> <========AS2=======>
Figure 8: Particular case of Inter-AS PCE Figure 8: Particular Case of Inter-AS PCE
In the case of the PCECC Inter-AS TE scenario (as shown in Figure 8) In the case of the PCECC Inter-AS TE scenario (as shown in Figure 8),
where the service provider controls both domains (AS1 and AS2), each where the service provider controls both domains (AS1 and AS2), each
of them has its own IGP and MPLS transport. There is a need to set of them has its own IGP and MPLS transport. There is a need to set
up Inter-AS LSPs for transporting different services on top of them up Inter-AS LSPs for transporting different services on top of them
(Voice, L3VPN etc.). Inter-AS links with different capacities exist (such as Voice, L3VPN, etc.). Inter-AS links with different
in several regions. The task is not only to provision those Inter-AS capacities exist in several regions. The task is not only to
LSPs with given constraints but also to calculate the path and pre- provision those Inter-AS LSPs with given constraints but also to
setup the backup Inter-AS LSPs that will be used if the primary LSP calculate the path and pre-setup the backup Inter-AS LSPs that will
fails. be used if the primary LSP fails.
As per Figure 8, LSP1 from R1 to R3 goes via ASBR1 and ASBR3, and it As per Figure 8, LSP1 from R1 to R3 goes via ASBR1 and ASBR3, and it
is the primary Inter-AS LSP. R1-R3 LSP2 that goes via ASBR5 and is the primary Inter-AS LSP. R1-R3 LSP2 that goes via ASBR5 and
ASBR6 are the backup ones. In addition, there could also be a bypass ASBR6 is the backup one. In addition, there could also be a bypass
LSP setup to protect against ASBR or inter-AS link failures. LSP setup to protect against ASBR or Inter-AS link failures.
After the addition of PCECC functionality to PCE (SDN controller), After the addition of PCECC functionality to the PCE (SDN
the PCECC-based Inter-AS TE model should follow the PCECC use case controller), the PCECC-based Inter-AS TE model should follow the
for TE LSP including requirements of [RFC5376] with the following PCECC use case for TE LSP including the requirements described in
details: [RFC5376] with the following details:
* Since PCECC needs to know the topology of both domains AS1 and * Since the PCECC needs to know the topology of both domains AS1 and
AS2, PCECC can utilize the BGP-LS peering with BGP routers (or AS2, the PCECC can utilize the BGP-LS peering with BGP routers (or
RRs) in both domains. RRs) in both domains.
* PCECC needs to establish PCEP connectivity with all routers in * The PCECC needs to establish PCEP connectivity with all routers in
both domains (see also section 4 in [RFC5376]). both domains (see also Section 4 of [RFC5376]).
* After the operator's application or service orchestrator creates a * After the operator's application or service orchestrator creates a
request for tunnel creation of a specific service, PCECC will request for tunnel creation of a specific service, the PCECC will
receive that request via NBI (NBI type is implementation receive that request via NBI (note that the NBI type is
dependent, it could be NETCONF/Yang, REST etc.). Then PCECC will implementation-dependent; it could be NETCONF/YANG, REST, etc.).
calculate the optimal path based on Objective Function (OF) and Then, the PCECC will calculate the optimal path based on Objective
given constraints (i.e. path setup type, bandwidth etc.), Function (OF) and given constraints (i.e., path setup type,
including those from [RFC5376]: priority, AS sequence, preferred bandwidth, etc.). These constraints include those from [RFC5376],
ASBR, disjoint paths, and protection type. In this step, we will such as priority, AS sequence, preferred ASBR, disjoint paths, and
have two paths: R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3 protection type. In this step, we will have two paths:
R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3.
* PCECC will use central control download instructions to the PCC * The PCECC will use central control download instructions to the
based on the PST. At this stage, it is assumed the PCECC is aware PCC based on the PST. At this stage, it is assumed the PCECC is
of the label space it controls and in the case of SR the SID aware of the label space it controls, and in the case of SR, the
allocation and distribution is already done. SID allocation and distribution is already done.
* PCECC will send PCInitiate message [RFC8281] towards the ingress * The PCECC will send a PCInitiate message [RFC8281] towards the
router R1 (PCC) in AS1 and receive the PCRpt message [RFC8231] ingress router R1 (PCC) in AS1 and receive the PCRpt message
back from it. [RFC8231] back from it.
- If the PST is SR-MPLS, the PCECC will include the SID stack as - If the PST is SR-MPLS, the PCECC will include the SID stack as
per [RFC8664]. Optionally, a binding SID or BGP Peering-SID per [RFC8664]. Optionally, a BSID or BGP Peering-SID [RFC9087]
[RFC9087] can also be included on the AS boundary. The backup can also be included on the AS boundary. The backup SID stack
SID stack can be installed at ingress R1 but more importantly, can be installed at ingress R1, but more importantly, each node
each node along the SR path could also do the local protection along the SR path could also do the local protection just based
just based on the top segment. on the top segment.
- If the PST is PCECC, the PCECC will assign labels along the - If the PST is a PCECC, the PCECC will assign labels along the
calculated paths (R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3) and calculated paths (R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3) and
sets up the path by sending central controller instructions in sets up the path by sending central controller instructions in
PCEP message to each node along the path of the LSPs as per a PCEP message to each node along the path of the LSPs as per
[RFC9050]. After these steps, the PCECC will send a PCUpd [RFC9050]. After these steps, the PCECC will send a PCUpd
message to the ingress R1 router with information about new message to the ingress R1 router with information about new
LSPs and R1 will respond by PCRpt with LSP(s) status. LSPs and R1 will respond by a PCRpt with LSP(s) status.
* After that step, R1 now have primary and backup TEs (LSP1 and * After that step, R1 now has primary and backup TEs (LSP1 and LSP2)
LSP2) towards R3. It is up to router implementation how to make towards R3. It is up to the router implementation for how to
switchover to backup LSP2 if LSP1 fails. switchover to backup LSP2 if LSP1 fails.
3.6. PCECC for Multicast LSPs 3.6. PCECC for Multicast LSPs
The multicast LSPs can be set up via the RSVP-TE P2MP or Multipoint The multicast LSPs can be set up via the RSVP-TE P2MP or Multipoint
LDP (mLDP) protocols. The setup of these LSPs may require manual LDP (mLDP) protocols. The setup of these LSPs may require manual
configurations and complex signalling when the protection is configurations and complex signalling when the protection is
considered. By using the PCECC solution, the multicast LSP can be considered. By using the PCECC solution, the multicast LSP can be
computed and set up through a centralized controller which has the computed and set up through a centralized controller that has the
full picture of the topology and bandwidth usage for each link. It full picture of the topology and bandwidth usage for each link. It
not only reduces the complex configurations comparing the distributed not only reduces the complex configurations comparing the distributed
RSVP-TE P2MP or mLDP signalling, but also it can compute the disjoint RSVP-TE P2MP or mLDP signalling, but also it can compute the disjoint
primary path and secondary P2MP path efficiently. primary path and secondary P2MP path efficiently.
3.6.1. PCECC for P2MP/MP2MP LSPs' Setup 3.6.1. PCECC for the Setup of P2MP/MP2MP LSPs
It is assumed the PCECC is aware of the label space it controls for It is assumed the PCECC is aware of the label space it controls for
all nodes and makes allocations accordingly. all nodes and makes allocations accordingly.
+----------+ +----------+
| R1 | Root node of the multicast LSP | R1 | Root Node of the multicast LSP
+----------+ +----------+
|9000 (L0) |9000 (L0)
+----------+ +----------+
Transit Node | R2 | Transit Node | R2 |
branch +----------+ branch +----------+
* | * * * | * *
9001* | * *9002 9001* | * *9002
L1 * | * *L2 L1 * | * *L2
+-----------+ | * +-----------+ +-----------+ | * +-----------+
| R4 | | * | R5 | Transit Nodes | R4 | | * | R5 | Transit Nodes
skipping to change at page 22, line 29 skipping to change at line 981
9003* | * * +9004 9003* | * * +9004
L3 * | * * +L4 L3 * | * * +L4
+-----------+ +-----------+ +-----------+ +-----------+
| R3 | | R6 | Leaf Node | R3 | | R6 | Leaf Node
+-----------+ +-----------+ +-----------+ +-----------+
9005| L5 9005| L5
+-----------+ +-----------+
| R8 | Leaf Node | R8 | Leaf Node
+-----------+ +-----------+
Figure 9: Using PCECC for P2MP/MP2MP LSPs' Setup Figure 9: Using a PCECC for the Setup of P2MP/MP2MP LSPs
The P2MP examples (based on Figure 9) are explained here, where R1 is The P2MP examples (based on Figure 9) are explained here, where R1 is
the root and the router R8 and R6 are the leaves. the root and the routers R8 and R6 are the leaves.
* Based on the P2MP path computation request/delegation or PCE * Based on the P2MP path computation request/delegation or PCE
initiation, the PCECC receives the request with constraints and initiation, the PCECC receives the request with constraints and
optimization criteria. optimization criteria.
* PCECC will calculate the optimal P2MP path according to given * The PCECC will calculate the optimal P2MP path according to given
constraints (i.e.bandwidth). constraints (i.e., bandwidth).
* PCECC will provision each node along the path and assign incoming * The PCECC will provision each node along the path and assign
and outgoing labels from R1 to {R6, R8} with the path as incoming and outgoing labels from R1 to {R6, R8} with the path as
"R1-L0-R2-L2-R5-L4-R6" and "R1-L0-R2-L1-R4-L3-R3-L5-R8": "R1-L0-R2-L2-R5-L4-R6" and "R1-L0-R2-L1-R4-L3-R3-L5-R8":
- R1: Outgoing label 9000 on link L0 - R1: Outgoing label 9000 on link L0
- R2: Incoming label 9000 on link L0 - R2: Incoming label 9000 on link L0
- R2: Outgoing label 9001 on link L1 (*) - R2: Outgoing label 9001 on link L1 (*)
- R2: Outgoing label 9002 on link L2 (*) - R2: Outgoing label 9002 on link L2 (*)
- R5: Incoming label 9002 on link L2 - R5: Incoming label 9002 on link L2
- R5: Outgoing label 9004 on link L4 - R5: Outgoing label 9004 on link L4
- R6: Incoming label 9004 on link L4 - R6: Incoming label 9004 on link L4
- R4: Incoming label 9001 on link L1 - R4: Incoming label 9001 on link L1
- R4: Outgoing label 9003 on link L3 - R4: Outgoing label 9003 on link L3
- R3: Incoming label 9003 on link L3 - R3: Incoming label 9003 on link L3
- R3: Outgoing label 9005 on link L5 - R3: Outgoing label 9005 on link L5
- R8: Incoming label 9005 on link L5 - R8: Incoming label 9005 on link L5
* This can also be represented as : {R1, 6000}, {6000, R2, * This can also be represented as: {R1, 6000}, {6000, R2,
{9001,9002}}, {9001, R4, 9003}, {9002, R5, 9004} {9003, R3, 9005}, {9001,9002}}, {9001, R4, 9003}, {9002, R5, 9004} {9003, R3, 9005},
{9004, R6}, {9005, R8}. The main difference (*) is in the branch {9004, R6}, {9005, R8}. The main difference (*) is in the branch
node instruction at R2 where two copies of a packet are sent node instruction at R2, where two copies of a packet are sent
towards R4 and R5 with 9001 and 9002 labels respectively. towards R4 and R5 with 9001 and 9002 labels, respectively.
The packet forwarding involves - The packet forwarding involves the following:
Step 1: R1 sends a packet to R2 simply by pushing the label of Step 1. R1 sends a packet to R2 simply by pushing the label of 9000
9000 to the packet. to the packet.
Step 2: When R2 receives the packet with label 9000, it will Step 2. When R2 receives the packet with label 9000, it will forward
forward it to R4 by swapping label 9000 to 9001 and at the same it to R4 by swapping label 9000 to 9001. At the same time,
time, it will replicate the packet and swap the label 9000 to 9002 it will replicate the packet and swap the label 9000 to 9002
and forward it to R5. and forward it to R5.
Step 3: When R4 receives the packet with label 9001, it will Step 3. When R4 receives the packet with label 9001, it will forward
forward it to R3 by swapping 9001 to 9003. When R5 receives the it to R3 by swapping 9001 to 9003. When R5 receives the
packet with the label 9002, it will forward it to R6 by swapping packet with the label 9002, it will forward it to R6 by
9002 to 9004. swapping 9002 to 9004.
Step 4: When R3 receives the packet with label 9003, it will Step 4. When R3 receives the packet with label 9003, it will forward
forward it to R8 by swapping it to 9005 and when R5 receives the it to R8 by swapping it to 9005. When R5 receives the
packet with label 9002, it will be swapped to 9004 and sent to R6. packet with label 9002, it will be swapped to 9004 and sent
to R6.
Step 5: When R8 receives the packet with label 9005, it will pop Step 5. When R8 receives the packet with label 9005, it will pop the
the label; when R6 receives the packet with label 9004, it will label. When R6 receives the packet with label 9004, it will
pop the label. pop the label.
3.6.2. PCECC for the End-to-End Protection of P2MP/MP2MP LSPs 3.6.2. PCECC for the End-to-End Protection of P2MP/MP2MP LSPs
In this section, the end-to-end managed path protection service as This section describes the end-to-end managed path protection service
well as the local protection with the operation management in the as well as the local protection with the operation management in the
PCECC network for the P2MP/MP2MP LSP. PCECC network for the P2MP/MP2MP LSP.
An end-to-end protection principle can be applied for computing An end-to-end protection principle can be applied for computing
backup P2MP or MP2MP LSPs. During the computation of the primary backup P2MP or MP2MP LSPs. During the computation of the primary
multicast trees, PCECC could also take the computation of a secondary multicast trees, the PCECC could also take the computation of a
tree into consideration. A PCECC could compute the primary and secondary tree into consideration. A PCECC could compute the primary
backup P2MP (or MP2MP) LSPs together or sequentially. and backup P2MP (or MP2MP) LSPs together or sequentially.
+----+ +----+ +----+ +----+
Root node of LSP | R1 |--| R11| Root Node of LSP | R1 |--| R11|
+----+ +----+ +----+ +----+
/ + / +
10/ +20 10/ +20
/ + / +
+----------+ +-----------+ +----------+ +-----------+
Transit Node | R2 | | R3 | Transit Node | R2 | | R3 |
+----------+ +-----------+ +----------+ +-----------+
| \ + + | \ + +
| \ + + | \ + +
10| 10\ +20 20+ 10| 10\ +20 20+
| \ + + | \ + +
| \ + | \ +
| + \ + | + \ +
+-----------+ +-----------+ Leaf Nodes +-----------+ +-----------+ Leaf Nodes
| R4 | | R5 | (Downstream LSR) | R4 | | R5 | (Downstream LSR)
+-----------+ +-----------+ +-----------+ +-----------+
Figure 10: PCECC for the End-to-End Protection of the P2MP/MP2MP LSPs Figure 10: PCECC for the End-to-End Protection of P2MP/MP2MP LSPs
In Figure 10, when the PCECC setups the primary multicast tree from In Figure 10, when the PCECC sets up the primary multicast tree from
the root node R1 to the leaves, which is R1->R2->{R4, R5}, at the the root node R1 to the leaves, which is R1->R2->{R4, R5}, it can set
same time, it can setup the backup tree, which is R1->R11->R3->{R4, up the backup tree at the same time, which is R1->R11->R3->{R4, R5}.
R5}. Both of them (primary forwarding tree and secondary forwarding Both of them (the primary forwarding tree and secondary forwarding
tree) will be downloaded to each router along the primary path and tree) will be downloaded to each router along the primary path and
the secondary path. The traffic will be forwarded through the the secondary path. The traffic will be forwarded through the
R1->R2->{R4, R5} path normally, but when a node in the primary tree R1->R2->{R4, R5} path normally, but when a node in the primary tree
fails (say R2) the root node R1 will switch the flow to the backup fails (say R2), the root node R1 will switch the flow to the backup
tree, which is R1->R11->R3->{R4, R5}. By using the PCECC a path tree, which is R1->R11->R3->{R4, R5}. By using the PCECC, path
computation, label downloading and finally forwarding can be done computation, label downloading, and finally forwarding can be done
without complex signalling used in the P2MP RSVP-TE or mLDP. without the complex signalling used in the P2MP RSVP-TE or mLDP.
3.6.3. PCECC for the Local Protection of the P2MP/MP2MP LSPs 3.6.3. PCECC for the Local Protection of P2MP/MP2MP LSPs
In this section, we describe the local protection service in the In this section, we describe the local protection service in the
PCECC network for the P2MP/MP2MP LSP. PCECC network for the P2MP/MP2MP LSP.
While the PCECC sets up the primary multicast tree, it can also build While the PCECC sets up the primary multicast tree, it can also build
the backup LSP between the Point of Local Repair (PLR), the protected the backup LSP between the Point of Local Repair (PLR), protected
node and Merge Points (MPs) (the downstream nodes of the protected node, and Merge Points (MPs) (the downstream nodes of the protected
node). In the cases where the amount of downstream nodes is huge, node). In the cases where the amount of downstream nodes is huge,
this mechanism can avoid unnecessary packet duplication on PLR and this mechanism can avoid unnecessary packet duplication on the PLR
protect the network from traffic congestion risk. and protect the network from traffic congestion risks.
+------------+ +------------+
| R1 | Root Node | R1 | Root Node
+------------+ +------------+
. .
. .
. .
+------------+ Point of Local Repair/ +------------+ Point of Local Repair /
| R10 | Switchover Point | R10 | Switchover Point
+------------+ (Upstream LSR) +------------+ (Upstream LSR)
/ + / +
10/ +20 10/ +20
/ + / +
+----------+ +-----------+ +----------+ +-----------+
Protected Node | R20 | | R30 | Protected Node | R20 | | R30 |
+----------+ +-----------+ +----------+ +-----------+
| \ + + | \ + +
| \ + + | \ + +
10| 10\ +20 20+ 10| 10\ +20 20+
| \ + + | \ + +
| \ + | \ +
| + \ + | + \ +
+-----------+ +-----------+ Merge Point +-----------+ +-----------+ Merge Point
| R40 | | R50 | (Downstream LSR) | R40 | | R50 | (Downstream LSR)
+-----------+ +-----------+ +-----------+ +-----------+
. . . .
. . . .
Figure 11: PCECC for the Local Protection of the P2MP/MP2MP LSPs Figure 11: PCECC for the Local Protection of P2MP/MP2MP LSPs
In Figure 11, when the PCECC setups the primary multicast path around In Figure 11, when the PCECC sets up the primary multicast path
the PLR node R10 to protect node R20, which is R10->R20->{R40, R50}, around the PLR node R10 to protect node R20, which is R10->R20->{R40,
at the same time, it can set up the backup path R10->R30->{R40, R50}. R50}, it can set up the backup path R10->R30->{R40, R50} at the same
Both the primary forwarding path and secondary bypass forwarding path time. Both the primary forwarding path and the secondary bypass
will be downloaded to each router along the primary path and the forwarding path will be downloaded to each router along the primary
secondary bypass path. The traffic will be forwarded through the path and the secondary bypass path. The traffic will be forwarded
R10->R20->{R40, R50} path normally and when there is a node failure through the R10->R20->{R40, R50} path normally, and when there is a
for node R20, the PLR node R10 will switch the flow to the backup node failure for node R20, the PLR node R10 will switch the flow to
path, which is R10->R30->{R40, R50}. By using the PCECC, path the backup path, which is R10->R30->{R40, R50}. By using the PCECC,
computation, label downloading and finally forwarding can be done path computation, label downloading, and finally forwarding can be
without complex signalling used in the P2MP RSVP-TE or mLDP. done without the complex signalling used in the P2MP RSVP-TE or mLDP.
3.7. PCECC for Traffic Classification 3.7. PCECC for Traffic Classification
As described in [RFC8283], traffic classification is an important As described in [RFC8283], traffic classification is an important
part of traffic engineering. It is the process of looking into a part of traffic engineering. It is the process of looking into a
packet to determine how it should be treated while it is forwarded packet to determine how it should be treated while it is forwarded
through the network. It applies in many scenarios including the through the network. It applies in many scenarios, including the
following: following:
MPLS traffic engineering (where it determines what traffic is * MPLS traffic engineering (where it determines what traffic is
forwarded into which LSPs), forwarded into which LSPs),
Segment Routing (where it is used to select which set of * Segment Routing (where it is used to select which set of
forwarding instructions (SIDs) to add to a packet), forwarding instructions (SIDs) to add to a packet), and
SFC (where it indicates how a packet should be forwarded across * SFC (where it indicates how a packet should be forwarded across
which service function path ). which service function path).
In conjunction with traffic engineering, traffic classification is an In conjunction with traffic engineering, traffic classification is an
important enabler for load balancing. Traffic classification is important enabler for load balancing. Traffic classification is
closely linked to the computational elements of planning for the closely linked to the computational elements of planning for the
network functions because it determines how traffic is balanced and network functions because it determines how traffic is balanced and
distributed through the network. Therefore, selecting what traffic distributed through the network. Therefore, selecting what traffic
classification mechanism should be performed by a router is an classification mechanism should be performed by a router is an
important part of the work done by a PCECC. important part of the work done by a PCECC.
The description of traffic flows by the combination of multiple Flow The description of traffic flows by the combination of multiple Flow
Specification components and their dissemination as traffic flow Specification components and their dissemination as traffic flow
specifications (Flow Specifications) is described for BGP in specifications (Flow Specifications) is described for BGP in
[RFC8955]. When a PCECC is used to initiate tunnels (such as TE-LSPs [RFC8955]. When a PCECC is used to initiate tunnels (such as TE LSPs
or SR paths) using PCEP, it is important that the head end of the or SR paths) using PCEP, it is important that the head end of the
tunnels understands what traffic to place on each tunnel. [RFC9168] tunnels understands what traffic to place on each tunnel. [RFC9168]
specifies a set of extensions to PCEP to support the dissemination of specifies a set of extensions to PCEP to support the dissemination of
Flow Specification components where the instructions are passed from Flow Specification components where the instructions are passed from
the PCECC to the routers using PCEP. the PCECC to the routers using PCEP.
Along with traffic classification, there are a few more questions Along with traffic classification, there are a few more
that need to be considered after path setup: considerations after path setup:
* how to use it * how to use it,
* Whether it is a virtual link * whether it is a virtual link,
* Whether to advertise it in the IGP as a virtual link * whether to advertise it in the IGP as a virtual link, and
* What bits of this information to signal to the tail end
* what bits of this information to signal to the tail end.
These are out of the scope of this document. These are out of the scope of this document.
3.8. PCECC for SFC 3.8. PCECC for SFC
Service Function Chaining (SFC) is described in [RFC7665]. It is the Service Function Chaining (SFC) is described in [RFC7665]. It is the
process of directing traffic in a network such that it passes through process of directing traffic in a network such that it passes through
specific hardware devices or virtual machines (known as service specific hardware devices or virtual machines (known as service
function nodes) that can perform particular desired functions on the function nodes) that can perform particular desired functions on the
traffic. The set of functions to be performed and the order in which traffic. The set of functions to be performed and the order in which
skipping to change at page 27, line 33 skipping to change at line 1222
be told how to mark packets entering the network. Additionally, it be told how to mark packets entering the network. Additionally, it
may be necessary to establish tunnels between service function nodes may be necessary to establish tunnels between service function nodes
to carry the traffic. Planning an SFC network requires load to carry the traffic. Planning an SFC network requires load
balancing between service function nodes and traffic engineering balancing between service function nodes and traffic engineering
across the network that connects them. As per [RFC8283], these are across the network that connects them. As per [RFC8283], these are
operations that can be performed by a PCE-based controller, and that operations that can be performed by a PCE-based controller, and that
controller can use PCEP to program the network and install the controller can use PCEP to program the network and install the
service function chains and any required tunnels. service function chains and any required tunnels.
A possible mechanism could add support for SFC-based central control A possible mechanism could add support for SFC-based central control
instructions. PCECC will be able to instruct each SFF along the SFP. instructions. The PCECC will be able to instruct each Service
Function Forwarder (SFF) along the SFP.
* Service Path Identifier (SPI): Uniquely identifies an SFP. * Service Path Identifier (SPI): Uniquely identifies an SFP.
* Service Index (SI): Provides location within the SFP. * Service Index (SI): Provides location within the SFP.
* SFC Proxy handling * SFC Proxy handling
PCECC can play the role of setting the traffic classification rules The PCECC can play the role of setting the traffic classification
(as per Section 3.7) at the classifier to impose the Network Service rules (as per Section 3.7) at the classifier to impose the Network
Header (NSH) [RFC8300] as well as downloading the forwarding Service Header (NSH) [RFC8300]. It can also download the forwarding
instructions to each SFF along the way so that they could process the instructions to each SFF along the way so that they could process the
NSH and forward accordingly. Including instructions for the service NSH and forward accordingly. This includes instructions for the
classifier that handles the context header, metadata etc. This service classifier that handles the context header, metadata, etc.
metadata/context is shared amongst SFs and classifiers, between SFs, This metadata/context is shared amongst SFs and classifiers, between
and between external systems (such as PCECC) and SFs. As described SFs, and between external systems (such as a PCECC) and SFs. As
in [RFC7665], the SFC encapsulation enables the sharing of metadata/ described in [RFC7665], the SFC encapsulation enables the sharing of
context information along the SFP. metadata/context information along the SFP.
It is also possible to support SFC with SR in conjunction with or It is also possible to support SFC with SR in conjunction with or
without NSH such as [RFC9491] and without an NSH such as described in [RFC9491] and [SR-SERVICE].
[I-D.ietf-spring-sr-service-programming]. PCECC technique can also PCECC techniques can also be used for service-function-related
be used for service function-related segments and SR service segments and SR service policies.
policies.
3.9. PCECC for Native IP 3.9. PCECC for Native IP
[RFC8735] describes the scenarios and simulation results for the [RFC8735] describes the scenarios and simulation results for the
"Centrally Control Dynamic Routing (CCDR)" solution, which integrates "Centralized Control Dynamic Routing (CCDR)" solution, which
the advantage of using distributed protocols (IGP/BGP) and the power integrates the advantage of using distributed protocols (IGP/BGP) and
of a centralized control technology (PCE/SDN), providing traffic the power of a centralized control technology (PCE/SDN), providing
engineering for native IP networks. [RFC8821] defines the framework traffic engineering for native IP networks. [RFC8821] defines the
for CCDR traffic engineering within a Native IP network, using framework for CCDR traffic engineering within a native IP network,
multiple BGP sessions and a PCE as the centralized controller. It using multiple BGP sessions and a PCE as the centralized controller.
requires the PCECC to send the instructions to the PCCs, to build It requires the PCECC to send the instructions to the PCCs to build
multiple BGP sessions, distribute different prefixes on the multiple BGP sessions, distribute different prefixes on the
established BGP sessions and assign the different paths to the BGP established BGP sessions, and assign the different paths to the BGP
next hops. PCEP protocol is used to transfer the key parameters next hops. The PCEP protocol is used to transfer the key parameters
between PCE and the underlying network devices (PCC) using the PCECC between the PCE and the underlying network devices (PCC) using the
technique. The central control instructions from PCECC to PCC will PCECC technique. The central control instructions from the PCECC to
identify which prefix should be advertised on which BGP session. PCC will identify which prefix should be advertised on which BGP
There are PCEP extensions defined in session. There are PCEP extensions defined in [PCEP-NATIVE] for it.
[I-D.ietf-pce-pcep-extension-native-ip] for it.
+------+ +------+
+----------+ PCECC+-------+ +----------+ PCECC+-------+
| +------+ | | +------+ |
| | | |
PCEP | BGP Session 1(lo11/lo21)| PCEP PCEP | BGP Session 1(lo11/lo21)| PCEP
+-------------------------+ +-------------------------+
| | | |
| BGP Session 2(lo12/lo22)| | BGP Session 2(lo12/lo22)|
+-------------------------+ +-------------------------+
PF12 | | PF22 PF12 | | PF22
PF11 | | PF21 PF11 | | PF21
+---+ +-----+-----+ +-----+-----+ +---+ +---+ +-----+-----+ +-----+-----+ +---+
|SW1+---------+(lo11/lo12)+-------------+(lo21/lo22)+-----------+SW2| |SW1+---------+(lo11/lo12)+-------------+(lo21/lo22)+-----------+SW2|
+---+ | R1 +-------------+ R2 | +---+ +---+ | R1 +-------------+ R2 | +---+
+-----------+ +-----------+ +-----------+ +-----------+
Figure 12: PCECC for Native IP Figure 12: PCECC for Native IP
In the case, as shown in Figure 12, PCECC will instruct both R1 and In the case as shown in Figure 12, the PCECC will instruct both R1
R2 via PCEP how to form BGP sessions with each other and which IP and R2 how to form BGP sessions with each other via PCEP and which IP
prefixes need to be advertised via which BGP session. prefixes need to be advertised via which BGP session.
3.10. PCECC for BIER 3.10. PCECC for BIER
Bit Index Explicit Replication (BIER) [RFC8279] defines an Bit Index Explicit Replication (BIER) [RFC8279] defines an
architecture where all intended multicast receivers are encoded as a architecture where all intended multicast receivers are encoded as a
bitmask in the multicast packet header within different BitMask in the multicast packet header within different
encapsulations. A router that receives such a packet will forward encapsulations. A router that receives such a packet will forward
that packet based on the bit position in the packet header towards that packet based on the bit position in the packet header towards
the receiver(s) following a precomputed tree for each of the bits in the receiver(s) following a precomputed tree for each of the bits in
the packet. Each receiver is represented by a unique bit in the the packet. Each receiver is represented by a unique bit in the
bitmask. BitMask.
BIER-TE [RFC9262] shares architecture and packet formats with BIER. BIER-TE [RFC9262] shares architecture and packet formats with BIER.
BIER-TE forwards and replicates packets based on a BitString in the BIER-TE forwards and replicates packets based on a BitString in the
packet header, but every BitPosition of the BitString of a BIER-TE packet header, but every BitPosition of the BitString of a BIER-TE
packet indicates one or more adjacencies. BIER-TE paths can be packet indicates one or more adjacencies. BIER-TE paths can be
derived from a PCE and used at the ingress ( a possible mechanism is derived from a PCE and used at the ingress (a possible mechanism is
described in [I-D.chen-pce-bier]). described in [PCEP-BIER]).
PCECC mechanism could be used for the allocation of bits for the BIER The PCECC mechanism could be used for the allocation of bits for the
router for BIER as well as for the adjacencies for BIER-TE. PCECC- BIER router for BIER as well as for the adjacencies for BIER-TE.
based controllers can use PCEP to instruct the BIER-capable routers PCECC-based controllers can use PCEP to instruct the BIER-capable
on the meaning of the bits as well as other fields needed for BIER routers on the meaning of the bits as well as other fields needed for
encapsulation. The PCECC could be used to program the BIER router BIER encapsulation. The PCECC could be used to program the BIER
with various parameters used in the BIER encapsulation such as BIER router with various parameters used in the BIER encapsulation (such
subdomain-ID, BFR-ID, BIER Encapsulation etc. for both node and as BIER sub-domain-id, BFR-id, etc.) for both node and adjacency.
adjacency.
A possible way for the PCECC usage and PCEP extension is described in A possible way to use the PCECC and PCEP extension is described in
[I-D.chen-pce-pcep-extension-pce-controller-bier]. [PCECC-BIER].
4. IANA Considerations 4. IANA Considerations
This document does not require any action from IANA. This document has no IANA actions.
5. Security Considerations 5. Security Considerations
[RFC8283] describes how the security considerations for a PCE-based [RFC8283] describes how the security considerations for a PCE-based
controller are a little different from those for any other PCE controller are a little different from those for any other PCE
system. PCECC operations rely heavily on the use and security of system. PCECC operations rely heavily on the use and security of
PCEP, so due consideration should be given to the security features PCEP, so due consideration should be given to the security features
discussed in [RFC5440] and the additional mechanisms described in discussed in [RFC5440] and the additional mechanisms described in
[RFC8253]. It further lists the vulnerability of a central [RFC8253]. It further lists the vulnerability of a central
controller architecture, such as a central point of failure, denial controller architecture, such as a central point of failure, denial
skipping to change at page 30, line 18 skipping to change at line 1345
for associating peer identities with different levels of access and/ for associating peer identities with different levels of access and/
or authoritativeness via an attribute in X.509 certificates or a or authoritativeness via an attribute in X.509 certificates or a
local policy with a specific accept-list of X.509 certificates. This local policy with a specific accept-list of X.509 certificates. This
can be used to check the authority for the PCECC operations. can be used to check the authority for the PCECC operations.
It is expected that each new document that is produced for a specific It is expected that each new document that is produced for a specific
use case will also include considerations of the security impacts of use case will also include considerations of the security impacts of
the use of a PCE-based central controller on the network type and the use of a PCE-based central controller on the network type and
services being managed. services being managed.
6. Acknowledgments 6. References
Thanks to Adrian Farrel, Aijun Wang, Robert Tao, Changjiang Yan,
Tieying Huang, Sergio Belotti, Dieter Beller, Andrey Elperin and
Evgeniy Brodskiy for their useful comments and suggestions.
Thanks to Mach Chen and Carlos Pignataro for the RTGDIR review.
Thanks to Derrell Piper for the SECDIR review. Thanks to Sue Hares
for GENART review.
Thanks to Vishnu Pavan Beeram for being the document shepherd and Jim
Guichard for being the responsible AD.
Thanks to Roman Danyliw for the IESG review comments.
7. References
7.1. Normative References 6.1. Normative References
[RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation
Element (PCE) Communication Protocol (PCEP)", RFC 5440, Element (PCE) Communication Protocol (PCEP)", RFC 5440,
DOI 10.17487/RFC5440, March 2009, DOI 10.17487/RFC5440, March 2009,
<https://www.rfc-editor.org/info/rfc5440>. <https://www.rfc-editor.org/info/rfc5440>.
[RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
Chaining (SFC) Architecture", RFC 7665, Chaining (SFC) Architecture", RFC 7665,
DOI 10.17487/RFC7665, October 2015, DOI 10.17487/RFC7665, October 2015,
<https://www.rfc-editor.org/info/rfc7665>. <https://www.rfc-editor.org/info/rfc7665>.
skipping to change at page 31, line 28 skipping to change at line 1388
Architecture for Use of PCE and the PCE Communication Architecture for Use of PCE and the PCE Communication
Protocol (PCEP) in a Network with Central Control", Protocol (PCEP) in a Network with Central Control",
RFC 8283, DOI 10.17487/RFC8283, December 2017, RFC 8283, DOI 10.17487/RFC8283, December 2017,
<https://www.rfc-editor.org/info/rfc8283>. <https://www.rfc-editor.org/info/rfc8283>.
[RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
Decraene, B., Litkowski, S., and R. Shakir, "Segment Decraene, B., Litkowski, S., and R. Shakir, "Segment
Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
July 2018, <https://www.rfc-editor.org/info/rfc8402>. July 2018, <https://www.rfc-editor.org/info/rfc8402>.
7.2. Informative References 6.2. Informative References
[I-D.cbrt-pce-stateful-local-protection] [MAP-REDUCE]
Lee, K., Choi, T., Ganguly, A., Wolinsky, D., Boykin, P.,
and R. Figueiredo, "Parallel Processing Framework on a P2P
System Using Map and Reduce Primitives",
DOI 10.1109/IPDPS.2011.315, May 2011,
<https://leeky.me/publications/mapreduce_p2p.pdf>.
[MPLS-DC] Afanasiev, D. and D. Ginsburg, "MPLS in DC and inter-DC
networks: the unified forwarding mechanism for network
programmability at scale", March 2014,
<https://www.slideshare.net/DmitryAfanasiev1/yandex-
nag201320131031>.
[MPLS-SEAMLESS]
Leymann, N., Ed., Decraene, B., Filsfils, C.,
Konstantynowicz, M., Ed., and D. Steinberg, "Seamless MPLS
Architecture", Work in Progress, Internet-Draft, draft-
ietf-mpls-seamless-mpls-07, 28 June 2014,
<https://datatracker.ietf.org/doc/html/draft-ietf-mpls-
seamless-mpls-07>.
[PCE-ID] Li, C., Shi, H., Ed., Wang, A., Cheng, W., and C. Zhou,
"Path Computation Element Communication Protocol (PCEP)
extension to advertise the PCE Controlled Identifier
Space", Work in Progress, Internet-Draft, draft-ietf-pce-
controlled-id-space-00, 4 June 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce-
controlled-id-space-00>.
[PCE-INTERDOMAIN]
Dugeon, O., Meuric, J., Lee, Y., and D. Ceccarelli, "PCEP
Extension for Stateful Inter-Domain Tunnels", Work in
Progress, Internet-Draft, draft-ietf-pce-stateful-
interdomain-05, 5 July 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce-
stateful-interdomain-05>.
[PCE-PROTECTION]
Barth, C. and R. Torvi, "PCEP Extensions for RSVP-TE Barth, C. and R. Torvi, "PCEP Extensions for RSVP-TE
Local-Protection with PCE-Stateful", Work in Progress, Local-Protection with PCE-Stateful", Work in Progress,
Internet-Draft, draft-cbrt-pce-stateful-local-protection- Internet-Draft, draft-cbrt-pce-stateful-local-protection-
01, 29 June 2018, <https://datatracker.ietf.org/doc/html/ 01, 29 June 2018, <https://datatracker.ietf.org/doc/html/
draft-cbrt-pce-stateful-local-protection-01>. draft-cbrt-pce-stateful-local-protection-01>.
[I-D.chen-pce-bier] [PCECC-BIER]
Chen, R., Zhang, Z., Chen, H., Dhanaraj, S., Qin, F., and Chen, R., Zhu, C., Xu, B., Chen, H., and A. Wang, "PCEP
A. Wang, "PCEP Extensions for Tree Engineering for Bit Procedures and Protocol Extensions for Using PCE as a
Index Explicit Replication (BIER-TE)", Work in Progress, Central Controller (PCECC) of BIER", Work in Progress,
Internet-Draft, draft-chen-pce-bier-13, 1 October 2023, Internet-Draft, draft-chen-pce-pcep-extension-pce-
controller-bier-06, 8 July 2024,
<https://datatracker.ietf.org/doc/html/draft-chen-pce- <https://datatracker.ietf.org/doc/html/draft-chen-pce-
bier-13>. pcep-extension-pce-controller-bier-06>.
[I-D.chen-pce-pcep-extension-pce-controller-bier] [PCECC-SR] Li, Z., Peng, S., Negi, M. S., Zhao, Q., and C. Zhou, "PCE
Chen, R., Xu, B., Chen, H., and A. Wang, "PCEP Procedures Communication Protocol (PCEP) Extensions for Using PCE as
and Protocol Extensions for Using PCE as a Central a Central Controller (PCECC) for Segment Routing (SR) MPLS
Controller (PCECC) of BIER", Work in Progress, Internet- Segment Identifier (SID) Allocation and Distribution.",
Draft, draft-chen-pce-pcep-extension-pce-controller-bier- Work in Progress, Internet-Draft, draft-ietf-pce-pcep-
05, 19 October 2023, extension-pce-controller-sr-09, 4 July 2024,
<https://datatracker.ietf.org/doc/html/draft-chen-pce- <https://datatracker.ietf.org/doc/html/draft-ietf-pce-
pcep-extension-pce-controller-bier-05>. pcep-extension-pce-controller-sr-09>.
[I-D.dhody-pce-pcep-extension-pce-controller-srv6] [PCECC-SRv6]
Li, Z., Peng, S., Geng, X., and M. S. Negi, "PCE Li, Z., Peng, S., Geng, X., and M. S. Negi, "PCE
Communication Protocol (PCEP) Extensions for Using the PCE Communication Protocol (PCEP) Extensions for Using the PCE
as a Central Controller (PCECC) for Segment Routing over as a Central Controller (PCECC) for Segment Routing over
IPv6 (SRv6) Segment Identifier (SID) Allocation and IPv6 (SRv6) Segment Identifier (SID) Allocation and
Distribution.", Work in Progress, Internet-Draft, draft- Distribution.", Work in Progress, Internet-Draft, draft-
dhody-pce-pcep-extension-pce-controller-srv6-10, 15 ietf-pce-pcep-extension-pce-controller-srv6-03, 18 August
January 2023, <https://datatracker.ietf.org/doc/html/ 2024, <https://datatracker.ietf.org/doc/html/draft-ietf-
draft-dhody-pce-pcep-extension-pce-controller-srv6-10>. pce-pcep-extension-pce-controller-srv6-03>.
[I-D.ietf-mpls-seamless-mpls]
Leymann, N., Decraene, B., Filsfils, C., Konstantynowicz,
M., and D. Steinberg, "Seamless MPLS Architecture", Work
in Progress, Internet-Draft, draft-ietf-mpls-seamless-
mpls-07, 28 June 2014,
<https://datatracker.ietf.org/doc/html/draft-ietf-mpls-
seamless-mpls-07>.
[I-D.ietf-pce-binding-label-sid] [PCEP-BIER]
Sivabalan, S., Filsfils, C., Tantsura, J., Previdi, S., Chen, R., Zhang, Z., Chen, H., Dhanaraj, S., Qin, F., and
and C. Li, "Carrying Binding Label/Segment Identifier A. Wang, "PCEP Extensions for BIER-TE", Work in Progress,
(SID) in PCE-based Networks.", Work in Progress, Internet- Internet-Draft, draft-ietf-pce-bier-te-00, 4 November
Draft, draft-ietf-pce-binding-label-sid-16, 27 March 2023, 2023, <https://datatracker.ietf.org/doc/html/draft-ietf-
<https://datatracker.ietf.org/doc/html/draft-ietf-pce- pce-bier-te-00>.
binding-label-sid-16>.
[I-D.ietf-pce-pcep-extension-native-ip] [PCEP-NATIVE]
Wang, A., Khasanov, B., Fang, S., Tan, R., and C. Zhu, Wang, A., Khasanov, B., Fang, S., Tan, R., and C. Zhu,
"Path Computation Element Communication Protocol (PCEP) "Path Computation Element Communication Protocol (PCEP)
Extensions for Native IP Networks", Work in Progress, Extensions for Native IP Networks", Work in Progress,
Internet-Draft, draft-ietf-pce-pcep-extension-native-ip- Internet-Draft, draft-ietf-pce-pcep-extension-native-ip-
30, 1 February 2024, 40, 10 September 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce-
pcep-extension-native-ip-30>.
[I-D.ietf-pce-pcep-extension-pce-controller-sr]
Li, Z., Peng, S., Negi, M. S., Zhao, Q., and C. Zhou, "PCE
Communication Protocol (PCEP) Extensions for Using PCE as
a Central Controller (PCECC) for Segment Routing (SR) MPLS
Segment Identifier (SID) Allocation and Distribution.",
Work in Progress, Internet-Draft, draft-ietf-pce-pcep-
extension-pce-controller-sr-08, 1 January 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce-
pcep-extension-pce-controller-sr-08>.
[I-D.ietf-pce-segment-routing-ipv6]
Li, C., Kaladharan, P., Sivabalan, S., Koldychev, M., and
Y. Zhu, "Path Computation Element Communication Protocol
(PCEP) Extensions for IPv6 Segment Routing", Work in
Progress, Internet-Draft, draft-ietf-pce-segment-routing-
ipv6-25, 4 April 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce- <https://datatracker.ietf.org/doc/html/draft-ietf-pce-
segment-routing-ipv6-25>. pcep-extension-native-ip-40>.
[I-D.ietf-pce-segment-routing-policy-cp] [PCEP-POLICY]
Koldychev, M., Sivabalan, S., Barth, C., Peng, S., and H. Koldychev, M., Sivabalan, S., Barth, C., Peng, S., and H.
Bidgoli, "Path Computation Element Communication Protocol Bidgoli, "Path Computation Element Communication Protocol
(PCEP) Extensions for Segment Routing (SR) Policy (PCEP) Extensions for Segment Routing (SR) Policy
Candidate Paths", Work in Progress, Internet-Draft, draft- Candidate Paths", Work in Progress, Internet-Draft, draft-
ietf-pce-segment-routing-policy-cp-16, 28 May 2024, ietf-pce-segment-routing-policy-cp-18, 14 October 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce-
segment-routing-policy-cp-16>.
[I-D.ietf-pce-stateful-interdomain]
Dugeon, O., Meuric, J., Lee, Y., and D. Ceccarelli, "PCEP
Extension for Stateful Inter-Domain Tunnels", Work in
Progress, Internet-Draft, draft-ietf-pce-stateful-
interdomain-04, 23 October 2023,
<https://datatracker.ietf.org/doc/html/draft-ietf-pce- <https://datatracker.ietf.org/doc/html/draft-ietf-pce-
stateful-interdomain-04>. segment-routing-policy-cp-18>.
[I-D.ietf-spring-sr-service-programming]
Clad, F., Xu, X., Filsfils, C., Bernier, D., Li, C.,
Decraene, B., Ma, S., Yadlapalli, C., Henderickx, W., and
S. Salsano, "Service Programming with Segment Routing",
Work in Progress, Internet-Draft, draft-ietf-spring-sr-
service-programming-09, 20 February 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-spring-
sr-service-programming-09>.
[I-D.li-pce-controlled-id-space]
Li, C., Shi, H., Wang, A., Cheng, W., and C. Zhou, "Path
Computation Element Communication Protocol (PCEP)
extension to advertise the PCE Controlled Identifier
Space", Work in Progress, Internet-Draft, draft-li-pce-
controlled-id-space-16, 25 January 2024,
<https://datatracker.ietf.org/doc/html/draft-li-pce-
controlled-id-space-16>.
[MAP-REDUCE]
Lee, K., Choi, T., Ganguly, A., Wolinsky, D., Boykin, P.,
and R. Figueiredo, "Parallel Processing Framework on a P2P
System Using Map and Reduce Primitives", , May 2011,
<http://leeky.me/publications/mapreduce_p2p.pdf>.
[MPLS-DC] Afanasiev, D. and D. Ginsburg, "MPLS in DC and inter-DC
networks: the unified forwarding mechanism for network
programmability at scale", , March 2014,
<https://www.slideshare.net/DmitryAfanasiev1/yandex-
nag201320131031>.
[RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and [RFC1195] Callon, R., "Use of OSI IS-IS for routing in TCP/IP and
dual environments", RFC 1195, DOI 10.17487/RFC1195, dual environments", RFC 1195, DOI 10.17487/RFC1195,
December 1990, <https://www.rfc-editor.org/info/rfc1195>. December 1990, <https://www.rfc-editor.org/info/rfc1195>.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328,
DOI 10.17487/RFC2328, April 1998, DOI 10.17487/RFC2328, April 1998,
<https://www.rfc-editor.org/info/rfc2328>. <https://www.rfc-editor.org/info/rfc2328>.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
skipping to change at page 38, line 14 skipping to change at line 1683
[RFC9522] Farrel, A., Ed., "Overview and Principles of Internet [RFC9522] Farrel, A., Ed., "Overview and Principles of Internet
Traffic Engineering", RFC 9522, DOI 10.17487/RFC9522, Traffic Engineering", RFC 9522, DOI 10.17487/RFC9522,
January 2024, <https://www.rfc-editor.org/info/rfc9522>. January 2024, <https://www.rfc-editor.org/info/rfc9522>.
[RFC9552] Talaulikar, K., Ed., "Distribution of Link-State and [RFC9552] Talaulikar, K., Ed., "Distribution of Link-State and
Traffic Engineering Information Using BGP", RFC 9552, Traffic Engineering Information Using BGP", RFC 9552,
DOI 10.17487/RFC9552, December 2023, DOI 10.17487/RFC9552, December 2023,
<https://www.rfc-editor.org/info/rfc9552>. <https://www.rfc-editor.org/info/rfc9552>.
Appendix A. Other Use Cases of PCECC [RFC9603] Li, C., Ed., Kaladharan, P., Sivabalan, S., Koldychev, M.,
and Y. Zhu, "Path Computation Element Communication
Protocol (PCEP) Extensions for IPv6 Segment Routing",
RFC 9603, DOI 10.17487/RFC9603, July 2024,
<https://www.rfc-editor.org/info/rfc9603>.
This section lists some more use cases of PCECC that were proposed by [RFC9604] Sivabalan, S., Filsfils, C., Tantsura, J., Previdi, S.,
operators and discussed within the working group, but are not in and C. Li, Ed., "Carrying Binding Label/SID in PCE-Based
active development at the time of publication. They are listed here Networks", RFC 9604, DOI 10.17487/RFC9604, August 2024,
for future consideration. <https://www.rfc-editor.org/info/rfc9604>.
[SR-SERVICE]
Clad, F., Ed., Xu, X., Ed., Filsfils, C., Bernier, D., Li,
C., Decraene, B., Ma, S., Yadlapalli, C., Henderickx, W.,
and S. Salsano, "Service Programming with Segment
Routing", Work in Progress, Internet-Draft, draft-ietf-
spring-sr-service-programming-10, 23 August 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-spring-
sr-service-programming-10>.
Appendix A. Other Use Cases of the PCECC
This section lists some more use cases of the PCECC that were
proposed by operators and discussed within the working group but are
not in active development at the time of publication. They are
listed here for future consideration.
A.1. PCECC for Network Migration A.1. PCECC for Network Migration
One of the main advantages of the PCECC solution is its backward One of the main advantages of the PCECC solution is its backward
compatibility. The PCE server can function as a proxy node of the compatibility. The PCE server can function as a proxy node of the
MPLS network for all the new nodes that no longer support the MPLS network for all the new nodes that no longer support the
signalling protocols. signalling protocols.
As illustrated in the following example, the current network could As illustrated in the following example, the current network could
migrate to a total PCECC-controlled network gradually by replacing migrate to a total PCECC-controlled network gradually by replacing
the legacy nodes. During the migration, the legacy nodes still need the legacy nodes. During the migration, the legacy nodes still need
to use the existing MPLS protocols signalling such as LDP and RSVP- to use the existing MPLS protocols signalling such as LDP and RSVP-
TE, and the new nodes will set up their portion of the forwarding TE, and the new nodes will set up their portion of the forwarding
path through PCECC directly. With the PCECC function as the proxy of path through the PCECC directly. With the PCECC function as the
these new nodes, MPLS signalling can populate through the network for proxy of these new nodes, MPLS signalling can populate through the
both: old and new nodes. network for both old and new nodes.
The example described in this section is based on network The example described in this section is based on network
configurations illustrated using Figure 13: configurations illustrated in Figure 13:
+------------------------------------------------------------------+ +------------------------------------------------------------------+
| PCE DOMAIN | | PCE DOMAIN |
| +-----------------------------------------------------+ | | +-----------------------------------------------------+ |
| | PCECC | | | | PCECC | |
| +-----------------------------------------------------+ | | +-----------------------------------------------------+ |
| ^ ^ ^ ^ | | ^ ^ ^ ^ |
| | PCEP | | PCEP | | | | PCEP | | PCEP | |
| V V V V | | V V V V |
| +--------+ +--------+ +--------+ +--------+ +--------+ | | +--------+ +--------+ +--------+ +--------+ +--------+ |
| | NODE 1 | | NODE 2 | | NODE 3 | | NODE 4 | | NODE 5 | | | | NODE 1 | | NODE 2 | | NODE 3 | | NODE 4 | | NODE 5 | |
| | |...| |...| |...| |...| | | | | |...| |...| |...| |...| | |
| | Legacy |if1| Legacy |if2|Legacy |if3| PCECC |if4| PCECC | | | | Legacy |if1| Legacy |if2|Legacy |if3| PCECC |if4| PCECC | |
| | Node | | Node | |Enabled | |Enabled | | Enabled| | | | Node | | Node | |Enabled | |Enabled | | Enabled| |
| +--------+ +--------+ +--------+ +--------+ +--------+ | | +--------+ +--------+ +--------+ +--------+ +--------+ |
| | | |
+------------------------------------------------------------------+ +------------------------------------------------------------------+
Figure 13: PCECC Initiated LSP Setup In the Network Migration Figure 13: PCECC-Initiated LSP Setup in the Network Migration
In this example, there are five nodes for the TE LSP from the head In this example, there are five nodes for the TE LSP from the head
end (Node1) to the tail end (Node5). Where Node4 and Node5 are end (Node1) to the tail end (Node5), where Node4 and Node5 are
centrally controlled and other nodes are legacy nodes. centrally controlled and other nodes are legacy nodes.
* Node1 sends a path request message for the setup of LSP with the * Node1 sends a path request message for the setup of the LSP with
destination as Node5. the destination as Node5.
* PCECC sends to Node1 a reply message for LSP setup with the path: * The PCECC sends a reply message to Node1 for LSP setup with the
(Node1, if1),(Node2, if2), (Node3, if3), (Node4, if4), Node5. path: (Node1, if1), (Node2, if2), (Node3, if3), (Node4, if4),
Node5.
* Node1, Node2, and Node3 will set up the LSP to Node5 using the * Node1, Node2, and Node3 will set up the LSP to Node5 using the
local labels as usual. Node 3 with the help of PCECC could proxy local labels as usual. With the help of the PCECC, Node 3 could
the signalling. proxy the signalling.
* Then the PCECC will program the out-segment of Node3, the in- * Then, the PCECC will program the out-segment of Node3, the in-
segment/ out-segment of Node4, and the in-segment for Node5. segment/out-segment of Node4, and the in-segment for Node5.
A.2. PCECC for L3VPN and PWE3 A.2. PCECC for L3VPN and PWE3
As described in [RFC8283], various network services may be offered As described in [RFC8283], various network services may be offered
over a network. These include protection services (including Virtual over a network. These include protection services (including Virtual
Private Network (VPN) services (such as Layer 3 VPNs [RFC4364] or Private Network (VPN) services such as Layer 3 VPNs [RFC4364] or
Ethernet VPNs [RFC7432]); or Pseudowires [RFC3985]. Delivering Ethernet VPNs [RFC7432]) or pseudowires [RFC3985]. Delivering
services over a network in an optimal way requires coordination in services over a network in an optimal way requires coordination in
the way where network resources are allocated to support the the way where network resources are allocated to support the
services. A PCE-based central controller can consider the whole services. A PCE-based central controller can consider the whole
network and all components of a service at once when planning how to network and all components of a service at once when planning how to
deliver the service. It can then use PCEP to manage the network deliver the service. It can then use PCEP to manage the network
resources and to install the necessary associations between those resources and to install the necessary associations between those
resources. resources.
In the case of L3VPN, VPN labels could also be assigned and In the case of L3VPN, VPN labels could also be assigned and
distributed through PCEP among the PE router instead of using the BGP distributed through PCEP among the Provider Edge (PE) router instead
protocols. of using the BGP protocols.
The example described in this section is based on network The example described in this section is based on network
configurations illustrated using Figure 14: configurations illustrated in Figure 14:
+-------------------------------------------+ +-------------------------------------------+
| PCE DOMAIN | | PCE DOMAIN |
| +-----------------------------------+ | | +-----------------------------------+ |
| | PCECC | | | | PCECC | |
| +-----------------------------------+ | | +-----------------------------------+ |
| ^ ^ ^ | | ^ ^ ^ |
|PWE3/L3VPN | PCEP PCEP|LSP PWE3/L3VPN|PCEP | | PWE3/L3VPN|PCEP PCEP|LSP PWE3/L3VPN|PCEP |
| V V V | | V V V |
+--------+ | +--------+ +--------+ +--------+ | +--------+ +--------+ | +--------+ +--------+ +--------+ | +--------+
| CE | | | PE1 | | NODE x | | PE2 | | | CE | | CE | | | PE1 | | NODE x | | PE2 | | | CE |
| |...... | |...| |...| |.....| | | |...... | |...| |...| |.....| |
| Legacy | |if1 | PCECC |if2|PCCEC |if3| PCECC |if4 | Legacy | | Legacy | |if1 | PCECC |if2|PCCEC |if3| PCECC |if4 | Legacy |
| Node | | | Enabled| |Enabled | |Enabled | | | Node | | Node | | | Enabled| |Enabled | |Enabled | | | Node |
+--------+ | +--------+ +--------+ +--------+ | +--------+ +--------+ | +--------+ +--------+ +--------+ | +--------+
| | | |
+-------------------------------------------+ +-------------------------------------------+
Figure 14: PCECC for L3VPN and PWE3 Figure 14: PCECC for L3VPN and PWE3
In the case of PWE3, instead of using the LDP signalling protocols, In the case of PWE3, instead of using the LDP signalling protocols,
the label and port pairs assigned to each pseudowire can be assigned the label and port pairs assigned to each pseudowire can be assigned
through PCECC among the PE routers and the corresponding forwarding through the PCECC among the PE routers and the corresponding
entries will be distributed into each PE router through the extended forwarding entries will be distributed into each PE router through
PCEP and PCECC mechanism. the extended PCEP and PCECC mechanism.
A.3. PCECC for Local Protection (RSVP-TE) A.3. PCECC for Local Protection (RSVP-TE)
[I-D.cbrt-pce-stateful-local-protection] claim that there is a need [PCE-PROTECTION] claims that there is a need for the PCE to maintain
for the PCE to maintain and associate the local protection paths for and associate the local protection paths for the RSVP-TE LSP. Local
the RSVP-TE LSP. Local protection requires the setup of a bypass at protection requires the setup of a bypass at the PLR. This bypass
the PLR. This bypass can be PCC-initiated and delegated, or PCE- can be PCC-initiated and delegated or PCE-initiated. In either case,
initiated. In either case, the PLR needs to maintain a PCEP session the PLR needs to maintain a PCEP session with the PCE. The bypass
with the PCE. The Bypass LSPs need to be mapped to the primary LSP. LSPs need to be mapped to the primary LSP. This could be done
This could be done locally at the PLR based on a local policy but locally at the PLR based on a local policy, but there is a need for a
there is a need for a PCE to do the mapping as well to exert greater PCE to do the mapping as well to exert greater control.
control.
This mapping can be done via PCECC procedures where the PCE could This mapping can be done via PCECC procedures where the PCE could
instruct the PLR to the mapping and identify the primary LSP for instruct the PLR to the mapping and identify the primary LSP for
which bypass should be used. which bypass should be used.
A.4. Using reliable P2MP TE based multicast delivery for distributed A.4. Using Reliable P2MP TE-Based Multicast Delivery for Distributed
computations (MapReduce-Hadoop) Computations (MapReduce-Hadoop)
MapReduce model of distributed computations in computing clusters is The MapReduce model of distributed computations in computing clusters
widely deployed. In Hadoop (https://hadoop.apache.org/) 1.0 is widely deployed. In Hadoop (https://hadoop.apache.org/) 1.0
architecture MapReduce operations on big data in the Hadoop architecture, MapReduce operations occur on big data in the Hadoop
Distributed File System (HDFS), where NameNode knows about resources Distributed File System (HDFS), where NameNode knows about resources
of the cluster and where actual data (chunks) for a particular task of the cluster and where actual data (chunks) for a particular task
are located (which DataNode). Each chunk of data (64MB or more) are located (which DataNode). Each chunk of data (64 MB or more)
should have 3 saved copies in different DataNodes based on their should have three saved copies in different DataNodes based on their
proximity. proximity.
The proximity level currently has a semi-manual allocation and is The proximity level currently has a semi-manual allocation and is
based on Rack IDs (The assumption is that closer data are better based on Rack IDs (the assumption is that closer data is better
because of access speed/smaller latency). because of access speed / smaller latency).
JobTracker node is responsible for computation tasks, and scheduling The JobTracker node is responsible for computation tasks and
across DataNodes and also has Rack-awareness. Currently, transport scheduling across DataNodes and also has Rack awareness. Currently,
protocols between NameNode/JobTracker and DataNodes are based on IP transport protocols between NameNode/JobTracker and DataNodes are
unicast. It has simplicity as an advantage but has numerous based on IP unicast. It has simplicity as an advantage but has
drawbacks related to its flat approach. numerous drawbacks related to its flat approach.
There is a need to go beyond one data centre (DC) for Hadoop cluster There is a need to go beyond one data center (DC) for Hadoop cluster
creation and move towards distributed clusters. In that case, one creation and move towards distributed clusters. In that case, one
needs to handle performance and latency issues. Latency depends on needs to handle performance and latency issues. Latency depends on
the speed of light in the fibre links and on the latency introduced the speed of light in the fiber links and on the latency introduced
by intermediate devices in between. The latter is closely correlated by intermediate devices in between. The latter is closely correlated
with network device architecture and performance. The current with network device architecture and performance. The current
performance of NPU-based routers should be enough for creating performance of routers based on Network Processing Unit (NPU) should
distributed Hadoop clusters with predicted latency. The performance be enough for creating distributed Hadoop clusters with predicted
of software-based routers (mainly virtual network functions (VNF)) latency. The performance of software-based routers (mainly Virtual
with additional hardware features such as the Data Plane Development Network Functions (VNFs)) with additional hardware features such as
Kit (DPDK) is promising but requires additional research and testing. the Data Plane Development Kit (DPDK) is promising but requires
additional research and testing.
The main question is how to create a simple but effective The main question is how to create a simple but effective
architecture for a distributed Hadoop cluster. architecture for a distributed Hadoop cluster.
There is research [MAP-REDUCE] that show how usage of the multicast There is research [MAP-REDUCE] that shows how usage of the multicast
tree could improve the speed of resource or cluster members' tree could improve the speed of resource or cluster members'
discovery inside the cluster as well as increased redundancy in discovery inside the cluster as well as increased redundancy in
communications between cluster nodes. communications between cluster nodes.
The traditional IP-based multicast may not be appropriate because it The traditional IP-based multicast may not be appropriate because it
requires an additional control plane (IGMP, PIM) and a lot of requires an additional control plane (IGMP, PIM) and a lot of
signalling, that is not suitable for high-performance computations, signalling, which is not suitable for high-performance computations
that are very sensitive to latency. that are very sensitive to latency.
P2MP TE tunnels are more suitable as a potential solution for the P2MP TE tunnels are more suitable as a potential solution for the
creation of multicast-based communications between NameNode as root creation of multicast-based communications between NameNode as the
and DataNodes as leaves inside the cluster. These P2MP tunnels could root and DataNodes as leaves inside the cluster. These P2MP tunnels
be dynamically created and turned down (with no manual intervention). could be dynamically created and turned down (with no manual
Here, the PCECC comes into play with the main objective of creating intervention). Here, the PCECC comes into play with the main
an optimal topology for each particular request for MapReduce objective of creating an optimal topology for each particular request
computation and creating P2MP tunnels with needed parameters such as for MapReduce computation and creating P2MP tunnels with needed
bandwidth and delay. parameters such as bandwidth and delay.
This solution will require the use of MPLS label-based forwarding This solution will require the use of MPLS label-based forwarding
inside the cluster. The usage of label-based forwarding inside DC inside the cluster. The usage of label-based forwarding inside DC
was proposed by Yandex [MPLS-DC]. Technically it is already possible was proposed by Yandex [MPLS-DC]. Technically, it is already
because MPLS on switches is already supported by some vendors, MPLS possible because MPLS on switches is already supported by some
also exists on Linux and OVS. vendors, and MPLS also exists on Linux and Open vSwitch (OVS).
A possible framework for this task is shown in Figure 15: A possible framework for this task is shown in Figure 15:
+--------+ +--------+
| APP | | APP |
+--------+ +--------+
| NBI (REST API,...) | NBI (REST API,...)
| |
PCEP +----------+ REST API PCEP +----------+ REST API
+---------+ +---| PCECC |----------+ +---------+ +---| PCECC |----------+
skipping to change at page 43, line 4 skipping to change at line 1918
| Job Tracker | | | | | | NameNode | | Job Tracker | | | | | | NameNode |
| | | | | | | | | | | | | | | |
+-------------+ | | | | +----------+ +-------------+ | | | | +----------+
+------------------+ | +-----------+ +------------------+ | +-----------+
| | | | | | | |
|---+-----P2MP TE--+-----|-----------| | |---+-----P2MP TE--+-----|-----------| |
+----------+ +----------+ +----------+ +----------+ +----------+ +----------+
| DataNode1| | DataNode2| | DataNodeN| | DataNode1| | DataNode2| | DataNodeN|
|TaskTraker| |TaskTraker| .... |TaskTraker| |TaskTraker| |TaskTraker| .... |TaskTraker|
+----------+ +----------+ +----------+ +----------+ +----------+ +----------+
Figure 15: Using reliable P2MP TE based multicast delivery for
distributed computations (MapReduce-Hadoop)
Communication between JobTracker, NameNode and PCECC can be done via Figure 15: Using Reliable P2MP TE-Based Multicast Delivery for
REST API directly or via cluster manager such as Mesos. Distributed Computations (MapReduce-Hadoop)
Phase 1: Distributed cluster resources discovery During this phase, Communication between the JobTracker, NameNode, and PCECC can be done
JobTracker and NameNode should identify and find available DataNodes via REST API directly or via a cluster manager such as Mesos.
according to computing requests from the application (APP). NameNode
should query PCECC about available DataNodes, NameNode may provide
additional constraints to PCECC such as topological proximity, and
redundancy level.
PCECC should analyze the topology of the distributed cluster and * Phase 1: Distributed cluster resource discovery occurs during this
perform constraint-based path calculation from the client towards the phase. JobTracker and NameNode should identify and find available
most suitable NameNodes. PCECC should reply to NameNode with the DataNodes according to computing requests from the application
list of the most suitable DataNodes and their resource capabilities. (APP). NameNode should query the PCECC about available DataNodes,
The topology discovery mechanism for PCECC will be added later to and NameNode may provide additional constraints to the PCECC such
that framework. as topological proximity and redundancy level.
Phase 2: PCECC should create P2MP LSP from the client towards those The PCECC should analyze the topology of the distributed cluster
DataNodes by means of PCEP messages following the previously and perform a constraint-based path calculation from the client
calculated path. towards the most suitable NameNodes. The PCECC should reply to
NameNode with the list of the most suitable DataNodes and their
resource capabilities. The topology discovery mechanism for the
PCECC will be added later to that framework.
Phase 3. NameNode should send this information to the client, and * Phase 2: The PCECC should create P2MP LSPs from the client towards
PCECC should inform the client about the optimal P2MP path towards those DataNodes by means of PCEP messages following the previously
DataNodes via PCEP message. calculated path.
Phase 4. The Client sends data blocks to those DataNodes for writing * Phase 3: NameNode should send this information to the client, and
via the created P2MP tunnel. the PCECC should inform the client about the optimal P2MP path
towards DataNodes via a PCEP message.
* Phase 4: The client sends data blocks to those DataNodes for
writing via the created P2MP tunnel.
When this task is finished, the P2MP tunnel could be turned down. When this task is finished, the P2MP tunnel could be turned down.
Appendix B. Contributor Addresses Acknowledgments
Luyuan Fang
United States of America
Email: luyuanf@gmail.com Thanks to Adrian Farrel, Aijun Wang, Robert Tao, Changjiang Yan,
Tieying Huang, Sergio Belotti, Dieter Beller, Andrey Elperin, and
Evgeniy Brodskiy for their useful comments and suggestions.
Chao Zhou Thanks to Mach Chen and Carlos Pignataro for the RTGDIR review.
HPE Thanks to Derrell Piper for the SECDIR review. Thanks to Sue Hares
for GENART review.
Email: chaozhou_us@yahoo.com Thanks to Vishnu Pavan Beeram for being the document shepherd and Jim
Guichard for being the responsible AD.
Boris Zhang Thanks to Roman Danyliw for the IESG review comments.
Amazon
Email: zhangyud@amazon.com Contributors
Artsiom Rachytski Luyuan Fang
Belarus United States of America
Email: luyuanf@gmail.com
Email: arachyts@gmail.com Chao Zhou
HPE
Email: chaozhou_us@yahoo.com
Anton Gulida Boris Zhang
EPAM Systems, Inc. Amazon
Belarus Email: zhangyud@amazon.com
Email: Anton_Hulida@epam.com Artsiom Rachytski
Belarus
Email: arachyts@gmail.com
Anton Gulida
EPAM Systems, Inc.
Belarus
Email: Anton_Hulida@epam.com
Authors' Addresses Authors' Addresses
Zhenbin (Robin) Li Zhenbin (Robin) Li
Huawei Technologies Huawei Technologies
Huawei Bld., No.156 Beiqing Rd. Huawei Bld., No.156 Beiqing Rd.
Beijing Beijing
100095 100095
China China
Email: lizhenbin@huawei.com Email: lizhenbin@huawei.com
Dhruv Dhody Dhruv Dhody
Huawei Technologies Huawei Technologies
India India
Email: dhruv.ietf@gmail.com Email: dhruv.ietf@gmail.com
Quintin Zhao Quintin Zhao
Etheric Networks Etheric Networks
1009 S CLAREMONT ST 1009 S Claremont St.
SAN MATEO, CA 94402 San Mateo, CA 94402
United States of America United States of America
Email: qzhao@ethericnetworks.com Email: qzhao@ethericnetworks.com
King He King He
Tencent Holdings Ltd. Tencent Holdings Ltd.
Shenzhen Shenzhen
China China
Email: kinghe@tencent.com Email: kinghe@tencent.com
Boris Khasanov Boris Khasanov
Yandex LLC Yandex LLC
Ulitsa Lva Tolstogo 16 Ulitsa Lva Tolstogo 16
Moscow Moscow
Russian Federation
Email: bhassanov@yahoo.com Email: bhassanov@yahoo.com
 End of changes. 247 change blocks. 
711 lines changed or deleted 705 lines changed or added

This html diff was produced by rfcdiff 1.48.