draft-ietf-tsvwg-ecn-l4s-id-29.original | draft-ietf-tsvwg-ecn-l4s-id-30v2.txt | |||
---|---|---|---|---|
Transport Services (tsv) K. De Schepper | Transport Services (tsv) K. De Schepper | |||
Internet-Draft Nokia Bell Labs | Internet-Draft Nokia Bell Labs | |||
Intended status: Experimental B. Briscoe, Ed. | Intended status: Experimental B. Briscoe, Ed. | |||
Expires: 2 March 2023 Independent | Expires: April 24, 2023 Independent | |||
29 August 2022 | October 21, 2022 | |||
Explicit Congestion Notification (ECN) Protocol for Very Low Queuing | Explicit Congestion Notification (ECN) Protocol for Very Low Queuing | |||
Delay (L4S) | Delay (L4S) | |||
draft-ietf-tsvwg-ecn-l4s-id-29 | draft-ietf-tsvwg-ecn-l4s-id-30 | |||
Abstract | Abstract | |||
This specification defines the protocol to be used for a new network | This specification defines the protocol to be used for a new network | |||
service called low latency, low loss and scalable throughput (L4S). | service called low latency, low loss and scalable throughput (L4S). | |||
L4S uses an Explicit Congestion Notification (ECN) scheme at the IP | L4S uses an Explicit Congestion Notification (ECN) scheme at the IP | |||
layer that is similar to the original (or 'Classic') ECN approach, | layer that is similar to the original (or 'Classic') ECN approach, | |||
except as specified within. L4S uses 'scalable' congestion control, | except as specified within. L4S uses 'scalable' congestion control, | |||
which induces much more frequent control signals from the network and | which induces much more frequent control signals from the network and | |||
it responds to them with much more fine-grained adjustments, so that | it responds to them with much more fine-grained adjustments, so that | |||
skipping to change at page 2, line 10 ¶ | skipping to change at page 2, line 10 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on 2 March 2023. | This Internet-Draft will expire on April 24, 2023. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Simplified BSD License text as described in Section 4.e of | |||
provided without warranty as described in the Revised BSD License. | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. Latency, Loss and Scaling Problems . . . . . . . . . . . 5 | 1.1. Latency, Loss and Scaling Problems . . . . . . . . . . . 5 | |||
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 | 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
2. L4S Packet Identification: Document Roadmap . . . . . . . . . 10 | 2. L4S Packet Identification: Document Roadmap . . . . . . . . . 10 | |||
3. Choice of L4S Packet Identifier: Requirements . . . . . . . . 11 | 3. Choice of L4S Packet Identifier: Requirements . . . . . . . . 11 | |||
4. Transport Layer Behaviour (the 'Prague Requirements') . . . . 12 | 4. Transport Layer Behaviour (the 'Prague Requirements') . . . . 12 | |||
4.1. Codepoint Setting . . . . . . . . . . . . . . . . . . . . 12 | 4.1. Codepoint Setting . . . . . . . . . . . . . . . . . . . . 12 | |||
4.2. Prerequisite Transport Feedback . . . . . . . . . . . . . 13 | 4.2. Prerequisite Transport Feedback . . . . . . . . . . . . . 12 | |||
4.3. Prerequisite Congestion Response . . . . . . . . . . . . 14 | 4.3. Prerequisite Congestion Response . . . . . . . . . . . . 13 | |||
4.3.1. Guidance on Congestion Response in the RFC Series . . 17 | 4.3.1. Guidance on Congestion Response in the RFC Series . . 16 | |||
4.4. Filtering or Smoothing of ECN Feedback . . . . . . . . . 20 | 4.4. Filtering or Smoothing of ECN Feedback . . . . . . . . . 19 | |||
5. Network Node Behaviour . . . . . . . . . . . . . . . . . . . 20 | 5. Network Node Behaviour . . . . . . . . . . . . . . . . . . . 20 | |||
5.1. Classification and Re-Marking Behaviour . . . . . . . . . 20 | 5.1. Classification and Re-Marking Behaviour . . . . . . . . . 20 | |||
5.2. The Strength of L4S CE Marking Relative to Drop . . . . . 22 | 5.2. The Strength of L4S CE Marking Relative to Drop . . . . . 21 | |||
5.3. Exception for L4S Packet Identification by Network Nodes | 5.3. Exception for L4S Packet Identification by Network Nodes | |||
with Transport-Layer Awareness . . . . . . . . . . . . . 23 | with Transport-Layer Awareness . . . . . . . . . . . . . 22 | |||
5.4. Interaction of the L4S Identifier with other | 5.4. Interaction of the L4S Identifier with other Identifiers 22 | |||
Identifiers . . . . . . . . . . . . . . . . . . . . . . . 23 | ||||
5.4.1. DualQ Examples of Other Identifiers Complementing L4S | 5.4.1. DualQ Examples of Other Identifiers Complementing L4S | |||
Identifiers . . . . . . . . . . . . . . . . . . . . . 23 | Identifiers . . . . . . . . . . . . . . . . . . . . . 23 | |||
5.4.1.1. Inclusion of Additional Traffic with L4S . . . . 23 | 5.4.1.1. Inclusion of Additional Traffic with L4S . . . . 23 | |||
5.4.1.2. Exclusion of Traffic From L4S Treatment . . . . . 26 | 5.4.1.2. Exclusion of Traffic From L4S Treatment . . . . . 25 | |||
5.4.1.3. Generalized Combination of L4S and Other | 5.4.1.3. Generalized Combination of L4S and Other | |||
Identifiers . . . . . . . . . . . . . . . . . . . . 26 | Identifiers . . . . . . . . . . . . . . . . . . . 26 | |||
5.4.2. Per-Flow Queuing Examples of Other Identifiers | 5.4.2. Per-Flow Queuing Examples of Other Identifiers | |||
Complementing L4S Identifiers . . . . . . . . . . . . 28 | Complementing L4S Identifiers . . . . . . . . . . . . 27 | |||
5.5. Limiting Packet Bursts from Links . . . . . . . . . . . . 28 | 5.5. Limiting Packet Bursts from Links . . . . . . . . . . . . 27 | |||
5.5.1. Limiting Packet Bursts from Links Fed by an L4S | 5.5.1. Limiting Packet Bursts from Links Fed by an L4S AQM . 28 | |||
AQM . . . . . . . . . . . . . . . . . . . . . . . . . 29 | ||||
5.5.2. Limiting Packet Bursts from Links Upstream of an L4S | 5.5.2. Limiting Packet Bursts from Links Upstream of an L4S | |||
AQM . . . . . . . . . . . . . . . . . . . . . . . . . 29 | AQM . . . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
6. Behaviour of Tunnels and Encapsulations . . . . . . . . . . . 29 | 6. Behaviour of Tunnels and Encapsulations . . . . . . . . . . . 29 | |||
6.1. No Change to ECN Tunnels and Encapsulations in General . 29 | 6.1. No Change to ECN Tunnels and Encapsulations in General . 29 | |||
6.2. VPN Behaviour to Avoid Limitations of Anti-Replay . . . . 30 | 6.2. VPN Behaviour to Avoid Limitations of Anti-Replay . . . . 29 | |||
7. L4S Experiments . . . . . . . . . . . . . . . . . . . . . . . 31 | 7. L4S Experiments . . . . . . . . . . . . . . . . . . . . . . . 30 | |||
7.1. Open Questions . . . . . . . . . . . . . . . . . . . . . 32 | 7.1. Open Questions . . . . . . . . . . . . . . . . . . . . . 31 | |||
7.2. Open Issues . . . . . . . . . . . . . . . . . . . . . . . 33 | 7.2. Open Issues . . . . . . . . . . . . . . . . . . . . . . . 32 | |||
7.3. Future Potential . . . . . . . . . . . . . . . . . . . . 33 | 7.3. Future Potential . . . . . . . . . . . . . . . . . . . . 33 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 35 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 34 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 35 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 35 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 36 | 10.2. Informative References . . . . . . . . . . . . . . . . . 35 | |||
Appendix A. Rationale for the 'Prague L4S Requirements' . . . . 46 | Appendix A. Rationale for the 'Prague L4S Requirements' . . . . 44 | |||
A.1. Rationale for the Requirements for Scalable Transport | A.1. Rationale for the Requirements for Scalable Transport | |||
Protocols . . . . . . . . . . . . . . . . . . . . . . . . 47 | Protocols . . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
A.1.1. Use of L4S Packet Identifier . . . . . . . . . . . . 47 | A.1.1. Use of L4S Packet Identifier . . . . . . . . . . . . 45 | |||
A.1.2. Accurate ECN Feedback . . . . . . . . . . . . . . . . 47 | A.1.2. Accurate ECN Feedback . . . . . . . . . . . . . . . . 45 | |||
A.1.3. Capable of Replacement by Classic Congestion | A.1.3. Capable of Replacement by Classic Congestion Control 46 | |||
Control . . . . . . . . . . . . . . . . . . . . . . . 47 | ||||
A.1.4. Fall back to Classic Congestion Control on Packet | A.1.4. Fall back to Classic Congestion Control on Packet | |||
Loss . . . . . . . . . . . . . . . . . . . . . . . . 48 | Loss . . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
A.1.5. Coexistence with Classic Congestion Control at Classic | A.1.5. Coexistence with Classic Congestion Control at | |||
ECN bottlenecks . . . . . . . . . . . . . . . . . . . 49 | Classic ECN bottlenecks . . . . . . . . . . . . . . . 47 | |||
A.1.6. Reduce RTT dependence . . . . . . . . . . . . . . . . 52 | A.1.6. Reduce RTT dependence . . . . . . . . . . . . . . . . 50 | |||
A.1.7. Scaling down to fractional congestion windows . . . . 53 | A.1.7. Scaling down to fractional congestion windows . . . . 51 | |||
A.1.8. Measuring Reordering Tolerance in Time Units . . . . 54 | A.1.8. Measuring Reordering Tolerance in Time Units . . . . 53 | |||
A.2. Scalable Transport Protocol Optimizations . . . . . . . . 57 | A.2. Scalable Transport Protocol Optimizations . . . . . . . . 55 | |||
A.2.1. Setting ECT in Control Packets and Retransmissions . 57 | A.2.1. Setting ECT in Control Packets and Retransmissions . 55 | |||
A.2.2. Faster than Additive Increase . . . . . . . . . . . . 58 | A.2.2. Faster than Additive Increase . . . . . . . . . . . . 56 | |||
A.2.3. Faster Convergence at Flow Start . . . . . . . . . . 58 | A.2.3. Faster Convergence at Flow Start . . . . . . . . . . 56 | |||
Appendix B. Compromises in the Choice of L4S Identifier . . . . 59 | Appendix B. Compromises in the Choice of L4S Identifier . . . . 57 | |||
Appendix C. Potential Competing Uses for the ECT(1) Codepoint . 64 | Appendix C. Potential Competing Uses for the ECT(1) Codepoint . 62 | |||
C.1. Integrity of Congestion Feedback . . . . . . . . . . . . 64 | C.1. Integrity of Congestion Feedback . . . . . . . . . . . . 62 | |||
C.2. Notification of Less Severe Congestion than CE . . . . . 65 | C.2. Notification of Less Severe Congestion than CE . . . . . 63 | |||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 66 | Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 64 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 66 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65 | |||
1. Introduction | 1. Introduction | |||
This experimental track specification defines the protocol to be used | This experimental track specification defines the protocol to be used | |||
for a new network service called low latency, low loss and scalable | for a new network service called low latency, low loss and scalable | |||
throughput (L4S). L4S uses an Explicit Congestion Notification (ECN) | throughput (L4S). L4S uses an Explicit Congestion Notification (ECN) | |||
scheme at the IP layer with the same set of codepoint transitions as | scheme at the IP layer with the same set of codepoint transitions as | |||
the original (or 'Classic') Explicit Congestion Notification | the original (or 'Classic') Explicit Congestion Notification | |||
(ECN [RFC3168]). RFC 3168 required an ECN mark to be equivalent to a | (ECN [RFC3168]). RFC 3168 required an ECN mark to be equivalent to a | |||
drop, both when applied in the network and when responded to by a | drop, both when applied in the network and when responded to by a | |||
skipping to change at page 4, line 26 ¶ | skipping to change at page 4, line 19 ¶ | |||
transport response to each mark is reduced and smoothed relative to | transport response to each mark is reduced and smoothed relative to | |||
that for drop. The two changes counterbalance each other so that the | that for drop. The two changes counterbalance each other so that the | |||
throughput of an L4S flow will be roughly the same as a comparable | throughput of an L4S flow will be roughly the same as a comparable | |||
non-L4S flow under the same conditions. Nonetheless, the much more | non-L4S flow under the same conditions. Nonetheless, the much more | |||
frequent ECN control signals and the finer responses to these signals | frequent ECN control signals and the finer responses to these signals | |||
result in very low queuing delay without compromising link | result in very low queuing delay without compromising link | |||
utilization, and this low delay can be maintained during high load. | utilization, and this low delay can be maintained during high load. | |||
For instance, queuing delay under heavy and highly varying load with | For instance, queuing delay under heavy and highly varying load with | |||
the example DCTCP/DualQ solution described below on a DSL or Ethernet | the example DCTCP/DualQ solution described below on a DSL or Ethernet | |||
link is sub-millisecond on average and roughly 1 to 2 milliseconds at | link is sub-millisecond on average and roughly 1 to 2 milliseconds at | |||
the 99th percentile without losing link utilization [DualPI2Linux], | the 99th percentile without losing link utilization [L4Seval22] | |||
[DCttH19]. Note that the queuing delay while waiting to acquire a | [DualPI2Linux]. Note that the queuing delay while waiting to acquire | |||
shared medium such as wireless has to be added to the above. It is a | a shared medium such as wireless has to be added to the above. It is | |||
different issue that needs to be addressed, but separately (see | a different issue that needs to be addressed, but separately (see | |||
section 6.3 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]). | section 6.3 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch]). | |||
L4S relies on 'scalable' congestion controls for these delay | L4S relies on 'scalable' congestion controls for these delay | |||
properties and for preserving low delay as flow rate scales, hence | properties and for preserving low delay as flow rate scales, hence | |||
the name. The congestion control used in Data Center TCP (DCTCP) is | the name. The congestion control used in Data Center TCP (DCTCP) is | |||
an example of a scalable congestion control, but DCTCP is applicable | an example of a scalable congestion control, but DCTCP is applicable | |||
solely to controlled environments like data centres [RFC8257], | solely to controlled environments like data centres [RFC8257], | |||
because it is too aggressive to co-exist with existing TCP-Reno- | because it is too aggressive to co-exist with existing TCP-Reno- | |||
friendly traffic. The DualQ Coupled AQM, which is defined in a | friendly traffic. The DualQ Coupled AQM, which is defined in a | |||
complementary experimental | complementary experimental | |||
skipping to change at page 7, line 37 ¶ | skipping to change at page 7, line 25 ¶ | |||
sharing the same queue, which is why they have been confined to | sharing the same queue, which is why they have been confined to | |||
private data centres or research testbeds. | private data centres or research testbeds. | |||
It turns out that these scalable congestion control algorithms that | It turns out that these scalable congestion control algorithms that | |||
solve the latency problem can also solve the scalability problem of | solve the latency problem can also solve the scalability problem of | |||
Classic congestion controls. The finer sawteeth in the congestion | Classic congestion controls. The finer sawteeth in the congestion | |||
window have low amplitude, so they cause very little queuing delay | window have low amplitude, so they cause very little queuing delay | |||
variation and the average time to recover from one congestion signal | variation and the average time to recover from one congestion signal | |||
to the next (the average duration of each sawtooth) remains | to the next (the average duration of each sawtooth) remains | |||
invariant, which maintains constant tight control as flow-rate | invariant, which maintains constant tight control as flow-rate | |||
scales. A background paper [DCttH19] gives the full explanation of | scales. A background paper [L4Seval22] gives the full explanation of | |||
why the design solves both the latency and the scaling problems, both | why the design solves both the latency and the scaling problems, both | |||
in plain English and in more precise mathematical form. The | in plain English and in more precise mathematical form. The | |||
explanation is summarized without the mathematics in Section 4 of the | explanation is summarized without the mathematics in Section 4 of the | |||
L4S architecture [I-D.ietf-tsvwg-l4s-arch]. | L4S architecture [I-D.ietf-tsvwg-l4s-arch]. | |||
1.2. Terminology | 1.2. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
skipping to change at page 10, line 9 ¶ | skipping to change at page 9, line 41 ¶ | |||
applicable for the unicast, multicast and anycast forwarding modes. | applicable for the unicast, multicast and anycast forwarding modes. | |||
The L4S identifier is an orthogonal packet classification to the | The L4S identifier is an orthogonal packet classification to the | |||
Differentiated Services Code Point (DSCP) [RFC2474]. Section 5.4 | Differentiated Services Code Point (DSCP) [RFC2474]. Section 5.4 | |||
explains what this means in practice. | explains what this means in practice. | |||
This document is intended for experimental status, so it does not | This document is intended for experimental status, so it does not | |||
update any standards track RFCs. Therefore, it depends on [RFC8311], | update any standards track RFCs. Therefore, it depends on [RFC8311], | |||
which is a standards track specification that: | which is a standards track specification that: | |||
* updates the ECN proposed standard [RFC3168] to allow experimental | o updates the ECN proposed standard [RFC3168] to allow experimental | |||
track RFCs to relax the requirement that an ECN mark must be | track RFCs to relax the requirement that an ECN mark must be | |||
equivalent to a drop (when the network applies markings and/or | equivalent to a drop (when the network applies markings and/or | |||
when the sender responds to them). For instance, in the ABE | when the sender responds to them). For instance, in the ABE | |||
experiment [RFC8511] this permits a sender to respond less to ECN | experiment [RFC8511] this permits a sender to respond less to ECN | |||
marks than to drops; | marks than to drops; | |||
* changes the status of the experimental ECN nonce [RFC3540] to | o changes the status of the experimental ECN nonce [RFC3540] to | |||
historic; | historic; | |||
* makes consequent updates to the following additional proposed | o makes consequent updates to the following additional proposed | |||
standard RFCs to reflect the above two bullets: | standard RFCs to reflect the above two bullets: | |||
- ECN for RTP [RFC6679]; | * ECN for RTP [RFC6679]; | |||
- the congestion control specifications of various DCCP | * the congestion control specifications of various DCCP | |||
congestion control identifier (CCID) profiles [RFC4341], | congestion control identifier (CCID) profiles [RFC4341], | |||
[RFC4342], [RFC5622]. | [RFC4342], [RFC5622]. | |||
This document is about identifiers that are used for interoperation | This document is about identifiers that are used for interoperation | |||
between hosts and networks. So the audience is broad, covering | between hosts and networks. So the audience is broad, covering | |||
developers of host transports and network AQMs, as well as covering | developers of host transports and network AQMs, as well as covering | |||
how operators might wish to combine various identifiers, which would | how operators might wish to combine various identifiers, which would | |||
require flexibility from equipment developers. | require flexibility from equipment developers. | |||
2. L4S Packet Identification: Document Roadmap | 2. L4S Packet Identification: Document Roadmap | |||
skipping to change at page 11, line 50 ¶ | skipping to change at page 11, line 31 ¶ | |||
are covered in the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. | are covered in the L4S architecture [I-D.ietf-tsvwg-l4s-arch]. | |||
3. Choice of L4S Packet Identifier: Requirements | 3. Choice of L4S Packet Identifier: Requirements | |||
This subsection briefly records the process that led to the chosen | This subsection briefly records the process that led to the chosen | |||
L4S identifier. | L4S identifier. | |||
The identifier for packets using the Low Latency, Low Loss, Scalable | The identifier for packets using the Low Latency, Low Loss, Scalable | |||
throughput (L4S) service needs to meet the following requirements: | throughput (L4S) service needs to meet the following requirements: | |||
* it SHOULD survive end-to-end between source and destination end- | o it SHOULD survive end-to-end between source and destination end- | |||
points: across the boundary between host and network, between | points: across the boundary between host and network, between | |||
interconnected networks, and through middleboxes; | interconnected networks, and through middleboxes; | |||
* it SHOULD be visible at the IP layer; | o it SHOULD be visible at the IP layer; | |||
* it SHOULD be common to IPv4 and IPv6 and transport-agnostic; | o it SHOULD be common to IPv4 and IPv6 and transport-agnostic; | |||
* it SHOULD be incrementally deployable; | o it SHOULD be incrementally deployable; | |||
* it SHOULD enable an AQM to classify packets encapsulated by outer | o it SHOULD enable an AQM to classify packets encapsulated by outer | |||
IP or lower-layer headers; | IP or lower-layer headers; | |||
* it SHOULD consume minimal extra codepoints; | o it SHOULD consume minimal extra codepoints; | |||
* it SHOULD be consistent on all the packets of a transport layer | o it SHOULD be consistent on all the packets of a transport layer | |||
flow, so that some packets of a flow are not served by a different | flow, so that some packets of a flow are not served by a different | |||
queue to others. | queue to others. | |||
Whether the identifier would be recoverable if the experiment failed | Whether the identifier would be recoverable if the experiment failed | |||
is a factor that could be taken into account. However, this has not | is a factor that could be taken into account. However, this has not | |||
been made a requirement, because that would favour schemes that would | been made a requirement, because that would favour schemes that would | |||
be easier to fail, rather than those more likely to succeed. | be easier to fail, rather than those more likely to succeed. | |||
It is recognized that any choice of identifier is unlikely to satisfy | It is recognized that any choice of identifier is unlikely to satisfy | |||
all these requirements, particularly given the limited space left in | all these requirements, particularly given the limited space left in | |||
skipping to change at page 18, line 19 ¶ | skipping to change at page 17, line 36 ¶ | |||
which has remained by far the most prevalent case since the ECN | which has remained by far the most prevalent case since the ECN | |||
RFC was published in 2001. | RFC was published in 2001. | |||
* L4S: Coexistence is not in doubt if the bottleneck supports | * L4S: Coexistence is not in doubt if the bottleneck supports | |||
L4S. | L4S. | |||
* Classic ECN [RFC3168]: The compromises centre around cases | * Classic ECN [RFC3168]: The compromises centre around cases | |||
where the bottleneck supports Classic ECN but not L4S. But it | where the bottleneck supports Classic ECN but not L4S. But it | |||
depends on which sub-case: | depends on which sub-case: | |||
- Shared Queue with Classic ECN: At the time of writing, the | + Shared Queue with Classic ECN: At the time of writing, the | |||
members of the Transport Working group are not aware of any | members of the Transport Working group are not aware of any | |||
current deployments of single-queue Classic ECN bottlenecks | current deployments of single-queue Classic ECN bottlenecks | |||
in the Internet. Nonetheless, at the scale of the Internet, | in the Internet. Nonetheless, at the scale of the Internet, | |||
rarity need not imply small numbers, nor that there will be | rarity need not imply small numbers, nor that there will be | |||
rarity in the future. | rarity in the future. | |||
- Per-Flow-queues with Classic ECN: Most AQMs with per-flow- | + Per-Flow-queues with Classic ECN: Most AQMs with per-flow- | |||
queuing (FQ) deployed from 2012 onwards had Classic ECN | queuing (FQ) deployed from 2012 onwards had Classic ECN | |||
enabled by default, specifically FQ-CoDel [RFC8290] and | enabled by default, specifically FQ-CoDel [RFC8290] and | |||
COBALT [COBALT]. But the compromises only apply to the | COBALT [COBALT]. But the compromises only apply to the | |||
second of two further sub-cases: | second of two further sub-cases: | |||
o With per-flow-queuing, co-existence between Classic and | - With per-flow-queuing, co-existence between Classic and | |||
L4S flows is not normally a problem, because different | L4S flows is not normally a problem, because different | |||
flows are not meant to be in the same queue | flows are not meant to be in the same queue | |||
(BCP 124 [RFC4774] did not foresee the introduction of | (BCP 124 [RFC4774] did not foresee the introduction of | |||
per-flow-queuing, which appeared as a potential isolation | per-flow-queuing, which appeared as a potential isolation | |||
technique some eight years after the BCP was published). | technique some eight years after the BCP was published). | |||
o However, the isolation between L4S and Classic flows is | - However, the isolation between L4S and Classic flows is | |||
not perfect in cases where the hashes of flow IDs collide | not perfect in cases where the hashes of flow IDs collide | |||
or where multiple flows within a layer-3 VPN are | or where multiple flows within a layer-3 VPN are | |||
encapsulated within one flow ID. | encapsulated within one flow ID. | |||
To summarize, the coexistence problem is confined to cases of | To summarize, the coexistence problem is confined to cases of | |||
imperfect flow isolation in an FQ, or in potential cases where a | imperfect flow isolation in an FQ, or in potential cases where a | |||
Classic ECN AQM has been deployed in a shared queue (see the L4S | Classic ECN AQM has been deployed in a shared queue (see the L4S | |||
operational guidance [I-D.ietf-tsvwg-l4sops] for further details | operational guidance [I-D.ietf-tsvwg-l4sops] for further details | |||
including recent surveys attempting to quantify prevalence). | including recent surveys attempting to quantify prevalence). | |||
Further, if one of these cases does occur, the coexistence problem | Further, if one of these cases does occur, the coexistence problem | |||
skipping to change at page 19, line 26 ¶ | skipping to change at page 18, line 44 ¶ | |||
with a base RTT below 100 ms, and falls below ~5:1 for base RTTs | with a base RTT below 100 ms, and falls below ~5:1 for base RTTs | |||
below 20ms. | below 20ms. | |||
These throughput ratios can clearly fall well outside current RFC | These throughput ratios can clearly fall well outside current RFC | |||
guidance on coexistence. However, the tendency towards leaving a | guidance on coexistence. However, the tendency towards leaving a | |||
greater share for Classic flows at lower link rate and the very | greater share for Classic flows at lower link rate and the very | |||
limited prevalence of the conditions necessary for harm to occur led | limited prevalence of the conditions necessary for harm to occur led | |||
to the possibility of allowing the RFC requirements to be | to the possibility of allowing the RFC requirements to be | |||
compromised, albeit briefly: | compromised, albeit briefly: | |||
* The recommended approach is still to detect and adapt to a Classic | o The recommended approach is still to detect and adapt to a Classic | |||
ECN AQM in real-time, which is fully consistent with all the RFCs | ECN AQM in real-time, which is fully consistent with all the RFCs | |||
on coexistence. In other words, the "SHOULD"s in the third bullet | on coexistence. In other words, the "SHOULD"s in the third bullet | |||
of Section 4.3 above expect the sender to implement something | of Section 4.3 above expect the sender to implement something | |||
similar to the proof of concept code that detects the presence of | similar to the proof of concept code that detects the presence of | |||
a Classic ECN AQM and falls back to a Classic congestion response | a Classic ECN AQM and falls back to a Classic congestion response | |||
within a few round trips [ecn-fallback]. However, although this | within a few round trips [ecn-fallback]. However, although this | |||
code reliably detects a Classic ECN AQM, the current code can also | code reliably detects a Classic ECN AQM, the current code can also | |||
wrongly categorize an L4S AQM as Classic, most often in cases when | wrongly categorize an L4S AQM as Classic, most often in cases when | |||
link rate is low or RTT is high. Although this is the safe way | link rate is low or RTT is high. Although this is the safe way | |||
round, and although implementers are expected to be able to | round, and although implementers are expected to be able to | |||
improve on this proof of concept, concerns have been raised that | improve on this proof of concept, concerns have been raised that | |||
implementers might lose faith in such detection and disable it. | implementers might lose faith in such detection and disable it. | |||
* Therefore the third bullet in Section 4.3 above allows a | o Therefore the third bullet in Section 4.3 above allows a | |||
compromise where coexistence could diverge from the requirements | compromise where coexistence could diverge from the requirements | |||
in the RFC Series briefly, but mandatory monitoring is required, | in the RFC Series briefly, but mandatory monitoring is required, | |||
in order to detect such cases and trigger remedial action. This | in order to detect such cases and trigger remedial action. This | |||
approach tolerates a brief divergence from the RFCs given the | approach tolerates a brief divergence from the RFCs given the | |||
likely low prevalence and given harm here means a flow progresses | likely low prevalence and given harm here means a flow progresses | |||
more slowly than otherwise, but it does progress. The L4S | more slowly than otherwise, but it does progress. The L4S | |||
operational guidance [I-D.ietf-tsvwg-l4sops] outlines a range of | operational guidance [I-D.ietf-tsvwg-l4sops] outlines a range of | |||
example remedial actions that include alterations either to the | example remedial actions that include alterations either to the | |||
sender or to the network. However, the final normative | sender or to the network. However, the final normative | |||
requirement in the third bullet of Section 4.3 above places | requirement in the third bullet of Section 4.3 above places | |||
skipping to change at page 20, line 43 ¶ | skipping to change at page 20, line 13 ¶ | |||
out variations in any particular way. Implementers are encouraged to | out variations in any particular way. Implementers are encouraged to | |||
openly publish the approach they take to smoothing, and the results | openly publish the approach they take to smoothing, and the results | |||
and experience they gain during the L4S experiment. | and experience they gain during the L4S experiment. | |||
5. Network Node Behaviour | 5. Network Node Behaviour | |||
5.1. Classification and Re-Marking Behaviour | 5.1. Classification and Re-Marking Behaviour | |||
A network node that implements the L4S service: | A network node that implements the L4S service: | |||
* MUST classify arriving ECT(1) packets for L4S treatment, unless | o MUST classify arriving ECT(1) packets for L4S treatment, unless | |||
overridden by another classifier (e.g., see Section 5.4.1.2); | overridden by another classifier (e.g., see Section 5.4.1.2); | |||
* MUST classify arriving CE packets for L4S treatment as well, | o MUST classify arriving CE packets for L4S treatment as well, | |||
unless overridden by another classifier or unless the exception | unless overridden by another classifier or unless the exception | |||
referred to next applies; | referred to next applies; | |||
CE packets might have originated as ECT(1) or ECT(0), but the | CE packets might have originated as ECT(1) or ECT(0), but the | |||
above rule to classify them as if they originated as ECT(1) is the | above rule to classify them as if they originated as ECT(1) is the | |||
safe choice (see Appendix B for rationale). The exception is | safe choice (see Appendix B for rationale). The exception is | |||
where some flow-aware in-network mechanism happens to be available | where some flow-aware in-network mechanism happens to be available | |||
for distinguishing CE packets that originated as ECT(0), as | for distinguishing CE packets that originated as ECT(0), as | |||
described in Section 5.3, but there is no implication that such a | described in Section 5.3, but there is no implication that such a | |||
mechanism is necessary. | mechanism is necessary. | |||
An L4S AQM treatment follows similar codepoint transition rules to | An L4S AQM treatment follows similar codepoint transition rules to | |||
those in RFC 3168. Specifically, the ECT(1) codepoint MUST NOT be | those in RFC 3168. Specifically, the ECT(1) codepoint MUST NOT be | |||
skipping to change at page 24, line 5 ¶ | skipping to change at page 23, line 19 ¶ | |||
In a typical case for the public Internet a network element that | In a typical case for the public Internet a network element that | |||
implements L4S in a shared queue might want to classify some low-rate | implements L4S in a shared queue might want to classify some low-rate | |||
but unresponsive traffic (e.g. DNS, LDAP, NTP, voice, game sync | but unresponsive traffic (e.g. DNS, LDAP, NTP, voice, game sync | |||
packets) into the low latency queue to mix with L4S traffic. In this | packets) into the low latency queue to mix with L4S traffic. In this | |||
case it would not be appropriate to call the queue an L4S queue, | case it would not be appropriate to call the queue an L4S queue, | |||
because it is shared by L4S and non-L4S traffic. Instead, it will be | because it is shared by L4S and non-L4S traffic. Instead, it will be | |||
called the low latency or L queue. The L queue then offers two | called the low latency or L queue. The L queue then offers two | |||
different treatments: | different treatments: | |||
* The L4S treatment, which is a combination of the L4S AQM treatment | o The L4S treatment, which is a combination of the L4S AQM treatment | |||
and a priority scheduling treatment; | and a priority scheduling treatment; | |||
* The low latency treatment, which is solely the priority scheduling | o The low latency treatment, which is solely the priority scheduling | |||
treatment, without ECN-marking by the AQM. | treatment, without ECN-marking by the AQM. | |||
To identify packets for just the scheduling treatment, it would be | To identify packets for just the scheduling treatment, it would be | |||
inappropriate to use the L4S ECT(1) identifier, because such traffic | inappropriate to use the L4S ECT(1) identifier, because such traffic | |||
is unresponsive to ECN marking. Examples of relevant non-ECN | is unresponsive to ECN marking. Examples of relevant non-ECN | |||
identifiers are: | identifiers are: | |||
* address ranges of specific applications or hosts configured to be, | o address ranges of specific applications or hosts configured to be, | |||
or known to be, safe, e.g. hard-coded IoT devices sending low | or known to be, safe, e.g. hard-coded IoT devices sending low | |||
intensity traffic; | intensity traffic; | |||
* certain low data-volume applications or protocols (e.g. ARP, DNS); | o certain low data-volume applications or protocols (e.g. ARP, DNS); | |||
* specific Diffserv codepoints that indicate traffic with limited | o specific Diffserv codepoints that indicate traffic with limited | |||
burstiness such as the EF (Expedited Forwarding [RFC3246]), Voice- | burstiness such as the EF (Expedited Forwarding [RFC3246]), Voice- | |||
Admit [RFC5865] or proposed NQB (Non-Queue- | Admit [RFC5865] or proposed NQB (Non-Queue- | |||
Building [I-D.ietf-tsvwg-nqb]) service classes or equivalent | Building [I-D.ietf-tsvwg-nqb]) service classes or equivalent | |||
local-use DSCPs (see [I-D.briscoe-tsvwg-l4s-diffserv]). | local-use DSCPs (see [I-D.briscoe-tsvwg-l4s-diffserv]). | |||
To be clear, classifying into the L queue based on application layer | To be clear, classifying into the L queue based on application layer | |||
identification (e.g. DNS) is an example of a local optimization, not | identification (e.g. DNS) is an example of a local optimization, not | |||
a recommendation. Applications will not be able to rely on such | a recommendation. Applications will not be able to rely on such | |||
unsolicited optimization. A more reliable approach would be for the | unsolicited optimization. A more reliable approach would be for the | |||
sender to set an appropriate IP layer identifier, such as one of the | sender to set an appropriate IP layer identifier, such as one of the | |||
skipping to change at page 27, line 45 ¶ | skipping to change at page 27, line 9 ¶ | |||
In summary, there are numerous ways in which the L4S ECN identifier | In summary, there are numerous ways in which the L4S ECN identifier | |||
(ECT(1) and CE) could be combined with other identifiers to achieve | (ECT(1) and CE) could be combined with other identifiers to achieve | |||
particular objectives. The following categorization articulates | particular objectives. The following categorization articulates | |||
those that are valid, but it is not necessarily exhaustive. Those | those that are valid, but it is not necessarily exhaustive. Those | |||
tagged 'Recommended-standard-use' could be set by the sending host or | tagged 'Recommended-standard-use' could be set by the sending host or | |||
a network. Those tagged 'Local-use' would only be set by a network: | a network. Those tagged 'Local-use' would only be set by a network: | |||
1. Identifiers Complementing the L4S Identifier | 1. Identifiers Complementing the L4S Identifier | |||
a. Including More Traffic in the L Queue | A. Including More Traffic in the L Queue | |||
(Could use Recommended-standard-use or Local-use identifiers) | (Could use Recommended-standard-use or Local-use identifiers) | |||
b. Excluding Certain Traffic from the L Queue | B. Excluding Certain Traffic from the L Queue | |||
(Local-use only) | (Local-use only) | |||
2. Identifiers to place L4S classification in a PHB Hierarchy | 2. Identifiers to place L4S classification in a PHB Hierarchy | |||
(Could use Recommended-standard-use or Local-use identifiers) | (Could use Recommended-standard-use or Local-use identifiers) | |||
a. PHBs Before L4S ECN Classification | A. PHBs Before L4S ECN Classification | |||
b. PHBs After L4S ECN Classification | B. PHBs After L4S ECN Classification | |||
5.4.2. Per-Flow Queuing Examples of Other Identifiers Complementing L4S | 5.4.2. Per-Flow Queuing Examples of Other Identifiers Complementing L4S | |||
Identifiers | Identifiers | |||
At a node with per-flow queueing (e.g. FQ-CoDel [RFC8290]), the L4S | At a node with per-flow queueing (e.g. FQ-CoDel [RFC8290]), the L4S | |||
identifier could complement the Layer-4 flow ID as a further level of | identifier could complement the Layer-4 flow ID as a further level of | |||
flow granularity (i.e. Not-ECT and ECT(0) queued separately from | flow granularity (i.e. Not-ECT and ECT(0) queued separately from | |||
ECT(1) and CE packets). "Risk of reordering Classic CE packets" in | ECT(1) and CE packets). "Risk of reordering Classic CE packets" in | |||
Appendix B discusses the resulting ambiguity if packets originally | Appendix B discusses the resulting ambiguity if packets originally | |||
marked ECT(0) are marked CE by an upstream AQM before they arrive at | marked ECT(0) are marked CE by an upstream AQM before they arrive at | |||
skipping to change at page 31, line 5 ¶ | skipping to change at page 30, line 15 ¶ | |||
If a VPN carrying traffic participating in the L4S experiment | If a VPN carrying traffic participating in the L4S experiment | |||
experiences inappropriate replay detection, the foremost remedy would | experiences inappropriate replay detection, the foremost remedy would | |||
be to ensure that the egress is configured to comply with the above | be to ensure that the egress is configured to comply with the above | |||
window-sizing requirements. | window-sizing requirements. | |||
If an implementation of a VPN egress does not support a sufficiently | If an implementation of a VPN egress does not support a sufficiently | |||
large anti-replay window, e.g. due to hardware limitations, one of | large anti-replay window, e.g. due to hardware limitations, one of | |||
the temporary alternatives listed in order of preference below might | the temporary alternatives listed in order of preference below might | |||
be feasible instead: | be feasible instead: | |||
* If the VPN can be configured to classify packets into different | o If the VPN can be configured to classify packets into different | |||
SAs indexed by DSCP, apply the appropriate locally defined DSCPs | SAs indexed by DSCP, apply the appropriate locally defined DSCPs | |||
to Classic and L4S packets. The DSCPs could be applied by the | to Classic and L4S packets. The DSCPs could be applied by the | |||
network (based on the least significant bit of the ECN field), or | network (based on the least significant bit of the ECN field), or | |||
by the sending host. Such DSCPs would only need to survive as far | by the sending host. Such DSCPs would only need to survive as far | |||
as the VPN ingress. | as the VPN ingress. | |||
* If the above is not possible and it is necessary to use L4S, | o If the above is not possible and it is necessary to use L4S, | |||
either of the following might be appropriate as a last resort: | either of the following might be appropriate as a last resort: | |||
- disable anti-replay protection at the VPN egress, after | * disable anti-replay protection at the VPN egress, after | |||
considering the security implications (it is mandatory to allow | considering the security implications (it is mandatory to allow | |||
the anti-replay facility to be disabled in both IPsec and | the anti-replay facility to be disabled in both IPsec and | |||
DTLS); | DTLS); | |||
- configure the tunnel ingress not to propagate ECN to the outer, | * configure the tunnel ingress not to propagate ECN to the outer, | |||
which would lose the benefits of L4S and Classic ECN over the | which would lose the benefits of L4S and Classic ECN over the | |||
VPN. | VPN. | |||
Modification to VPN implementations is outside the present scope, | Modification to VPN implementations is outside the present scope, | |||
which is why this section has so far focused on reconfiguration. | which is why this section has so far focused on reconfiguration. | |||
Although this document does not define any requirements for VPN | Although this document does not define any requirements for VPN | |||
implementations, determining whether there is a need for such | implementations, determining whether there is a need for such | |||
requirements could be one aspect of L4S experimentation. | requirements could be one aspect of L4S experimentation. | |||
7. L4S Experiments | 7. L4S Experiments | |||
skipping to change at page 32, line 9 ¶ | skipping to change at page 31, line 19 ¶ | |||
The specification of each scalable congestion control will need to | The specification of each scalable congestion control will need to | |||
include protocol-specific requirements for configuration and | include protocol-specific requirements for configuration and | |||
monitoring performance during experiments. Appendix A of the | monitoring performance during experiments. Appendix A of the | |||
guidelines in [RFC5706] provides a helpful checklist. | guidelines in [RFC5706] provides a helpful checklist. | |||
7.1. Open Questions | 7.1. Open Questions | |||
L4S experiments would be expected to answer the following questions: | L4S experiments would be expected to answer the following questions: | |||
* Have all the parts of L4S been deployed, and if so, what | o Have all the parts of L4S been deployed, and if so, what | |||
proportion of paths support it? | proportion of paths support it? | |||
- What types of L4S AQMs were deployed, e.g. FQ, coupled DualQ, | * What types of L4S AQMs were deployed, e.g. FQ, coupled DualQ, | |||
uncoupled DualQ, other? And how prevalent was each? | uncoupled DualQ, other? And how prevalent was each? | |||
- Are the signalling patterns emitted by the deployed AQMs in any | * Are the signalling patterns emitted by the deployed AQMs in any | |||
way different from those expected when the Prague requirements | way different from those expected when the Prague requirements | |||
for endpoints were written? | for endpoints were written? | |||
* Does use of L4S over the Internet result in significantly improved | o Does use of L4S over the Internet result in significantly improved | |||
user experience? | user experience? | |||
* Has L4S enabled novel interactive applications? | o Has L4S enabled novel interactive applications? | |||
* Did use of L4S over the Internet result in improvements to the | o Did use of L4S over the Internet result in improvements to the | |||
following metrics: | following metrics: | |||
- queue delay (mean and 99th percentile) under various loads; | * queue delay (mean and 99th percentile) under various loads; | |||
- utilization; | * utilization; | |||
- starvation / fairness; | * starvation / fairness; | |||
- scaling range of flow rates and RTTs? | * scaling range of flow rates and RTTs? | |||
* How dependent was the performance of L4S service on the bottleneck | o How dependent was the performance of L4S service on the bottleneck | |||
bandwidth or the path RTT? | bandwidth or the path RTT? | |||
* How much do bursty links in the Internet affect L4S performance | o How much do bursty links in the Internet affect L4S performance | |||
(see "Underutilization with Bursty Links" in [Heist21]) and how | (see "Underutilization with Bursty Links" in [Heist21]) and how | |||
prevalent are they? How much limitation of burstiness from | prevalent are they? How much limitation of burstiness from | |||
upstream links was needed and/or was realized - both at senders | upstream links was needed and/or was realized - both at senders | |||
and at links, especially radio links or how much did L4S target | and at links, especially radio links or how much did L4S target | |||
delay have to be increased to accommodate the bursts (see bullet | delay have to be increased to accommodate the bursts (see bullet | |||
#7 in Section 4.3 and Section 5.5.2)? | #7 in Section 4.3 and Section 5.5.2)? | |||
* Is the initial experiment with mis-marked bursty traffic at high | o Is the initial experiment with mis-marked bursty traffic at high | |||
RTT (see "Underutilization with Bursty Traffic" in [Heist21]) | RTT (see "Underutilization with Bursty Traffic" in [Heist21]) | |||
indicative of similar problems at lower RTTs and, if so, how | indicative of similar problems at lower RTTs and, if so, how | |||
effective is the suggested remedy in Appendix A.1 of the DualQ | effective is the suggested remedy in Appendix A.1 of the DualQ | |||
spec [I-D.ietf-tsvwg-aqm-dualq-coupled] (or possible other | spec [I-D.ietf-tsvwg-aqm-dualq-coupled] (or possible other | |||
remedies)? | remedies)? | |||
* Was per-flow queue protection typically (un)necessary? | o Was per-flow queue protection typically (un)necessary? | |||
- How well did overload protection or queue protection work? | * How well did overload protection or queue protection work? | |||
* How well did L4S flows coexist with Classic flows when sharing a | o How well did L4S flows coexist with Classic flows when sharing a | |||
bottleneck? | bottleneck? | |||
- How frequently did problems arise? | o | |||
- What caused any coexistence problems, and were any problems due | * How frequently did problems arise? | |||
* What caused any coexistence problems, and were any problems due | ||||
to single-queue Classic ECN AQMs (this assumes single-queue | to single-queue Classic ECN AQMs (this assumes single-queue | |||
Classic ECN AQMs can be distinguished from FQ ones)? | Classic ECN AQMs can be distinguished from FQ ones)? | |||
* How prevalent were problems with the L4S service due to tunnels / | o How prevalent were problems with the L4S service due to tunnels / | |||
encapsulations that do not support ECN decapsulation? | encapsulations that do not support ECN decapsulation? | |||
* How easy was it to implement a fully compliant L4S congestion | o How easy was it to implement a fully compliant L4S congestion | |||
control, over various different transport protocols (TCP, QUIC, | control, over various different transport protocols (TCP, QUIC, | |||
RMCAT, etc.)? | RMCAT, etc.)? | |||
Monitoring for harm to other traffic, specifically bandwidth | Monitoring for harm to other traffic, specifically bandwidth | |||
starvation or excess queuing delay, will need to be conducted | starvation or excess queuing delay, will need to be conducted | |||
alongside all early L4S experiments. It is hard, if not impossible, | alongside all early L4S experiments. It is hard, if not impossible, | |||
for an individual flow to measure its impact on other traffic. So | for an individual flow to measure its impact on other traffic. So | |||
such monitoring will need to be conducted using bespoke monitoring | such monitoring will need to be conducted using bespoke monitoring | |||
across flows and/or across classes of traffic. | across flows and/or across classes of traffic. | |||
7.2. Open Issues | 7.2. Open Issues | |||
* What is the best way forward to deal with L4S over single-queue | o What is the best way forward to deal with L4S over single-queue | |||
Classic ECN AQM bottlenecks, given current problems with | Classic ECN AQM bottlenecks, given current problems with | |||
misdetecting L4S AQMs as Classic ECN AQMs? See the L4S | misdetecting L4S AQMs as Classic ECN AQMs? See the L4S | |||
operational guidance [I-D.ietf-tsvwg-l4sops]. | operational guidance [I-D.ietf-tsvwg-l4sops]. | |||
* Fixing the poor Interaction between current L4S congestion | o Fixing the poor Interaction between current L4S congestion | |||
controls and CoDel with only Classic ECN support during flow | controls and CoDel with only Classic ECN support during flow | |||
startup. Originally, this was due to a bug in the initialization | startup. Originally, this was due to a bug in the initialization | |||
of the congestion EWMA in the Linux implementation of TCP Prague. | of the congestion EWMA in the Linux implementation of TCP Prague. | |||
That was quickly fixed, which removed the main performance impact, | That was quickly fixed, which removed the main performance impact, | |||
but further improvement would be useful (either by modifying | but further improvement would be useful (either by modifying | |||
CoDel, Scalable congestion controls, or both). | CoDel, Scalable congestion controls, or both). | |||
7.3. Future Potential | 7.3. Future Potential | |||
Researchers might find that L4S opens up the following interesting | Researchers might find that L4S opens up the following interesting | |||
areas for investigation: | areas for investigation: | |||
* Potential for faster convergence time and tracking of available | o Potential for faster convergence time and tracking of available | |||
capacity; | capacity; | |||
* Potential for improvements to particular link technologies, and | o Potential for improvements to particular link technologies, and | |||
cross-layer interactions with them; | cross-layer interactions with them; | |||
* Potential for using virtual queues, e.g. to further reduce latency | o Potential for using virtual queues, e.g. to further reduce latency | |||
jitter, or to leave headroom for capacity variation in radio | jitter, or to leave headroom for capacity variation in radio | |||
networks; | networks; | |||
* Development and specification of reverse path congestion control | o Development and specification of reverse path congestion control | |||
using L4S building blocks (e.g. AccECN, QUIC); | using L4S building blocks (e.g. AccECN, QUIC); | |||
* Once queuing delay is cut down, what becomes the 'second-longest | o Once queuing delay is cut down, what becomes the 'second-longest | |||
pole in the tent' (other than the speed of light)? | pole in the tent' (other than the speed of light)? | |||
* Novel alternatives to the existing set of L4S AQMs; | o Novel alternatives to the existing set of L4S AQMs; | |||
* Novel applications enabled by L4S. | o Novel applications enabled by L4S. | |||
8. IANA Considerations | 8. IANA Considerations | |||
The 01 codepoint of the ECN Field of the IP header is specified by | The 01 codepoint of the ECN Field of the IP header is specified by | |||
the present Experimental RFC. The process for an experimental RFC to | the present Experimental RFC. The process for an experimental RFC to | |||
assign this codepoint in the IP header (v4 and v6) is documented in | assign this codepoint in the IP header (v4 and v6) is documented in | |||
Proposed Standard [RFC8311], which updates the Proposed Standard | Proposed Standard [RFC8311], which updates the Proposed Standard | |||
[RFC3168]. | [RFC3168]. | |||
When the present document is published as an RFC, IANA is asked to | When the present document is published as an RFC, IANA is asked to | |||
update the 01 entry in the registry, "ECN Field (Bits 6-7)" to the | update the 01 entry in the registry, "ECN Field (Bits 6-7)" to the | |||
following (see https://www.iana.org/assignments/dscp-registry/dscp- | following (see https://www.iana.org/assignments/dscp-registry/dscp- | |||
registry.xhtml#ecn-field ): | registry.xhtml#ecn-field ): | |||
+========+=====================+=============================+ | +--------+-----------------------------+----------------------------+ | |||
| Binary | Keyword | References | | | Binary | Keyword | References | | |||
+========+=====================+=============================+ | +--------+-----------------------------+----------------------------+ | |||
| 01 | ECT(1) (ECN-Capable | [RFC8311] [RFC Errata 5399] | | | 01 | ECT(1) (ECN-Capable | [RFC8311] | | |||
| | Transport(1))[1] | [RFCXXXX] | | | | Transport(1))[1] | [RFC Errata 5399] | | |||
+--------+---------------------+-----------------------------+ | | | | [RFCXXXX] | | |||
+--------+-----------------------------+----------------------------+ | ||||
Table 1 | ||||
[XXXX is the number that the RFC Editor assigns to the present | [XXXX is the number that the RFC Editor assigns to the present | |||
document (this sentence to be removed by the RFC Editor)]. | document (this sentence to be removed by the RFC Editor)]. | |||
9. Security Considerations | 9. Security Considerations | |||
Approaches to assure the integrity of signals using the new | Approaches to assure the integrity of signals using the new | |||
identifier are introduced in Appendix C.1. See the security | identifier are introduced in Appendix C.1. See the security | |||
considerations in the L4S architecture [I-D.ietf-tsvwg-l4s-arch] for | considerations in the L4S architecture [I-D.ietf-tsvwg-l4s-arch] for | |||
further discussion of mis-use of the identifier, as well as extensive | further discussion of mis-use of the identifier, as well as extensive | |||
skipping to change at page 36, line 33 ¶ | skipping to change at page 35, line 42 ¶ | |||
2012, <https://www.rfc-editor.org/info/rfc6679>. | 2012, <https://www.rfc-editor.org/info/rfc6679>. | |||
10.2. Informative References | 10.2. Informative References | |||
[A2DTCP] Zhang, T., Wang, J., Huang, J., Huang, Y., Chen, J., and | [A2DTCP] Zhang, T., Wang, J., Huang, J., Huang, Y., Chen, J., and | |||
Y. Pan, "Adaptive-Acceleration Data Center TCP", IEEE | Y. Pan, "Adaptive-Acceleration Data Center TCP", IEEE | |||
Transactions on Computers 64(6):1522-1533, June 2015, | Transactions on Computers 64(6):1522-1533, June 2015, | |||
<https://ieeexplore.ieee.org/xpl/ | <https://ieeexplore.ieee.org/xpl/ | |||
articleDetails.jsp?arnumber=6871352>. | articleDetails.jsp?arnumber=6871352>. | |||
[Ahmed19] Ahmed, A.S., "Extending TCP for Low Round Trip Delay", | [Ahmed19] Ahmed, A., "Extending TCP for Low Round Trip Delay", | |||
Master's Thesis, Uni Oslo , August 2019, | Master's Thesis, Uni Oslo , August 2019, | |||
<https://www.duo.uio.no/handle/10852/70966>. | <https://www.duo.uio.no/handle/10852/70966>. | |||
[Alizadeh-stability] | [Alizadeh-stability] | |||
Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis | Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis | |||
of DCTCP: Stability, Convergence, and Fairness", ACM | of DCTCP: Stability, Convergence, and Fairness", ACM | |||
SIGMETRICS 2011 , June 2011, | SIGMETRICS 2011 , June 2011, | |||
<https://people.csail.mit.edu/alizadeh/papers/ | <https://dl.acm.org/doi/10.1145/1993744.1993753>. | |||
dctcp_analysis-sigmetrics11.pdf>. | ||||
[ARED01] Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An | [ARED01] Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An | |||
Algorithm for Increasing the Robustness of RED's Active | Algorithm for Increasing the Robustness of RED's Active | |||
Queue Management", ACIRI Technical Report , August 2001, | Queue Management", ACIRI Technical Report 301, August | |||
<https://www.icir.org/floyd/red.html>. | 2001, <http://www.icsi.berkeley.edu/icsi/node/2032>. | |||
[BBRv2] Cardwell, N., "BRTCP BBR v2 Alpha/Preview Release", GitHub | [BBRv2] Cardwell, N., "BRTCP BBR v2 Alpha/Preview Release", GitHub | |||
repository; Linux congestion control module, | repository; Linux congestion control module, | |||
<https://github.com/google/bbr/blob/v2alpha/README.md>. | <https://github.com/google/bbr/blob/v2alpha/README.md>. | |||
[Bufferbloat] | [Bufferbloat] | |||
"Bufferbloat", <https://bufferbloat.net/>. (last accessed | "Bufferbloat", <https://bufferbloat.net/>. | |||
27 Aug 2022) | ||||
(last accessed 27 Aug 2022) | ||||
[COBALT] Palmei, J., Gupta, S., Imputato, P., Morton, J., | [COBALT] Palmei, J., Gupta, S., Imputato, P., Morton, J., | |||
Tahiliani, M., Avallone, S., and D. Taht, "Design and | Tahiliani, M., Avallone, S., and D. Taht, "Design and | |||
Evaluation of COBALT Queue Discipline", In Proc. IEEE | Evaluation of COBALT Queue Discipline", In Proc. IEEE | |||
Int'l Symp. on Local and Metropolitan Area Networks 2019, | Int'l Symp. on Local and Metropolitan Area Networks 2019, | |||
pp1--6, 2019, | pp1--6, 2019, | |||
<https://doi.org/10.1109/LANMAN.2019.8847054>. | <https://doi.org/10.1109/LANMAN.2019.8847054>. | |||
[DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B. | [DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B. | |||
Briscoe, "`Data Centre to the Home': Ultra-Low Latency for | Briscoe, "`Data Centre to the Home': Ultra-Low Latency for | |||
skipping to change at page 37, line 35 ¶ | skipping to change at page 36, line 45 ¶ | |||
<https://www.netdevconf.org/0x13/session.html?talk- | <https://www.netdevconf.org/0x13/session.html?talk- | |||
DUALPI2-AQM>. | DUALPI2-AQM>. | |||
[Dukkipati06] | [Dukkipati06] | |||
Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is | Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is | |||
the Right Metric for Congestion Control", ACM CCR | the Right Metric for Congestion Control", ACM CCR | |||
36(1):59--62, January 2006, | 36(1):59--62, January 2006, | |||
<https://dl.acm.org/doi/10.1145/1111322.1111336>. | <https://dl.acm.org/doi/10.1145/1111322.1111336>. | |||
[ecn-fallback] | [ecn-fallback] | |||
Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on | Briscoe, B. and A. Ahmed, "TCP Prague Fall-back on | |||
Detection of a Classic ECN AQM", bobbriscoe.net Technical | Detection of a Classic ECN AQM", bobbriscoe.net Technical | |||
Report TR-BB-2019-002, April 2020, | Report TR-BB-2019-002, April 2020, | |||
<https://arxiv.org/abs/1911.00710>. | <https://arxiv.org/abs/1911.00710>. | |||
[Heist21] Heist, P. and J. Morton, "L4S Tests", GitHub README, May | [Heist21] Heist, P. and J. Morton, "L4S Tests", GitHub README, May | |||
2021, <https://github.com/heistp/l4s-tests/>. | 2021, <https://github.com/heistp/l4s-tests/>. | |||
[I-D.briscoe-docsis-q-protection] | [I-D.briscoe-docsis-q-protection] | |||
Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection | Briscoe, B. and G. White, "Queue Protection to Preserve | |||
Algorithm to Preserve Low Latency", Work in Progress, | Low Latency", draft-briscoe-docsis-q-protection-00 (work | |||
Internet-Draft, draft-briscoe-docsis-q-protection-06, 13 | in progress), July 2019. | |||
May 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
briscoe-docsis-q-protection/>. | ||||
[I-D.briscoe-iccrg-prague-congestion-control] | [I-D.briscoe-iccrg-prague-congestion-control] | |||
Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague | Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague | |||
Congestion Control", Work in Progress, Internet-Draft, | Congestion Control", draft-briscoe-iccrg-prague- | |||
draft-briscoe-iccrg-prague-congestion-control-01, 11 July | congestion-control-01 (work in progress), July 2022, | |||
2022, <https://datatracker.ietf.org/api/v1/doc/document/ | <https://www.ietf.org/archive/id/draft-briscoe-iccrg- | |||
draft-briscoe-iccrg-prague-congestion-control/>. | prague-congestion-control-01.txt>. | |||
[I-D.briscoe-tsvwg-l4s-diffserv] | [I-D.briscoe-tsvwg-l4s-diffserv] | |||
Briscoe, B., "Interactions between Low Latency, Low Loss, | Briscoe, B., "Interactions between Low Latency, Low Loss, | |||
Scalable Throughput (L4S) and Differentiated Services", | Scalable Throughput (L4S) and Differentiated Services", | |||
Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s- | draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress), | |||
diffserv-02, 2 July 2018, | November 2018. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
briscoe-tsvwg-l4s-diffserv/>. | ||||
[I-D.cardwell-iccrg-bbr-congestion-control] | [I-D.cardwell-iccrg-bbr-congestion-control] | |||
Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. | Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson, | |||
Jacobson, "BBR Congestion Control", Work in Progress, | "BBR Congestion Control", draft-cardwell-iccrg-bbr- | |||
Internet-Draft, draft-cardwell-iccrg-bbr-congestion- | congestion-control-00 (work in progress), July 2017. | |||
control-02, 7 March 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
cardwell-iccrg-bbr-congestion-control/>. | ||||
[I-D.ietf-tcpm-accurate-ecn] | [I-D.ietf-tcpm-accurate-ecn] | |||
Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More | Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More | |||
Accurate ECN Feedback in TCP", Work in Progress, Internet- | Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- | |||
Draft, draft-ietf-tcpm-accurate-ecn-20, 25 July 2022, | ecn-14 (work in progress), February 2021. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tcpm-accurate-ecn/>. | ||||
[I-D.ietf-tcpm-generalized-ecn] | [I-D.ietf-tcpm-generalized-ecn] | |||
Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit | Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit | |||
Congestion Notification (ECN) to TCP Control Packets", | Congestion Notification (ECN) to TCP Control Packets", | |||
Work in Progress, Internet-Draft, draft-ietf-tcpm- | draft-ietf-tcpm-generalized-ecn-07 (work in progress), | |||
generalized-ecn-10, 27 July 2022, | February 2021. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tcpm-generalized-ecn/>. | ||||
[I-D.ietf-trill-ecn-support] | [I-D.ietf-trill-ecn-support] | |||
Eastlake, D. E. and B. Briscoe, "TRILL (TRansparent | Eastlake, D. and B. Briscoe, "TRILL (TRansparent | |||
Interconnection of Lots of Links): ECN (Explicit | Interconnection of Lots of Links): ECN (Explicit | |||
Congestion Notification) Support", Work in Progress, | Congestion Notification) Support", draft-ietf-trill-ecn- | |||
Internet-Draft, draft-ietf-trill-ecn-support-07, 25 | support-07 (work in progress), February 2018. | |||
February 2018, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-trill-ecn-support/>. | ||||
[I-D.ietf-tsvwg-aqm-dualq-coupled] | [I-D.ietf-tsvwg-aqm-dualq-coupled] | |||
Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled | Schepper, K., Briscoe, B., and G. White, "DualQ Coupled | |||
AQMs for Low Latency, Low Loss and Scalable Throughput | AQMs for Low Latency, Low Loss and Scalable Throughput | |||
(L4S)", Work in Progress, Internet-Draft, draft-ietf- | (L4S)", draft-ietf-tsvwg-aqm-dualq-coupled-14 (work in | |||
tsvwg-aqm-dualq-coupled-24, 7 July 2022, | progress), March 2021. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-aqm-dualq-coupled/>. | ||||
[I-D.ietf-tsvwg-ecn-encap-guidelines] | [I-D.ietf-tsvwg-ecn-encap-guidelines] | |||
Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding | Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding | |||
Congestion Notification to Protocols that Encapsulate IP", | Congestion Notification to Protocols that Encapsulate IP", | |||
Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn- | draft-ietf-tsvwg-ecn-encap-guidelines-15 (work in | |||
encap-guidelines-17, 11 July 2022, | progress), March 2021. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-ecn-encap-guidelines/>. | ||||
[I-D.ietf-tsvwg-l4s-arch] | [I-D.ietf-tsvwg-l4s-arch] | |||
Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, | Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low | |||
"Low Latency, Low Loss, Scalable Throughput (L4S) Internet | Latency, Low Loss, Scalable Throughput (L4S) Internet | |||
Service: Architecture", Work in Progress, Internet-Draft, | Service: Architecture", draft-ietf-tsvwg-l4s-arch-08 (work | |||
draft-ietf-tsvwg-l4s-arch-19, 27 July 2022, | in progress), November 2020. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-l4s-arch/>. | ||||
[I-D.ietf-tsvwg-l4sops] | [I-D.ietf-tsvwg-l4sops] | |||
White, G., "Operational Guidance for Deployment of L4S in | White, G., "Operational Guidance for Deployment of L4S in | |||
the Internet", Work in Progress, Internet-Draft, draft- | the Internet", draft-ietf-tsvwg-l4sops-03 (work in | |||
ietf-tsvwg-l4sops-03, 28 April 2022, | progress), April 2022, <https://www.ietf.org/archive/id/ | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | draft-ietf-tsvwg-l4sops-03.txt>. | |||
ietf-tsvwg-l4sops/>. | ||||
[I-D.ietf-tsvwg-nqb] | [I-D.ietf-tsvwg-nqb] | |||
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | |||
Behavior (NQB PHB) for Differentiated Services", Work in | Behavior (NQB PHB) for Differentiated Services", draft- | |||
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-10, 4 March | ietf-tsvwg-nqb-05 (work in progress), March 2021. | |||
2022, <https://datatracker.ietf.org/api/v1/doc/document/ | ||||
draft-ietf-tsvwg-nqb/>. | ||||
[I-D.ietf-tsvwg-rfc6040update-shim] | [I-D.ietf-tsvwg-rfc6040update-shim] | |||
Briscoe, B., "Propagating Explicit Congestion Notification | Briscoe, B., "Propagating Explicit Congestion Notification | |||
Across IP Tunnel Headers Separated by a Shim", Work in | Across IP Tunnel Headers Separated by a Shim", draft-ietf- | |||
Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update- | tsvwg-rfc6040update-shim-13 (work in progress), March | |||
shim-15, 11 July 2022, | 2021. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-rfc6040update-shim/>. | ||||
[I-D.mathis-iccrg-relentless-tcp] | [I-D.mathis-iccrg-relentless-tcp] | |||
Mathis, M., "Relentless Congestion Control", Work in | Mathis, M., "Relentless Congestion Control", draft-mathis- | |||
Progress, Internet-Draft, draft-mathis-iccrg-relentless- | iccrg-relentless-tcp-00 (work in progress), March 2009. | |||
tcp-00, 4 March 2009, <https://www.ietf.org/archive/id/ | ||||
draft-mathis-iccrg-relentless-tcp-00.txt>. | ||||
[I-D.sridharan-tcpm-ctcp] | [I-D.sridharan-tcpm-ctcp] | |||
Sridharan, M., Tan, K., Bansal, D., and D. Thaler, | Sridharan, M., Tan, K., Bansal, D., and D. Thaler, | |||
"Compound TCP: A New TCP Congestion Control for High-Speed | "Compound TCP: A New TCP Congestion Control for High-Speed | |||
and Long Distance Networks", Work in Progress, Internet- | and Long Distance Networks", draft-sridharan-tcpm-ctcp-02 | |||
Draft, draft-sridharan-tcpm-ctcp-02, 29 October 2007, | (work in progress), November 2008. | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
sridharan-tcpm-ctcp/>. | ||||
[I-D.stewart-tsvwg-sctpecn] | [I-D.stewart-tsvwg-sctpecn] | |||
Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream | Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream | |||
Control Transmission Protocol (SCTP)", Work in Progress, | Control Transmission Protocol (SCTP)", draft-stewart- | |||
Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January | tsvwg-sctpecn-05 (work in progress), January 2014. | |||
2014, <https://www.ietf.org/archive/id/draft-stewart- | ||||
tsvwg-sctpecn-05.txt>. | [L4Seval22] | |||
De Schepper, K., Albisser, O., Tilmans, O., and B. | ||||
Briscoe, "Dual Queue Coupled AQM: Deployable Very Low | ||||
Queuing Delay for All", Preprint submitted to IEEE/ACM | ||||
Transactions on Networking arXiv:2209.01078 [cs.NI], | ||||
September 2022, <https://arxiv.org/abs/2209.01078>. | ||||
[LinuxPacedChirping] | [LinuxPacedChirping] | |||
Misund, J. and B. Briscoe, "Paced Chirping - Rethinking | Misund, J. and B. Briscoe, "Paced Chirping - Rethinking | |||
TCP start-up", Proc. Linux Netdev 0x13 , March 2019, | TCP start-up", Proc. Linux Netdev 0x13 , March 2019, | |||
<https://www.netdevconf.org/0x13/session.html?talk-chirp>. | <https://www.netdevconf.org/0x13/session.html?talk-chirp>. | |||
[Paced-Chirping] | ||||
Misund, J., "Rapid Acceleration in TCP Prague", Master's | ||||
Thesis , May 2018, | ||||
<https://riteproject.files.wordpress.com/2018/07/ | ||||
misundjoakimmastersthesissubmitted180515.pdf>. | ||||
[PI2] De Schepper, K., Bondarenko, O., Tsang, I., and B. | [PI2] De Schepper, K., Bondarenko, O., Tsang, I., and B. | |||
Briscoe, "PI^2 : A Linearized AQM for both Classic and | Briscoe, "PI^2 : A Linearized AQM for both Classic and | |||
Scalable TCP", Proc. ACM CoNEXT 2016 pp.105-119, December | Scalable TCP", Proc. ACM CoNEXT 2016 pp.105-119, December | |||
2016, | 2016, | |||
<https://dl.acm.org/citation.cfm?doid=2999572.2999578>. | <https://dl.acm.org/citation.cfm?doid=2999572.2999578>. | |||
[PragueLinux] | [PragueLinux] | |||
Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | |||
Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing | Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing | |||
the `TCP Prague' Requirements for Low Latency Low Loss | the `TCP Prague' Requirements for Low Latency Low Loss | |||
Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , | Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , | |||
March 2019, <https://www.netdevconf.org/0x13/ | March 2019, <https://www.netdevconf.org/0x13/ | |||
session.html?talk-tcp-prague-l4s>. | session.html?talk-tcp-prague-l4s>. | |||
[QV] Briscoe, B. and P. Hurtig, "Up to Speed with Queue View", | [QV] Briscoe, B. and P. Hurtig, "Up to Speed with Queue View", | |||
RITE Technical Report D2.3; Appendix C.2, August 2015, | RITE Technical Report D2.3; Appendix C.2, August 2015, | |||
<https://riteproject.files.wordpress.com/2015/12/rite- | <https://riteproject.files.wordpress.com/2015/12/rite- | |||
deliverable-2-3.pdf>. | deliverable-2-3.pdf>. | |||
skipping to change at page 41, line 24 ¶ | skipping to change at page 40, line 5 ¶ | |||
Queue Management and Congestion Avoidance in the | Queue Management and Congestion Avoidance in the | |||
Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998, | Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998, | |||
<https://www.rfc-editor.org/info/rfc2309>. | <https://www.rfc-editor.org/info/rfc2309>. | |||
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, | [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, | |||
"Definition of the Differentiated Services Field (DS | "Definition of the Differentiated Services Field (DS | |||
Field) in the IPv4 and IPv6 Headers", RFC 2474, | Field) in the IPv4 and IPv6 Headers", RFC 2474, | |||
DOI 10.17487/RFC2474, December 1998, | DOI 10.17487/RFC2474, December 1998, | |||
<https://www.rfc-editor.org/info/rfc2474>. | <https://www.rfc-editor.org/info/rfc2474>. | |||
[RFC3246] Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le | [RFC3246] Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec, | |||
Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D. | J., Courtney, W., Davari, S., Firoiu, V., and D. | |||
Stiliadis, "An Expedited Forwarding PHB (Per-Hop | Stiliadis, "An Expedited Forwarding PHB (Per-Hop | |||
Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, | Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, | |||
<https://www.rfc-editor.org/info/rfc3246>. | <https://www.rfc-editor.org/info/rfc3246>. | |||
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit | |||
Congestion Notification (ECN) Signaling with Nonces", | Congestion Notification (ECN) Signaling with Nonces", | |||
RFC 3540, DOI 10.17487/RFC3540, June 2003, | RFC 3540, DOI 10.17487/RFC3540, June 2003, | |||
<https://www.rfc-editor.org/info/rfc3540>. | <https://www.rfc-editor.org/info/rfc3540>. | |||
[RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", | [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", | |||
skipping to change at page 46, line 11 ¶ | skipping to change at page 44, line 31 ¶ | |||
Johansson, I., "SCReAM", GitHub repository; , | Johansson, I., "SCReAM", GitHub repository; , | |||
<https://github.com/EricssonResearch/scream/blob/master/ | <https://github.com/EricssonResearch/scream/blob/master/ | |||
README.md>. | README.md>. | |||
[sub-mss-prob] | [sub-mss-prob] | |||
Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion | Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion | |||
Window for Small Round Trip Times", BT Technical Report | Window for Small Round Trip Times", BT Technical Report | |||
TR-TUB8-2015-002, May 2015, | TR-TUB8-2015-002, May 2015, | |||
<https://arxiv.org/abs/1904.07598>. | <https://arxiv.org/abs/1904.07598>. | |||
[TCP-CA] Jacobson, V. and M.J. Karels, "Congestion Avoidance and | [TCP-CA] Jacobson, V. and M. Karels, "Congestion Avoidance and | |||
Control", Laurence Berkeley Labs Technical Report , | Control", Laurence Berkeley Labs Technical Report , | |||
November 1988, <https://ee.lbl.gov/papers/congavoid.pdf>. | November 1988, <https://ee.lbl.gov/papers/congavoid.pdf>. | |||
[TCPPrague] | [TCPPrague] | |||
Briscoe, B., "Notes: DCTCP evolution 'bar BoF': Tue 21 Jul | Briscoe, B., "Notes: DCTCP evolution 'bar BoF': Tue 21 Jul | |||
2015, 17:40, Prague", tcpprague mailing list archive , | 2015, 17:40, Prague", tcpprague mailing list archive , | |||
July 2015, <https://www.ietf.org/mail- | July 2015, <https://www.ietf.org/mail- | |||
archive/web/tcpprague/current/msg00001.html>. | archive/web/tcpprague/current/msg00001.html>. | |||
[VCP] Xia, Y., Subramanian, L., Stoica, I., and S. Kalyanaraman, | [VCP] Xia, Y., Subramanian, L., Stoica, I., and S. Kalyanaraman, | |||
skipping to change at page 49, line 24 ¶ | skipping to change at page 47, line 39 ¶ | |||
be resolved in real-time, or via administrative intervention. | be resolved in real-time, or via administrative intervention. | |||
Motivation: Similarly to the discussion in Appendix A.1.4, this | Motivation: Similarly to the discussion in Appendix A.1.4, this | |||
requirement in Section 4.3 is a safety condition to ensure an L4S | requirement in Section 4.3 is a safety condition to ensure an L4S | |||
congestion control coexists well with Classic flows when it builds a | congestion control coexists well with Classic flows when it builds a | |||
queue at a shared network bottleneck that has not been upgraded to | queue at a shared network bottleneck that has not been upgraded to | |||
support L4S. Nonetheless, if necessary, it is considered reasonable | support L4S. Nonetheless, if necessary, it is considered reasonable | |||
to resolve such problems over management timescales (possibly | to resolve such problems over management timescales (possibly | |||
involving human intervention) because: | involving human intervention) because: | |||
* although a Classic flow can considerably reduce its throughput in | o although a Classic flow can considerably reduce its throughput in | |||
the face of a competing scalable flow, it still makes progress and | the face of a competing scalable flow, it still makes progress and | |||
does not starve; | does not starve; | |||
* implementations of a Classic ECN AQM in a queue that is intended | o implementations of a Classic ECN AQM in a queue that is intended | |||
to be shared are believed to be rare; | to be shared are believed to be rare; | |||
* detection of such AQMs is not always clear-cut; so focused out-of- | o detection of such AQMs is not always clear-cut; so focused out-of- | |||
band testing (or even contacting the relevant network operator) | band testing (or even contacting the relevant network operator) | |||
would improve certainty. | would improve certainty. | |||
Therefore, the relevant normative requirement (Section 4.3) is | Therefore, the relevant normative requirement (Section 4.3) is | |||
divided into three stages: monitoring, detection and action: | divided into three stages: monitoring, detection and action: | |||
Monitoring: Monitoring involves collection of the measurement data | Monitoring: Monitoring involves collection of the measurement data | |||
to be analysed. Monitoring is expressed as a 'MUST' for | to be analysed. Monitoring is expressed as a 'MUST' for | |||
uncontrolled environments, although the placement of the | uncontrolled environments, although the placement of the | |||
monitoring function is left open. Whether monitoring has to be | monitoring function is left open. Whether monitoring has to be | |||
skipping to change at page 51, line 27 ¶ | skipping to change at page 49, line 43 ¶ | |||
Classic ECN AQM and determine whether it is in a shared queue, | Classic ECN AQM and determine whether it is in a shared queue, | |||
summarized here. | summarized here. | |||
An L4S-enabled test server could be set up so that, when a test | An L4S-enabled test server could be set up so that, when a test | |||
client accesses it, it serves a script that gets the client to open | client accesses it, it serves a script that gets the client to open | |||
two parallel long-running flows. It could serve one with a Classic | two parallel long-running flows. It could serve one with a Classic | |||
congestion control (C, that sets ECT(0)) and one with a scalable CC | congestion control (C, that sets ECT(0)) and one with a scalable CC | |||
(L, that sets ECT(1)). If neither flow induces any ECN marks, it can | (L, that sets ECT(1)). If neither flow induces any ECN marks, it can | |||
be presumed the path does not contain a Classic ECN AQM. If either | be presumed the path does not contain a Classic ECN AQM. If either | |||
flow induces some ECN marks, the server could measure the relative | flow induces some ECN marks, the server could measure the relative | |||
flow rates and round trip times of the two flows. Table 2 shows the | flow rates and round trip times of the two flows. Table 1 shows the | |||
AQM that can be inferred for various cases (presuming the AQM | AQM that can be inferred for various cases (presuming the AQM | |||
behaviours known at the time of writing). | behaviours known at the time of writing). | |||
+========+=======+========================+ | +--------+-------+------------------------+ | |||
| Rate | RTT | Inferred AQM | | | Rate | RTT | Inferred AQM | | |||
+========+=======+========================+ | ||||
| L > C | L = C | Classic ECN AQM (FIFO) | | ||||
+--------+-------+------------------------+ | +--------+-------+------------------------+ | |||
| L > C | L = C | Classic ECN AQM (FIFO) | | ||||
| L = C | L = C | Classic ECN AQM (FQ) | | | L = C | L = C | Classic ECN AQM (FQ) | | |||
+--------+-------+------------------------+ | ||||
| L = C | L < C | FQ-L4S AQM | | | L = C | L < C | FQ-L4S AQM | | |||
+--------+-------+------------------------+ | ||||
| L ~= C | L < C | Coupled DualQ AQM | | | L ~= C | L < C | Coupled DualQ AQM | | |||
+--------+-------+------------------------+ | +--------+-------+------------------------+ | |||
Table 2: Out-of-band testing with two | Table 1: Out-of-band testing with two parallel flows. L:=L4S, | |||
parallel flows. L:=L4S, C:=Classic. | C:=Classic. | |||
Finally, we motivate the recommendation in Section 4.3 that a | Finally, we motivate the recommendation in Section 4.3 that a | |||
scalable congestion control is not expected to change to setting | scalable congestion control is not expected to change to setting | |||
ECT(0) while it adapts its behaviour to coexist with Classic flows. | ECT(0) while it adapts its behaviour to coexist with Classic flows. | |||
This is because the sender needs to continue to check whether it made | This is because the sender needs to continue to check whether it made | |||
the right decision - and switch back if it was wrong, or if a | the right decision - and switch back if it was wrong, or if a | |||
different link becomes the bottleneck: | different link becomes the bottleneck: | |||
* If, as recommended, the sender changes only its behaviour but not | o If, as recommended, the sender changes only its behaviour but not | |||
its codepoint to Classic, its codepoint will still be compatible | its codepoint to Classic, its codepoint will still be compatible | |||
with either an L4S or a Classic AQM. If the bottleneck does | with either an L4S or a Classic AQM. If the bottleneck does | |||
actually support both, it will still classify ECT(1) into the same | actually support both, it will still classify ECT(1) into the same | |||
L4S queue, where the sender can measure that switching to Classic | L4S queue, where the sender can measure that switching to Classic | |||
behaviour was wrong, so that it can switch back. | behaviour was wrong, so that it can switch back. | |||
* In contrast, if the sender changes both its behaviour and its | o In contrast, if the sender changes both its behaviour and its | |||
codepoint to Classic, even if the bottleneck supports both, it | codepoint to Classic, even if the bottleneck supports both, it | |||
will classify ECT(0) into the Classic queue, reinforcing the | will classify ECT(0) into the Classic queue, reinforcing the | |||
sender's incorrect decision so that it never switches back. | sender's incorrect decision so that it never switches back. | |||
* Also, not changing codepoint avoids the risk of being flipped to a | o Also, not changing codepoint avoids the risk of being flipped to a | |||
different path by a load balancer or multipath routing that hashes | different path by a load balancer or multipath routing that hashes | |||
on the whole of the ex-ToS byte (unfortunately still a common | on the whole of the ex-ToS byte (unfortunately still a common | |||
pathology). | pathology). | |||
Note that if a flow is configured to _only_ use a Classic congestion | Note that if a flow is configured to _only_ use a Classic congestion | |||
control, it is then entirely appropriate not to use ECT(1). | control, it is then entirely appropriate not to use ECT(1). | |||
A.1.6. Reduce RTT dependence | A.1.6. Reduce RTT dependence | |||
Description: A scalable congestion control needs to reduce RTT bias | Description: A scalable congestion control needs to reduce RTT bias | |||
skipping to change at page 53, line 11 ¶ | skipping to change at page 51, line 28 ¶ | |||
in current Classic congestion controls and still work satisfactorily. | in current Classic congestion controls and still work satisfactorily. | |||
So, there is no additional requirement in Section 4.3 for high RTT | So, there is no additional requirement in Section 4.3 for high RTT | |||
L4S flows to remove RTT bias - they can but they don't have to. | L4S flows to remove RTT bias - they can but they don't have to. | |||
One way for a Scalable congestion control to satisfy these | One way for a Scalable congestion control to satisfy these | |||
requirements is to make its additive increase behave as if it were a | requirements is to make its additive increase behave as if it were a | |||
standard Reno flow but over a larger RTT by using a virtual RTT | standard Reno flow but over a larger RTT by using a virtual RTT | |||
(rtt_virt) that is a function of the actual RTT (rtt). Example | (rtt_virt) that is a function of the actual RTT (rtt). Example | |||
functions might be: | functions might be: | |||
rtt_virt = max(rtt, 25ms) | "rtt_virt = max(rtt, 25ms)" | |||
rtt_virt = rtt + 10ms | "rtt_virt = rtt + 10ms" | |||
These example functions are chosen so that, as the actual RTT reduces | These example functions are chosen so that, as the actual RTT reduces | |||
from high to low, the virtual RTT reduces less (see | from high to low, the virtual RTT reduces less (see | |||
[I-D.briscoe-iccrg-prague-congestion-control] for details). | [I-D.briscoe-iccrg-prague-congestion-control] for details). | |||
However, short RTT flows can more rapidly respond to changes in | However, short RTT flows can more rapidly respond to changes in | |||
available capacity, whether due to other flows arriving and departing | available capacity, whether due to other flows arriving and departing | |||
or radio capacity varying. So it would wrong to require short RTT | or radio capacity varying. So it would wrong to require short RTT | |||
flows to be as sluggish as long-RTT flows, which would unnecessarily | flows to be as sluggish as long-RTT flows, which would unnecessarily | |||
under-utilize capacity and result in unnecessary overshoots and | under-utilize capacity and result in unnecessary overshoots and | |||
skipping to change at page 55, line 24 ¶ | skipping to change at page 53, line 39 ¶ | |||
End-systems cannot know whether a missing packet is due to loss or | End-systems cannot know whether a missing packet is due to loss or | |||
reordering, except in hindsight - if it appears later. So they can | reordering, except in hindsight - if it appears later. So they can | |||
only deem that there has been a loss if a gap in the sequence space | only deem that there has been a loss if a gap in the sequence space | |||
has not been filled, either after a certain number of subsequent | has not been filled, either after a certain number of subsequent | |||
packets has arrived (e.g. the 3 DupACK rule of standard TCP | packets has arrived (e.g. the 3 DupACK rule of standard TCP | |||
congestion control [RFC5681]) or after a certain amount of time | congestion control [RFC5681]) or after a certain amount of time | |||
(e.g. the RACK approach [RFC8985]). | (e.g. the RACK approach [RFC8985]). | |||
As we attempt to scale packet rate over the years: | As we attempt to scale packet rate over the years: | |||
* Even if only _some_ sending hosts still deem that loss has | o Even if only _some_ sending hosts still deem that loss has | |||
occurred by counting reordered packets, _all_ networks will have | occurred by counting reordered packets, _all_ networks will have | |||
to keep reducing the time over which they keep packets in order. | to keep reducing the time over which they keep packets in order. | |||
If some link technologies keep the time within which reordering | If some link technologies keep the time within which reordering | |||
occurs roughly unchanged, then loss over these links, as perceived | occurs roughly unchanged, then loss over these links, as perceived | |||
by these hosts, will appear to continually rise over the years. | by these hosts, will appear to continually rise over the years. | |||
* In contrast, if all senders detect loss in units of time, the time | o In contrast, if all senders detect loss in units of time, the time | |||
over which the network has to keep packets in order stays roughly | over which the network has to keep packets in order stays roughly | |||
invariant. | invariant. | |||
Therefore, hosts have an incentive to detect loss in time units (so | Therefore, hosts have an incentive to detect loss in time units (so | |||
as not to fool themselves too often into detecting losses when there | as not to fool themselves too often into detecting losses when there | |||
are none). And for hosts that are changing their congestion control | are none). And for hosts that are changing their congestion control | |||
implementation to L4S, there is no downside to including time-based | implementation to L4S, there is no downside to including time-based | |||
loss detection code in the change (loss recovery implemented in | loss detection code in the change (loss recovery implemented in | |||
hardware is an exception, covered later). Therefore, requiring L4S | hardware is an exception, covered later). Therefore, requiring L4S | |||
hosts to detect loss in time-based units would not be a burden. | hosts to detect loss in time-based units would not be a burden. | |||
skipping to change at page 59, line 4 ¶ | skipping to change at page 57, line 17 ¶ | |||
Motivation: As an example, a new DCTCP flow takes longer than a | Motivation: As an example, a new DCTCP flow takes longer than a | |||
Classic congestion control to obtain its share of the capacity of the | Classic congestion control to obtain its share of the capacity of the | |||
bottleneck when there are already ongoing flows using the bottleneck | bottleneck when there are already ongoing flows using the bottleneck | |||
capacity. In a data centre environment DCTCP takes about a factor of | capacity. In a data centre environment DCTCP takes about a factor of | |||
1.5 to 2 longer to converge due to the much higher typical level of | 1.5 to 2 longer to converge due to the much higher typical level of | |||
ECN marking that DCTCP background traffic induces, which causes new | ECN marking that DCTCP background traffic induces, which causes new | |||
flows to exit slow start early [Alizadeh-stability]. In testing for | flows to exit slow start early [Alizadeh-stability]. In testing for | |||
use over the public Internet the convergence time of DCTCP relative | use over the public Internet the convergence time of DCTCP relative | |||
to a regular loss-based TCP slow start is even less | to a regular loss-based TCP slow start is even less | |||
favourable [Paced-Chirping] due to the shallow ECN marking threshold | favourable [LinuxPacedChirping] due to the shallow ECN marking | |||
needed for L4S. It is exacerbated by the typically greater mismatch | threshold needed for L4S. It is exacerbated by the typically greater | |||
between the link rate of the sending host and typical Internet access | mismatch between the link rate of the sending host and typical | |||
bottlenecks. This problem is detrimental in general, but would | Internet access bottlenecks. This problem is detrimental in general, | |||
particularly harm the performance of short flows relative to Classic | but would particularly harm the performance of short flows relative | |||
congestion controls. | to Classic congestion controls. | |||
Appendix B. Compromises in the Choice of L4S Identifier | Appendix B. Compromises in the Choice of L4S Identifier | |||
This appendix is informative, not normative. As explained in | This appendix is informative, not normative. As explained in | |||
Section 3, there is insufficient space in the IP header (v4 or v6) to | Section 3, there is insufficient space in the IP header (v4 or v6) to | |||
fully accommodate every requirement. So the choice of L4S identifier | fully accommodate every requirement. So the choice of L4S identifier | |||
involves tradeoffs. This appendix records the pros and cons of the | involves tradeoffs. This appendix records the pros and cons of the | |||
choice that was made. | choice that was made. | |||
Non-normative recap of the chosen codepoint scheme: | Non-normative recap of the chosen codepoint scheme: | |||
Packets with ECT(1) and conditionally packets with CE signify L4S | Packets with ECT(1) and conditionally packets with CE signify L4S | |||
semantics as an alternative to the semantics of Classic | semantics as an alternative to the semantics of Classic | |||
ECN [RFC3168], specifically: | ECN [RFC3168], specifically: | |||
- The ECT(1) codepoint signifies that the packet was sent by an | * The ECT(1) codepoint signifies that the packet was sent by an | |||
L4S-capable sender. | L4S-capable sender. | |||
- Given shortage of codepoints, both L4S and Classic ECN sides of | * Given shortage of codepoints, both L4S and Classic ECN sides of | |||
an AQM have to use the same CE codepoint to indicate that a | an AQM have to use the same CE codepoint to indicate that a | |||
packet has experienced congestion. If a packet that had | packet has experienced congestion. If a packet that had | |||
already been marked CE in an upstream buffer arrived at a | already been marked CE in an upstream buffer arrived at a | |||
subsequent AQM, this AQM would then have to guess whether to | subsequent AQM, this AQM would then have to guess whether to | |||
classify CE packets as L4S or Classic ECN. Choosing the L4S | classify CE packets as L4S or Classic ECN. Choosing the L4S | |||
treatment is a safer choice, because then a few Classic packets | treatment is a safer choice, because then a few Classic packets | |||
might arrive early, rather than a few L4S packets arriving | might arrive early, rather than a few L4S packets arriving | |||
late. | late. | |||
- Additional information might be available if the classifier | * Additional information might be available if the classifier | |||
were transport-aware. Then it could classify a CE packet for | were transport-aware. Then it could classify a CE packet for | |||
Classic ECN treatment if the most recent ECT packet in the same | Classic ECN treatment if the most recent ECT packet in the same | |||
flow had been marked ECT(0). However, the L4S service ought | flow had been marked ECT(0). However, the L4S service ought | |||
not to need transport-layer awareness. | not to need transport-layer awareness. | |||
Cons: | Cons: | |||
Consumes the last ECN codepoint: The L4S service could potentially | Consumes the last ECN codepoint: The L4S service could potentially | |||
supersede the service provided by Classic ECN, therefore using | supersede the service provided by Classic ECN, therefore using | |||
ECT(1) to identify L4S packets could ultimately mean that the | ECT(1) to identify L4S packets could ultimately mean that the | |||
skipping to change at page 64, line 37 ¶ | skipping to change at page 62, line 43 ¶ | |||
the sum, so it could not suppress feedback of a loss or mark without | the sum, so it could not suppress feedback of a loss or mark without | |||
a 50-50 chance of guessing the sum incorrectly. | a 50-50 chance of guessing the sum incorrectly. | |||
It is highly unlikely that ECT(1) will be needed as a nonce for | It is highly unlikely that ECT(1) will be needed as a nonce for | |||
integrity protection of congestion notifications in future. The ECN | integrity protection of congestion notifications in future. The ECN | |||
Nonce RFC [RFC3540] has been reclassified as historic, partly because | Nonce RFC [RFC3540] has been reclassified as historic, partly because | |||
other ways (that do not consume a codepoint in the IP header) have | other ways (that do not consume a codepoint in the IP header) have | |||
been developed to protect feedback integrity of TCP and other | been developed to protect feedback integrity of TCP and other | |||
transports [RFC8311]. For instance: | transports [RFC8311]. For instance: | |||
* the sender can test the integrity of a small random sample of the | o the sender can test the integrity of a small random sample of the | |||
receiver's feedback by occasionally setting the IP-ECN field to a | receiver's feedback by occasionally setting the IP-ECN field to a | |||
value normally only set by the network. Then it can test whether | value normally only set by the network. Then it can test whether | |||
the receiver's feedback faithfully reports what it expects (see | the receiver's feedback faithfully reports what it expects (see | |||
para 2 of Section 20.2 of the ECN spec [RFC3168]. This works for | para 2 of Section 20.2 of the ECN spec [RFC3168]. This works for | |||
loss and it will work for the accurate ECN feedback [RFC7560] | loss and it will work for the accurate ECN feedback [RFC7560] | |||
intended for L4S. Like the (historic) ECN nonce, this technique | intended for L4S. Like the (historic) ECN nonce, this technique | |||
does not protect against a misbehaving sender. But it allows a | does not protect against a misbehaving sender. But it allows a | |||
well-behaved sender to check that each receiver is correctly | well-behaved sender to check that each receiver is correctly | |||
feeding back congestion notifications. | feeding back congestion notifications. | |||
* A network can check that its ECN markings (or packet losses) have | o A network can check that its ECN markings (or packet losses) have | |||
been passed correctly round the full feedback loop by auditing | been passed correctly round the full feedback loop by auditing | |||
congestion exposure (ConEx) [RFC7713]. This assures that the | congestion exposure (ConEx) [RFC7713]. This assures that the | |||
integrity of congestion notifications and feedback messages must | integrity of congestion notifications and feedback messages must | |||
have both been preserved. ConEx information is also available | have both been preserved. ConEx information is also available | |||
anywhere along the network path, so it can be used to enforce a | anywhere along the network path, so it can be used to enforce a | |||
congestion response. Whether the receiver or a downstream network | congestion response. Whether the receiver or a downstream network | |||
is suppressing congestion feedback or the sender is unresponsive | is suppressing congestion feedback or the sender is unresponsive | |||
to the feedback, or both, ConEx is intended to neutralise any | to the feedback, or both, ConEx is intended to neutralise any | |||
advantage that any of these three parties would otherwise gain. | advantage that any of these three parties would otherwise gain. | |||
* Congestion feedback fields in transport layer headers are | o Congestion feedback fields in transport layer headers are | |||
immutable end-to-end, and therefore amenable to end-to-end | immutable end-to-end, and therefore amenable to end-to-end | |||
integrity protection. This preserves the integrity of a | integrity protection. This preserves the integrity of a | |||
receiver's feedback messages to the sender, but it does not | receiver's feedback messages to the sender, but it does not | |||
protect against misbehaving receivers or misbehaving senders. The | protect against misbehaving receivers or misbehaving senders. The | |||
TCP authentication option (TCP-AO [RFC5925]), QUIC's end-to-end | TCP authentication option (TCP-AO [RFC5925]), QUIC's end-to-end | |||
protection [RFC9001] or end-to-end IPsec integrity protection | protection [RFC9001] or end-to-end IPsec integrity protection | |||
[RFC4303] can be used to detect any tampering with congestion | [RFC4303] can be used to detect any tampering with congestion | |||
feedback (whether malicious or accidental). respectively in TCP, | feedback (whether malicious or accidental). respectively in TCP, | |||
QUIC or any transport. TCP-AO covers the main TCP header and TCP | QUIC or any transport. TCP-AO covers the main TCP header and TCP | |||
options by default, but it is often too brittle to use on many | options by default, but it is often too brittle to use on many | |||
end-to-end paths, where middleboxes can make verification fail in | end-to-end paths, where middleboxes can make verification fail in | |||
their attempts to improve performance or security, e.g. by | their attempts to improve performance or security, e.g. by | |||
resegmentation or shifting the sequence space. | resegmentation or shifting the sequence space. | |||
At the time of writing, It is not common to protect the integrity of | At the time of writing, It is becoming common to protect the | |||
congestion feedback, whether loss or Classic ECN. If this position | integrity of transport feedback, using QUIC. However, it is still | |||
changes during the L4S experiment, one or more of the above | not common to protect the integrity of the wider congestion feedback | |||
techniques might need to be developed and deployed. | loop, whether based on loss or Classic ECN. If this position changes | |||
during the L4S experiment, more of the above techniques might need to | ||||
be developed and deployed. | ||||
C.2. Notification of Less Severe Congestion than CE | C.2. Notification of Less Severe Congestion than CE | |||
Various researchers have proposed to use ECT(1) as a less severe | Various researchers have proposed to use ECT(1) as a less severe | |||
congestion notification than CE, particularly to enable flows to fill | congestion notification than CE, particularly to enable flows to fill | |||
available capacity more quickly after an idle period, when another | available capacity more quickly after an idle period, when another | |||
flow departs or when a flow starts, e.g. VCP [VCP], Queue View | flow departs or when a flow starts, e.g. VCP [VCP], Queue View | |||
(QV) [QV]. | (QV) [QV]. | |||
Before assigning ECT(1) as an identifier for L4S, we must carefully | Before assigning ECT(1) as an identifier for L4S, we must carefully | |||
skipping to change at page 67, line 4 ¶ | skipping to change at page 65, line 6 ¶ | |||
The authors' contributions were part-funded by the European Community | The authors' contributions were part-funded by the European Community | |||
under its Seventh Framework Programme through the Reducing Internet | under its Seventh Framework Programme through the Reducing Internet | |||
Transport Latency (RITE) project (ICT-317700). The contribution of | Transport Latency (RITE) project (ICT-317700). The contribution of | |||
Koen De Schepper was also part-funded by the 5Growth and DAEMON EU | Koen De Schepper was also part-funded by the 5Growth and DAEMON EU | |||
H2020 projects. Bob Briscoe was also funded partly by the Research | H2020 projects. Bob Briscoe was also funded partly by the Research | |||
Council of Norway through the TimeIn project, partly by CableLabs and | Council of Norway through the TimeIn project, partly by CableLabs and | |||
partly by the Comcast Innovation Fund. The views expressed here are | partly by the Comcast Innovation Fund. The views expressed here are | |||
solely those of the authors. | solely those of the authors. | |||
Authors' Addresses | Authors' Addresses | |||
Koen De Schepper | ||||
Nokia Bell Labs | Koen De Schepper | |||
Antwerp | Nokia Bell Labs | |||
Belgium | Antwerp | |||
Email: koen.de_schepper@nokia.com | Belgium | |||
URI: https://www.bell-labs.com/about/researcher-profiles/ | ||||
koende_schepper/ | Email: koen.de_schepper@nokia.com | |||
URI: https://www.bell-labs.com/about/researcher-profiles/koende_schepper/ | ||||
Bob Briscoe (editor) | Bob Briscoe (editor) | |||
Independent | Independent | |||
United Kingdom | UK | |||
Email: ietf@bobbriscoe.net | Email: ietf@bobbriscoe.net | |||
URI: https://bobbriscoe.net/ | URI: https://bobbriscoe.net/ | |||
End of changes. 138 change blocks. | ||||
270 lines changed or deleted | 237 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |