rfc9626xml2.original.xml | rfc9626.xml | |||
---|---|---|---|---|
<?xml version="1.0" encoding="UTF-8"?> | <?xml version='1.0' encoding='UTF-8'?> | |||
<!DOCTYPE rfc > | ||||
<?rfc compact="yes"?> | ||||
<?rfc subcompact="yes"?> | ||||
<?rfc iprnotified="no" ?> | ||||
<?rfc strict="yes"?> | ||||
<?rfc symrefs="yes"?> | ||||
<?rfc toc="yes"?> | ||||
<?rfc tocdepth="4"?> | ||||
<rfc category="exp" docName="draft-ietf-avtext-framemarking-16" ipr="trust200902 | <!DOCTYPE rfc [ | |||
" submissionType="IETF"> | <!ENTITY nbsp " "> | |||
<!ENTITY zwsp "​"> | ||||
<!ENTITY nbhy "‑"> | ||||
<!ENTITY wj "⁠"> | ||||
]> | ||||
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="exp" docName="draft-ie | ||||
tf-avtext-framemarking-16" number="9626" consensus="true" updates="" obsoletes=" | ||||
" ipr="trust200902" submissionType="IETF" symRefs="true" tocInclude="true" tocDe | ||||
pth="4" version="3" xml:lang="en"> | ||||
<front> | <front> | |||
<title abbrev="Video Frame Marking">Video Frame Marking RTP Header Extension </title> | <title abbrev="Video Frame Marking">Video Frame Marking RTP Header Extension </title> | |||
<seriesInfo name="RFC" value="9626"/> | ||||
<author fullname="Mo Zanaty" initials="M" surname="Zanaty"> | <author fullname="Mo Zanaty" initials="M" surname="Zanaty"> | |||
<organization>Cisco Systems</organization> | <organization>Cisco Systems</organization> | |||
<address> | <address> | |||
<postal> | <postal> | |||
<street>170 West Tasman Drive</street> | <street>170 West Tasman Drive</street> | |||
<city>San Jose</city> | <city>San Jose</city> | |||
<region>CA</region> | <region>CA</region> | |||
<code>95134</code> | <code>95134</code> | |||
<country>US</country> | <country>United States of America</country> | |||
</postal> | </postal> | |||
<email>mzanaty@cisco.com</email> | <email>mzanaty@cisco.com</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<author initials="E." surname="Berger" fullname="Espen Berger"> | <author initials="E." surname="Berger" fullname="Espen Berger"> | |||
<organization>Cisco Systems</organization> | <organization>Cisco Systems</organization> | |||
<address> | <address> | |||
<email>espeberg@cisco.com</email> | <email>espeberg@cisco.com</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<author fullname="Suhas Nandakumar" initials="S" surname="Nandakumar"> | <author fullname="Suhas Nandakumar" initials="S" surname="Nandakumar"> | |||
<organization>Cisco Systems</organization> | <organization>Cisco Systems</organization> | |||
<address> | <address> | |||
<postal> | <postal> | |||
<street>170 West Tasman Drive</street> | <street>170 West Tasman Drive</street> | |||
<city>San Jose</city> | <city>San Jose</city> | |||
<region>CA</region> | <region>CA</region> | |||
<code>95134</code> | <code>95134</code> | |||
<country>US</country> | <country>United States of America</country> | |||
</postal> | </postal> | |||
<email>snandaku@cisco.com</email> | <email>snandaku@cisco.com</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<date day="04" month="March" year="2024"/> | <date month="August" year="2024"/> | |||
<area>Applications</area> | <area>WIT</area> | |||
<keyword>Internet-Draft</keyword> | <workgroup>avtcore</workgroup> | |||
<!-- [rfced] Please insert any keywords (beyond those that appear in | ||||
the title) for use on https://www.rfc-editor.org/search. --> | ||||
<keyword>example</keyword> | ||||
<abstract> | <abstract> | |||
<t>This document describes a Video Frame Marking RTP header extension used to | <t>This document describes a Video Frame Marking RTP header extension used to | |||
convey information about video frames that is critical for error recovery | convey information about video frames that is critical for error recovery | |||
and packet forwarding in RTP middleboxes or network nodes. It is most | and packet forwarding in RTP middleboxes or network nodes. It is most | |||
useful when media is encrypted, and essential when the middlebox or node | useful when media is encrypted and essential when the middlebox or node | |||
has no access to the media decryption keys. It is also useful for | has no access to the media decryption keys. It is also useful for | |||
codec-agnostic processing of encrypted or unencrypted media, while it also | codec-agnostic processing of encrypted or unencrypted media, while it also | |||
supports extensions for codec-specific information.</t> | supports extensions for codec-specific information.</t> | |||
</abstract> | </abstract> | |||
</front> | </front> | |||
<middle> | <middle> | |||
<section anchor="intro"> | ||||
<section title="Introduction" anchor="intro"> | <name>Introduction</name> | |||
<t>Many widely deployed RTP <xref target="RFC3550" /> topologies | <t>Many widely deployed RTP <xref target="RFC3550"/> topologies | |||
<xref target="RFC7667" /> used in modern voice and video | <xref target="RFC7667"/> used in modern voice and video | |||
conferencing systems include a centralized component that acts as an RTP s witch. | conferencing systems include a centralized component that acts as an RTP s witch. | |||
It receives voice and video streams from each participant, which may be en crypted using | It receives voice and video streams from each participant, which may be en crypted using | |||
SRTP <xref target="RFC3711" />, or extensions that provide participants wi | Secure Real-time Transport Protocol (SRTP) <xref target="RFC3711"/> or ext | |||
th | ensions that provide participants with | |||
private media <xref target="RFC8871" /> | private media <xref target="RFC8871"/> | |||
via end-to-end encryption where the switch has no access to media decrypti on keys. | via end-to-end encryption where the switch has no access to media decrypti on keys. | |||
The goal is to provide a set of streams back to | The goal is to provide a set of streams back to | |||
the participants which enable them to render the right media content. In a | the participants, which enable them to render the right media content. For | |||
simple video configuration, for example, the goal will be that each partic | example, in a | |||
ipant | simple video configuration, the goal will be that each participant | |||
sees and hears just the active speaker. In that case, the goal of the swit ch is to | sees and hears just the active speaker. In that case, the goal of the swit ch is to | |||
receive the voice and video streams from each participant, determine the a ctive | receive the voice and video streams from each participant, determine the a ctive | |||
speaker based on energy in the voice packets, possibly using the client-to -mixer | speaker based on energy in the voice packets, possibly using the client-to -mixer | |||
audio level RTP header extension <xref target="RFC6464" />, and select the | audio level RTP header extension <xref target="RFC6464"/>, and select the | |||
corresponding video | corresponding video | |||
stream for transmission to participants; see <xref target="rtpswitch" /> | stream for transmission to participants; see <xref target="rtpswitch"/>. | |||
.</t> | </t> | |||
<t>In this document, an "RTP switch" is used as shorthand for the terms | ||||
<t>In this document, an "RTP switch" is used as a common short term for th | ||||
e terms | ||||
"switching RTP mixer", "source projecting middlebox", | "switching RTP mixer", "source projecting middlebox", | |||
"source forwarding unit/middlebox" and "video switching MCU" as | "source forwarding unit/middlebox" and "video switching Multipoint Control | |||
discussed in <xref target="RFC7667" />.</t> | Unit (MCU)", as | |||
discussed in <xref target="RFC7667"/>.</t> | ||||
<figure title="RTP switch" anchor="rtpswitch"><artwork><![CDATA[ | <figure anchor="rtpswitch"> | |||
<name>RTP Switch</name> | ||||
<artwork><![CDATA[ | ||||
+---+ +------------+ +---+ | +---+ +------------+ +---+ | |||
| A |<---->| |<---->| B | | | A |<---->| |<---->| B | | |||
+---+ | | +---+ | +---+ | | +---+ | |||
| RTP | | | RTP | | |||
+---+ | Switch | +---+ | +---+ | Switch | +---+ | |||
| C |<---->| |<---->| D | | | C |<---->| |<---->| D | | |||
+---+ +------------+ +---+ | +---+ +------------+ +---+ | |||
]]> | ||||
</artwork></figure> | ||||
<t>In order to properly support switching of video streams, the RTP switch t | ]]></artwork> | |||
ypically needs | </figure> | |||
<t>In order to properly support the switching of video streams, the RTP sw | ||||
itch typically needs | ||||
some critical information about video frames in order to start and stop forw arding streams. | some critical information about video frames in order to start and stop forw arding streams. | |||
<list style="symbols"> | </t> | |||
<t>Because of inter-frame dependencies, it should ideally switch video s | <ul> | |||
treams at a point | <li> | |||
<!--[rfced] Please review whether "e.g." in the following should | ||||
instead be "i.e.": | ||||
Original: | ||||
Because of inter-frame dependencies, it should ideally switch video | ||||
streams at a point where the first frame from the new speaker can be | ||||
decoded by recipients without prior frames, e.g. switch on an | ||||
intra-frame. | ||||
--> | ||||
<t>Because of inter-frame dependencies, it should ideally switch video | ||||
streams at a point | ||||
where the first frame from the new speaker can be decoded by recipients without prior | where the first frame from the new speaker can be decoded by recipients without prior | |||
frames, e.g switch on an intra-frame.</t> | frames, e.g., switch on an intra-frame.</t> | |||
<t>In many cases, the switch may need to drop frames in order to realize | </li> | |||
congestion control | <li> | |||
techniques, and needs to know which frames can be dropped with minimal i | <t>In many cases, the switch may need to drop frames in order to reali | |||
mpact to video quality.</t> | ze congestion control | |||
<t>For scalable streams with dependent layers, the switch may need to se | techniques, and it needs to know which frames can be dropped with minima | |||
lectively forward | l impact to video quality.</t> | |||
</li> | ||||
<li> | ||||
<t>For scalable streams with dependent layers, the switch may need to | ||||
selectively forward | ||||
specific layers to specific recipients due to recipient bandwidth or dec oder limits.</t> | specific layers to specific recipients due to recipient bandwidth or dec oder limits.</t> | |||
</list> | </li> | |||
</t> | </ul> | |||
<t>Furthermore, it is highly desirable to do this in a payload format-agno | ||||
<t>Furthermore, it is highly desirable to do this in a payload format-agno | stic way that is not | |||
stic way which is not | ||||
specific to each different video codec. | specific to each different video codec. | |||
Most modern video codecs share common concepts around frame types and ot her critical information | Most modern video codecs share common concepts around frame types and ot her critical information | |||
to make this codec-agnostic handling possible.</t> | to make this codec-agnostic handling possible.</t> | |||
<t>It is also desirable to be able to do this for SRTP without requiring t he video switch to | <t>It is also desirable to be able to do this for SRTP without requiring t he video switch to | |||
decrypt the packets. SRTP will encrypt the RTP payload format contents a nd consequently this | decrypt the packets. SRTP will encrypt the RTP payload format contents; consequently, this | |||
data is not usable for the switching function without decryption, which may not even | data is not usable for the switching function without decryption, which may not even | |||
be possible in the case of end-to-end encryption of private media | be possible in the case of end-to-end encryption of private media | |||
<xref target="RFC8871" />.</t> | <xref target="RFC8871"/>.</t> | |||
<t>By providing meta-information about the RTP streams outside the encrypt ed media payload, an | <t>By providing meta-information about the RTP streams outside the encrypt ed media payload, an | |||
RTP switch can do codec-agnostic selective forwarding without decrypting t he payload. | RTP switch can do codec-agnostic selective forwarding without decrypting t he payload. | |||
This document specifies the necessary meta-information in an RTP header ex tension. | This document specifies the necessary meta-information in an RTP header ex tension. | |||
</t> | </t> | |||
</section> | ||||
<section title="Key Words for Normative Requirements"> | ||||
<t> | ||||
The key words "MUST", "MUST NOT", "REQUIRED&quo | ||||
t;, | ||||
"SHALL", "SHALL NOT", "SHOULD", "SHOU | ||||
LD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY&q | ||||
uot;, and | ||||
"OPTIONAL" in this document are to be interpreted as described | ||||
in | ||||
BCP 14 <xref target="RFC2119" /> <xref target="RFC8174" /> when, and on | ||||
ly when, they | ||||
appear in all capitals, as shown here. </t> | ||||
</section> | </section> | |||
<section> | ||||
<name>Requirements Language</name> | ||||
<section title="Frame Marking RTP Header Extension"> | <t> | |||
<t>This specification uses RTP header extensions as defined in <xref targe | The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", | |||
t="RFC8285" />. A subset of | "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14> | |||
", | ||||
"<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", | ||||
"<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", | ||||
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to | ||||
be | ||||
interpreted as described in BCP 14 <xref target="RFC2119"/> <xref | ||||
target="RFC8174"/> when, and only when, they appear in all capitals, as | ||||
shown here. | ||||
</t> | ||||
</section> | ||||
<section> | ||||
<name>Frame Marking RTP Header Extension</name> | ||||
<t>This specification uses RTP header extensions as defined in <xref targe | ||||
t="RFC8285"/>. A subset of | ||||
meta-information from the video stream is provided as an RTP header extens ion to allow an RTP switch | meta-information from the video stream is provided as an RTP header extens ion to allow an RTP switch | |||
to do generic selective forwarding of video streams encoded with potential ly different video codecs.</t> | to do generic selective forwarding of video streams encoded with potential ly different video codecs.</t> | |||
<t>The Frame Marking RTP header extension is encoded | ||||
<t>The Frame Marking RTP header extension is encoded | using the one-byte header or two-byte header as described in <xref target | |||
using the one-byte header or two-byte header as described in <xref target | ="RFC8285"/>. | |||
="RFC8285" />. | The one-byte header format is used for examples in this document. | |||
The one-byte header format is used for examples in this memo. | ||||
The two-byte header format is used when other two-byte header extensions | The two-byte header format is used when other two-byte header extensions | |||
are present in the same RTP packet, since mixing one-byte and two-byte e xtensions | are present in the same RTP packet since mixing one-byte and two-byte ex tensions | |||
is not possible in the same RTP packet.</t> | is not possible in the same RTP packet.</t> | |||
<t>This extension is only specified for Source (not Redundancy) RTP Stream | ||||
<t>This extension is only specified for Source (not Redundancy) RTP Stre | s | |||
ams | <xref target="RFC7656"/> that carry video payloads. | |||
<xref target="RFC7656" /> that carry video payloads. | ||||
It is not specified for audio payloads, nor is it specified for Redu ndancy RTP Streams. | It is not specified for audio payloads, nor is it specified for Redu ndancy RTP Streams. | |||
The (separate) specifications for Redundancy RTP Streams often inclu de | The (separate) specifications for Redundancy RTP Streams often inclu de | |||
provisions for recovering any header extensions that were part of th e original source packet. | provisions for recovering any header extensions that were part of th e original source packet. | |||
Such provisions can be followed to recover the Frame Marking RTP hea der extension of the | Such provisions can be followed to recover the Frame Marking RTP hea der extension of the | |||
original source packet. | original source packet. | |||
Source packet frame markings may be useful when generating Redundanc y RTP Streams; | Source packet frame markings may be useful when generating Redundanc y RTP Streams; | |||
for example, the I (Independent Frame) and D (Discardable Frame) bit s, | for example, the I (Independent Frame) and D (Discardable Frame) bit s, | |||
defined in <xref target="mandatory-scalable" />, | defined in <xref target="mandatory-scalable"/>, | |||
can be used to generate extra or no redundancy, respectively, | can be used to generate extra or no redundancy, respectively, | |||
and redundancy schemes with source blocks can align source block bou ndaries with | and redundancy schemes with source blocks can align source block bou ndaries with | |||
independent frame boundaries as marked by the I bit. | independent frame boundaries as marked by the I bit. | |||
</t> | </t> | |||
<t>A frame, in the context of this specification, is the set of RTP pack | <t>A frame, in the context of this specification, is the set of RTP packet | |||
ets | s | |||
with the same RTP timestamp from a specific RTP synchronization source | with the same RTP timestamp from a specific RTP Synchronization Source | |||
(SSRC). | (SSRC). | |||
A frame within a layer is the set of RTP packets with the same RTP tim estamp, SSRC, | A frame within a layer is the set of RTP packets with the same RTP tim estamp, SSRC, | |||
Temporal ID (TID), and Layer ID (LID).</t> | Temporal ID (TID), and Layer ID (LID).</t> | |||
<section anchor="mandatory-scalable"> | ||||
<section title="Long Extension for Scalable Streams" anchor="mandatory-sca | <name>Long Extension for Scalable Streams</name> | |||
lable"> | <t>The following RTP header extension is <bcp14>RECOMMENDED</bcp14> for | |||
<t>The following RTP header extension is RECOMMENDED for scalable streams | scalable streams. | |||
. | It <bcp14>MAY</bcp14> also be used for non-scalable streams, in which | |||
It MAY also be used for non-scalable streams, in which case TID, LID | case the TID, LID, and TL0PICIDX <bcp14>MUST</bcp14> be 0 or omitted. | |||
and TL0PICIDX MUST be 0 or omitted. | The ID is assigned per <xref target="RFC8285"/>. | |||
The ID is assigned per <xref target="RFC8285" />, | The length is encoded as follows:</t> | |||
and the length is encoded as L=2 which indicates 3 octets of data whe | <ul> | |||
n nothing is omitted, | <li>L=2 to indicate 3 octets of data when nothing is omitted,</li> | |||
or L=1 for 2 octets when TL0PICIDX is omitted, or L=0 for 1 octet whe | <li>L=1 for 2 octets when TL0PICIDX is omitted, or</li> | |||
n both LID and TL0PICIDX are omitted.</t> | <li>L=0 for 1 octet when both the LID and TL0PICIDX are omitted.</li | |||
<figure> | ></ul> | |||
<artwork><![CDATA[ | <artwork><![CDATA[ | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=2 |S|E|I|D|B| TID | LID | TL0PICIDX | | | ID=? | L=2 |S|E|I|D|B| TID | LID | TL0PICIDX | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
or | or | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=1 |S|E|I|D|B| TID | LID | (TL0PICIDX omitted) | | ID=? | L=1 |S|E|I|D|B| TID | LID | (TL0PICIDX omitted) | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
or | or | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=0 |S|E|I|D|B| TID | (LID and TL0PICIDX omitted) | | ID=? | L=0 |S|E|I|D|B| TID | (LID and TL0PICIDX omitted) | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
<t>The following information is extracted from the media payload and sen | ||||
t in the Frame Marking RTP header extension. | ||||
</t> | ||||
<dl newline="true"> | ||||
<t>The following information are extracted from the media payload and se | <dt>S: Start of Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 in th | |||
nt in the Frame Marking RTP header extension. | e first packet in a frame | |||
<list style='symbols'> | within a layer; otherwise, <bcp14>MUST</bcp14> be 0.</dd> | |||
<t>S: Start of Frame (1 bit) - MUST be 1 in the first packet in a fr | ||||
ame | <dt>E: End of Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 in the | |||
within a layer; otherwise MUST be 0.</t> | last packet in a frame | |||
<t>E: End of Frame (1 bit) - MUST be 1 in the last packet in a frame | within a layer; otherwise, <bcp14>MUST</bcp14> be 0. | |||
within a layer; otherwise MUST be 0. | Note that the RTP header marker bit <bcp14>MAY</bcp14> be used t | |||
Note that the RTP header marker bit MAY be used to infer the las | o infer the last packet of the highest enhancement layer in payload formats with | |||
t packet of the highest enhancement layer, in payload formats with such semantic | such semantics.</dd> | |||
s.</t> | ||||
<t>I: Independent Frame (1 bit) - MUST be 1 for a frame within a la | <dt>I: Independent Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 f | |||
yer that can be | or a frame within a layer that can be | |||
decoded independent of temporally prior frames, e.g. intra-frame, | decoded independent of temporally prior frames, e.g., intra-frame | |||
VPX keyframe, | , VPX keyframe, | |||
H.264 IDR <xref target="RFC6184" />, | H.264 Instantaneous Decoding Refresh (IDR) <xref target="RFC6184" | |||
H.265 IDR/CRA/BLA/RAP <xref target="RFC7798" />; | />, or | |||
otherwise MUST be 0. | H.265 IDR / Clean Random Access (CRA) / Broken Link Access (BLA) | |||
/ Random Access Point (RAP) <xref target="RFC7798"/>; | ||||
otherwise, <bcp14>MUST</bcp14> be 0. | ||||
Note that this bit only signals temporal independence, so it can be | Note that this bit only signals temporal independence, so it can be | |||
1 in spatial or quality enhancement layers that depend on tempora lly | 1 in spatial or quality enhancement layers that depend on tempora lly | |||
co-located layers but not temporally prior frames.</t> | co-located layers but not temporally prior frames.</dd> | |||
<t>D: Discardable Frame (1 bit) - MUST be 1 for a frame within a lay | ||||
er the sender knows can be discarded, | ||||
and still provide a decodable media stream; otherwise MUST be 0. | ||||
</t> | ||||
<t>B: Base Layer Sync (1 bit) - When TID is not 0, this MUST be 1 if | ||||
the sender knows this frame within a layer only depends | ||||
on the base temporal layer; otherwise MUST be 0. When TID is 0 o | ||||
r if no scalability is used, this MUST be 0.</t> | ||||
<t>TID: Temporal ID (3 bits) - Identifies the temporal layer/sub-lay | ||||
er encoded, | ||||
starting with 0 for the base layer, and increasing with higher te | ||||
mporal fidelity. | ||||
If no scalability is used, this MUST be 0. It is implicitly 0 in | ||||
the short extension format.</t> | ||||
<t>LID: Layer ID (8 bits) - Identifies the spatial and quality layer | ||||
encoded, | ||||
starting with 0 for the base layer, and increasing with higher fi | ||||
delity. | ||||
If no scalability is used, this MUST be 0 or omitted to reduce le | ||||
ngth. | ||||
When omitted, TL0PICIDX MUST also be omitted. It is implicitly 0 | ||||
in the short extension format | ||||
or when omitted in the long extension format.</t> | ||||
<t>TL0PICIDX: Temporal Layer 0 Picture Index (8 bits) - When TID is | ||||
0 and LID is 0, this is a cyclic counter labeling | ||||
base layer frames. When TID is not 0 or LID is not 0, | ||||
this indicates a dependency on the given index, such that this fr | ||||
ame within this layer | ||||
depends on the frame with this label in the layer with TID 0 and | ||||
LID 0. | ||||
If no scalability is used, or the cyclic counter is unknown, this | ||||
MUST be omitted to reduce length. | ||||
Note that 0 is a valid index value for TL0PICIDX.</t> | ||||
</list> | ||||
</t> | ||||
<t>The layer information contained in TID and LID convey useful aspects o | <dt>D: Discardable Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 fo | |||
f the layer structure that | r a frame within a layer the sender knows can be discarded | |||
and still provide a decodable media stream; otherwise, <bcp14>MU | ||||
ST</bcp14> be 0. </dd> | ||||
<dt>B: Base Layer Sync (1 bit)</dt><dd>When the TID is not 0, this < | ||||
bcp14>MUST</bcp14> be 1 if the sender knows this frame within a layer only depen | ||||
ds | ||||
on the base temporal layer; otherwise, <bcp14>MUST</bcp14> be 0. | ||||
When the TID is 0 or if no scalability is used, this <bcp14>MUST</bcp14> be 0. | ||||
</dd> | ||||
<dt>TID: Temporal ID (3 bits)</dt><dd>Identifies the temporal layer/ | ||||
sub-layer encoded, | ||||
starting with 0 for the base layer and increasing with higher tem | ||||
poral fidelity. | ||||
If no scalability is used, this <bcp14>MUST</bcp14> be 0. It is i | ||||
mplicitly 0 in the short extension format. | ||||
</dd> | ||||
<dt>LID: Layer ID (8 bits)</dt><dd>Identifies the spatial and qualit | ||||
y layer encoded, | ||||
starting with 0 for the base layer and increasing with higher fid | ||||
elity. | ||||
If no scalability is used, this <bcp14>MUST</bcp14> be 0 or omitt | ||||
ed to reduce length. | ||||
When the LID is omitted, TL0PICIDX <bcp14>MUST</bcp14> also be om | ||||
itted. It is implicitly 0 in the short extension format | ||||
or when omitted in the long extension format.</dd> | ||||
<dt>TL0PICIDX: Temporal Layer 0 Picture Index (8 bits)</dt><dd>When | ||||
the TID is 0 and the LID is 0, this is a cyclic counter labeling | ||||
base layer frames. When the TID is not 0 or the LID is not 0, | ||||
the indication is that a dependency on the given index, such that | ||||
this frame within this layer | ||||
depends on the frame with this label in the layer with a TID 0 an | ||||
d LID 0. | ||||
If no scalability is used, or the cyclic counter is unknown, TL0P | ||||
ICIDX <bcp14>MUST</bcp14> be omitted to reduce length. | ||||
Note that 0 is a valid index value for TL0PICIDX.</dd> | ||||
</dl> | ||||
<t>The layer information contained in the TID and LID convey useful aspe | ||||
cts of the layer structure that | ||||
can be utilized in selective forwarding.</t> | can be utilized in selective forwarding.</t> | |||
<t>Without further information about the layer structure, | <t>Without further information about the layer structure, | |||
these TID/LID identifiers can only be used for relative priority of la yers | these TID/LID identifiers can only be used for relative priority of la yers | |||
and implicit dependencies between layers. | and implicit dependencies between layers. | |||
They convey a layer hierarchy with TID=0 and LID=0 identifying the bas e layer. | They convey a layer hierarchy with TID = 0 and LID = 0 identifying the base layer. | |||
Higher values of TID identify higher temporal layers with higher frame rates. | Higher values of TID identify higher temporal layers with higher frame rates. | |||
Higher values of LID identify higher spatial and/or quality layers wit h higher resolutions and/or bitrates. | Higher values of LID identify higher spatial and/or quality layers wit h higher resolutions and/or bitrates. | |||
Implicit dependencies between layers assume that a layer with a given | Implicit dependencies between layers assume that a layer with a given | |||
TID/LID MAY depend | TID/LID <bcp14>MAY</bcp14> depend | |||
on layer(s) with the same or lower TID/LID, but MUST NOT depend on lay | on a layer or layers with the same or lower TID/LID, but they <bcp14>M | |||
er(s) with higher TID/LID. | UST NOT</bcp14> depend on a layer or layers with higher TID/LID. | |||
</t><t> | </t> | |||
<t> | ||||
With further information, | With further information, | |||
for example, possible future RTCP SDES items that convey full layer st | for example, possible future RTCP source description (SDES) items that | |||
ructure information, it may | convey full layer structure information, it may | |||
be possible to map these TIDs and LIDs to specific absolute frame rate | be possible to map these TIDs and LIDs to specific absolute frame rate | |||
s, resolutions and bitrates, | s, resolutions, bitrates, and explicit dependencies between layers. | |||
as well as explicit dependencies between layers. | Such additional layer information may be useful for forwarding decisio | |||
Such additional layer information may be useful for forwarding decisio | ns in the RTP switch | |||
ns in the RTP switch, | ||||
but is beyond the scope of this memo. The relative layer information i s still useful | but is beyond the scope of this memo. The relative layer information i s still useful | |||
for many selective forwarding decisions even without such additional l ayer information. | for many selective forwarding decisions, even without such additional layer information. | |||
</t> | </t> | |||
</section> | </section> | |||
<section anchor="mandatory-non-scalable"> | ||||
<section title="Short Extension for Non-Scalable Streams" anchor="mandator | <name>Short Extension for Non-scalable Streams</name> | |||
y-non-scalable"> | <t>The following RTP header extension is <bcp14>RECOMMENDED</bcp14> for | |||
<t>The following RTP header extension is RECOMMENDED for non-scalable str | non-scalable streams. | |||
eams. | ||||
It is identical to the shortest form of the extension for scalable st reams, | It is identical to the shortest form of the extension for scalable st reams, | |||
except the last four bits (B and TID) are replaced with zeros. | except the last four bits (B and TID) are replaced with zeros. | |||
It MAY also be used for scalable streams if the sender has limited or no | It <bcp14>MAY</bcp14> also be used for scalable streams if the sender has limited or no | |||
information about stream scalability. | information about stream scalability. | |||
The ID is assigned per <xref target="RFC8285" />, | The ID is assigned per <xref target="RFC8285"/>; | |||
and the length is encoded as L=0 which indicates 1 octet of data.</t> | the length is encoded as L=0, which indicates 1 octet of data.</t> | |||
<artwork><![CDATA[ | ||||
<figure> | ||||
<artwork><![CDATA[ | ||||
0 1 | 0 1 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=0 |S|E|I|D|0 0 0 0| | | ID=? | L=0 |S|E|I|D|0 0 0 0| | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
<t>The following information is extracted from the media payload and sen | ||||
<t>The following information are extracted from the media payload and se | t in the Frame Marking RTP header extension. | |||
nt in the Frame Marking RTP header extension. | ||||
<list style='symbols'> | ||||
<t>S: Start of Frame (1 bit) - MUST be 1 in the first packet in a fr | ||||
ame; otherwise MUST be 0.</t> | ||||
<t>E: End of Frame (1 bit) - MUST be 1 in the last packet in a frame | ||||
; otherwise MUST be 0. | ||||
SHOULD match the RTP header marker bit in payload formats with such | ||||
semantics for marking end of frame.</t> | ||||
<t>I: Independent Frame (1 bit) - MUST be 1 for frames that can be | ||||
decoded independent of temporally prior frames, e.g. intra-frame, | ||||
VPX keyframe, | ||||
H.264 IDR <xref target="RFC6184" />, | ||||
H.265 IDR/CRA/BLA/IRAP <xref target="RFC7798" />; | ||||
otherwise MUST be 0. </t> | ||||
<t>D: Discardable Frame (1 bit) - MUST be 1 for frames the sender kn | ||||
ows can be discarded, | ||||
and still provide a decodable media stream; otherwise MUST be 0. | ||||
</t> | ||||
<t>The remaining (4 bits) - are reserved/fixed values and not used f | ||||
or non-scalable streams; | ||||
they MUST be set to 0 upon transmission and ignored upon reception | ||||
.</t> | ||||
</list> | ||||
</t> | </t> | |||
</section> | ||||
<section title="Layer ID Mappings for Scalable Streams"> | <dl newline="true"> | |||
<t> This section maps the specific Layer ID information contained in speci | <dt>S: Start of Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 in th | |||
fic scalable codecs to the generic LID and TID fields. </t> | e first packet in a frame; otherwise, <bcp14>MUST</bcp14> be 0.</dd> | |||
<t> Note that non-scalable streams have no Layer ID information and thus n | ||||
o mappings. </t> | ||||
<section title="VP9 LID Mapping"> | <dt>E: End of Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 in the | |||
<t> The VP9 <xref target="I-D.ietf-payload-vp9" /> | last packet in a frame; otherwise, <bcp14>MUST</bcp14> be 0. | |||
<bcp14>SHOULD</bcp14> match the RTP header marker bit in payload for | ||||
mats with such semantics for marking end of frame.</dd> | ||||
<dt>I: Independent Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 f | ||||
or frames that can be | ||||
decoded independent of temporally prior frames, e.g., intra-frame | ||||
, VPX keyframe, | ||||
H.264 IDR <xref target="RFC6184"/>, or | ||||
H.265 IDR/CRA/BLA/IRAP <xref target="RFC7798"/>; | ||||
otherwise, <bcp14>MUST</bcp14> be 0. </dd> | ||||
<dt>D: Discardable Frame (1 bit)</dt><dd><bcp14>MUST</bcp14> be 1 fo | ||||
r frames the sender knows can be discarded | ||||
and still provide a decodable media stream; otherwise, <bcp14>MU | ||||
ST</bcp14> be 0. </dd> | ||||
<dt>The remaining (4 bits)</dt><dd>These are reserved/fixed values a | ||||
nd not used for non-scalable streams; | ||||
they <bcp14>MUST</bcp14> be set to 0 upon transmission and ignored | ||||
upon reception.</dd> | ||||
</dl> | ||||
</section> | ||||
<section> | ||||
<name>LID Mappings for Scalable Streams</name> | ||||
<t> This section maps the specific Layer ID (LID) information contained | ||||
in specific scalable codecs to the generic LID and TID fields. </t> | ||||
<t> Note that non-scalable streams have no LID information; thus, they h | ||||
ave no mappings. </t> | ||||
<section> | ||||
<name>VP9 LID Mapping</name> | ||||
<t> The VP9 <xref target="RFC9628"/> | ||||
Spatial Layer ID (SID, 3 bits) and Temporal Layer ID (TID, 3 bits) | Spatial Layer ID (SID, 3 bits) and Temporal Layer ID (TID, 3 bits) | |||
in the VP9 payload descriptor are mapped to the generic LID and TID f ields | in the VP9 payload descriptor are mapped to the generic LID and TID f ields | |||
in the header extension as shown in the following figure.</t> | in the header extension as shown in the following figure.</t> | |||
<artwork><![CDATA[ | ||||
<figure> | ||||
<artwork><![CDATA[ | ||||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=2 |S|E|I|D|B| TID |0|0|0|0|0| SID | TL0PICIDX | | | ID=? | L=2 |S|E|I|D|B| TID |0|0|0|0|0| SID | TL0PICIDX | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
<t> The S bit <bcp14>MUST</bcp14> match the B bit in the VP9 payload d | ||||
escriptor.</t> | ||||
<t> The E bit <bcp14>MUST</bcp14> match the E bit in the VP9 payload d | ||||
escriptor.</t> | ||||
<t> The I bit <bcp14>MUST</bcp14> match the inverse of the P bit in th | ||||
e VP9 payload descriptor.</t> | ||||
<t> The S bit MUST match the B bit in the VP9 payload descriptor.</t> | <!--[rfced] Should "field" or some other noun follow | |||
<t> The E bit MUST match the E bit in the VP9 payload descriptor.</t> | "refresh_frame_flags" in this sentence? Or is this referring to | |||
<t> The I bit MUST match the inverse of the P bit in the VP9 payload desc | the flags (as the verb "are" is plural)? | |||
riptor.</t> | ||||
<t> The D bit MUST be 1 if the refresh_frame_flags in the VP9 payload unc | Original: | |||
ompressed header are all 0, otherwise it MUST be 0.</t> | The D bit MUST be 1 if the refresh_frame_flags in the VP9 payload | |||
<t> The B bit MUST be 0 if TID is 0; otherwise, if TID is not 0, it MUST | uncompressed header are all 0, otherwise it MUST be 0. | |||
match the U bit in the VP9 payload descriptor. Note: When using temporally neste | --> | |||
d scalability structures as recommended in <xref target="scalable-structures" /> | ||||
, the B bit and VP9 U bit will always be 1 if TID is not 0, since it is always | <t> The D bit <bcp14>MUST</bcp14> be 1 if the refresh_frame_flags in t | |||
he VP9 payload uncompressed header are all 0; otherwise, it <bcp14>MUST</bcp14> | ||||
be 0.</t> | ||||
<t> The B bit <bcp14>MUST</bcp14> be 0 if the TID is 0; if the TID is | ||||
not 0, it <bcp14>MUST</bcp14> match the U bit in the VP9 payload descriptor. Not | ||||
e: when using temporally nested scalability structures as recommended in <xref t | ||||
arget="scalable-structures"/>, the B bit and VP9 U bit will always be 1 if the T | ||||
ID is not 0 since it is always | ||||
possible to switch up to a higher temporal layer in such nested struc tures.</t> | possible to switch up to a higher temporal layer in such nested struc tures.</t> | |||
<t> TID, SID and TL0PICIDX MUST match the correspondingly named fields in the VP9 payload descriptor, | <t>The TID, SID, and TL0PICIDX <bcp14>MUST</bcp14> match the correspon dingly named fields in the VP9 payload descriptor, | |||
with SID aligned in the least significant 3 bits of the 8-bit LID fie ld and zeros | with SID aligned in the least significant 3 bits of the 8-bit LID fie ld and zeros | |||
in the most significant 5 bits.</t> | in the most significant 5 bits.</t> | |||
</section> | ||||
<section> | ||||
<name>H265 LID Mapping</name> | ||||
</section> | <t> The H265 <xref target="RFC7798"/> LayerID (6 bits), and TID (3 bit | |||
s) | ||||
<section title="H265 LID Mapping"> | from the Network Abstraction Layer (NAL) unit header are mapped to | |||
<t> The H265 <xref target="RFC7798" /> LayerID (6 bits) and TID (3 bits) | the generic LID and TID fields | |||
from the NAL unit header are mapped to the generic LID and TID fiel | ||||
ds | ||||
in the header extension as shown in the following figure.</t> | in the header extension as shown in the following figure.</t> | |||
<figure> | ||||
<artwork><![CDATA[ | <artwork><![CDATA[ | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=2 |S|E|I|D|B| TID |0|0| LayerID | TL0PICIDX | | | ID=? | L=2 |S|E|I|D|B| TID |0|0| LayerID | TL0PICIDX | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
<t>The S and E bits MUST match the correspondingly named bits in PACI:PHE | <!--[rfced] [*AD] We see several (similar) sentences like the example | |||
S:TSCI payload structures.</t> | below where it might be difficult for the reader to correclty | |||
<t>The I bit MUST be 1 when the NAL unit type is 16-23 (inclusive) or 32- | understand what part(s) of the sentence the keyword MUST applies | |||
34 (inclusive), or an aggregation packet or fragmentation unit encapsulating any | to. We wonder if a rewrite may be helpful to the reader, | |||
of these types, otherwise it MUST be 0. These ranges cover intra (IRAP) frames | possibly using a list... Please see the example below (again, | |||
as well as | other similar instances exist in the document) and let us know if | |||
critical parameter sets (VPS, SPS, PPS).</t> | an update like one of the following might work. | |||
<t>The D bit MUST be 1 when the NAL unit type is 0, 2, 4, 6, 8, 10, 12, 1 | ||||
4, or 38, or an aggregation packet or fragmentation unit encapsulating only thes | ||||
e types, otherwise it MUST be 0. These ranges cover non-reference frames as well | ||||
as filler data.</t> | ||||
<t>The B bit can not be determined reliably from simple inspection of pay | ||||
load headers, and therefore is determined by implementation-specific means. For | ||||
example, internal codec interfaces may provide information to set this reliably. | ||||
</t> | ||||
<t> TID and LayerID MUST match the correspondingly named fields in the H2 | ||||
65 NAL unit header, | ||||
with LayerID aligned in the least significant 6 bits of the 8-bit LID | ||||
field and zeros | ||||
in the most significant 2 bits.</t> | ||||
</section> | Original: | |||
<section title="H264-SVC LID Mapping"> | The D bit MUST be 1 when the NAL unit header NRI field is 0, or an | |||
<t> The following shows H264-SVC <xref target="RFC6190" /> Layer encoding inf | aggregation packet or fragmentation unit encapsulating only NAL units | |||
ormation (3 bits for | with NRI=0, otherwise it MUST be 0. | |||
spatial/dependency layer, 4 bits for quality layer and 3 bits for temporal la | ||||
yer) mapped to the generic LID and TID fields.</t> | Perhaps A (the "when" clause applies to both the D bit being set to 1 or NRI=0): | |||
<t>The S, E, I and D bits MUST match the correspondingly named bits in PACSI | ||||
payload structures.</t> | When the NAL unit header NRI field is 0, the D bit MUST be either 1 or | |||
<t>The I bit MUST be 1 when the NAL unit type is 5, 7, 8, 13, or 15, | an aggregation packet or fragmentation unit encapsulating only NAL | |||
or an aggregation packet or fragmentation unit encapsulating any of these t | units with NRI=0. When the NAL unit header NRI field is not set to 0, | |||
ypes, otherwise it MUST be 0. These ranges cover intra (IDR) frames as well as | the D bit MUST be 0. | |||
Perhaps B (the "when" clause only applies to the D bit being 0): | ||||
The D bit MUST be: | ||||
-1 when the NAL unit header NRI field is 0, | ||||
-an aggregation packet or fragmentation unit encapsulating only NAL units | ||||
with NRI=0, or | ||||
- 0. | ||||
--> | ||||
<t>The S and E bits <bcp14>MUST</bcp14> match the correspondingly name | ||||
d bits in PACI:PHES:TSCI payload structures.</t> | ||||
<t>The I bit <bcp14>MUST</bcp14> be 1 when the NAL unit type is 16-23 | ||||
(inclusive) or 32-34 (inclusive), or an aggregation packet or fragmentation unit | ||||
encapsulating any of these types; otherwise, it <bcp14>MUST</bcp14> be 0. These | ||||
ranges cover intra (IRAP) frames as well as | ||||
critical parameter sets (Video Parameter Set (VPS), Sequence Paramete | ||||
r Set (SPS), Picture Parameter Set (PPS)).</t> | ||||
<t>The D bit <bcp14>MUST</bcp14> be 1 when the NAL unit type is 0, 2, | ||||
4, 6, 8, 10, 12, 14, 38, or an aggregation packet or fragmentation unit encapsul | ||||
ating only these types; otherwise, it <bcp14>MUST</bcp14> be 0. These ranges cov | ||||
er non-reference frames as well as filler data.</t> | ||||
<t>The B bit cannot be determined reliably from simple inspection of p | ||||
ayload headers; therefore, it is determined by implementation-specific means. Fo | ||||
r example, internal codec interfaces may provide information to set this reliabl | ||||
y.</t> | ||||
<t>The TID and LayerID <bcp14>MUST</bcp14> match the correspondingly n | ||||
amed fields in the H265 NAL unit header, | ||||
with LayerID aligned in the least significant 6 bits of the 8-bit LID | ||||
field and zeros | ||||
in the most significant 2 bits.</t> | ||||
</section> | ||||
<section> | ||||
<name>H264 Scalable Video Coding (SVC) LID Mapping</name> | ||||
<t> The following shows H264-SVC <xref target="RFC6190"/> Layer encodi | ||||
ng information (3 bits for | ||||
spatial/dependency layer, 4 bits for quality layer, and 3 bits for temporal l | ||||
ayer) mapped to the generic LID and TID fields.</t> | ||||
<t>The S, E, I, and D bits <bcp14>MUST</bcp14> match the corresponding | ||||
ly named bits in Payload Content Scalability Information (PACSI) payload structu | ||||
res.</t> | ||||
<t>The I bit <bcp14>MUST</bcp14> be 1 when the NAL unit type is 5, 7, | ||||
8, 13, 15, | ||||
or an aggregation packet or fragmentation unit encapsulating any of these t | ||||
ypes; otherwise, it <bcp14>MUST</bcp14> be 0. These ranges cover intra (IDR) fra | ||||
mes as well as | ||||
critical parameter sets (SPS/PPS variants).</t> | critical parameter sets (SPS/PPS variants).</t> | |||
<t>The D bit MUST be 1 when the NAL unit header NRI field is 0, or an aggrega tion packet or fragmentation unit encapsulating only NAL units with NRI=0, other wise it MUST be 0. | <t>The D bit <bcp14>MUST</bcp14> be 1 when the NAL unit header Network Remote Identification (NRI) field is 0, or an aggregation packet or fragmentati on unit encapsulating only NAL units with NRI=0; otherwise, it <bcp14>MUST</bcp1 4> be 0. | |||
The NRI=0 condition signals non-reference frames.</t> | The NRI=0 condition signals non-reference frames.</t> | |||
<t>The B bit can not be determined reliably from simple inspection of payload | <t>The B bit cannot be determined reliably from simple inspection of p | |||
headers, and therefore is determined by implementation-specific means. For exam | ayload headers; therefore, it is determined by implementation-specific means. Fo | |||
ple, internal codec interfaces may provide information to set this reliably.</t> | r example, internal codec interfaces may provide information to set this reliabl | |||
y.</t> | ||||
<figure> | ||||
<artwork><![CDATA[ | <artwork><![CDATA[ | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=2 |S|E|I|D|B| TID |0| DID | QID | TL0PICIDX | | | ID=? | L=2 |S|E|I|D|B| TID |0| DID | QID | TL0PICIDX | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
</section> | </section> | |||
<section> | ||||
<section title="H264 (AVC) LID Mapping"> | <name>H264 Advanced Video Coding (AVC) LID Mapping</name> | |||
<t> The following shows the header extension for H264 (AVC) <xref target="RF | <t> The following shows the header extension for H264 (AVC) <xref tar | |||
C6184" /> that contains | get="RFC6184"/> that contains | |||
only temporal layer information.</t> | only temporal layer information.</t> | |||
<t> The S bit MUST be 1 when the timestamp in the RTP header differs from the | <t> The S bit <bcp14>MUST</bcp14> be 1 when the timestamp in the RTP h | |||
timestamp | eader differs from the timestamp | |||
in the prior RTP sequence number from the same SSRC, otherwise it MUST be 0 | in the prior RTP sequence number from the same SSRC; otherwise, it <bcp14>M | |||
.</t> | UST</bcp14> be 0.</t> | |||
<t> The E bit MUST match the M bit in the RTP header.</t> | <t> The E bit <bcp14>MUST</bcp14> match the M bit in the RTP header.</ | |||
<t>The I bit MUST be 1 when the NAL unit type is 5, 7, or 8, | t> | |||
or an aggregation packet or fragmentation unit encapsulating any of these t | <t>The I bit <bcp14>MUST</bcp14> be 1 when the NAL unit type is 5, 7, | |||
ypes, | or 8, | |||
otherwise it MUST be 0. These ranges cover intra (IDR) frames as well as | or an aggregation packet or fragmentation unit encapsulating any of these t | |||
ypes; | ||||
otherwise, it <bcp14>MUST</bcp14> be 0. These ranges cover intra (IDR) fram | ||||
es as well as | ||||
critical parameter sets (SPS/PPS).</t> | critical parameter sets (SPS/PPS).</t> | |||
<t>The D bit MUST be 1 when the NAL unit header NRI field is 0, | <t>The D bit <bcp14>MUST</bcp14> be 1 when the NAL unit header NRI fie ld is 0, | |||
or an aggregation packet or fragmentation unit encapsulating only | or an aggregation packet or fragmentation unit encapsulating only | |||
NAL units with NRI=0, otherwise it MUST be 0. | NAL units with NRI=0; otherwise, it <bcp14>MUST</bcp14> be 0. | |||
The NRI=0 condition signals non-reference frames.</t> | The NRI=0 condition signals non-reference frames.</t> | |||
<t>The B bit can not be determined reliably from simple inspection of payload | <t>The B bit cannot be determined reliably from simple inspection of p | |||
headers, and therefore is determined by implementation-specific means. For exam | ayload headers; therefore, it is determined by implementation-specific means. Fo | |||
ple, internal codec interfaces may provide information to set this reliably.</t> | r example, internal codec interfaces may provide information to set this reliabl | |||
<figure> | y.</t> | |||
<artwork><![CDATA[ | <artwork><![CDATA[ | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=2 |S|E|I|D|B| TID |0|0|0|0|0|0|0|0| TL0PICIDX | | | ID=? | L=2 |S|E|I|D|B| TID |0|0|0|0|0|0|0|0| TL0PICIDX | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
</section> | </section> | |||
<section> | ||||
<section title="VP8 LID Mapping"> | <name>VP8 LID Mapping</name> | |||
<t> The following shows the header extension for VP8 <xref target="RFC7741" | <t> The following shows the header extension for VP8 <xref target="RF | |||
/> that contains | C7741"/> that contains | |||
only temporal layer information.</t> | only temporal layer information.</t> | |||
<t> The S bit MUST match the correspondingly named bit in the VP8 payload des | <t> The S bit <bcp14>MUST</bcp14> match the correspondingly named bit | |||
criptor when PID=0, otherwise it MUST be 0.</t> | in the VP8 payload descriptor when PID=0; otherwise, it <bcp14>MUST</bcp14> be 0 | |||
<t> The E bit MUST match the M bit in the RTP header. </t> | .</t> | |||
<t> The I bit MUST match the inverse of the P bit in the VP8 payload header.< | <t> The E bit <bcp14>MUST</bcp14> match the M bit in the RTP header. < | |||
/t> | /t> | |||
<t> The D bit MUST match the N bit in the VP8 payload descriptor.</t> | <t> The I bit <bcp14>MUST</bcp14> match the inverse of the P bit in th | |||
<t> The B bit MUST match the Y bit in the VP8 payload descriptor. Note: When | e VP8 payload header.</t> | |||
using temporally nested scalability structures as recommended in <xref target="s | <t> The D bit <bcp14>MUST</bcp14> match the N bit in the VP8 payload d | |||
calable-structures" />, the B bit and VP8 Y bit will always be 1 if TID is not 0 | escriptor.</t> | |||
, since it is always | ||||
<!-- [rfced] Please review whether any of the notes in this document | ||||
should be in the <aside> element. It is defined as "a container for | ||||
content that is semantically less important or tangential to the | ||||
content that surrounds it" (https://authors.ietf.org/en/rfcxml-vocabulary#aside) | ||||
. | ||||
--> | ||||
<t> The B bit <bcp14>MUST</bcp14> match the Y bit in the VP8 payload d | ||||
escriptor. Note: when using temporally nested scalability structures as recommen | ||||
ded in <xref target="scalable-structures"/>, the B bit and VP8 Y bit will always | ||||
be 1 if the TID is not 0 since it is always | ||||
possible to switch up to a higher temporal layer in such nested structure s.</t> | possible to switch up to a higher temporal layer in such nested structure s.</t> | |||
<t> TID and TL0PICIDX MUST match the correspondingly named fields in the VP8 | <t>The TID and TL0PICIDX <bcp14>MUST</bcp14> match the correspondingly | |||
payload descriptor. </t> | named fields in the VP8 payload descriptor. </t> | |||
<figure> | ||||
<artwork><![CDATA[ | <artwork><![CDATA[ | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| ID=? | L=2 |S|E|I|D|B| TID |0|0|0|0|0|0|0|0| TL0PICIDX | | | ID=? | L=2 |S|E|I|D|B| TID |0|0|0|0|0|0|0|0| TL0PICIDX | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
]]></artwork></figure> | ]]></artwork> | |||
</section> | </section> | |||
<section> | ||||
<section title="Future Codec LID Mapping"> | <name>Future Codec LID Mapping</name> | |||
<t>The RTP payload format specification for future video codecs SHOULD inc | <t>The RTP payload format specification for future video codecs <bcp14 | |||
lude a section describing | >SHOULD</bcp14> include a section describing | |||
the LID mapping and TID mapping for the codec.</t> | the LID mapping and TID mapping for the codec.</t> | |||
</section> | </section> | |||
</section> | ||||
</section> | <section> | |||
<name>Signaling Information</name> | ||||
<section title="Signaling Information"> | <t>The URI for declaring this header extension in an extmap attribute is | |||
<t>The URI for declaring this header extension in an extmap attribute is | ||||
"urn:ietf:params:rtp-hdrext:framemarking". It does not contain any | "urn:ietf:params:rtp-hdrext:framemarking". It does not contain any | |||
extension attributes. </t> | extension attributes. </t> | |||
<t>An example attribute line in SDP:</t> | <t>An example attribute line in SDP:</t> | |||
<figure> | <artwork><![CDATA[ | |||
<artwork><![CDATA[ | ||||
a=extmap:3 urn:ietf:params:rtp-hdrext:framemarking | a=extmap:3 urn:ietf:params:rtp-hdrext:framemarking | |||
]]></artwork></figure> | ]]></artwork> | |||
</section> | ||||
<section> | ||||
<name>Usage Considerations</name> | ||||
</section> | <!--[rfced] May we update this sentence as follows for the ease of the | |||
reader? Note that the introductory "when" phrase mentions a | ||||
single frame while the recommendation mentions plural frames: | ||||
please consider if further updates are necessary. | ||||
<section title="Usage Considerations"> | Original: | |||
<t>The header extension values MUST represent what is already in the RTP p | When an RTP switch needs to discard a received video frame due to | |||
ayload.</t> | congestion control considerations, it is RECOMMENDED that it | |||
<t> When an RTP switch needs to discard a received video frame due to cong | preferably drop frames marked with the D (Discardable) bit set, or the | |||
estion control considerations, | highest values of TID and LID, which indicate the highest temporal and | |||
it is RECOMMENDED that it preferably drop frames marked with the D (Discar | spatial/quality enhancement layers, since those typically have fewer | |||
dable) bit set, | dependenices on them than lower layers. | |||
or the highest values of TID and LID, which indicate the highest tempora | ||||
l and spatial/quality enhancement layers, since those typically have fewer depen | ||||
denices on them than lower layers.</t> | ||||
<t> When an RTP switch wants to forward a new video stream to a receiver, | ||||
it is RECOMMENDED to | ||||
select the new video stream from the first switching point with the I (Ind | ||||
ependent) bit set in all spatial layers and forward the same. | ||||
An RTP switch can request a media source to generate a switching point by | ||||
sending | ||||
Full Intra Request (RTCP FIR) as defined in <xref target="RFC5104" />, for | ||||
example. </t> | ||||
<section title="Relation to Layer Refresh Request (LRR)"> | Perhaps A: | |||
<t>Receivers can use the Layer Refresh Request (LRR) <xref target="I-D.i | When an RTP switch needs to discard a received video frame due to | |||
etf-avtext-lrr" /> | congestion control considerations, it is RECOMMENDED that it drop: | |||
- frames marked with the D (Discardable) bit set, or | ||||
-frames with the highest values of TID and LID (which indicate the | ||||
highest temporal and spatial/quality enhancement layers) since those | ||||
typically have fewer dependencies on them than lower layers. | ||||
Perhaps B (to upddate the sg/pl switch): | ||||
When an RTP switch needs to discard received video frames due to | ||||
congestion control considerations, it is RECOMMENDED that it drop: | ||||
- frames marked with the D (Discardable) bit set, or | ||||
-frames with the highest values of TID and LID (which indicate the | ||||
highest temporal and spatial/quality enhancement layers) since those | ||||
typically have fewer dependencies on them than lower layers. | ||||
--> | ||||
<t>The header extension values <bcp14>MUST</bcp14> represent what is alr | ||||
eady in the RTP payload.</t> | ||||
<t> When an RTP switch needs to discard a received video frame due to co | ||||
ngestion control considerations, | ||||
it is <bcp14>RECOMMENDED</bcp14> that it preferably drop frames marked wit | ||||
h the D (Discardable) bit set, | ||||
or the highest values of TID and LID, which indicate the highest tempora | ||||
l and spatial/quality enhancement layers, since those typically have fewer depen | ||||
dencies on them than lower layers.</t> | ||||
<!--[rfced] Please clarify what "and forward the same" means in this text. | ||||
Original: | ||||
When an RTP switch wants to forward a new video stream to a receiver, | ||||
it is RECOMMENDED to select the new video stream from the first | ||||
switching point with the I (Independent) bit set in all spatial | ||||
layers and forward the same. | ||||
--> | ||||
<t> When an RTP switch wants to forward a new video stream to a receiver | ||||
, it is <bcp14>RECOMMENDED</bcp14> to | ||||
select the new video stream from the first switching point with the I (In | ||||
dependent) bit set in all spatial layers and forward the same. | ||||
<!--[rfced] How may we update this text to more easily illustrate the | ||||
1:1 mapping between initialism and expansion? | ||||
Original: | ||||
... source to generate a switching point by sending Full Intra | ||||
Request (RTCP FIR) as defined in [RFC5104]... | ||||
Perhaps: | ||||
... source to generate a switching point by sending RTCP Full Intra | ||||
Request (FIR) as defined in [RFC5104]... | ||||
--> | ||||
An RTP switch can request that a media source generate a switching point b | ||||
y sending | ||||
Full Intra Request (RTCP FIR) as defined in <xref target="RFC5104"/>, for | ||||
example. </t> | ||||
<section> | ||||
<name>Relation to Layer Refresh Request (LRR)</name> | ||||
<t>Receivers can use the Layer Refresh Request (LRR) <xref target="RFC | ||||
9627"/> | ||||
RTCP feedback message | RTCP feedback message | |||
to upgrade to a higher layer in scalable encodings. The TID/LID values | to upgrade to a higher layer in scalable encodings. The TID/LID values | |||
and formats used in LRR messages MUST correspond to the same values an | and formats used in LRR messages <bcp14>MUST</bcp14> correspond to the | |||
d formats | same values and formats | |||
specified in <xref target="mandatory-scalable" />. | specified in <xref target="mandatory-scalable"/>. | |||
</t> | </t> | |||
<t>Because frame marking can only be used with temporally-nested strea | ||||
ms, | <!--[rfced] In the following, are "layer" and "refreshes" redundant | |||
with what LRR stands for? Please let us know if any updates are | ||||
necessary. | ||||
Original: | ||||
Because frame marking can only be used with temporally-nested | ||||
streams, temporal-layer LRR refreshes are unnecessary for frame- | ||||
marked streams. | ||||
As expanded it would be: | ||||
Because frame marking can only be used with temporally nested | ||||
streams, temporal-layer Layer Refresh Request (LRR) refreshes are | ||||
unnecessary for frame-marked streams. | ||||
--> | ||||
<t>Because frame marking can only be used with temporally nested strea | ||||
ms, | ||||
temporal-layer LRR refreshes are unnecessary for frame-marked stream s. | temporal-layer LRR refreshes are unnecessary for frame-marked stream s. | |||
Other refreshes can be detected based on the I bit being set for the s pecific spatial layers. | Other refreshes can be detected based on the I bit being set for the s pecific spatial layers. | |||
</t> | </t> | |||
</section> | </section> | |||
<section title="Scalability Structures" anchor="scalable-structures"> | <section anchor="scalable-structures"> | |||
<name>Scalability Structures</name> | ||||
<t>The LID and TID information is most useful for fixed scalability st ructures, | <t>The LID and TID information is most useful for fixed scalability st ructures, | |||
such as nested hierarchical temporal layering structures, where each temporal | such as nested hierarchical temporal layering structures, where each temporal | |||
layer only references lower temporal layers or the base temporal lay er. | layer only references lower temporal layers or the base temporal lay er. | |||
The LID and TID information is less useful, or even not useful at al l, | The LID and TID information is less useful, or even not useful at al l, | |||
for complex, irregular scalability structures that do not conform to common, | for complex, irregular scalability structures that do not conform to common, | |||
fixed patterns of inter-layer dependencies and referencing structure s. | fixed patterns of inter-layer dependencies and referencing structure s. | |||
Therefore it is RECOMMENDED to use LID and TID information for | Therefore, it is <bcp14>RECOMMENDED</bcp14> to use LID and TID infor mation for | |||
RTP switch forwarding decisions only in the case of temporally neste d | RTP switch forwarding decisions only in the case of temporally neste d | |||
scalability structures, and it is NOT RECOMMENDED for other | scalability structures, and it is <bcp14>NOT RECOMMENDED</bcp14> for other | |||
(more complex or irregular) scalability structures.</t> | (more complex or irregular) scalability structures.</t> | |||
</section> | ||||
</section> | </section> | |||
</section> | </section> | |||
</section> | <section> | |||
<name>Security and Privacy Considerations</name> | ||||
<section title="Security Considerations and Privacy Considerations" > | <t>In "<xref target="RFC3711" format="title"/>" <xref target="RFC3711"/>, | |||
<t>In the Secure Real-Time Transport Protocol (SRTP) <xref target="RFC3711 | RTP header extensions are | |||
" />, RTP header extensions are | authenticated and optionally encrypted <xref target="RFC9335"/>. | |||
authenticated and optionally encrypted <xref target="RFC9335" />. | ||||
When unencrypted header extensions are used, some metadata is | When unencrypted header extensions are used, some metadata is | |||
exposed and visible to middle boxes on the network path, | exposed and visible to middleboxes on the network path, | |||
while encrypted media data and metadata in encrypted header extensions are not exposed.</t> | while encrypted media data and metadata in encrypted header extensions are not exposed.</t> | |||
<t>The primary utility of this specification is for RTP switches to make p roper media forwarding decisions. | <t>The primary utility of this specification is for RTP switches to make p roper media forwarding decisions. | |||
RTP switches are the SRTP peers of endpoints, so they can access encrypted header extensions, | RTP switches are the SRTP peers of endpoints, so they can access encrypted header extensions, | |||
but not end-to-end encrypted private media payloads. Other middle boxes on | but not end-to-end encrypted private media payloads. Other middleboxes on | |||
the network path can only access | the network path can only access | |||
unencrypted header extensions, since they are not SRTP peers.</t> | unencrypted header extensions since they are not SRTP peers.</t> | |||
<t>RTP endpoints that negotiate this extension should consider whether: | ||||
<t>RTP endpoints which negotiate this extension should consider whether th | </t> | |||
is video frame marking metadata | <ul><li>this video frame marking metadata | |||
needs to be exposed to the SRTP peer only, in which case the header extens | needs to be exposed to the SRTP peer only, in which case the header extens | |||
ion can be encrypted; or whether | ion can be encrypted; or</li> | |||
other middle boxes on the network path also need this metadata, for exampl | <li>other middleboxes on the network path also need this metadata, for exa | |||
e, to optimize packet drop decisions | mple, to optimize packet drop decisions | |||
that minimize media quality impacts, in which case the header extension ca n be unencrypted, if the endpoint | that minimize media quality impacts, in which case the header extension ca n be unencrypted, if the endpoint | |||
accepts the potential privacy leakage of this metadata. For example, it wo | accepts the potential privacy leakage of this metadata.</li> | |||
uld be possible to determine | </ul> | |||
<t> | ||||
For example, it would be possible to determine | ||||
keyframes and their frequency in unencrypted header extensions. This infor mation can often be obtained via | keyframes and their frequency in unencrypted header extensions. This infor mation can often be obtained via | |||
statistical analysis of encrypted data. For example, keyframes are usually much larger than other frames, | statistical analysis of encrypted data. For example, keyframes are usually much larger than other frames, | |||
so frame size alone can leak this in the absence of any unencrypted metada ta. However, unencrypted metadata | so frame size alone can leak this in the absence of any unencrypted metada ta. However, unencrypted metadata | |||
provides a reliable signal rather than a statistical probability; so endpo ints should take that into consideration | provides a reliable signal rather than a statistical probability; so endpo ints should take that into consideration | |||
to balance the privacy leakage risk against the potential benefit of optim ized media delivery when deciding | to balance the privacy leakage risk against the potential benefit of optim ized media delivery when deciding | |||
whether to negotiate and encrypt this header extension.</t> | whether to negotiate and encrypt this header extension.</t> | |||
</section> | </section> | |||
<section title="Acknowledgements"> | <section> | |||
<t>Many thanks to Bernard Aboba, Jonathan Lennox, Stephan Wenger, Dale Wor | <name>IANA Considerations</name> | |||
ley, and Magnus Westerlund for their inputs.</t> | <t>This document defines a new extension URI listed in the "RTP Compact He | |||
</section> | ader Extensions" subregistry of the | |||
"Real-Time Transport Protocol (RTP) Parameters" registry, according to th | ||||
<section title="IANA Considerations"> | e following data:</t> | |||
<t>This document defines a new extension URI to the RTP Compact HeaderExte | ||||
nsions sub-registry of the | ||||
Real-Time Transport Protocol (RTP) Parameters registry, according to the | ||||
following data:</t> | ||||
<t>Extension URI: urn:ietf:params:rtp-hdrext:framemarkinginfo </t> | <t>Extension URI: urn:ietf:params:rtp-hdrext:framemarkinginfo </t> | |||
<t>Description: Frame marking information for video streams </t> | <t>Description: Frame marking information for video streams </t> | |||
<t>Contact: mzanaty@cisco.com </t> | <t>Contact: mzanaty@cisco.com </t> | |||
<t>Reference: RFC XXXX</t> | <t>Reference: RFC 9626</t> | |||
<t>Note to RFC Editor: please replace RFC XXXX with the number of this RFC .</t> | ||||
</section> | </section> | |||
</middle> | </middle> | |||
<back> | <back> | |||
<references title="Normative References"> | <references> | |||
<?rfc include="reference.RFC.2119"?> | ||||
<?rfc include="reference.RFC.8174"?> | <!-- [rfced] Would you like the references to be alphabetized or left | |||
<?rfc include="reference.RFC.8285"?> | in their current order? | |||
<?rfc include="reference.RFC.6184"?> | --> | |||
<?rfc include="reference.RFC.6190"?> | ||||
<?rfc include="reference.RFC.7741"?> | <name>References</name> | |||
<?rfc include="reference.RFC.7798"?> | <references> | |||
</references> | <name>Normative References</name> | |||
<references title="Informative References"> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2 | |||
<?rfc include="reference.RFC.7656"?> | 119.xml"/> | |||
<?rfc include="reference.RFC.7667"?> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8 | |||
<?rfc include="reference.RFC.6464"?> | 174.xml"/> | |||
<?rfc include="reference.RFC.3550"?> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8 | |||
<?rfc include="reference.RFC.3711"?> | 285.xml"/> | |||
<?rfc include="reference.RFC.5104"?> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6 | |||
<?rfc include="reference.RFC.8871"?> | 184.xml"/> | |||
<?rfc include="reference.RFC.9335"?> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6 | |||
<?rfc include="reference.I-D.ietf-avtext-lrr"?> | 190.xml"/> | |||
<?rfc include="reference.I-D.ietf-payload-vp9"?> | <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7 | |||
741.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7 | ||||
798.xml"/> | ||||
</references> | ||||
<references> | ||||
<name>Informative References</name> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7 | ||||
656.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7 | ||||
667.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6 | ||||
464.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3 | ||||
550.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3 | ||||
711.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5 | ||||
104.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8 | ||||
871.xml"/> | ||||
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9 | ||||
335.xml"/> | ||||
<!-- [I-D.ietf-avtext-lrr]; Companion document --> | ||||
<reference anchor="RFC9627" target="https://www.rfc-editor.org/info/rfc9627"> | ||||
<front> | ||||
<title>The Layer Refresh Request (LRR) RTCP Feedback Message</title> | ||||
<author initials="J." surname="Lennox" fullname="Jonathan Lennox"> | ||||
<organization>Vidyo, Inc.</organization> | ||||
</author> | ||||
<author initials="D." surname="Hong" fullname="Danny Hong"> | ||||
<organization>Vidyo, Inc.</organization> | ||||
</author> | ||||
<author initials="J." surname="Uberti" fullname="Justin Uberti"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<author initials="S." surname="Holmer" fullname="Stefan Holmer"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<author initials="M." surname="Flodman" fullname="Magnus Flodman"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<date month="August" year="2024" /> | ||||
</front> | ||||
<seriesInfo name="RFC" value="9627" /> | ||||
<seriesInfo name="DOI" value="10.17487/RFC9627"/> | ||||
</reference> | ||||
<!-- [I-D.ietf-payload-vp9]; Companion document --> | ||||
<reference anchor="RFC9628" target="https://www.rfc-editor.org/info/rfc9628"> | ||||
<front> | ||||
<title>RTP Payload Format for VP9 Video</title> | ||||
<author initials="J." surname="Uberti" fullname="Justin Uberti"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<author initials="S." surname="Holmer" fullname="Stefan Holmer"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<author initials="M." surname="Flodman" fullname="Magnus Flodman"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<author initials="D." surname="Hong" fullname="Danny Hong"> | ||||
<organization>Google, Inc.</organization> | ||||
</author> | ||||
<author initials="J." surname="Lennox" fullname="Jonathan Lennox"> | ||||
<organization>8x8, Inc. / Jitsi</organization> | ||||
</author> | ||||
<date month="August" year="2024" /> | ||||
</front> | ||||
<seriesInfo name="RFC" value="9628"/> | ||||
<seriesInfo name="DOI" value="10.17487/RFC9628"/> | ||||
</reference> | ||||
</references> | ||||
</references> | </references> | |||
<section numbered="false"> | ||||
<name>Acknowledgements</name> | ||||
<t>Many thanks to <contact fullname="Bernard Aboba"/>, <contact fullname | ||||
="Jonathan Lennox"/>, <contact fullname="Stephan Wenger"/>, <contact fullname= | ||||
"Dale Worley"/>, and <contact fullname="Magnus Westerlund"/> for their inputs.< | ||||
/t> | ||||
</section> | ||||
<!-- [rfced] We had the following questions related to abbreviations | ||||
used throughout the document. | ||||
a) Please note that we have expanded these abbreviations as follows on | ||||
first use. Please let us know any objections. | ||||
MCU - Multipoint Control Unit (per RFC 7667) | ||||
SRTP - Secure Real-time Transport Protocol | ||||
IDR - Instantaneous Decoding Refresh (per RFC 6184) | ||||
SDES - source description | ||||
NAL - Network Abstraction Layer | ||||
CRA - Clean Random Access | ||||
BLA - Broken Link Access | ||||
RAP - Random Access Point | ||||
AVC - Advanced Video Coidng (per RFC 6184) | ||||
SVC - Scalable Video Coding (per RFC 6190) | ||||
PACSI - Payload Content Scalability Information | ||||
NRI - Network Remote Identification | ||||
VPS - Video Parameter Set | ||||
SPS - Sequence Parameter Set | ||||
PPS - Picture Parameter Set | ||||
b) Please clarify if/how we may expand the following abbreviations: | ||||
VPX | ||||
PACI - is this intentionally different from PACSI? | ||||
c) Should "intra (IDR)" frames instead be "IDR intra-frames"? This | ||||
formation occurs twice in this document. | ||||
d) Please note that the following similar abbreviations appear to be | ||||
differently treated with regard to punctuation: | ||||
H264 (AVC) | ||||
H264-SVC | ||||
We have expanded the abbreviations on first use, but please let us | ||||
know if/how these should be made uniform with regard to parens and | ||||
hypheantion. | ||||
See also our question regarding H264 vs. H.264. | ||||
e) We note that in Section 3.3.2, "LayerID" is used. Later, in Figure | ||||
8, we see "LayerId" (lowercase d). May these be made consistent? If | ||||
so, which is preferred? Further, could these actually be made "LID" | ||||
instead (we see TID in both figures in question, which seems similar)? | ||||
Please review our related cluster-wide AQ prior to responding. | ||||
--> | ||||
<!--[rfced] We had the following questions related to terminology used | ||||
throughout the document. | ||||
a) Two questions about the header extension: | ||||
Should this RTP header extension appear using "Video" throughout? We | ||||
see both of the following forms. | ||||
Video Frame Marking RTP header extension vs. Frame Marking RTP header extension | ||||
Secondly, in the Abstract, we see: | ||||
Original: | ||||
This document describes a Video Frame Marking RTP header extension | ||||
used to convey information about video frames that is critical for | ||||
error recovery and packet forwarding in RTP middleboxes or network | ||||
nodes. | ||||
Is the use of the indefinite article "a" intentional ("a Video Frame | ||||
Marking RTP header extension")? This seems (possibly) contradictory | ||||
with the capitalization of the proper noun and use in Section 3 (are | ||||
there more types of Video Frame Marking RTP header extensions?). | ||||
Please review. | ||||
--> | ||||
<!-- [rfced] Please review the "Inclusive Language" portion of the | ||||
online Style Guide | ||||
<https://www.rfc-editor.org/styleguide/part2/#inclusive_language> | ||||
and let us know if any changes are needed. Updates of this | ||||
nature typically result in more precise language, which is | ||||
helpful for readers. | ||||
Note that our script did not flag any words in particular, but this | ||||
should still be reviewed as a best practice. | ||||
--> | ||||
</back> | </back> | |||
</rfc> | </rfc> | |||
End of changes. 95 change blocks. | ||||
432 lines changed or deleted | 771 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |