The present disclosure relates generally to video teleconferencing, and more specifically to scaling multicast video streams according to an endpoint's coder/decoder (codec) capabilities.
In a typical video communications environment, it is not unusual to have endpoints with heterogeneous or diverse video resolutions and codec capabilities, such as International Telecommunications Union (ITU) Recommendations H.263 and H.264 Advanced Video Coding (AVC), with various profiles. In other words, each endpoint may have a different codec and each endpoint is considered to be heterogeneous with respect to their respective codecs and/or decoder equipment. In a video session, when heterogeneous endpoints are engaged, the video content quality will suffer for some endpoints because the video usually has to be encoded to accommodate the lowest resolution and quality, i.e., a minimum content quality, from among the endpoints in the video session. Thus, endpoints with more advanced codecs or better equipment will still have to decode video that does not fully utilize the endpoint's decoding capability. The minimal content quality can result in poor user experiences (Quality of Experience (QoE)) for those in the video session that use higher-quality decoding at the endpoints.
Scalable video coding (SVC) technologies such as those defined in H.264 Annex G and the Motion Pictures Experts Group (MPEG) MPEG-2 standards have been proposed to partially address endpoint heterogeneity. The true advantages of these scalable codecs are alternatively considered to be fault-resilience and traffic engineering regarding packet losses and priority queuing in the wide area network and wireless network environments. SVC implementations have high complexity and have not yet become widely supported by video endpoints in many environments. Thus, many video communications environments have yet to benefit from SVC multilayer traffic engineering and fault resilience for network Quality of Service (QoS) in order to maximize the content quality delivered to each individual endpoint for QoE.
Overview
Techniques are provided for a scalable video multicast framework. In one form, a network device receives information indicating video decoding parameters of an endpoint network device. One or more video streams are received at the network device. A video stream is generated from the one or more video streams for the endpoint network device based on the video decoding parameters. The video stream is transmitted to the endpoint network device.
In another form, video decoding parameters from a plurality of endpoint network devices are received at a network device. An incoming video stream is received at the network device that is intended for the plurality of endpoint network devices. One or more outgoing video streams are generated from the incoming video stream for the plurality of endpoint network devices based on a highest video quality indicated by the video decoding parameters of each of the plurality of network endpoint devices. The one or more outgoing video streams are transmitted to the plurality of endpoint network devices.
Referring first to
The video source 110 can be any video source for video that is to be distributed over a network to various endpoints, e.g., switched digital video, video on demand, Internet video, etc. The video source may encode the video with its native or embedded codec, e.g., an H.264 AVC, an H-263, or MPEG-2 video codec. The video source 110 transmits video to the service node 120. The service node 120 transcodes the video to an SVC video stream by executing the scalable video generation process logic 600. The SVC stream comprises at least a base layer and one or more optional enhancement layers. The SVC standard is built upon the AVC standard and the base layer conforms to the AVC standard. The SVC stream is multicast to endpoints that have joined the multicast group, e.g., endpoints 140(1) and 140(2).
Before reaching the endpoints 140(1) and 140(2) the SVC video stream will pass through endpoint service nodes 130(1) and 130(2). The endpoint service nodes 130(1) and 130(2) adapt the SVC video stream to each endpoint's decoding requirements. For example, service node 130(1) knows that its associated endpoint, endpoint 140(1), has an AVC decoder, and will forward the SVC base layer when executing the scalable video multicast process logic 500. Similarly, service node 130(2) knows that endpoint 140(2) has an H.263 decoder and will transcode the SVC stream to an H.263 compliant stream before sending it to endpoint 140(2). Depending on system requirements, each of the endpoint service nodes may or may not be configured with a transcoder.
In one example, the service nodes 120, 130(1), and 130(2) receive information about how the video is to be encoded for each endpoint. The information may include parameters such as codec type, desired bit rate, frame rate, video resolution, etc. The bit rates may be lowered by the encoders via quantization parameters that also lower the signal-to-noise ratio (SNR), i.e., a lower bit rate results in a lower SNR, and therefore, a lower video quality. Once the service node 120 receives this information from all of the endpoints that will receive the multicast SVC stream, then the service node 120 can encode the SVC stream to the highest video quality that will be used by the multicast group. In some instances, the video source 110 may provide an SVC compliant stream and the service node 120 will not have to perform transcoding.
As can be seen from the examples described in connection with
Referring to
As shown in
Each of the conference endpoints 230(1)-230(4) has a different codec, e.g., the conference endpoints 230(1)-230(4) have H.263, AVC, MPEG-2, and SVC codecs, respectively. The conference bridge 210 will account for the various codecs and other requirements when encoding the outgoing multicast SVC stream. The corresponding service nodes 220(1)-220(4) will then transcode the SVC stream or forward the appropriate SVC layers to a corresponding conference endpoint.
The SVM framework can support any video endpoint whether or not an endpoint can decode a multilayer video stream that has been encoded according to a scalable coding scheme. When joining a session, an endpoint publishes its desired video quality and codec, and the SVM framework uses this description to automatically allocate and associate a service node to the endpoint. The associated service node intercepts video streams going to and from the endpoint, depending on whether the endpoint joined the session to receive and/or send video, and performs any necessary transcoding, content conversion, bit-stream rewriting (multilayer to/from single-layer), up-scaling with enhancement, and/or quality down-scaling. If an endpoint both sends and receives video in the same session, SVM may allocate either one or two service nodes to handle the video streams in both directions. If only one service node is allocated, the service node may need to handle codec asymmetry that may exist in each direction. If two service nodes are allocated, they can operate separately for each direction.
Referring now to
The processor 310 is a data processing device, e.g., a microprocessor, microcontroller, systems on a chip (SOCs), or other fixed or programmable logic. The processor 310 interfaces with the memory 330 that may be any form of random access memory (RAM) or other data storage memory block that stores data used for the techniques described herein. The memory 330 may be separate or part of the processor 310. Instructions for performing the scalable video multicast process logic 500 may be stored in the memory 330 for execution by the processor 310.
The functions of the processor 310 may be implemented by a processor or computer readable tangible medium encoded with instructions or by logic encoded in one or more tangible media (e.g., embedded logic such as an application specific integrated circuit (ASIC), digital signal processor (DSP) instructions, software that is executed by a processor, etc.), wherein the memory 330 stores data used for the computations or functions described herein (and/or to store software or processor instructions that are executed to carry out the computations or functions described herein). Thus, the process 500 may be implemented with fixed logic or programmable logic (e.g., software/computer instructions executed by a processor or field programmable gate array (FPGA)), or the processor or computer readable tangible medium may be encoded with instructions that, when executed by a processor, cause the processor to execute the process 500.
Referring to
Turning now to
At 520, one or more video streams are received by the network device. The one or more video streams may be multicast video streams that were encoded according to a scalable video encoding scheme that comprises at least a base layer and one or more optional enhancement layers. At 530, a video stream is generated for the endpoint network device from the one or more video streams video based on the decoding capability of an endpoint network device. At 540, the video stream is transmitted to the endpoint network device. In one example, one or more optional enhancement layers may be removed from the one or more multicast video streams and any remaining layers are forwarded to the endpoint network device.
Referring to
At 630, one or more outgoing video streams are generated for the plurality of endpoint network devices based on a highest video quality indicated by the video decoding parameters. In one example, the one or more outgoing video streams are encoded according to a scalable video encoding scheme, e.g., the SVC standard, with a base layer and one or more optional enhancement layers. In one example, the one or more outgoing video stream are generated as a plurality of multicast streams. Each endpoint network device may be joined to the appropriate multicast group in order to receive video generated according the respective decoding parameters. At 640, the one or more outgoing video streams are transmitted to the plurality of endpoint network devices.
The video source 710 sends an AVC video stream that is initially encoded with a high quality. At the transcoder node 720, the high quality AVC stream is transcoded by the scalable video generation process logic 600 into an SVC video stream comprising three layers. The AVC base layer is passed through the transcoder node 720 and is denoted as “SVC layer 0”; while two enhancement layers are denoted as SVC layer 1 and SVC layer 2. Each of the SVC layers is individually encapsulated into a separate RTP stream as shown. The stream for SVC layer 0 is depicted as a thicker line, the stream for SVC layer 1 is depicted as a dashed line, and the stream for SVC layer 2 is depicted as a thinner line. Any transcoded stream transmitted to an endpoint is depicted as a thinner line, e.g., the AVC high quality stream that is transmitted from endpoint service node with XCoder 740(2) to endpoint with AVC decoder 750(2) and the AVC medium quality stream that is transmitted from service node with XCoder 740(3) to endpoint with AVC decoder 750(3). Streams in
Thus, the packets in each scalable layer are sent only to those service nodes that require the respective scalable layer. This is possible because the transcoder node 720 has received each endpoint's video requirements (decoding parameters) that were published in advance by endpoint service nodes 740(1)-740(4) on behalf of endpoints 750(1)-750(4), respectively. Each multicast distribution tree is built on the granularity of scalable layers and based on the published requirements from the endpoints 750(1)-750(4). For example, the stream for SVC layer 0 is needed by all endpoints 750(1)-750(4), and as such, each of the endpoints 750(1)-750(4) are joined to the multicast group for SVC layer 0. The stream for SVC layer 1 is needed by endpoints 750(1)-750(3) and the endpoints 750(1)-750(3) are joined to the multicast group for SVC layer 1. The stream for SVC layer 2 is only needed by endpoint 750(2) and endpoint 750(2) is joined to the multicast group for SVC layer 2. Each of the endpoint service nodes 740(1)-740(4) can convert the multicast streams to unicast streams when transmitting the SVC layers or transcoded video to each of the endpoints 750(1)-750(4).
The SVM framework can be easily implemented by any receiver-based multicast method that uses techniques to prevent loops from forming during the forwarding process, e.g., Reverse Path Forwarding (RPF) that used according to the Protocol Independent Multicast (PIM) and IGMP standards.
In addition, traffic engineering with multiple SVC layers can be applied by way of the multiple scalable layers for use in various applications. For example, each scalable layer may have its packets denoted with a QoS marking different from those in other scalable layers, for administrative purposes in traffic engineering and network bandwidth reservation as defined in existing methods, such as Differentiated Services Code Point (DSCP) and Resource Reservation Protocol (RSVP). SVM service nodes can perform QoS marking on SVC packets transparently to the endpoints so that the customized traffic engineering with SVC can be easily deployed for optimized QoE and network efficiency.
In this example, the RTP streams arrive at router 730(1) which routes SVC layers 0-2 to router 730(2) and routes SVC layers 0 and 1 to router 730(3). Router 730(2), in turn, routes SVC layers 0 and 1 to service node 740(1) and routes SVC layers 0-2 to service node with XCoder 740(2), while Router 730(3) routes SVC layers 0 and 1 to service node with XCoder 740(3) and routes SVC layer 0 to endpoint service node 740(4).
Service node 740(1) forwards SVC layers 0 and 1 to the endpoint 750(1) that employs an SVC decoder. SVC layers 0 and 1 provide SVC video at medium quality. Had endpoint 750(1) indicated in the decoding parameters that high quality video was desired, then SVC layer 2 could have been multicast to endpoint 750(1). Note that no transcoding is necessary at endpoint service node 740(1). At service node 740(2), SVC layers 0-2 are transcoded to provide high quality AVC video to endpoint 750(2), and at service node 740(3), SVC layers 0 and 1 are transcoded to provide medium quality AVC video to endpoint 750(3). Service node 740(4) passes SVC layer 0 to endpoint 750(4) to provide SVC base quality to the AVC decoder.
SVM service nodes are typically deployed in proximity to the network endpoints, e.g. with or as part of Internet edge routers on the enterprise campuses and branches. An SVM implementation may be included in an application with a centralized call control agent, such as the Cisco Unified Communication Manager (CUCM) or other call manager. With a call manager, the SVM service nodes can act as media resources similar to Media Termination Points (MTP) or hardware transcoders, such that they may be allocated by the call manager when the call manager sets up a switched video conference with heterogeneous endpoints. The call manager can use standard or proprietary signaling protocols, e.g., Session Initiation Protocol (SIP)/Session Description Protocol (SDP), H.323, or Skinny Client Control Protocol (SCCP), to negotiate the desired codec and quality of each endpoint.
With the deployment of SVM service nodes, the video stream originating from a conference bridge can be multicast in scalable-coded layers to traverse the network, and then reconstructed by SVM service nodes at the edge routers before reaching the endpoints. In the reverse direction, the video streams originating from the endpoints can optionally traverse the network via the SVM service nodes, but since they have only one receiver, e.g., a conference bridge, multicast streams may simply be replaced by a unicast stream that uses scalable-coding, e.g., for traffic engineering in the reverse path.
Referring to
In network 800, the endpoints 750(1)-750(4) receive essentially the same video and quality, and perform essentially the same functions as those described in connection with network 700 (
Referring to
The router 730(2) forwards the first and second RTP multicast streams 910 and 920 to service node with XCoder 740(2) and the first RTP multicast stream 910 to service node 740(1). In this example, service node with XCoder 740(2) combines the first and second RTP multicast streams 910 and 920 when transcoding video for network endpoint 750(2) during execution of the scalable video multicast process logic 500. MANE router 810(3) forwards the first RTP multicast stream 910 to service node with XCoder 740(3), removes SVC layer 1 information from the first RTP multicast stream 910, and transmits the modified RTP multicast stream 930 with SVC layer 0 to service node 740(4). The service node 740(4) forwards the SVC layer 0 to endpoint 750(4) thereby delivering base quality video.
The examples described in connection with
Techniques are provided for an SVM video-service framework. The SVM video-service framework seamlessly adapts video for heterogeneous video endpoints that may have diverse codec capabilities such as different video resolutions and codec attributes, but do not necessarily support scalable video coding and/or multicast streaming. SVM uses in-network media services to perform multicast streaming and to convert video formats to and from a scalable coding scheme, on behalf of the endpoints transparently, in order to maximize the QoE at the video endpoints and facilitate traffic engineering for network QoS.
The above description is intended by way of example only.