Embodiments of the invention generally relate to encoding of motion pictures and, more particularly, to a mechanism for facilitating cost-efficient and low-latency encoding of video streams.
Encoding of video streams (e.g., motion pictures) is a well-known technique for removing redundancy from special and temporal domains of the video streams. For example, an I-picture of a video stream is obtained by reducing spatial redundancy of a given picture of the video stream, while a P-picture is produced by removing temporal redundancy residing between a current frame and any previously-encoded (referenced) frames or pictures of the video stream. Conventional systems attempt to reduce spatial and temporal redundancy by investigating multiple reference frames to determine redundant portions of video streams; consequently, these systems require high processing time and added hardware resources while inevitably incurring high latency as well as requiring large amount of memory. The excessive hardware cost makes the conventional systems expensive to employ and while the associated high latency keeps these conventional systems inefficient and unsuitable for certain latency-sensitive applications, such as video conferencing applications and games, etc.
A mechanism for facilitating cost-efficient and low-latency video stream encoding for limited channel bandwidth is described.
In one embodiment, an apparatus includes a source device having an encoding logic. The encoding logic includes a first logic to receive a video stream having a plurality of video frames. The video stream is received frame-by-frame. The encoding logic may further include a second logic to determine an input data rate relating to a first current video frame of the plurality of video frames received at the encoding mechanism, and a third logic to generate one or more zero-delta frames based on the input data rate, and allocate the one or more zero-delta frames to one or more first video frames of the plurality of video frames subsequent to the first current video frame.
In one embodiment, a system includes a source device having a processor coupled to a memory device and further having an encoding mechanism. The encoding mechanism to receive a video stream having a plurality of video frames. The video stream is received frame-by-frame. The encoding mechanism may be further to determine an input data rate relating to a first current video frame of the plurality of video frames received at the encoding mechanism, and generate one or more zero-delta frames based on the input data rate, and allocate the one or more zero-delta frames to one or more first video frames of the plurality of video frames subsequent to the first current video frame.
In one embodiment, a method may include receiving a video stream having a plurality of video frames. The video stream is received frame-by-frame. The method may further include determining an input data rate relating to a first current video frame of the plurality of video frames received at the encoding mechanism, and generating one or more zero-delta frames based on the input data rate, and allocate the one or more zero-delta frames to one or more first video frames of the plurality of video frames subsequent to the first current video frame.
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements:
Embodiments of the invention are directed to facilitating cost-efficient and low-latency video stream encoding for limited channel bandwidth. In one embodiment, this novel scheme applies rate control frame-by-frame such that if a single frame consumes too much bandwidth, the quality of the next (following) frame(s) may be controlled by raising a quantization parameter (QP) value and, at the same time, one or more frames may be skipped by having one or more zero-delta prediction (ZDP) frames (ZDPFs) or zero-delta prediction macro-blocks (ZDP-MB). This novel technique, for example, is distinct from and advantageous over a conventional rate control system where the rate control is performed over a large number of frames to be able to gather information on how much data is accumulated for a leading I-frame and the corresponding set of P-frames that follows it, which, naturally, results in a slow response to the channel status.
A P-frame or predicted frame may refer to a frame constructed from a previous frame (e.g., through prediction) with some modification (e.g., delta). To calculate the delta portion, an encoder may need a large memory to store one or more full frames. A ZDPF refers to a P-frame having zero-delta. Since its delta portion is zero, a ZDPF may be the same as the predicted frame and without any frame memory requirement. A ZDP-MB includes a ZDP-MB which may include 4×4 or 16×16 pixel blocks of a frame. Generally, an I-frame is composed of all I-MBs, while a P-frame may be composed of an I-MB and a P-MB. A P-MB refers to a macro-block that is composed of prediction and delta, while a ZDP-MB refers to a P-MB with zero delta. Although certain advantages of using a ZDP-MB may be the same as using a ZDP-frame; nevertheless, using ZDP-MBs may provide a better fine-grained MB-wise control on choosing an I-frame or a ZDPF. For example and in one embodiment, decision logic along with hash memory of a data rate measurement module may be used to decide whether to send an I-MB or a ZDP-MB.
In one embodiment, source device 200 employs a dynamic encoding mechanism (encoding mechanism) 210 for dynamic cost-efficient and low-latency frame-by-frame encoding of video streams (e.g., motion pictures). Source device 200 may include an operating system 206 serving as an interface between any hardware or physical resources of the source device 200 and a sink device or a user. Source device 200 may further include one or more processors 202, memory devices 204, network devices, drivers, or the like, as well as input/output (I/O) sources 208, such as a touchscreen, a touch panel, a touch pad, a virtual or regular keyboard, a virtual or regular mouse, etc. Terms like “frame” and “picture” may be used interchangeably throughout this document.
In one embodiment, a data rate of the current frame 422 is calculated using the data rate measurement process 410. For example and in one embodiment, the data rate measurement process 410 may be used to performed several tasks and the results of which may be used to check to determine the amount of bandwidth required to send or pass the current frame 422 to the sink device. It is contemplated that the data rate measurement process 410 may control the QP value to meet the required channel bandwidth by sacrificing the quality of the image associated with the current frame 422; however, the required bandwidth for the current frame 422 may not be achieved even with a significantly lowered quality of the image (such as even when reaching virtually the minimum image quality). In one embodiment, to overcome this problem, ZDPFs 426 may be generated and inserted into one or more frames that are subsequent to or following the current frame 422 to carry the additional bandwidth required by the current frame 422. The number of ZDPFs 426 or the number of subsequent frames representing the ZDPFs 426 may be based on the amount of extra bandwidth, as compared to the available channel bandwidth, demanded by the current frame 422. The data rate measurement process 410 may be used to calculate the QP value that is then applied to the next input video frame. Further, using the data rate measurement process 410, the decision to use ZDPFs may also be made. However, the two processes of calculating the QP value and the decision to use a ZDPF are regarded as two separate and independent tasks performed in the data rate measurement process 410. For example and in one embodiment, the decision to use a ZDPF is made from the input data rate (not the QP value) obtained from the data rate measurement process 410.
In one embodiment, ZDPF generation 414 is performed using the ZDPF generator 314 of
Referring now to
In one embodiment, when a ZDPF 444, 448-450, 454-456 is received by a decoder at the sink device, the decoder may simply repeat the previously decoded picture or frame 442, 446 and 452 that shows the same effect but with a dynamic frame rate control. For example, when ZDPF 444 (representing frame 6 of the encoded video stream 440) is received at the decoder, the decoder simply repeats the previous frame 5442 until it reaches the subsequent frame 7 and similarly, when frame 10446 may be repeated from ZDPF-based frames 448-450 until their subsequent frame 13 is reached and so forth. To explain it further, let us suppose, frame 5442 (or the fifth input frame) is a complex frame that needs 1.5 times the bandwidth of a single frame time. To tackle this situation, in one embodiment, the encoding mechanism 210 generates and sends compressed data for frame 5442 in the fifth frame time equaling 1.0 times of the 1.5 times the required bandwidth and further inserting a ZDPF in frame 6444 and sending it in the sixth frame time to represent the rest of the bandwidth equaling 0.5 times of the 1.5 times the bandwidth of a single frame time. Stated differently, in the fifth frame time, the encoding mechanism 210 sends the data of frame 5442, while in the sixth frame time, the encoding mechanism 210 puts the remaining data of frame 5442 and a ZDPF in frame 6444 to be received at the decoder at the sink device.
Similarly, let us suppose if frame 10446 is even more complex that frame 5442 and requires 2.5 times the bandwidth of a single frame time. In this case, the encoding mechanism 210 sends the compressed data of frame 10446 over the tenth frame time as well as the eleventh frame time and the twelfth frame time using frame 11448, frame 12450, respectively. The ZDPF generation process 414 of
Method 450 begins at block 452 with a current frame of an input video stream being received at the dynamic encoding mechanism employed at a source device coupled to a sink device over a communication network. At block 454, a number of encoding processes (e.g., intra-prediction, transformation, quantization, entropy coding, etc.), as described with reference to
At block 458, if the bandwidth is less than or equal to the channel bandwidth of the single frame time, the current frame data is compressed and the current frame is labeled as I-picture and transmitted on to the sink device to be handled by its decoder. At block 460, if the bandwidth is determined to be greater than the channel bandwidth of a single frame time, the current frame data is compressed to be delivered over multiple frames. In other words and in one embodiment, the current frame is labeled as I-picture, while one or more frames following the current frame are assigned ZDPFs to carry the burden of the remaining compressed data and/or provide the additional bandwidth necessitated by the current frame. The current frame (as I-picture) and the one or more subsequent frames (as ZDPFs) are transmitted over to the sink device to decoded and displayed. As described earlier, the number of frames to be referenced as ZDPFs may depend on the complexity of the current frame, such as the amount bandwidth in addition to or over the normal channel bandwidth needed to compress the current frame data and transmit the current frame to the sink device.
In embodiment, as described with reference to
Further, a data rate measurement process 410 may be used to calculate a QP value that is then applied to the next input video frame. Further, using the data rate measurement process 410, the decision to use ZDP-MBs 526 may also be made. However, the two processes of calculating the QP value and the decision to use a ZDP-MB 526 are regarded as two separate and independent tasks performed in the data rate measurement process 410. For example and in one embodiment, the decision to use a ZDP-MB 526 is made from the input data rate (not the QP value) obtained from the data rate measurement process 410. The higher the QP value is determined to be, and as used by the data rate measurement process 410, the more the current frame data compression is needed and vice versa. Generally, an I-frame is composed of all I-MBs 424, while a P-frame may be composed of an I-MB 424 and a P-MB. A P-MB refers to a macro-block that is composed of prediction and delta, while a ZDP-MB 526 refers to a P-MB with zero delta. Although certain advantages of using a ZDP-MB 526 may be the same as using a ZDPF of
Stated differently, instead of sending a ZDPF in a frame having no different information than contained in a preceding frame, as described with reference to the previous embodiment, in this embodiment, various I-blocks are distributed over multiple P-pictures. For example, as illustrated in
Method 550 begins at block 552 with a current frame of an input video stream being received at the dynamic encoding mechanism employed at a source device coupled to a sink device over a communication network. At block 554, a number of encoding processes (e.g., intra-prediction, transformation, quantization, entropy coding, etc.), as described with reference to
At block 558, if the current frame is not too complex and/or its required bandwidth is less than or equal to the channel bandwidth of the single frame time, the current frame data is compressed and the current frame is labeled as I-picture and transmitted on to the sink device to be handled by its decoder. At block 560, if the current frame is determined to be too complex and/or if the bandwidth is determined to be greater than the channel bandwidth of a single frame time, the current frame data is compressed to be delivered over multiple frames. In other words and in one embodiment, the current frame data is compressed and to be delivered over multiple frames. The current frame is labeled I-picture, while one or more ZDP-MBs are associated with one or more subsequent frames following the current frame. The current frame and the subsequent ZDP-MB-based frames are transmitted on to the sink device to be decoded at a decoder employed by the sink device and subsequently, displayed as images on a display device.
In some embodiments, the network unit 610 includes a processor for the processing of data. The processing of data may include the generation of media data streams, the manipulation of media data streams in transfer or storage, and the decrypting and decoding of media data streams for usage. The network device may also include memory to support network operations, such as Dynamic Random Access Memory (DRAM) 620 or other similar memory and flash memory 625 or other nonvolatile memory. Network device 605 also may include a read only memory (ROM) and or other static storage device for storing static information and instructions used by processor 615.
A data storage device, such as a magnetic disk or optical disc and its corresponding drive, may also be coupled to network device 605 for storing information and instructions. Network device 605 may also be coupled to an input/output (I/O) bus via an I/O interface. A plurality of I/O devices may be coupled to I/O bus, including a display device, an input device (e.g., an alphanumeric input device and or a cursor control device). Network device 605 may include or be coupled to a communication device for accessing other computers (servers or clients) via external data network. The communication device may comprise a modem, a network interface card, or other well-known interface device, such as those used for coupling to Ethernet, token ring, or other types of networks.
Network device 605 may also include a transmitter 630 and/or a receiver 640 for transmission of data on the network or the reception of data from the network, respectively, via one or more network interfaces 655. Network Device 605 may be the same as the communication device 200 employing the cost-efficient, low-latency dynamic encoding mechanism 210 of
Network device 605 may be interconnected in a client/server network system or a communication media network (such as satellite or cable broadcasting). A network may include a communication network, a telecommunication network, a Local Area Network (LAN), Wide Area Network (WAN), Metropolitan Area Network (MAN), a Personal Area Network (PAN), an intranet, the Internet, etc. It is contemplated that there may be any number of devices connected via the network. A device may transfer data streams, such as streaming media data, to other devices in the network system via a number of standard and non-standard protocols.
In the description above, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well-known structures and devices are shown in block diagram form. There may be intermediate structure between illustrated components. The components described or illustrated herein may have additional inputs or outputs which are not illustrated or described.
Various embodiments of the present invention may include various processes. These processes may be performed by hardware components or may be embodied in computer program or machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the processes. Alternatively, the processes may be performed by a combination of hardware and software.
One or more modules, components, or elements described throughout this document, such as the ones shown within or associated with an embodiment of a DRAM enhancement mechanism may include hardware, software, and/or a combination thereof. In a case where a module includes software, the software data, instructions, and/or configuration may be provided via an article of manufacture by a machine/electronic device/hardware. An article of manufacture may include a machine accessible/readable medium having content to provide instructions, data, etc.
Portions of various embodiments of the present invention may be provided as a computer program product, which may include a computer-readable medium having stored thereon computer program instructions, which may be used to program a computer (or other electronic devices) to perform a process according to the embodiments of the present invention. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, compact disk read-only memory (CD-ROM), and magneto-optical disks, read-only memory (ROM), random access memory (RAM), erasable programmable read-only memory (EPROM), EEPROM, magnet or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions. Moreover, the present invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer.
Many of the methods are described in their most basic form, but processes can be added to or deleted from any of the methods and information can be added or subtracted from any of the described messages without departing from the basic scope of the present invention. It will be apparent to those skilled in the art that many further modifications and adaptations can be made. The particular embodiments are not provided to limit the invention but to illustrate it. The scope of the embodiments of the present invention is not to be determined by the specific examples provided above but only by the claims below.
If it is said that an element “A” is coupled to or with element “B,” element A may be directly coupled to element B or be indirectly coupled through, for example, element C. When the specification or claims state that a component, feature, structure, process, or characteristic A “causes” a component, feature, structure, process, or characteristic B, it means that “A” is at least a partial cause of “B” but that there may also be at least one other component, feature, structure, process, or characteristic that assists in causing “B.” If the specification indicates that a component, feature, structure, process, or characteristic “may”, “might”, or “could” be included, that particular component, feature, structure, process, or characteristic is not required to be included. If the specification or claim refers to “a” or “an” element, this does not mean there is only one of the described elements.
An embodiment is an implementation or example of the present invention. Reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” or “other embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments. The various appearances of “an embodiment,” “one embodiment,” or “some embodiments” are not necessarily all referring to the same embodiments. It should be appreciated that in the foregoing description of exemplary embodiments of the present invention, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims are hereby expressly incorporated into this description, with each claim standing on its own as a separate embodiment of this invention.