The present disclosure is generally related to decoding video data.
Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), ITU-T H.266/Versatile Video Coding (VVC) and extensions of such standards. Such video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
Conventionally, a video device can implement a video playback dataflow that includes a video decoder receiving an input data frame via a bitstream and also receiving reference frame pixel data from a dynamic random access memory (DRAM) to generate a reconstructed frame (a “decoded frame”) of the video, which is stored back into the DRAM. If the decoded frame has a higher resolution than is supported by a display device for video playback, a downscale pass is performed in which the decoded frame is transferred from the DRAM to a graphics processing unit (GPU) to generate a downscaled frame that matches a resolution supported by the display device.
For example, when a user selects to play an 8K video at a smart phone, the decoded video frames may have a resolution of 8192×4320 pixels, while the display may only be capable of supporting a resolution of 3120×1440 pixels. In this case, each decoded frame is stored into the DRAM, and a GPU performs a downscale pass to generate a downscaled frame, which is stored to the DRAM. Downscaled frames are read from the DRAM to a display processing unit (DPU) and provided to the display device via a display refresh.
Performance of the downscale pass, which can include waking the GPU for each frame of the video data, transferring decoded full-resolution frames from the DRAM to the GPU, downscaling at the GPU, and transferring the downscaled frames to the DRAM, consumes additional power, uses DRAM and GPU resources, and increases data traffic.
According to a particular implementation of the techniques disclosed herein, a device includes a memory configured to store video data. The device also includes a video decoder coupled to the memory and to a cache. The video decoder is configured to decode an input frame of the video data to generate a first video frame and includes an inline downscaler configured to generate a second video frame corresponding to the first video frame downscaled for display output.
According to a particular implementation of the techniques disclosed herein, a method of processing video data includes obtaining, at a video decoder, an input frame of video data. The method includes decoding, at the video decoder, the input frame to generate a first video frame. The method also includes generating, at an inline downscaler of the video decoder, a second video frame corresponding to the first video frame downscaled for display output.
According to a particular implementation of the techniques disclosed herein, a non-transitory computer-readable medium includes instructions that, when executed by one or more processors, cause the one or more processors to decode, at the video decoder, an input frame of video data to generate a first video frame. The instructions, when executed by one or more processors, also cause the one or more processors to generate, at an inline downscaler of the video decoder, a second video frame corresponding to the first video frame downscaled for display output.
According to a particular implementation of the techniques disclosed herein, an apparatus includes means for obtaining an input frame of video data. The apparatus also includes means for decoding the input frame to generate a first video frame, the means for decoding including means for inline downscaling to generate a second video frame corresponding to the first video frame downscaled for display output.
Other implementations, advantages, and features of the present disclosure will become apparent after review of the entire application, including the following sections: Brief Description of the Drawings, Detailed Description, and the Claims.
Systems and methods to perform video decoding using a video decoder with an inline downscaler are disclosed. In conventional video decoding techniques, when a decoded frame has a higher resolution than is supported by a display device for video playback, a downscale pass is performed in which the decoded frame is transferred from DRAM to a GPU to generate a downscaled frame that matches a resolution supported by the display device. Performance of the downscale pass can include waking the GPU for each frame of the video data, transferring decoded full-resolution frames from the DRAM to the GPU, downscaling at the GPU, and transferring the downscaled frame to the DRAM, which consumes power, uses DRAM and GPU resources, and increases data traffic.
The disclosed systems and methods include techniques to bypass GPU downscale processing of full-resolution video frames by using an inline downscaler at the video decoder to generate downscaled frames. Because the video decoder can generate full-resolution and downscaled versions of each video frame, the downscaled versions of the video frames output by the video decoder can be provided to a display unit for output without accessing the GPU. As a result, use of the inline downscaler in the video decoder provides the technical advantages of generating reduced resolution video frames for playout without incurring the power consumption, DRAM and GPU resource consumption, and data traffic that result from downscaling using the conventional GPU downscaling pass.
In some aspects, the downscaled frames are stored at a system cache/on-chip memory instead of at the DRAM and retrieved from the system cache/on-chip memory for playout. Using the system cache/on-chip memory for storage and retrieval of downscaled video frames provides the technical advantage of reducing DRAM usage and data traffic associated with transferring the frames into and out of the DRAM.
According to an aspect, additional benefits are obtained by selectively storing the full-resolution video frames generated by the video decoder into the DRAM based on whether the full-resolution video frames are references frame that will be later used to decode another video frame. For example, the video decoder may decode input frames from a bitstream that also includes indications of which input frames are reference frames. The video decoder may store a particular full-resolution video frame to the DRAM only if the bitstream indicates that the input frame is a reference frame; otherwise, the full-resolution video frame may be discarded. As a result, storage of full-resolution non-reference frames to the DRAM is skipped, providing the technical advantage of further reducing memory usage, data traffic, and power consumption associated with video decoding and playback.
Particular aspects of the present disclosure are described below with reference to the drawings. In the description, common features are designated by common reference numbers. As used herein, various terminology is used for the purpose of describing particular implementations only and is not intended to be limiting of implementations. For example, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. Further, some features described herein are singular in some implementations and plural in other implementations. To illustrate,
It may be further understood that the terms “comprise,” “comprises,” and “comprising” may be used interchangeably with “include,” “includes,” or “including.” Additionally, it will be understood that the term “wherein” may be used interchangeably with “where.” As used herein, “exemplary” may indicate an example, an implementation, and/or an aspect, and should not be construed as limiting or as indicating a preference or a preferred implementation. As used herein, an ordinal term (e.g., “first,” “second,” “third,” etc.) used to modify an element, such as a structure, a component, an operation, etc., does not by itself indicate any priority or order of the element with respect to another element, but rather merely distinguishes the element from another element having a same name (but for use of the ordinal term). As used herein, the term “set” refers to one or more of a particular element, and the term “plurality” refers to multiple (e.g., two or more) of a particular element.
As used herein, “coupled” may include “communicatively coupled,” “electrically coupled,” or “physically coupled,” and may also (or alternatively) include any combinations thereof. Two devices (or components) may be coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) directly or indirectly via one or more other devices, components, wires, buses, networks (e.g., a wired network, a wireless network, or a combination thereof), etc. Two devices (or components) that are electrically coupled may be included in the same device or in different devices and may be connected via electronics, one or more connectors, or inductive coupling, as illustrative, non-limiting examples. In some implementations, two devices (or components) that are communicatively coupled, such as in electrical communication, may send and receive signals (e.g., digital signals or analog signals) directly or indirectly, via one or more wires, buses, networks, etc. As used herein, “directly coupled” may include two devices that are coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) without intervening components.
In the present disclosure, terms such as “obtaining,” “determining,” “calculating,” “estimating,” “shifting,” “adjusting,” etc. may be used to describe how one or more operations are performed. It should be noted that such terms are not to be construed as limiting and other techniques may be utilized to perform similar operations. Additionally, as referred to herein, “obtaining,” “generating,” “calculating,” “estimating,” “using,” “selecting,” “accessing,” and “determining” may be used interchangeably. For example, “obtaining,” “generating,” “calculating,” “estimating,” or “determining” a parameter (or a signal) may refer to actively generating, estimating, calculating, or determining the parameter (or the signal) or may refer to using, selecting, retrieving, receiving, or accessing the parameter (or signal) that is already generated, such as by another component or device.
Referring to
The one or more processors 116 are configured to execute the instructions 112 to perform operations associated with decoding encoded video data 122 at the video decoder 124. In various implementations, some or all of the functionality associated with the video decoder 124 is performed via execution of the instructions 112 by the one or more processors 116, performed by processing circuitry of the one or more processors 116 in a hardware implementation, or a combination thereof.
The one or more processors 116 include the video decoder 124 coupled to an encoded data source 120. The video decoder 124 is also coupled to a system cache/on-chip memory 150, which is also referred to herein as a cache 150. The video decoder 124 is configured to obtain the encoded video data 122 from the encoded data source 120. For example, the encoded data source 120 may correspond to a portion of one or more of media files (e.g., a media file including the encoded video data 122 that is retrieved from the memory 110), a game engine, one or more other sources of video information, such as a remote media server, or a combination thereof.
In a particular implementation, the cache 150 and the video decoder 124 are integrated into a single substrate 190 (e.g., a single chip). Although the cache 150 is illustrated as distinct from and coupled to the video decoder 124, in other examples the cache 150 is integrated in the video decoder 124. According to an aspect, the cache 150 includes a static random access memory (SRAM).
The video decoder 124 is configured to decode an input frame of the video data 122 to generate a first video frame, illustrated as a decoded frame 130, and includes the inline downscaler 126 configured to generate a second video frame corresponding to the first video frame downscaled for display output, illustrated as a downscaled frame 132. For example, the video decoder 124 may be configured to receive a first input frame 162 in the encoded video data 122 and to process the first input frame 162 to generate the decoded frame 130 that has a first resolution (e.g., 8192×4320 pixels) that is determined by the encoded video data 122 and that is used in conjunction with decoding other frames of the encoded video data 122. To illustrate, the encoded video data 122 can include motion vectors that are associated with one or more previously decoded reference frames that are based on the first resolution and used for pixel prediction. However, the first resolution may be too large for the display device 104. For example, the display device 104 may support resolutions up to 3120×1440 pixels and may therefore be unable to display the decoded frame 130.
The inline downscaler 126 is configured to generate the downscaled frame 132 that corresponds to a downscaled version of the decoded frame 130 and that has a resolution that is supported by the display device 104. The decoded frame 130 and the downscaled frame 132 may be output by the video decoder 124 in parallel. For example, the downscaled frame 132 may be output to the cache 150 for storage and later retrieval by a display unit (e.g., a DPU) 140, and in parallel, the decoded frame 130 may be output to the memory 110 for storage, such as included with the decoded frame(s) 134. The decoded frame(s) 134 may include one or more reference frames, one or more non-reference frames, or a combination thereof. Additional details regarding operation of the video decoder 124 and the inline downscaler 126 are provided with reference to
Optionally, the video decoder 124 may be configured to selectively store the decoded frame 130 into the memory 110 based on whether the decoded frame 130 corresponds to a reference frame. To illustrate, the encoded video data 122 may include information that indicates which frames are reference frames, and the video decoder 124 may use the information to determine whether to store the decoded frame 130 or discard the decoded frame 130. For example, the decoded frame 130 may be stored into the memory 110 based on a determination that the decoded frame 130 is a reference frame, and storage of full-resolution non-reference frames to the memory 110 may be skipped. Skipping storage of full-resolution non-reference frames enables memory access bandwidth, power consumption, and storage capacity of the memory 110 that is used during decoding, to be reduced.
The display unit 140 is configured to receive the downscaled frame 132 and to generate a video data output 142, such as a display refresh, to the display device 104. To illustrate, the display unit 140 is configured to receive the downscaled frame 132 from the cache 150. The display unit 140 may also retrieve additional data from the memory 110 for use in conjunction with processing the downscaled frame 132 (e.g., layer composition) to generate the video data output 142.
The display device 104 is configured to display the video data output 142, which is based on the downscaled frame 132. For example, the video data output 142 can include a reconstructed frame that is based on the downscaled frame 132 for viewing by a user of the device 102.
The device 102 optionally includes a modem 118 that is coupled to the one or more processors 116 and configured to enable communication with one or more other devices, such as via one or more wireless networks. According to some aspects, the modem 118 is configured to receive the encoded video data 122 from a second device, such as video data that is streamed via a wireless transmission 194 from a remote device 192 (e.g., a remote server) for playback at the device 102.
During operation, the encoded video data 122 may be received at the video decoder 124 as a bitstream that includes a sequence of frames including a first input frame 162, a second input frame 164, and one or more additional input frames including an Nth input frame 166 (N is a positive integer). The encoded video data 122 is processed by the video decoder 124 to generate full-resolution decoded frames and downscaled frames for each of the input frames 162-166. For example, the video decoder 124 processes the first input frame 162 to generate the full-resolution decoded frame 130 and also to generate the downscaled frame 132 via operation of the inline downscaler 126. The video decoder 124 stores the downscaled frame 132 into the cache 150, and the downscaled frame 132 is retrieved from the cache 150 and provided to the display unit 140 for output at the display device 104.
The full-resolution decoded frames generated by the video decoder 124, such as the decoded frame 130, may be stored into the memory 110. In some implementations, however, only the full-resolution decoded frames that are reference frames are stored to the memory 110, and the rest of the full-resolution decoded frames that are not reference frames are discarded (e.g., erased or overwritten) at the video decoder 124.
The downscaled frames generated by the inline downscaler 126 in the video decoder 124 are stored into the cache 150, and later retrieved from the cache 150 by the display unit 140 for generation of the video data output 142. For example, the downscaled frame 132 may be generated at the video decoder 124 and transferred to the display unit 140 via the cache 150 without being stored at the memory 110 or processed at the GPU 160. However, in some cases, such as based on the size of the cache 150 and a management policy used by the device 102, the downscaled frame 132 may be evicted from the cache 150 and stored into the memory 110. In such cases, in response to a request for the downscaled frame 132 resulting in a cache miss, the downscaled frame 132 may be retrieved from the memory 110 and provided to the display unit 140. The display unit 140 generates the video data output 142 based on the downscaled frame 132 and provides the video data output 142 to the display device 104 for playout, such as to a user of the device 102.
Generation of the downscaled frame 132 at the inline downscaler 126 in the video decoder 124 enables reduced usage of the storage capacity of the memory 110, reduced memory bandwidth associated with data transfer into and out of the memory 110, and reduced power consumption associated with the memory 110 and the GPU 160 as compared to a conventional technique in which the downscaled frame 132 is generated by the GPU 160 accessing the decoded frame 130 from the memory 110 and processing the decoded frame 130 to generate the downscaled frame 132, which is then stored back to the memory 110. Use of the inline downscaler 126 thus provides improved power efficiency (e.g., reduced power consumption by reducing/bypassing operations at the memory 110 and the GPU 160), improved memory bandwidth efficiency (e.g., reduced data transfer into and out of the memory 110), and improved end-to-end efficiency from the video decoder 124 to the display device 104.
In addition, storage of the downscaled frame 132 at the cache 150 enables the downscaled frame 132 to be conveyed from the video decoder 124 to the display unit 140 without being stored into, and later retrieved from, the memory 110. As a result, usage of the storage capacity of the memory 110, memory bandwidth associated with data transfer into and out of the memory 110, and power consumption associated with the memory 110 are reduced as compared to storing the downscaled frame 132 into the memory 110 and later reading the downscaled frame 132 from the memory 110.
Table 1 provides illustrative, non-limiting examples of memory bandwidth savings and power savings that may be obtained via use of the disclosed techniques as compared to conventional decoding techniques in which the decoded frame 130 is stored to the memory 110, processed by the GPU 160 in a downscaling pass to generate a downscaled frame that is stored into the memory 110, and the downscaled frame is read from the memory 110 to the display unit 140.
As shown in Table 1, for a 8192×4320 video resolution that is downscaled by the inline downscaler 126 to a 3120×1440 display resolution, the present techniques can result in a 1165 megabytes-per-second (MBps) reduction in memory bandwidth and a 124 milliwatt (mW) reduction in power consumption as compared to conventional techniques that use the GPU 160 in a downscaling pass.
Although specific examples of resolutions are described herein, such as 8192×4320 pixels as an example of the first resolution of the decoded frames and 3120×1440 pixels as an example of the largest resolution supported by the display device 104, these resolution examples are provided for purpose of illustration only and should not be construed as limiting. In general, the techniques described herein can be applied to decoded frames of any resolution to generate lower-resolution frames. In addition, the largest resolution supported by the display device 104 may also be any resolution. Although the present techniques can be used when the largest resolution supported by the display device 104 is less than the resolution of the decoded frames, the present techniques can also be used to generate lower-resolution video for playout even when the display device 104 is capable of playback using the original resolution of the decoded frames. For example, lower-resolution video playback may be selected (e.g., selected by user input via a user interface, or selected by a power/performance management policy the device 102) to reduce power consumption as compared to video playback at the resolution of the decoded frames.
Although
For example, in some implementations, the video decoder 124 is configured to select between storing the downscaled frame 132 into the memory 110 or into the cache 150. To illustrate, the video decoder 124 may make the selection based on based on cache size and/or availability for storage of decoded pixel data, based on a user setting, based on a configuration setting that indicates a product tier, or any combination thereof. For example, a “value” tier may indicate that the device 102 has a relatively small cache 150, in which case the video decoder 124 may select to store downscaled video frames to the memory 110, while a “premium” tier may indicate that the device 102 has a relatively large cache 150, in which case the video decoder 124 may select to store downscaled video frames into the cache 150. The display unit 140 may also be configured to selectively receive the downscaled frame 132 from the memory 110 or from the cache 150 based on the storage location of the downscaled frame 132. To illustrate, the display unit 140 may select to retrieve the downscaled frame 132 from the memory 110 or from the cache 150 based on based on cache size and/or availability, based a user setting, based on a configuration setting (e.g., indicating a tier of the device 102), or any combination thereof.
In some implementations, the video decoder 124 is configured to store a first portion of the downscaled frame 132 into the memory 110 and to store a second portion of the downscaled frame 132 into the cache 150, such as described in further detail with reference to
According to some aspects, the one or more processors 116 are integrated in at least one of a mobile phone or a tablet computer device, such as illustrated in
Although in some implementations the input frames 162-166 of the encoded video data 122 may all have the same resolution, in other implementations the present techniques can also be performed in an adaptive resolution decoding environment. Although the display device 104 is illustrated as included in (e.g., integrated with) the device 102, in other implementations the display device 104 may be coupled to, but not included in, the device 102. Although the modem 118 is illustrated as included in the device 102, in other examples the modem 118 may be omitted.
The various units shown in
In general, the video decoder 124 reconstructs a picture on a block-by-block basis. The video decoder 124 may perform a reconstruction operation on each block individually (where the block currently being reconstructed, i.e., decoded, may be referred to as a “current block”).
The bitstream parsing unit 210 receives encoded video data and may entropy decode the encoded video data 122 to reproduce syntax elements. The pixel prediction processing unit 212, inverse transform processing unit 214, and the pixel reconstruction and inloop filtering unit 216 may generate decoded video data based on the syntax elements extracted from the bitstream. In some implementations, the bitstream parsing unit 210 may decode information indicating which frames in the bitstream 222 are reference frames, which the video decoder 124 may use to determine which full-resolution decoded frames to store into the memory 110.
The bitstream parsing unit 210 may entropy decode syntax elements defining quantized transform coefficients of a quantized transform coefficient block, as well as transform information, such as a quantization parameter (QP) and/or transform mode indication(s). The QP associated with the quantized transform coefficient block may be used to determine a degree of quantization and a degree of inverse quantization to apply. In an example, a bitwise left-shift operation may be performed to inverse quantize the quantized transform coefficients, and a transform coefficient block including transform coefficients may be formed.
The pixel prediction processing unit 212 may include one or more units to perform prediction in accordance with one or more prediction modes. As examples, the pixel prediction processing unit 212 may include a motion compensation unit, an inter-prediction unit, an intra-prediction unit, a palette unit, an affine unit, a linear model (LM) unit, one or more other units configured to prediction, or a combination thereof.
In addition, the pixel prediction processing unit 212 generates a prediction block according to prediction information syntax elements that were entropy decoded by the bitstream parsing unit 210. For example, if the prediction information syntax elements indicate that the current block is inter-predicted, a motion compensation unit (not shown) may generate the prediction block. In this case, the prediction information syntax elements may indicate a reference picture in the reference picture buffer 218 from which to retrieve a reference block, as well as a motion vector identifying a location of the reference block in the reference picture relative to the location of the current block in the current picture.
As another example, if the prediction information syntax elements indicate that the current block is intra-predicted, an intra-prediction unit (not shown) may generate the prediction block according to an intra-prediction mode indicated by the prediction information syntax elements. The pixel prediction processing unit 212 may retrieve data of neighboring samples to the current block from the reference picture buffer 218.
The pixel prediction processing unit 212 may also determine to decode blocks of video data using an intra block copy (IBC) mode. In general, in IBC mode, the video decoder 124 may determine predictive blocks for a current block, where the predictive blocks are in the same frame as the current block. The predictive blocks may be identified by a block vector (e.g., a motion vector) and limited to the locations of blocks that have already been decoded.
The inverse transform processing unit 214 may apply one or more inverse transforms to the transform coefficient block from the bitstream parsing unit 210 to generate a residual block associated with the current block. For example, the inverse transform processing unit 214 may apply an inverse discrete cosine transform (DCT), an inverse integer transform, an inverse Karhunen-Loeve transform (KLT), an inverse rotational transform, an inverse directional transform, or another inverse transform to the coefficient block.
The pixel reconstruction and inloop filtering unit 216 may reconstruct the current block using the prediction block and the residual block. For example, the pixel reconstruction and inloop filtering unit 216 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the current block.
The pixel reconstruction and inloop filtering unit 216 may perform one or more filter operations on reconstructed blocks. For example, the pixel reconstruction and inloop filtering unit 216 may access reconstructed blocks and perform deblocking operations to reduce blockiness artifacts along edges of the reconstructed blocks. Operations of the pixel reconstruction and inloop filtering unit 216 are not necessarily performed in all examples.
The video decoder 124 may store the reconstructed blocks in the reference picture buffer 218, which may be implemented as, coupled to, or include the cache 150, the off-chip memory 110, or both, for larger storage capacity. The reference picture buffer 218 generally stores decoded pictures, illustrated as a reference frame 230 (e.g., the decoded frame 130), which the video decoder 124 may output, use as reference video data when decoding subsequent data or pictures of the encoded video bitstream, or both. As discussed above, the reference picture buffer 218 may provide reference information, such as samples of a current picture for intra-prediction and previously decoded pictures for subsequent motion compensation, to the pixel prediction processing unit 212. Such reference information may be provided from the memory 110 to the pixel prediction processing unit 212. In implementations in which reference information is stored at the cache 150, the reference information may be provided from the cache 150 to the pixel prediction processing unit 212 via a path indicated by the dotted line.
The inline downscaler 126 is configured to process pixels stored in the reference picture buffer 218 to generate downscaled versions that are saved as downscaled frames (e.g., the downscaled frame 132) in the display picture buffer 220. In a particular implementation, the inline downscaler 126 includes a combination of a four-tap filter based downscaler, and a bilinear 2:1 downscaler for larger downscaling ratios. Alternatively, or in addition, in other implementations one or more other types of downscalers may be used in the inline downscaler 126, and may be selected based on one or more power/area and quality criteria.
The video decoder 124 may output decoded pictures from the display picture buffer 220 for subsequent presentation on a display device. For example, the downscaled frame 132 is output from the display picture buffer 220 to the cache 150, and is retrieved from the cache 150 by the display unit 140 to generate the video data output 142 provided to the display device 104 of
Optionally, the downscaled frame 132 may be removed from the cache 150 and stored in the memory 110, such as based on the usage and/or configuration of the cache 150, and the downscaled frame 132 may be retrieved from the memory 110 responsive to a request from the display unit 140. Such transfers of downscaled pixels (e.g., the downscaled frame 132) between the cache 150 and the memory 110 are illustrated as dotted lines.
Optionally, the video decoder 124 may select whether to store the downscaled frame 132 into the cache 150 or into the memory 110. For example, as described above with reference to
Optionally, the video decoder 124 may select whether to store the decoded frame 130 into the memory 110 based on whether the decoded frame 130 is a reference frame. For example, as described above, the bitstream parsing unit 210 may extract information from the bitstream 222 indicating whether the decoded frame 130 is a reference frame. After the decoded frame 130 has been processed by the inline downscaler 126 to generate the downscaled frame 132, if the decoded frame 130 is not a reference frame, the video decoder 124 may overwrite or erase the decoded frame 130 from the reference picture buffer 218 without outputting the decoded frame 130 to the cache 150 or to the memory 110.
In some implementations, the video decoder 124 is configured to always bypass the cache 150, such as in accordance with a configuration parameter indicating that the video decoder 124 is implemented at a value tier chip that has a smaller cache 150 as compared to a premium tier chip. In other implementations, a determination of whether to bypass the cache 150 can be determined occasionally or periodically, such as on a frame-by-frame basis, and may be based on a usage of the cache 150 by other processes, an amount of available storage capacity in the cache, a relative priority of the video playback operation as compared to other ongoing processes that may use the cache 150, one or more other factors, or a combination thereof.
Optionally, the video decoder 124 may select whether to store the decoded frame 130 into the memory 110 or to discard the decoded frame 130 based on whether the decoded frame 130 is a reference frame, such as described above with reference to
The video decoder 124 operates to perform video decoding using the inline downscaler 126 during playback of video data that is decoded by the video decoder 124 and played back via a display device 1204. In some implementations, the vehicle 1202 is manned (e.g., carries a pilot, one or more passengers, or both), the display device 1204 is internal to a cabin of the vehicle 1202, and the video decoding using the inline downscaler 126 is performed during playback to a pilot or a passenger of the vehicle 1202. In another implementation, the vehicle 1202 is unmanned, the display device 1204 is mounted to an external surface of the vehicle 1202, and the video decoding using the inline downscaler 126 is performed during video playback to one or more viewers external to the vehicle 1202. For example, the vehicle 1202 may move (e.g., circle an outdoor audience during a concert) while playing out video such as advertisements or steaming video of the concert stage, and the one or more processors 116 (e.g., including the video decoder 124) may perform video decoding using the inline downscaler 126 to generate the video from an encoded video stream.
The method 1400 includes, at block 1402, obtaining, at a video decoder, an input frame of video data. For example, the first input frame 162 of
The method 1400 includes, at block 1404, decoding, at the video decoder, the input frame to generate a first video frame. For example, the video decoder 124 decodes the first input frame 162 to generate the decoded frame 130.
The method 1400 includes, at block 1406, generating, at an inline downscaler of the video decoder, a second video frame corresponding to the first video frame downscaled for display output. For example, the downscaled frame 132 is generated at the inline downscaler 126 of the video decoder 124.
In some implementations, the method 1400 includes outputting the first video frame and the second video frame in parallel. For example, the video decoder 124 may output the downscaled frame 132 in parallel with outputting the decoded frame 130.
In some implementations, the method 1400 includes storing the first video frame into a memory and storing the second video frame into a cache. For example, as illustrated in
In some implementations, the method 1400 includes selecting between storing the second video frame into a memory or into a cache. For example, as explained with reference to
In some implementations, the method 1400 includes storing a first portion of the second video frame into a memory and storing a second portion of the second video frame into a cache. For example, as explained with reference to
In some implementations, the method 1400 includes receiving the second video frame at a display unit and generating an output to a display device, such as the display unit 140 receiving the downscaled frame 132 and generating the video data output 142 to the display device 104. For example, the second video frame may be received from a cache, such as the cache 150. As another example, the second video frame may be received from a memory, such as the memory 110. According to some aspects, a first portion of the second video frame is received from a memory and a second portion of the second video frame is received from a cache. For example, the display unit 140 may receive the first portion 428A of the downscaled frame 132 from the memory 110 and may receive the second portion 428B of the downscaled frame 132 from the cache 150.
The method 1400 of
Referring to
In a particular implementation, the device 1500 includes a processor 1506 (e.g., a CPU). The device 1500 may include one or more additional processors 1510 (e.g., one or more DSPs). In a particular implementation, the one or more processors 116 of
The device 1500 may include a memory 1586 and a CODEC 1534. The memory 1586 may include instructions 1556 that are executable by the one or more additional processors 1510 (or the processor 1506) to implement the functionality described with reference to the video decoder 124. In a particular example, the memory 1586 corresponds to the memory 110 and the instructions 1556 correspond to the instructions 112 of
The device 1500 may include a display 1528, such as the display device 104, coupled to a display controller 1526. One or more speakers 1592, one or more microphones 1590, or a combination thereof, may be coupled to the CODEC 1534. The CODEC 1534 may include a digital-to-analog converter (DAC) 1502 and an analog-to-digital converter (ADC) 1504. In a particular implementation, the CODEC 1534 may receive analog signals from the microphones 1590, convert the analog signals to digital signals using the analog-to-digital converter 1504, and send the digital signals to the speech and music codec 1508. In a particular implementation, the speech and music codec 1508 may provide digital signals to the CODEC 1534. The CODEC 1534 may convert the digital signals to analog signals using the digital-to-analog converter 1502 and may provide the analog signals to the speakers 1592.
In a particular implementation, the device 1500 may be included in a system-in-package or system-on-chip device 1522. In a particular implementation, the memory 1586, the processor 1506, the processors 1510, the display controller 1526, the CODEC 1534, and the modem 118 are included in a system-in-package or system-on-chip device 1522. In a particular implementation, an input device 1530 (e.g., a keyboard, a touchscreen, or a pointing device) and a power supply 1544 are coupled to the system-in-package or system-on-chip device 1522. Moreover, in a particular implementation, as illustrated in
The device 1500 may include a smart speaker, a speaker bar, a mobile communication device, a smart phone, a cellular phone, a laptop computer, a computer, a tablet, a personal digital assistant, a display device, a television, a gaming console, a music player, a radio, a digital video player, a digital video disc (DVD) player, a tuner, a camera, a navigation device, a vehicle, a headset, an augmented reality headset, a mixed reality headset, a virtual reality headset, an aerial vehicle, a home automation system, a voice-activated device, a wireless speaker and voice activated device, a portable electronic device, a car, a vehicle, a computing device, a communication device, an internet-of-things (IoT) device, a virtual reality (VR) device, a base station, a mobile device, or any combination thereof.
In conjunction with the described techniques, an apparatus includes means for obtaining an input frame of video data. In an example, the means for obtaining an input frame of video data includes video decoder 124, the one or more processors 116, the device 102, the system 100, the bitstream parsing unit 210, one or more other circuits or devices to obtain an input frame of video data, or a combination thereof.
The apparatus includes means for decoding the input frame to generate a first video frame, the means for decoding including means for inline downscaling to generate a second video frame corresponding to the first video frame downscaled for display output. In an example, the means for decoding the input frame includes the video decoder 124, the one or more processors 116, the device 102, the system 100, the bitstream parsing unit 210, the pixel prediction processing unit 212, the inverse transform processing unit 214, the pixel reconstruction and inloop filtering unit 216, the reference picture buffer 218, one or more other circuits or devices to decode the video frame to generate a first video frame, or a combination thereof. In an example, the means for inline downscaling includes the inline downscaler 126, one or more other circuits or devices to generate a second video frame corresponding to the first video frame downscaled for display output, or a combination thereof.
In some implementations, a non-transitory computer-readable medium (e.g., a computer-readable storage device, such as the memory 110) includes instructions (e.g., the instructions 112) that, when executed by one or more processors that include a video decoder (e.g., the one or more processors 116 that include the video decoder 124), cause the one or more processors to perform operations corresponding to at least a portion of any of the techniques described with reference to
Particular aspects of the disclosure are described below in the following sets of interrelated Examples:
According to Example 1, a device includes a memory configured to store video data; and a video decoder coupled to the memory and to a cache, the video decoder configured to decode an input frame of the video data to generate a first video frame and including an inline downscaler configured to generate a second video frame corresponding to the first video frame downscaled for display output.
Example 2 includes the device of Example 1, wherein the video decoder is configured to output the first video frame and the second video frame in parallel.
Example 3 includes the device of Example 1 or Example 2, wherein the video decoder is configured to select between storing the second video frame into the memory or into the cache.
Example 4 includes the device of any of Examples 1 to 3, wherein the first video frame is stored into the memory, and wherein the second video frame is stored into the cache.
Example 5 includes the device of any of Examples 1 to 4, wherein the first video frame is stored into the memory based on a determination that the first video frame is a reference frame, and wherein storage of full-resolution non-reference frames to the memory is skipped.
Example 6 includes the device of Example 1 or Example 2, wherein the video decoder is configured to store a first portion of the second video frame into the memory and to store a second portion of the second video frame into the cache.
Example 7 includes the device of any of Examples 1 to 6, and further includes a display unit configured to receive the second video frame and to generate an output to a display device.
Example 8 includes the device of Example 7, wherein the display unit is configured to receive the second video frame from the cache.
Example 9 includes the device of Example 7 or Example 8, wherein the display unit is configured to receive the second video frame from the memory.
Example 10 includes the device of any of Examples 7 to 9, wherein the display unit is configured to selectively receive the second video frame from the memory or from the cache.
Example 11 includes the device of Example 7, wherein the display unit is configured to receive a first portion of the second video frame from the memory and a second portion of the second video frame from the cache.
Example 12 includes the device of any of Examples 1 to 10, and further includes a display device configured to display an output based on the second video frame.
Example 13 includes the device of any of Examples 1 to 12, and further includes one or more processors that include the video decoder.
Example 14 includes the device of Example 13, and further includes a modem coupled to the one or more processors and configured to receive the video data.
Example 15 includes the device of Example 13 or Example 14, wherein the one or more processors are integrated in an extended reality headset device that is configured to display an output based on the second video frame.
Example 16 includes the device of any of Example 13 or Example 14, wherein the one or more processors are integrated in at least one of a mobile phone, a tablet computer device, a wearable electronic device.
Example 17 includes the device of Example 13 or Example 14, wherein the one or more processors are integrated in a mobile phone.
Example 18 includes the device of Example 13 or Example 14, wherein the one or more processors are integrated in a tablet computer device.
Example 19 includes the device of Example 13 or Example 14, wherein the one or more processors are integrated in a wearable electronic device.
Example 20 includes the device of Example 13 or Example 14, wherein the one or more processors are integrated in a vehicle, the vehicle further including a display device configured to display an output based on the second video frame.
Example 21 includes the device of any of Examples 13 to 20, wherein the one or more processors are included in an integrated circuit.
According to Example 22, a method of processing video data includes obtaining, at a video decoder, an input frame of video data; decoding, at the video decoder, the input frame to generate a first video frame; and generating, at an inline downscaler of the video decoder, a second video frame corresponding to the first video frame downscaled for display output.
Example 23 includes the method of Example 22, and further includes outputting the first video frame and the second video frame in parallel.
Example 24 includes the method of Example 22 or Example 23, and further includes selecting between storing the second video frame into a memory or into a cache.
Example 25 includes the method of any of Examples 22 to 24, further including storing the first video frame into a memory and storing the second video frame into a cache.
Example 26 includes the method of any of Examples 22 to 25, wherein the first video frame is stored into a memory based on a determination that the first video frame is a reference frame, and wherein storage of full-resolution non-reference frames to the memory is skipped.
Example 27 includes the method of Example 22 or Example 23, and further includes storing a first portion of the second video frame into a memory and storing a second portion of the second video frame into a cache.
Example 28 includes the method of any of Examples 22 to 27, and further includes receiving the second video frame at a display unit and generating an output to a display device.
Example 29 includes the method of Example 28, wherein the second video frame is received from a cache.
Example 30 includes the method of Example 28, wherein the second video frame is received from a memory.
Example 31 includes the method of Example 28, wherein a first portion of the second video frame is received from a memory and a second portion of the second video frame is received from a cache.
According to Example 32, a non-transitory computer-readable medium includes instructions that, when executed by one or more processors that include a video decoder, cause the one or more processors to: decode, at the video decoder, an input frame of video data to generate a first video frame; and generate, at an inline downscaler of the video decoder, a second video frame corresponding to the first video frame downscaled for display output.
Example 33 includes the non-transitory computer-readable medium of Example 32, wherein the instructions, when executed by the one or more processors, further cause the one or more processors to output the first video frame and the second video frame in parallel.
Example 34 includes the non-transitory computer-readable medium of Example 32 or Example 33, wherein the instructions, when executed by the one or more processors, further cause the one or more processors to select between storing the second video frame into a memory or into a cache.
Example 35 includes the non-transitory computer-readable medium of any of Examples 32 to 34, wherein the instructions, when executed by the one or more processors, further cause the one or more processors to store the first video frame into a memory and store the second video frame into a cache.
Example 36 includes the non-transitory computer-readable medium of any of Examples 32 to 35, wherein the first video frame is stored into a memory based on a determination that the first video frame is a reference frame, and wherein storage of full-resolution non-reference frames to the memory is skipped.
Example 37 includes the non-transitory computer-readable medium of Example 32 or Example 33, wherein the instructions, when executed by the one or more processors, further cause the one or more processors to store a first portion of the second video frame into a memory and storing a second portion of the second video frame into a cache.
Example 38 includes the non-transitory computer-readable medium of any of Examples 32 to 37, wherein the instructions, when executed by the one or more processors, further cause the one or more processors to receive the second video frame at a display unit and generating an output to a display device.
Example 39 includes the non-transitory computer-readable medium of Example 38, wherein the second video frame is received from a cache.
Example 40 includes the non-transitory computer-readable medium of Example 38, wherein the second video frame is received from a memory.
Example 41 includes the non-transitory computer-readable medium of Example 38, wherein a first portion of the second video frame is received from a memory and a second portion of the second video frame is received from a cache.
According to Example 42, an apparatus includes means for obtaining an input frame of video data; and means for decoding the input frame to generate a first video frame, the means for decoding including means for inline downscaling to generate a second video frame corresponding to the first video frame downscaled for display output.
Example 43 includes the apparatus of Example 42, and further includes means for outputting the first video frame and the second video frame in parallel.
Example 44 includes the apparatus of Example 42 or Example 43, and further includes means for selecting between storing the second video frame into a memory or into a cache.
Example 45 includes the apparatus of any of Examples 42 to 44, and further includes means for storing the first video frame into a memory and storing the second video frame into a cache.
Example 46 includes the apparatus of any of Examples 42 to 45, wherein the first video frame is stored into a memory based on a determination that the first video frame is a reference frame, and wherein storage of full-resolution non-reference frames to the memory is skipped.
Example 47 includes the apparatus of Example 42 or Example 43, and further includes means for storing a first portion of the second video frame into a memory and storing a second portion of the second video frame into a cache.
Example 48 includes the apparatus of any of Examples 42 to 47, and further includes means for receiving the second video frame at a display unit and generating an output to a display device.
Example 49 includes the apparatus of Example 48, wherein the second video frame is received from a cache.
Example 50 includes the apparatus of Example 48, wherein the second video frame is received from a memory.
Example 51 includes the apparatus of Example 48, wherein a first portion of the second video frame is received from a memory and a second portion of the second video frame is received from a cache.
Those of skill would further appreciate that the various illustrative logical blocks, configurations, circuits, and algorithm steps described in connection with the implementations disclosed herein may be implemented as electronic hardware, computer software executed by a processing device such as a hardware processor, or combinations of both. Various illustrative components, blocks, configurations, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or executable software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The steps of a method or algorithm described in connection with the implementations disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in a memory device, such as random access memory (RAM), magnetoresistive random access memory (MRAM), spin-torque transfer MRAM (STT-MRAM), flash memory, read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disk, a removable disk, a compact disc read-only memory (CD-ROM), or any other form of non-transient storage medium known in the art. An exemplary memory device is coupled to the processor such that the processor can read information from, and write information to, the memory device. In the alternative, the memory device may be integral to the processor. The processor and the storage medium may reside in an application-specific integrated circuit (ASIC). The ASIC may reside in a computing device or a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a computing device or a user terminal.
The previous description of the disclosed implementations is provided to enable a person skilled in the art to make or use the disclosed implementations. Various modifications to these implementations will be readily apparent to those skilled in the art, and the principles defined herein may be applied to other implementations without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the implementations shown herein but is to be accorded the widest scope possible consistent with the principles and novel features as defined by the following claims.
Number | Date | Country | |
---|---|---|---|
Parent | 17219637 | Mar 2021 | US |
Child | 18460149 | US |