The present invention relates generally to the field of video coding and, more specifically, to video coding based on motion compensated temporal filtering.
For storing and broadcasting purposes, digital video is compressed so that the resulting, compressed video can be stored in a smaller space than the original, uncompressed video content.
Digital video sequences, like ordinary motion pictures recorded on film, comprise a sequence of still images, the illusion of motion being created by displaying the images one after the other at a relatively fast frame rate, typically 15 to 30 frames per second. A common way of compressing digital video is to exploit redundancy between these sequential images (i.e. temporal redundancy). In a typical video at a given moment, there exists slow or no camera movement combined with some moving objects. Since consecutive images have similar content, it is advantageous to transmit only the difference between consecutive images. The difference frame, called prediction error frame En, is the difference between the current frame In and the reference frame Pn. The prediction error frame is thus given by
En(x,y)=In(x,y)−Pn(x,y),
where n is the frame number and (x, y) represents pixel coordinates. The predication error frame is also called the prediction residue frame. In a typical video codec, the difference frame is compressed before transmission. Compression is achieved by means of Discrete Cosine Transform (DCT) and Huffman coding, or similar methods.
Since video to be compressed contains motion, subtracting two consecutive images does not always result in the smallest difference. For example, when the camera is panning, the whole scene is changing. To compensate the motion, a displacement (Δx(x, y), Δy(x, y)) called motion vector is added to the coordinates of the previous frame. Thus prediction error becomes
En(x,y)=In(x,y)−Pn(x+Δx(x, y), y+Δy(x, y)).
In practice, the frame in the video codec is divided into blocks and only one motion vector for each block is transmitted, so that the same motion vector is used for all the pixels within one block. The process of finding the best motion vector for each block in a frame is called motion estimation. Once the motion vectors are available, the process of calculating Pn(x+Δx(x, y), y+Δy(x, y) is called motion compensation. Pn(x+Δx(x, y), y+Δy(x, y) is called motion compensated prediction.
In the coding mechanism described above, the reference frame Pn can be one of the previously coded frames. In this case, Pn is known at both the encoder and decoder. Such coding architecture is referred to as closed-loop.
Pn can also be one of original frames. In this case, the coding architecture is referred to as open-loop. Since the original frame is only available at the encoder but not the decoder, the decoder still has to use one of previously coded frames as reference frame. This may result in drift in the prediction process. Drift refers to the mismatch (or difference) of prediction Pn(x+Δx(x, y), y+Δy(x, y) between the encoder and the decoder due to different frames used as reference. Nevertheless, the open-loop structure becomes more and more often used in video coding, especially in scalable video coding due to the fact that open-loop structure makes it possible to obtain a temporally scalable representation of video by using lifting-steps to implement motion compensated temporal filtering (MCTF).
a and 1b show the basic structure of MCTF using lifting-steps. In
The lifting process consists of two steps: a prediction step and an update step. They are denoted by P and U respectively as shown in
H=In+1−P(In)
L=In+U(H)
In fact, the prediction step P can be considered as motion compensation. The output of P, i.e. P(In), is the motion compensated prediction. Therefore, in
In the composition process shown in
I′n=L−U(H)
I′n+1=H+P(I′n)
If signals L and H remain unchanged between the decomposition and composition processes as shown in
The structure shown in
In the examples as shown in
In MCTF, the prediction step is essentially a general motion compensation process, except that it is based on an open-loop structure. In this process, a compensated prediction for the current frame is produced based on best-estimated motion vectors for each macroblock. Because motion vectors usually have sub-pixel precision, sub-pixel interpolation is needed in motion compensation.
In both AVC standard and the current SVC reference software (HHI JSVM software version 1.0 provided for JVT meeting, January 2005, Hong Kong, China), motion vectors have a precision of ¼ pixel. In this case, possible positions for pixel interpolation are shown in
In AVC standard, values at half pixel positions are obtained by using a 6-tap filter with impulse response (1/32, −5/32, 20/32, 20/32, −5/32, 1/32). The filter is operated on integer pixel values, along both horizontal direction and vertical direction as appropriate. For decoder simplification, 6-tap filter is not used to interpolate quarter pixel values. Instead, the quarter positions are obtained by averaging an integer position and its adjacent half pixel positions, and by averaging two adjacent half pixel positions as follows:
b=(A+c)/2,d=(c+E)/2,f=(A+k)/2,g=(c+k)/2,h=(c+m)/2,i=(c+o)/2,j=(E+o)/2 l=(k+m)/2,n=(m+o)/2,p=(U+k)/2,q=(k+w)/2,r=(m+w)/2,s=(w+o)/2, t=(Y+o)/2 v=(w+U)/2,x=(Y+w)/2
For the convenience of description, such interpolation method will be hereafter referred to as AVC standard interpolation.
An example of motion prediction is shown in
In the update step, the prediction residue of the predicted block Bn+1 is added to the reference block along the reverse direction of the motion vectors used in the prediction step. According to
In fact the update process is performed only for integer pixels in frame In. If An is located at sub-pixel position, its nearest integer position block A′n is actually updated according to motion vector (−Δx, −Δy). There is a partial pixel difference between pixel locations of blocks An and A′n. In this case, because of the motion vector (−Δx, −Δy), the reference block for A′n in the update step, denoted as B′n+1 is not located at an integer pixel position either. There will be the same partial pixel difference between block Bn+1 and block B′n+1. For that reason, interpolation is needed for obtaining prediction residue for block B′n+1. Generally interpolation is always needed in the update step whenever motion vector (−Δx, −Δy) does not have an integer pixel displacement for both horizontal and vertical directions. In the current SVC reference model, the AVC standard interpolation method is used for sub-pixel interpolation in both prediction step and update step.
Instead of dealing with a block A′n that can be located anywhere in the frame to be updated, in current SVC reference software the update step is performed block by block with a block size of 4×4 in this frame. Such rectangular block used as a coding unit is hereafter referred to as coding block for the ease of description. In current SVC reference software, all the motion vectors used in the prediction step are scanned to derive the best motion vectors for updating a coding block. Such motion vector is called update motion vector in the following description. By doing so, the regular block based motion compensation process used in prediction step can be directly applied to the update step, which simplifies the implementation of the update process.
In the prediction step, block Bn+1 is predicted from block An, as shown in
It should be noted that when bi-directional predicted frames (or B-frames) are used in video coding, it is common for a coding block in the updated frame to have two update motion vectors. The two vectors should come from different update directions. An example is shown in
It is found that the update process in MCTF is helpful in improving coding performance in terms of objective quality of the coded video. However, it may also bring unwanted coding artifacts, which may be undesirable to the subjective quality of coded video. In order to avoid the unwanted coding artifacts, adaptive trade-off mechanisms have been created and used. One method is to measure the energy level of the prediction residue block that is to be used for update operation. If the energy is too high, it is more likely that the update operation could produce the unwanted visual artifacts. In this case the update strength needs to be lowered. For that reason, another weight factor, w2, can be derived based on the energy of the prediction residue block used for update operation and can be used to control update strength. In cases where the energy is higher than predetermined threshold, the update step is not performed.
Weight factors w1 and w2 can be used jointly to determine the final update strength for a coding block. Assume En+1 is the prediction residue block used for the update operation, then instead of using En+1 directly, W1*w2*En+1 should be used for update in order to avoid possible coding artifacts. It should be noted that weight factor based on other criteria, e.g. quantization parameter qp which is a factor indicating how fine the quantization step is, may also be used to control the update strength. Generally, weight factor is an indicator showing how reliable or safe it is for the current update operation.
Although the MCTF technique is found to be useful in improving coding performance, complexity has always been a major concern. In a large part, the complexity is related to the update step because the prediction step is needed even without using MCTF. Therefore, for this technique to be widely adopted and used, reducing the update step complexity is desired and important.
In the current SVC reference model (JSVM software version 1.0 provided for JVT meeting, January, 2005, Hong Kong, China), the update step interpolation is the same as that for the prediction step, i.e. AVC standard interpolation. To derive update motion vectors, all motion vectors including motion vectors for 4×4 block are considered. As a result, an update motion vector has to be found for each 4×4 coding block. In estimating energy of a block that is to be used for update operation, the block is first interpolated using AVC standard interpolation if the block is not located at integer pixel positions and the energy of the block is calculated based on the interpolated pixels.
It is advantageous and desirable to simplify both the update step interpolation process and the update motion derivation process. It is also advantageous and desirable to simplify the energy estimation process, so that the weight factor calculation becomes less complex.
The present invention aims to provide a method and device to reduce the complexity in the update step without significantly affecting the coding performance. In particular, the present invention provides simple but efficient methods for performing the update step in motion compensated temporal filtering for video coding.
The first aspect of the present invention provides a method for use in motion compensated temporal filtering of video frames, wherein the filtering of video frames comprises an update operation in which prediction residue is interpolated and fed back to low pass frame and wherein the interpolation of the prediction residue block is at least based on a filter for filtering interpolation. The filter is adaptively selected from a set of filters comprising at least a short filter and a long filter. A short filter refers to a filter with a relatively small number of filter taps such as two, and a long filter refers to a filter having more filter taps than the number of taps in the short filter. For example, a long filter may have four or more filter taps.
Thus, the method comprises:
adaptively selecting an interpolation filter from a set of filters comprising at least a shorter filter and a longer filter; and
obtaining update signal through interpolation of prediction residue based on said interpolation filter.
Advantageously, the interpolation filter is selected on a block basis from the set of filters based at least on a weight factor calculated for a block in a video frame comprising multiple blocks, and the method further comprises:
estimating an energy level of a prediction residue block corresponding to the block, wherein the estimating can be based on prediction residues at nearest integer pixel locations relative to the prediction residue block position in case the prediction residue block is located at partial pixel location, and
The interpolation filter can also be based on the number of update motion vectors available for a block in a video frame comprising multiple blocks, such that
if the number is one, comparing the weight factor of the block to a first predetermined threshold, such that if the weight factor is larger than the first predetermined value, select the longer filter as the interpolation filter, otherwise select the shorter filter as the interpolation filter; and
if the number is greater than one, comparing the weight factor of the block to a second predetermined threshold, such that if the weight factor is larger than the second predetermined value, select the longer filter as the interpolation filter, otherwise select the shorter filter as the interpolation filter.
The method further comprises deriving, for each block in a video frame, update motion vectors based on motion vectors used for blocks of at least a certain size or larger in prediction process of motion compensated temporal filtering of video frames.
The method further comprises:
comparing the weight factor of the block to a predetermined threshold;
selecting the longer filter as the interpolation filter if the weight factor is larger than the predetermined threshold; and
selecting the short filter as the interpolation filter if the weight factor is smaller than or equal to the predetermined threshold.
The second aspect of the present invention provides an electronic module which can be used in an encoder or a decoder, the electronic module has all the necessary blocks to carry out the update operation of motion compensated temporal filtering of video frames, according to the method of the present invention.
The third aspect of the present invention provides an encoder for use in motion compensated temporal filtering of video frames, the encoder has a module for carrying out the update method of the present invention.
The fourth aspect of the present invention provides a decoder for use in motion compensated temporal filtering of video frames, the decoder has a module for carrying out the update method of the present invention.
The fifth aspect of the present invention provides an electronic device, such as a mobile terminal. The electronic device comprises one or both of the encoder and decoder having a module for carrying out the update method of the present invention.
The sixth aspect of the present invention provides a software application product having a storage medium for storing program codes for carrying the update method of the present invention.
The present invention will become apparent upon reading the description taken in conjunction with FIGS. 8 to 16.
The present invention provides simple but efficient methods for performing the update operation in motion compensated temporal filtering (MCTF) for video coding in order to reduce the complexity in the update operation without significantly affecting the coding performance.
In estimating the energy level of a block that is to be used for update operation, if the block is located at a sub-pixel position, the nearest integer position pixels are used instead of the interpolated pixels of the block.
In the update step, instead of using AVC standard interpolation, a simple adaptive filter is used in interpolating prediction residue block for update operation. The adaptive filter is an adaptive combination of a shorter filter (i.e. a filter with fewer filter taps) and a longer filter (i.e. a filter with more filter taps). For instance, the short filter can be a bilinear filter and the long filter can be a 4-tap FIR (finite impulse response) filter. The switching between the short filter and the long filter is based on either one of the following three criteria:
Motion vectors that are used for the update step are derived from the motion vectors obtained from the prediction step in MCTF. According to the present invention, a further simplification mechanism for MCTF update step is that only the motion vectors corresponding to larger block size obtained from the prediction step are considered in deriving the motion vectors for update step. For example, if the block size is limited to a minimum of 8×8, and a motion vector in the prediction step is corresponding to a block size smaller than 8×8 (such as 8×4, 4×8 and 4×4), then the motion vector and its associated residue block are not used in the update step. In other words, only motion vectors for 8×8 or larger macroblock partitions are considered in deriving update motion vectors in the update step. In this case, update step can be performed simply on 8×8 block basis instead of 4×4.
Block Energy Estimation Based on Integer Pixels
As explained above, depending on the updated motion vector, interpolation may be needed to obtain sub-pixel values in the update step if the motion vector points to a sub-pixel location in the prediction residue frame. As shown in
According to the present invention, in order to obtain weight factor w2, block energy estimation is performed on the nearest integer pixels and the result is used as an approximation for the real energy level of the interpolated block. With this approach, the complexity of calculations of energy estimation remains the same as in the prior art approach. However, there is an advantage in performing energy estimation based on integer pixels. When the estimated energy level is so high that the block should be excluded from update process (i.e. with a weight factor w2=0), interpolation for the current block can be totally omitted. This would not be possible if energy estimation is done on interpolated pixels.
Another advantage is that such mechanism makes it possible to use different interpolation methods for the current block based on its block energy level or the correspondingly derived weight factor. This would not be possible if energy estimation is done on interpolated pixels.
Adaptive Interpolation for Update Step Based on Weight Factor
According to the present invention, interpolation for the update step is greatly simplified compared with the method that uses AVC standard interpolation.
In AVC standard, the adoption of the 6-tap filter is a trade-off between complexity and coding performance. It has been found that using a short filter, especially bilinear filter, for interpolation in motion estimation and motion compensation in AVC may bring degradation to the coding performance. The same conclusion still holds for the prediction step of MCTF when it is used in video coding. However, in the update step of MCTF, interpolation is actually done on the prediction residue. It has been found that using a short filter to do interpolation in the update step does not introduce noticeable coding performance degradation. For example, when using a 4-tap filter for interpolation in the update step, there is virtually no coding performance degradation compared with that using AVC standard interpolation.
According to the present invention, a 4-tap filter can be used for interpolation in the MCTF update step. The filter has different filter coefficients for different interpolation positions.
Use of the interpolation filters defined above in the calculation of sub-pixel values will now be described in detail.
In a pixel array having a horizontal row including pixels A1, A2, A3 and A4, the sub-pixel values to be interpolated in the horizontal row are denoted by x1/4, x2/4 and x3/4, respectively. The sub-pixel value x1/4 is calculated by applying interpolation filter (1/4), defined above, to pixel values A1, A2, A3 and A4. Thus, x1/4 is given by:
x1/4=((−2A1)+(14A2)+(5A3)+(−1A4))/16
Sub-pixel x2/4 is calculated in an analogous manner by applying interpolation filter (2/4) to pixel values A1, A2, A3 and A4 and similarly, sub-pixel x3/4 is calculated by applying interpolation filter (3/4), as shown below:
x2/4=((−2A1)+(10A2)+(10A3)+(−2A4))/16
x3/4=((−1A1)+(5A2)+(14A3)+(−2A4))/16
Likewise, in a pixel array having a vertical row including pixels A1, A2, A3 and A4, the sub-pixel values to be interpolated in the horizontal row are denoted by y1/4, y2/4 and y3/4 respectively. The sub-pixel values y1/4, y2/4 and y3/4 are calculated using respectively interpolation filters (1/4), (2/4) and (3/4) applied to the integer location pixel values A1, A2, A3 and A4 as defined in
y1/4=((−2A1)+(14A2)+(5A3)+(−1A4))/16
y2/4=((−2A1)+(10A2)+(10A3)+(−2A4))/16
y3/4=((−1A1)+(5A2)+(14A3)+(−2A4))/16
Interpolation filter (0/4) is included in the set of interpolation filters for completeness and is purely notional as it represents the calculation of a sub-pixel value co-incident with, and having the same value as, a pixel at an integer location. The coefficients of the other 4-tap interpolation filters (1/4), (2/4) and (3/4) are chosen empirically for example, so as to provide the best possible subjective interpolation of the sub-pixel values. For example, it is possible to interpolate rows of sub-pixel values in the horizontal direction first and then interpolate column-by-column in the vertical direction. As such a value for each sub-pixel position between integer location pixels can be obtained.
As shown in
To further simplify the interpolation process, bilinear filter can also be used in interpolating the prediction residue in the MCTF update step.
q=(1−d)*(1−dy)*p1+dx*(1−dy)*p2+(1−dx)*dy*p3+dx*dy*p4
According to the above equation, bilinear interpolation of the pixel positions as shown in
g=(3*3*A+3*1*E+1*3*U+1*1*Y)/16.
Compared with AVC standard interpolation, bilinear interpolation has a much lower complexity.
For the MCTF update step, bilinear interpolation also gives good coding performance, with only slight degradation compared to 4-tap or AVC standard interpolation. In order to keep the low complexity advantage of bilinear interpolation while still maintaining high coding performance, the present invention uses an adaptive interpolation approach based on switching between bilinear and 4-tap filters for the update step interpolation. In the adaptive interpolation approach, the switching between bilinear interpolation and 4-tap interpolation is based on a weight factor of the current block to be interpolated.
As explained above, a weight factor is used to control update strength. The weight factor is an indicator of how reliable the update motion vector is and how unlikely the update operation can cause coding artifacts. If the weight factor is large, it indicates that it is relatively safe to do the update operation on the associated block. When choosing an interpolation filter, we would like to use a relatively long filter, e.g. 4-tap filter, for interpolation for blocks with a larger weight factor because these blocks are more important in maintaining the coding performance. For blocks with a lower weight factor, a short filter, e.g. bilinear filter, is sufficient and preferable.
Before interpolation of corresponding prediction residue for a block in the update step, the final weight factor for the block is first calculated. Assume the final weight factor is w and it is a normalized value so that w is in the range of [0, 1]. Th is a pre-determined threshold in the range of [0, 1]. The adaptive interpolation mechanism is that if w>Th, the long filter, e.g. 4-tap filter, is used in interpolation for the current block. Otherwise, the short filter, e.g. bilinear filter, is used. The threshold Th can be determined through a testing procedure. The testing result provides a trade-off between complexity and coding performance. When Th is low, more blocks are interpolated with the long filter. When Th, is high, the short filter is more often used. Two extreme cases are: when Th=0, the long filter is always selected; when Th=1, the short filter is always selected. Generally Th=0.5 can be a good trade-off value. In this case it provides no obvious coding performance degradation.
Adaptive Interpolation for Update Step Based on Block Update Type (or Number of Update Motion Vectors)
In another embodiment of the present invention, adaptive interpolation can be controlled based on block update type, or in other words, by the number of update motion vectors for the current block. As explained above, it is possible for a block to have two update motion vectors. One such example is shown in
Based on the number of update motion vectors, a block in the frame to be updated can be classified into three categories. Different interpolation methods are applied accordingly to interpolate the corresponding prediction residue for that block:
As mentioned above, when a block has two update motion vectors, the compensated residue from each side is averaged and the result is used for update for that block. Since the interpolation result is later averaged, there is no need to use a long filter to do the interpolation at the beginning in this case.
Adaptive Interpolation for Update Step Based on Both Block Update Type and Weight Factor
In a further embodiment of the present invention, adaptive interpolation can be controlled based on both the block update type and the weight factor in the update step. The control mechanism used in this method is a combination of the above two methods.
In this method, the block update type is first checked and the final weight factor is also calculated for a block before interpolation of the corresponding prediction residue block for the block. Two thresholds values, Th1 and Th2, are predetermined for unidirectional update block and bi-directional update block respectively. To determine the interpolation method for a block, first the block update type is checked:
In the present invention, we also use a method in deriving update motion vectors simply based on coding blocks with larger block size, e.g. a minimum of 8×8.
Taking a minimum block size of 8×8 as an example. According to this method, motion vectors corresponding to a block size smaller than 8×8 (such as 8×4, 4×8 and 4×4 as specified in AVC standard) are excluded from, and therefore not used for, the update step. The main procedure for the update step as described above remains the same, except that in this method everything is performed on an 8×8 block basis.
For example, each block in the frame to be updated has a size of 8×8. All the motion vectors with a block size of at least 8×8 in the prediction step are scanned in the derivation of update motion vectors. With this method, the situation as shown in
Generally only a small percentage of motion vectors are corresponding to block size smaller than 8×8 and meanwhile these motion vectors may not be so reliable to be used for update process. Excluding these motion vectors from update process does not significantly affect coding performance. For that reason, update motion vectors can be derived simply based on 8×8 block and the entire process can be greatly simplified
Advantages
In terms of interpolation, both the 4-tap filter and the bilinear filter are simpler than the AVC standard interpolation. Especially the use of bilinear filter can dramatically reduce the interpolation complexity for update process. Furthermore, the present invention uses a long filter, e.g. the 4-tap filter, and a short filter, e.g. the bilinear filter, adaptively so that performance degradation is minimized while the filtering process is so much simplified.
In terms of update motion vector derivation, the present invention provides a method in which update motion vectors are derived based on larger block size, e.g. 8×8 blocks. As such, the process for update motion vector derivation is greatly simplified.
In terms of block energy estimation, it is found that estimation based on integer pixels gives a very close result to that based on sub-pixels (or interpolated pixels). Meanwhile, there is an obvious advantage doing it based on integer pixels. For instance, when the estimated energy level is so high that the block should be excluded from update process (i.e. with a weight factor w2=0), interpolation for the current block is no longer needed. Another advantage is that this makes it possible to select different interpolation method for the current block based on its block energy level or the correspondingly derived weight factor.
It should be noted that at least some of the MCTF composition and decomposition processes, according to the present invention, are carried out by software programs as indicated on
The mobile device 1 may communicate over a voice network and/or may likewise communicate over a data network, such as any public land mobile networks (PLMNs) in form of e.g. digital cellular networks, especially GSM (global system for mobile communication) or UMTS (universal mobile telecommunications system). Typically the voice and/or data communication is operated via an air interface, i.e. a cellular communication interface subsystem in cooperation with further components (see above) to a base station (BS) or node B (not shown) being part of a radio access network (RAN) of the infrastructure of the cellular network.
The cellular communication interface subsystem as depicted illustratively in
In case the mobile device 1 communications through the PLMN occur at a single frequency or a closely-spaced set of frequencies, then a single local oscillator (LO) 123 may be used in conjunction with the transmitter (TX) 122 and receiver (RX) 121. Alternatively, if different frequencies are utilized for voice/data communications or transmission versus reception, then a plurality of local oscillators can be used to generate a plurality of corresponding frequencies.
Although the mobile device 1 depicted in
After any required network registration or activation procedures, which may involve the subscriber identification module (SIM) 210 required for registration in cellular networks, have been completed, the mobile device 1 may then send and receive communication signals, including both voice and data signals, over the wireless network. Signals received by the antenna 129 from the wireless network are routed to the receiver 121, which provides for such operations as signal amplification, frequency down conversion, filtering, channel selection, and analog to digital conversion. Analog to digital conversion of a received signal allows more complex communication functions, such as digital demodulation and decoding, to be performed using the digital signal processor (DSP) 120. In a similar manner, signals to be transmitted to the network are processed, including modulation and encoding, for example, by the digital signal processor (DSP) 120 and are then provided to the transmitter 122 for digital to analog conversion, frequency up conversion, filtering, amplification, and transmission to the wireless network via the antenna 129.
The microprocessor/microcontroller (μC) 110, which may also be designated as a device platform microprocessor, manages the functions of the mobile device 1. Operating system software 149 used by the processor 110 is preferably stored in a persistent store such as the non-volatile memory 140, which may be implemented, for example, as a Flash memory, battery backed-up RAM, any other non-volatile storage technology, or any combination thereof. In addition to the operating system 149, which controls low-level functions as well as (graphical) basic user interface functions of the mobile device 10, the non-volatile memory 140 includes a plurality of high-level software application programs or modules, such as a voice communication software application 142, a data communication software application 141, an organizer module (not shown), or any other type of software module (not shown). These modules are executed by the processor 100 and provide a high-level interface between a user of the mobile device 1 and the mobile device 1. This interface typically includes a graphical component provided through the display 135 controlled by a display controller 130 and input/output components provided through a keypad 175 connected via a keypad controller 170 to the processor 100, an auxiliary input/output (I/O) interface 200, and/or a short-range (SR) communication interface 180. The auxiliary I/O interface 200 comprises especially USB (universal serial bus) interface, serial interface, MMC (multimedia card) interface and related interface technologies/standards, and any other standardized or proprietary data communication bus technology, whereas the short-range communication interface radio frequency (RF) low-power interface includes especially WLAN (wireless local area network) and Bluetooth communication technology or an IRDA (infrared data access) interface. The RF low-power interface technology referred to herein should especially be understood to include any IEEE 801.xx standard technology, which description is obtainable from the Institute of Electrical and Electronics Engineers. Moreover, the auxiliary I/O interface 200 as well as the short-range communication interface 180 may each represent one or more interfaces supporting one or more input/output interface technologies and communication interface technologies, respectively. The operating system, specific device software applications or modules, or parts thereof, may be temporarily loaded into a volatile store 150 such as a random access memory (typically implemented on the basis of DRAM (direct random access memory) technology for faster operation). Moreover, received communication signals may also be temporarily stored to volatile memory 150, before permanently writing them to a file system located in the non-volatile memory 140 or any mass storage preferably detachably connected via the auxiliary I/O interface for storing data. It should be understood that the components described above represent typical components of a traditional mobile device 1 embodied herein in the form of a cellular phone. The present invention is not limited to these specific components and their implementation depicted merely for illustration and for the sake of completeness.
An exemplary software application module of the mobile device 1 is a personal information manager application providing PDA functionality including typically a contact manager, calendar, a task manager, and the like. Such a personal information manager is executed by the processor 100, may have access to the components of the mobile device 1, and may interact with other software application modules. For instance, interaction with the voice communication software application allows for managing phone calls, voice mails, etc., and interaction with the data communication software application enables for managing SMS (soft message service), MMS (multimedia service), e-mail communications and other data transmissions. The non-volatile memory 140 preferably provides a file system to facilitate permanent storage of data items on the device including particularly calendar entries, contacts etc. The ability for data communication with networks, e.g. via the cellular interface, the short-range communication interface, or the auxiliary I/O interface enables upload, download, and synchronization via such networks.
The application modules 141 to 149 represent device functions or software applications that are configured to be executed by the processor 100. In most known mobile devices, a single processor manages and controls the overall operation of the mobile device as well as all device functions and software applications. Such a concept is applicable for today's mobile devices. The implementation of enhanced multimedia functionalities includes, for example, reproducing of video streaming applications, manipulating of digital images, and capturing of video sequences by integrated or detachably connected digital camera functionality. The implementation may also include gaming applications with sophisticated graphics and the necessary computational power. One way to deal with the requirement for computational power, which has been pursued in the past, solves the problem for increasing computational power by implementing powerful and universal processor cores. Another approach for providing computational power is to implement two or more independent processor cores, which is a well known methodology in the art. The advantages of several independent processor cores can be immediately appreciated by those skilled in the art. Whereas a universal processor is designed for carrying out a multiplicity of different tasks without specialization to a pre-selection of distinct tasks, a multi-processor arrangement may include one or more universal processors and one or more specialized processors adapted for processing a predefined set of tasks. Nevertheless, the implementation of several processors within one device, especially a mobile device such as mobile device 1, requires traditionally a complete and sophisticated re-design of the components.
In the following, the present invention will provide a concept which allows simple integration of additional processor cores into an existing processing device implementation enabling the omission of expensive complete and sophisticated redesign. The inventive concept will be described with reference to system-on-a-chip (SoC) design. System-on-a-chip (SoC) is a concept of integrating at least numerous (or all) components of a processing device into a single high-integrated chip. Such a system-on-a-chip can contain digital, analog, mixed-signal, and often radio-frequency functions—all on one chip. A typical processing device comprises a number of integrated circuits that perform different tasks. These integrated circuits may include especially microprocessor, memory, universal asynchronous receiver-transmitters (UARTs), serial/parallel ports, direct memory access (DMA) controllers, and the like. A universal asynchronous receiver-transmitter (UART) translates between parallel bits of data and serial bits. The recent improvements in semiconductor technology cause very-large-scale integration (VLSI) integrated circuits to enable a significant growth in complexity, making it possible to integrate numerous components of a system in a single chip. With reference to
Additionally, the device 1 is equipped with a module for scalable encoding 105 and scalable decoding 106 of video data according to the inventive operation of the present invention. By means of the CPU 100 said modules 105, 106 may individually be used. However, the device 1 is adapted to perform video data encoding or decoding respectively. Said video data may be received by means of the communication modules of the device or it also may be stored within any imaginable storage means within the device 1. Video data can be conveyed in a bitstream between the device 1 and another electronic device in a communications network.
Although the invention has been described with respect to one or more embodiments thereof, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.
The present invention is based on and claims priority to U.S. Provisional Patent Application No. 60/670,315, filed Apr. 11, 2005, and U.S. Provisional Patent Application No. 60/671,156, filed Apr. 13, 2005.
Number | Date | Country | |
---|---|---|---|
60670315 | Apr 2005 | US | |
60671156 | Apr 2005 | US |