1. Field of the Invention
The present invention relates generally to the simultaneous processing of a plurality of video inputs provided in different image and video formats, and compositing these inputs to a single composite output subject to blending parameters and scaling.
2. Prior Art
The display of multiple sources of video and computer graphics in real-time as a composite picture is required in applications ranging from consumer video devices, such as Set-Top-Boxes and DVD players, to embedded automotive displays. In the alpha blending method of compositing images, described by Porter and Duff in “Compositing Digital Images” (Computer Graphics, pp. 253-259, vol. 18 no. 3, July 1984), a foreground image F, described as a vector of Red, Green and Blue pixels, is combined with a Background image B with a predefined alpha channel α describing the extent of foreground coverage of the background, or the extent of opacity of the foreground pixels. This method of blending may be implemented using a general CPU and frame buffer memory is shown in U.S. Pat. No. 5,651,107. The blending computation requires multiplication of the pixel Red, Green and Blue values of the foreground image with the α value for each pixel, and summing with the product of the expression (1-α) with the Red, Green and Blue pixel values of the background image. This requires integer and floating point multiplication, which is a costly operation for a general purpose CPU (“Compositing, Part 2: Practice”, IEEE Computer Graphics and Applications, pp. 78-82, vol. 14, no. 6, November 1994).
Graphics processors with instructions for these operations can efficiently perform alpha blending. Processors capable of vector processing are able to process the alpha blending of a set of pixels of the foreground, background together with the alpha channel in parallel, as shown in U.S. Pat. Nos. 5,767,867 and 7,034,849. Color images are more efficiently represented using a Color Lookup Table (CLUT) than by pure RGB. Alpha blending of CLUT images requires generation of the blended CLUT, and mapping the output image to the blended CLUT, as shown in U.S. Pat. No. 5,831,604. Images being blended may require scaling and resolution conversion to an output resolution and format, followed by blending in the resolution and representation [α,RGB or α, CLUT] as shown in U.S. Pat. No. 5,914,725.
Consumer applications require blending of multiple images and video to a single output image, with each input source potentially having a format different than the output format. The multiple image and video sources can be treated as different image layers, each converted to a common output format and blended with a background for output. Each layer is then blended into the other layers to produce the output layer as shown in U.S. Pat. No. 6,157,415. The hardware implementation of Alpha Blending is feasible through the combination of multiplier and adder circuits as shown in European Patent Application Publication No. 0588522.
Graphics chips have implemented the compositing of multiple graphics layers with video input, including format conversion from YUV format to RGB format and video scaling. Such chips use a memory controller to access the graphics images and video frame buffer in memory, and combine the graphics layers sequentially in a graphics pipeline based on the alpha values using a graphics blending module using line buffers, and then combine the blended graphics image with the video input using a video compositor module as shown in U.S. Pat. Nos. 6,570,579, 6,608,630 and 6,700,588. Recent advances in semiconductor technology have made the integration of multiple pipelines for graphics and video compositing feasible as shown, for example, in U.S. Patent Application Publication No. 2007/0252912. However, while format conversion and scaling of the layers may be done in parallel, Alpha Blending necessitates sequential processing of the image layers, typically specifying a rendering order from background to highest foreground.
Methods for image format conversion from YUV to RGB are well known. The YUV format is a common format for video, providing lossy compression with preservation of video quality. The conversion of YUV to RGB uses a sequence of interpolation and standard conversion steps as described by Poyton in “Merging computing with studio video: Converting between R′G′B′ and 4:2:2″, 2004, http://www.poynton.com/papers/Discreet_Logic/index.html and further in U.S. Pat. Nos. 5,124,688 and 5,784,050. Conversion between the YUV formats from YUV4:2:0 and upsampling to YUV 4:2:2 and YUV 4:4:4 has been implemented in hardware circuits as shown in U.S. Pat. No. 6,674,479.
Current and future applications for compositing multiple sources of video and graphics into high resolution video output will require the blending of an increasing number of input sources, at increasing resolution, such as HDTV. As these factors increase, memory access and bus bandwidth for the input sources and for frame buffer memory become critical resources. Hence, it would be advantageous to provide a solution that overcomes the limitations of the prior art.
Apparatus and methods for video processing that integrate multiple processing modules that execute methods for simultaneous format conversion, scaling and image blending from a plurality of video sources, resulting in a video output ready for display. The modules use methods that are optimized for integration in a pipeline architecture, enabling the processor to increase the number of video input sources while minimizing required access to external memory. The processor combines multiple such pipelines, enabling the processor to simultaneously process a plurality of video inputs and combine these inputs into a single video output. This architecture is implemented as a hardware video processing apparatus.
Graphics controllers for many applications must receive input streams of images and video in different formats, and composite these inputs to a single output video stream. The throughput of such controllers may be limited by the system memory bus access overhead of the image and video processing. Accordingly, the preferred embodiment of the present invention is a system-on-chip (SoC) architecture consisting of a plurality of processing pipelines, driven by the output module. This architecture and pipeline-based methods greatly reduce the memory bus access, increasing throughput and scalability in the number and resolution of the video inputs.
Reference is now made to
The sprite layer modules 140-1 to 140-M provide the input of small graphical objects, for example, without limitation, a mouse cursor, converting these objects to RGB alpha format 112-1 and 112-M. The modules 140 are directly accessible through the advanced peripheral bus (APB) and uses internal memory to store the sprite data. The control module 150 controls: the conversion modules 130 for the input layers, the sprite modules 140, the generic blending module (GBM) 110 as well as the video module 165.
The converted input video layers are blended following conversion to RGB alpha format by the generic blending module 110. The resulting blended composite video layer is output as RGB 118 and provided to the video module 160. The video module 160 provides output video data in the required format, e.g., YUV or RGB, and with the appropriate horizontal and vertical synchronization. The video can be converted by a RGB2YUV module 165. The video output can also be provided to memory over a bus interface 170, using for example an AHB master 115.
Reference is now made to
The Video Module 160 requests the video layer in RGB from the Video Interface 231, which requests data from FIFO 232. The FIFO requests composited video from the Pixel Management unit 230. The Pixel Management unit processes the Alpha Blending for the plurality of layers in parallel, and requests a plurality of converted video layers via bus master (AHB MST) interfaces for each of the plurality of conversion modules and sprite layers. The parallel data flow though the plurality of conversion modules 130-1 to 130-N is shown for a typical conversion module layer 130-1. The conversion module provides converted video data in RGB alpha to the Generic Alpha Blending unit through the bus slave AHB SLV1213-1. The converted data is provided from the FIFO OUT 212-1. FIFO OUT 212-1 requests the converted video layer from the CONV module 211-1, the CONV module requests the video layer from FIFO IN 210-1, which is interfaced to the video source memory through AHB MST 1131-1. The conversion modules 130 are managed by the control module 150, and sprite data and other parameters are stored internal memory 220.
Reference is now made to
Reference is now made to
Reference is now made to
The YUV 4:2:0 to YUV 4:2:2 conversion 510 consists of the duplication of the chrominance pixels. In YUV4:2:0 video format, chrominance pixels U and V are shared by 4 luminance Y pixels; whereas in YUV4:2:2 video format, chrominance pixels U and V are only shared by 2 Y luminance pixels. The YUV 4:2:2 to YUV 4:4:4 conversion 520 consists of interpolating 2 chrominance U and V pixels to create the “mixing” chrominance U and V pixel. The YUV 4:4:4 to RGB 8:8:8 conversion 530 is performed using well-known conversion equations. An exemplary and non-limiting conversion equation is given as follows:
R=Y+351/256(V−128)=Y+1.371(V−128)
G=Y−179*(V−128)/256−86*(U−128)/256=Y−0.699*(V−128)−0.336*(U−128)
B=Y+443*(U−128)/256=Y+1.73*(U−128)
The coefficients used by the YUV2RGB unit 410-i can be configured through the control module 150.
Reference is now made to
The following equations are used to perform standard definition conversion:
Y=(77*R+150*G+29*B)/256=0.301*R+0.586*G+0.113*B
Cb=(−43*R−85*G+128*B)/256+128=−0.168*R−0.332*G+0.5*B+128
Cr=(128*R−107*G−21*B)/256+128=0.5*R−0.418*G−0.082*B+128
The following equations are used to perform high definition conversion:
Y=(47*R+157*G+16*B)/256+16=0.184*R+0.613*G+0.063*B+16
Cb=(−26*R−87*G+112*B)/256+128=−0.101*R−0.34*G+0.438+128
Cr=(112*R−102*G−10*B)/256+128=0.438*R−0.398*G−0.039+128
The blended video input may also be passed directly as RGB to the video data generation module 650, without intervention by the RGB2YUV conversion unit 165 for RGB video output.
The control unit 620 enables programmable management of the screen display area and position as well as other display parameters. The window management unit 630 is used to define the area of display and the position of the composite video layer on the display screen. The window management unit 630 sets the horizontal and vertical delay of the video output.
The synchro detection unit 640 uses the video HSync,VSync and the pixel clock signals and synchronizes the window management unit 630 and the output of the pixel data via the video data generation 650. The video data generation 650 produces a proper, synchronized video output.
Reference is now made to
Reference is now made to
Pr=αP1+(1−α)*P0
Where P0 is the pixel value from layer 0, P1 the pixel value from layer 1 and Pr is the output blended pixel. In the case of alpha=100% for the entire layer 1, layer 1 is fully opaque and occludes layer 0 as shown by 830. In the case of alpha=0% for the entire layer 1, layer 1 is fully transparent and only layer 0 is seen as shown by 850. In the case of alpha=50% for the entire layer 1, layer 1 is partially transparent as shown by 840.
With respect to
Reference is now made to
Mixer unit 910-1 receives the pixel value of layer of order 0, the pixel value of layer of order 1 and the alpha coefficient of layer of order 1 as input, and produces the alpha blended output. Mixer unit 910-2 receives the output of mixer 910-1, the pixel value of layer of order 2 and the alpha coefficient of layer of order 2, and produces the alpha blended output. The chain is repeated for N layers using N mixing units 910, where the Nth layer is mixed by unit 910-N, which receives the output of the N-1 mixing unit, the pixel value of layer N and the alpha coefficient of layer N.
Reference is now made to
Pr=αP1+(1−α)P0
The module implements this function using multipliers 1010 and 1030, adder 1040 and subtraction unit 1020. The module computes the pixel α arithmetic using 32-bit RGB alpha format pixels.
Thus the processor described herein is operative using external memory solely for receiving the video input and for storage of the blended output video.
While the disclosed invention is described hereinabove with respect to specific exemplary embodiments, it is noted that other implementations are possible that provide the advantages described hereinabove, and which do not depart from the spirit of the inventions disclosed herein. Such embodiments are specifically included as part of this invention disclosure which should be limited only by the scope of its claims. Furthermore, the apparatus disclosed in the invention may be implemented as a semiconductor device on a monolithic semiconductor.