This application claims the priority of Korean Patent Application No. 10-2008-0131607 filed on Dec. 22, 2008, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference.
1. Field of the Invention
The present application relates to a technique that effectively manages frame memory in an image processing apparatus that accesses the frame memory by blocks (i.e., in units of blocks) and, more particularly, to an image processing apparatus, which uses a DRAM (embedded DRAM, SDR, and DDR SDRAM, etc.) as a frame memory, capable of using an overall bandwidth provided by the frame memory without a loss, and a method for managing the frame memory for image processing.
2. Description of the Related Art
Recently, due to the development of networks, improved storage capacities and effective displays, the amount of multimedia data is rapidly increasing. In case of video (i.e., moving pictures, moving images, etc.), conventionally, video with SD grade resolution (480p) was the mainstream, but currently, Full-HD video with a resolution of (1080p) and video beyond the HD grade (720p) is being generalized.
Full HD video has resolution of 1920×1080. However, because it is internally processed as 1920×1088, namely, a multiple of macroblocks (16×16), a frame memory to store 1920×1088 pixels is required.
In the case of storing data in the YCbCr 4:2:0 format, which is commonly used in image compression or decompression because the amount of data per frame is the smallest, a frame memory of about 24 Mbits per frame is required, and for video compression or decompression, at least two or more frame memories including one or more sheets of reference memory and one sheet of reconfiguration memory are required. That is, the use of an external memory is requisite.
Currently, a dynamic random access memory (DRAM) having a smaller area and being lower-priced than a static random access memory (SRAM) is used as the frame memory.
Here, a detailed description of the DRAM used as the frame memory will be omitted. In general, the DRAM includes two or more banks, and each bank is constructed by rows and columns. A memory unit having the same row address is called a page. In case of a single memory, memories having a page size of 1024 bytes or 2048 bytes are manufactured, and in case of a module type memory formed by combining single memories, modular memories having a page size of 4096 bytes or larger are manufactured according to configurations. Continuous accessing is possibly performed without delay in a single page, but accessing a different page needs delay for a precharge.
As for the delay, if frame data is stored by using two or more banks and the banks are accessed by turns, or if a DRAM access command is accessed in an overlap manner, the delay may be concealed.
In case of an H.264/AVC image codec having the best compression efficiency so far, the macroblock of 16×16 pixel size is defined and processing and data accessing are performed by macroblocks.
Besides the H.264/AVC, most of the currently used video codecs define macroblocks and process compression and decompression based on the macroblocks.
Image processing devices mostly have an interface for their connection to an image inputting and outputting device, and image inputting/outputting devices mostly have a structure in which data is inputted or outputted in the raster scan order.
The Full HD image includes a total of 8,160 macroblocks (120 in width×68 in length, and each macroblock includes 16×16 pixels), and is stored in the frame memory according to various methods as shown in
Among the methods, a method of storing the Full HD image while increasing addresses according to the scanning order as shown in
The method as shown in
With the method as shown in
The methods as shown in
For example, when a DRAM, which includes pixels, each having 8 bits, stored therein, has a 6-clock delay time required for a row address conversion, and has a 32-bit interface, is used as a frame memory, a luminance (LUMA) macroblock having a pixel size of 16×16 used in H.264/AVC may be accessed as follows.
Because 16 row address changes must be performed to access the macroblock, a delay time of 96 (6×16) clocks is required. Also, because the data of 4 pixels (4×8 bits) is output at one clock, a total of 64 clocks (16/4×16) is taken to output the data of the 16×16 macroblock. That is, in order to access the macroblock, a total of 160 clocks including the delay and data transmission time are used, which accounts for about 40% of the bandwidth the DRAM can offer.
In the H.264/AVC, for motion compensation of a chrominance signal (i.e., chroma signal) of a 4×4 block, a 3×3 block must be accessed.
Thus, in an effort to solve the problem, a method of sequentially storing each macroblock in a single column address of a frame memory and performing accessing by macroblocks has been proposed, as shown in
In order to solve the degradation of performance in accessing the block at the boundary of the macroblock, a frame memory structure allowing a multi-bank interleaving by storing an adjacent macroblock in a different bank or dividing one macroblock into several partitions and storing an adjacent partition in a different bank has been proposed. However, this frame memory structure also makes the address computation more complicated and still requires the data realignment for a screen image display.
An aspect of the present application provides a method and apparatus for managing a frame memory capable of removing the necessity of realignment of frame data for a screen image display, simplifying an address computation in accessing the frame memory by blocks, having an intuitive memory structure, and successively accessing block data without delay.
Another aspect of the present application provides a method and apparatus for managing a frame memory capable of accessing the frame memory by selecting a frames/field by macroblocks to effectively support interlace scanning.
Another aspect of the present application provides a method and apparatus for managing a frame memory capable of automatically generating a frame memory structure suitable for a configuration with reference to settings of an image processing device and an external memory.
Another aspect of the present application provides a method and apparatus for managing a frame memory capable of facilitating management by integrating frame memory-related functions which have been generally distributed to be managed.
According to an aspect of the present invention, there is provided a method for managing a frame memory, including: determining a frame memory structure with reference to memory configuration information and image processing information; configuring a frame memory such that a plurality of image signals can be stored in each page according to the frame memory structure; and computing signal storage addresses by combining image acquiring information by bits, and accessing a frame memory map to write or read an image signal by pages.
In determining the frame memory structure, the maximum number of frames of the frame memory, the number of image lines per page, a frame offset, and a chrominance signal offset may be determined with reference to the memory configuration information including information about a page size, a bus width, the number of banks, and the number or rows, and the image processing information including information about a width and height of an image.
In determining the frame memory structure, the maximum number of frames and the number of image lines per page may be determined by the equations shown below:
Image width=pixel width of a macroblock unit×16 1)
Image height=pixel height of a macroblock unit×16 2)
Frame access line distance=2(Ceil(log
Field access line distance=frame access line distance×2 4)
Number of image lines per page=page size/frame access line distance 5)
Maximum number of frames=floor(number of memory rows/frame offset) 6)
In determining the frame memory structure, the chrominance signal offset may be determined to be image height/image lines per page/number of banks (namely, chrominance signal offset=image height/image lines per page/number of banks) or may be input by a user, and the frame offset may be determined by multiplying 3/2 to chrominance signal offset (frame offset=chrominance offset×3/2), or may be inputted by a user.
In configuring the frame memory, the number of frames may be determined according to the maximum number of frames, a single bank may be divided into a plurality of subbanks according to the number of image lines per page, and a luminance signal and a chrominance signal may be separately stored according to the frame offset and the chrominance signal offset.
In configuring the frame memory, when an image signal includes a luminance signal and first and second chrominance signals, a start address of rows for storing the luminance signal may be determined according to the frame offset, and a start address of rows for storing the first and second chrominance signals may be determined according to the frame offset and the chrominance signal offset.
In configuring the frame memory, when an image signal includes a luminance signal and first and second chrominance signals, a plurality of luminance signals or the first and second chrominance signals may be stored together in a single page.
In writing and reading, when an image signal includes a luminance signal and first and second chrominance signals and a frame offset is 2n, a luminance signal address and first and second chrominance signal addresses may be acquired from image acquisition information including a frame index, a signal type, and x and Y coordinates according to following equations: 1) a luminance pixel address={frame index, luminance pixel, Y coordinate, X coordinate}={row address, bank address, column address, byte address}, 2) first chrominance pixel address={frame index, chrominance pixel, Y coordinate, X coordinate, a chrominance pixel type}={row address, bank address, column address, byte address}, and 3) second chrominance pixel address=first chrominance pixel address+1.
In writing and reading, when an image signal includes a luminance signal and first and second chrominance signals and a frame offset is not 2n, a luminance signal address and first and second chrominance signal addresses may be acquired from image acquisition information including a frame index, a signal type, and x and Y coordinates, according to following equations: 1) a luminance pixel address=frame index×frame offset+{Y coordinate, X coordinate}={row address, bank address, column address, byte address}, 2) first chrominance pixel address=frame index×frame offset+chrominance offset+{Y coordinate>>1, X coordinate>>1, a chrominance pixel type}={row address, bank address, column address, byte address}, and 3) second chrominance pixel address=first chrominance pixel address+1.
In writing and reading, accessing is performed in a bank interleaving manner, and in this case, an access unit may be changed by correcting a line distance, and a field access line distance may be double a frame access line distance.
According to another aspect of the present invention, there is provided an apparatus for managing a frame memory, including: a stream controller that interprets an image data stream provided from a host system; a stream processing unit that reads an image signal of a region corresponding to a motion vector provided from the stream controller, from a frame memory to configure a motion compensation screen image, and configures a predicted screen image and a residual screen image based on data provided from the stream controller; a screen image reconfiguring unit that configures an original screen image by adding the predicted screen image or the motion compensation screen image and the residual screen image in a screen image; a deblocking filter that reads a screen image of a neighbor block from the frame memory, filters the read screen image together with the original screen image, and restores the same in the frame memory; and a frame memory controller that provides control to simultaneously store a plurality of image signals in each page of the frame memory, and acquires a signal storage address from image acquisition information through a bit unit combining method and accesses the frame memory to write or read an image signal by pages when the stream processing unit or the deblocking filter requests accessing.
The frame memory controller may determine the maximum number of frames of the frame memory, the number of image lines per page, a frame offset, and a chrominance signal offset with reference to the memory configuration information including information about a page size, a bus width, the number of banks, and the number or rows, and the image processing information including information about a width and height of an image.
The frame memory controller may determine the number of frames according to the maximum number of frames, divides a single bank into a plurality of subbanks according to the number of image lines per page, and separately stores a luminance signal and a chrominance signal according to a frame offset and a chrominance signal offset.
When an image signal includes a luminance signal and first and second chrominance signals and a frame offset is 2n, the frame memory controller may acquire a luminance signal address and first and second chrominance signal addresses from image acquisition information including a frame index, a signal type and x and Y coordinates, such that 1) a luminance pixel address={frame index, luminance pixel, Y coordinate, X coordinate}={row address, bank address, column address, byte address}, 2) first chrominance pixel address={frame index, chrominance pixel, Y coordinate, X coordinate, a chrominance pixel type}={row address, bank address, column address, byte address}, and 3) second chrominance pixel address=first chrominance pixel address+1.
When an image signal includes a luminance signal and first and second chrominance signals and a frame offset is not 2n, the frame memory controller may acquire a luminance signal address and first and second chrominance signal addresses from image acquisition information including a frame index, a signal type and x and Y coordinates, such that 1) a luminance pixel address=frame index x frame offset+{Y coordinate, X coordinate}={row address, bank address, column address, byte address}, 2) first chrominance pixel address=frame index x frame offset+chrominance offset+{Y coordinate>>1, X coordinate>>1, a chrominance pixel type}={row address, bank address, column address, byte address}, and 3) second chrominance pixel address=first chrominance pixel address+1.
The frame memory controller performs accessing in a bank interleaving manner, and in this case, an access unit may be changed by correcting a line distance, and a field access line distance may be double a frame access line distance.
The above and other aspects, features and other advantages of the present application will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
Exemplary embodiments of the present application will now be described in detail with reference to the accompanying drawings. The invention may however be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. In the drawings, the shapes and dimensions may be exaggerated for clarity, and the same reference numerals will be used throughout to designate the same or like components.
Unless explicitly described to the contrary, the word “comprise” and variations such as “comprises” or “comprising,” will be understood to imply the inclusion of stated elements but not the exclusion of any other elements.
With reference to
The host system 200, which includes a processor and peripheral devices in which an application program is executed, may be included in a codec device such as an H.264/AVC codec or may be an external system. A frame memory 400, a memory device having two or more banks, may be included in the codec device or may be an external memory. For example, if the frame memory 400 is included in the codec device, it may be implemented as an embedded DRAM, and if the frame memory 400 is mounted outside the codec device, it may be implemented as a single data rate (SDR) SDRAM or a dual data rate (DDR) SDRAM.
The functions of each element will now be described.
The host interface bus 110 transmits initialization information regarding each function module and an image data stream provided from the host system 200, or transmits an image data stream outputted from the image output unit 190 to the host system 200.
The stream buffer 121 acquires and buffers an image data stream transmitted from the host interface bus 110 and provides the image data stream to the stream controller 122. The stream controller 122 interprets the received image data stream and distributes the interpreted data to each module.
The inter-screen image prediction unit 130 reads data of a region corresponding to a motion vector received from the stream controller 122 from the frame memory 400 to configure a motion compensation screen image, and transmits the same to the screen reconfiguring unit 160.
The intra-screen image prediction unit 140 configures a predicted screen image based on data received from the stream controller 122, and transfers the configured image to the screen image reconfiguring unit 160.
The inverse transform/inverse quantization unit 150 configures a residual screen image based on data received from the stream controller 122 and transmits the configured residual screen image to the screen image reconfiguring unit 160.
The screen image reconfiguring unit 160 adds the predicted screen image or a motion compensation screen image and the residual screen image in a screen according to a mode to reconfigure an original screen image and transmits the reconfigured original screen image to the deblocking filter 170.
The deblocking filter 170 reads a screen image of neighbor blocks from the frame memory 400, performs filtering on the read screen image of the neighbor blocks together with the reconfigured screen image to remove a block distortion appearing at the boundary of the blocks, and stores the same in the frame memory 400.
When a request for reading operation is received from the inter-screen image prediction unit 130, the deblocking filter 170, or the image output unit 190, the frame memory controller 180 reads the corresponding data from the frame memory 400 and transmits it to a corresponding module, or when a request for writing operation is received from the deblocking filter 170, the frame memory controller 180 stores the corresponding data in the frame memory 400. In this case, data transmission with respect to the frame memory is made in units of blocks.
The image output unit 190 reads the screen image stored in the frame memory 400, converts the stored image into an RGB format, and transmits the converted RGB format to the host system 200.
The host system 200 displays the data received from the image output unit 190 on a screen image display device 300.
The H.264/AVC supports interlace scanning that scans pixel lines by dividing them into two fields (even number lines and odd number lines), and includes a picture-adaptive frame/field (AFF) coding scheme that selects a frame/field in units of pictures (i.e., by pictures) and a macroblock (MB)-AFF coding scheme that selects a frame/field in units of macroblocks (i.e., by macroblocks).
Thus, in order to effectively support the interlace scanning, when a field/frame is written in or read from the frame memory, the field/frame needs to be accessed by macroblocks.
As an image format used in H.264/AVC, a YCbCr4:2:0 format in which a chroma signal (i.e., chrominance signal) has resolution of 2/1 in width and length of a luma signal (i.e., luminance signal) is largely used.
Y is the luma component, Cb is a blue-difference chroma component, and Cr is a red-difference chroma component. A single 16×16 macroblock includes a 16×16 luma signal, a 8×8 Cb signal, and a 8×8 Cr signal, which are independently processed.
If an actual image width is not 2n, data from the actual image width to 2n is not used. If each pixel has N byte, the original image with the WH resolution is stored as a two-dimensional array having H number of NW bytes in the frame memory.
Thus, the interval between lines (or rows) constituting the original image is NW bytes, which is defined as a line distance (LD). If a horizontal resolution of a block to be transmitted is W1, the amount of data corresponding to one ling of the block to be transmitted is NW1 bytes, and vertical resolution of the block to be transmitted may be defined as an image height (IH).
Accordingly, the frame memory with the WH resolution in which each pixel has N bytes requires parameters N, W, W1, IH, and the like, to define an arbitrary image block with a W1×IH resolution.
In an exemplary embodiment of the present invention, a frame memory structure suitable for a configuration is automatically generated with reference to settings of an image processing device and an external memory.
Namely, an optimized frame memory structure (image lines stored in a single page, a chrominance signal offset, a frame offset, a line distance, the maximum number of frames, etc.) as shown in
The memory configuration information is inputted when the image processing device is initialized, and the image processing information may be extracted from a stream provided by the host system 200 when it is decoded, and may be extracted from an encoding parameter when it is encoded.
Each frame's configuration information is calculated according to Equation 1 and stored in an internal register, and the stored configuration information may be read from the exterior and used for memory accessing.
[Equation 1]
Image width=pixel width of a macroblock unit×16 1)
Image height=pixel height of a macroblock unit×16 2)
Frame access line distance=2(Ceil(log
Field access line distance=frame access line distance×2 4)
Number of image lines per page=page size/frame access line distance 5)
Chroma signal offset=image height/image lines per page/number of banks 6)
Frame offset=chromaticity offset×3/2 7)
Maximum number of frames=floor(number of memory rows/frame offset) 8)
The ceil is a round-up value (i.e., the closest integer larger than or the same as this number), and the floor is a round-down value (i.e., the closest integer smaller than or the same as this number).
In
Luma y0_x0 includes four luminance pixels of y=0, x=0, 1, 2, 3, and chroma y0_x0 includes two chrominance pixels of y=0 and x=0, 1.
The No. 0 macroblock (MB#0) having the storage addresses as shown in
With reference to
The frame memory map is configured with reference to the above-described frame memory structure. Namely, the number of frames is determined according to the maximum number of frames calculated in recognizing the frame memory structure, a single bank is divided into a plurality of subbanks according to the number of image lines per page, and luma and chroma signals are separately stored according to a frame offset and a chroma signal offset.
For example, if the maximum number of frames in recognizing the frame memory structure is 8, the number of image lines per page is 2, a frame offset is 2n, and a chroma signal offset is calculated as 26′h0400000, then the frame memory map has such a form as shown in
With reference to
In
The memory used in
In an exemplary embodiment of the present invention, frame/field accessing is performed to support interlace scanning by adjusting the line distance. When accessing is performed by frames, the line distance is a one-line size of an image, and when field accessing is performed, the line distance is a two-line size of an image.
In
When a single bank is divided into several subbanks to be used as shown in
The addresses of the frame memory map configured as shown in
First, when the frame offset is 2n, an address at which a desired image signal is stored can be simply obtained through the bit unit combining according to Equation 2 shown below:
[Equation 2]
Luminance pixel address={frame index, luminance pixel, Y coordinate, X coordinate}={row address, bank address, column address, byte address} 1)
First chrominance pixel address={frame index, chrominance pixel, Y coordinate, X coordinate, chrominance pixel type}={row address, bank address, column address, byte address} 2)
Second chrominance pixel address=first chrominance pixel address+1 3)
When a pixel in which the X coordinate of a first frame is 32 and the Y coordinate of the first frame is 15 is taken as an example, a storage address of a luma signal is calculated to be 26′h0807820 as represented by Equation 3 shown below, and stored in a first byte of 208th column (8th in a subbank 31) of 201st row of a third bank of the DRAM.
A first chroma signal is stored in a first byte of 208th row (8th in a subbank 31) of the 300th row of the third bank of the DRAM, and a second chroma signal is stored in a second byte of the same position.
[Equation 3]
Frame index [2:0]=3′b001 1)
Y coordinate [10:0]=11′d15=11′h00F=11′b000—0000—1111 2)
X coordinate [10:0]=11′d32=11′h020=11′b000_0010_0000 3)
Luminance address={frame index [2:0], 1′b0, Y coordinate [10:0], X coordinate10:0]} (4)
={3′b001,1′b0,11′b000—0000—1111,11′b000—0010—00001}
=26′h0807820
={12′h201,2′h3,10h208,2′h0}
={row address [11:0], bank address [1:0], column address[9:0], byte address [1:0]}.
In this case, 1′b0 means a luma signal
First chrominance address={frame index [2:0], 2′b10, Y coordinate [10:1], X coordinate [10:1], 1′b0} 5)
={3′b001, 1′b10, 10′b000—0000—111, 9′b000—0010—000, 1′b0}
={12′h201, 2′h3, 10h208, 2′b00}
={row address [11:0]), bank address [1:0], column address [9:0], byte address[1:0]}
In this case, 2′b10 means a chroma signal, and 1′b0 means a first chroma signal.
Second chrominance address={frame index [2:0], 2′b10, Y coordinate [10:1], X coordinate [10:1], 1′b1} 6)
={3′b001, 1′b10, 10′b000—0000—111, 9′b000—0010—000, 1′b1}
={12′h201, 2′h3, 10h208, 2′b01}
={row address [11:0]), bank address[1:0], column address [9:0], byte address [1:0]}
=first chroma signal+1
In this case, 2′b10 means a chroma signal, and 1′b1 means a second chroma signal.
When the frame offset is 2n, an address at which a desired image signal is stored can be obtained by Equation 4 shown below:
[Equation 4]
Luminance address=frame index×frame offset+{Y coordinate, X coordinate} 1)
={row address, bank address, column address, byte address}
First chrominance address=frame index*frame offset+chrominance offset+{Y coordinate>>1, X coordinate>>1, 1′b0} 2)
={row address, bank address, column address, byte address}
In this case, 1′b0 means a first chroma signal.
Second chrominance address=frame index×frame offset+chrominance offset+{Y coordinate>>1, X coordinate>>1, 1′b1} 3)
={row address, bank address, column address, byte address}
=first chrominance address+1
In this case, 1′b1 means a second chroma signal.
In the above, (frame index×frame offset) may be substituted by (previous frame offset+frame offset) as in Equation 5 shown below, or a previously calculated value may be used.
[Equation 5]
frame 0 offset=frame buffer base address 1)
frame 1 offset=frame 0 offset+frame offset
. . .
frame n offset=frame n−1 offset+frame offset n)
In the configuration according to an exemplary embodiment of the present invention, although continuous data accessing is performed on different rows of the frame memory, data can be continuously access without delay for changing rows of the frame memory, except for an initial data delay.
In
Accordingly, in order to read a total of 81 pixels, 27 data in units of words are required, and a total of 33 cycles obtained by adding 27 data cycles and the initial delay 6 cycles is required.
Because the first and second chrominance pixels are stored in the same region as illustrated in
In case of frame accessing or field accessing, six cycles of the initial data delay and six cycles of the data transmission cycles are required, so the first and second chrominance pixels can be transmitted during a total of 12 cycles.
As shown in
As for motion compensation, when an extreme case that all 4×4 blocks of macroblocks are coded is taken as an example, in order to compensate motion with respect to a single macroblock, 16 times of luma 9×9 transmissions and 16 times of chroma 3×3 transmissions are continuously made.
In this case, as shown in Table 1 below, the initial delay cycles (6) plus 16 instances of luminance data transmission cycles (27) plus 16 instances of chrominance data transmission cycles (6), totaling 534 transmission cycles, are required.
As a result, the delay cycle accounts for merely 1.1 percent of the overall data transmission cycles, so 98.9 percent of the bandwidth provided by the memory can be used for actual data transmissions.
The above-described functions are all performed by the memory controller 180 provided in the image processing device in
As set forth above, the image processing apparatus and the frame memory management method for image processing according to exemplary embodiments of the invention have the advantages in that, in accessing the frame memory, a data realignment for a display screen is not required, an address computation in accessing the frame memory by blocks is simple, the memory structure is intuitive, and frame data can be successively accessed by blocks without delay.
In addition, when the frame memory in a single frame memory structure is accessed, frames/fields can be selectively accessed by macroblocks by correcting a line distance, effectively supporting the interlace scanning.
Also, a frame memory structure appropriate for a configuration can be automatically generated with reference to the image processing apparatus and external memory settings.
Moreover, frame memory-related functions, which are generally distributed to be managed, can be integrated to be managed more simply and effectively.
While the present application has been shown and described in connection with the exemplary embodiments, it will be apparent to those skilled in the art that modifications and variations can be made without departing from the spirit and scope of the invention as defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2008-0131607 | Dec 2008 | KR | national |