Information
-
Patent Grant
-
6061474
-
Patent Number
6,061,474
-
Date Filed
Friday, June 21, 199628 years ago
-
Date Issued
Tuesday, May 9, 200024 years ago
-
Inventors
-
Original Assignees
-
Examiners
- Lee; Thomas D.
- Chen; Wenpeng
Agents
- Fitzpatrick, Cella, Harper & Scinto
-
CPC
-
US Classifications
Field of Search
US
- 382 234
- 382 238
- 382 239
- 382 260
- 382 282
- 348 394
- 348 390
- 348 397
- 348 400
- 348 409
- 348 411
- 348 2612
- 348 430
- 341 55
- 341 107
- 358 430
- 375 250
- 375 252
- 375 241
-
International Classifications
-
Abstract
An encoding error by the encoding and the decoding of an image is controlled within a predetermined range by an image processing apparatus which includes an input unit for inputting image data, a prediction unit for predicting pixel values indicated by the image data, a generation unit for generating pixel values derived by converting the pixel values of the image data within a range of .+-.e based on the prediction result by said prediction unit and an encoding unit for encoding parameters relating to the pixel values generated by said generation unit based on the prediction result by said prediction unit. An image having a gradient in the values of block pixels is encoded at a high efficiency by an image processing apparatus which includes a division unit for dividing an image into block pixels to be encoded, a generation unit for generating parameters defining a plane based on values of first block pixels and values indicating positions of the block pixels to be encoded and a plane encoding unit for encoding difference values between the values of the block pixels indicated by the plane defined by the parameters and the values of the encoded block pixels.
Description
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to digital video processing.
2. Description of the Related Art
In U.S. Pat. No. 5,450,500 and UK Patent Application 9407155.2 there is disclosed a digital colour processor (DCP) for colour correction in which only pixels that are specifically selected to be modified are processed by the digital circuitry. The pixels that are not to be modified are passed through the DCP without any processing that could create rounding or other errors. The processor comprises:
input means for receiving a stream of digital pixel data representative of pixels in a video picture;
selecting means for testing digital pixel data in said stream and selecting said data for modification if and only if it meets predetermined selection criteria;
modifying means for modifying said selected pixel data according to predetermined modification parameters to generate modified data;
first combining means for combining said modified data from said modifying means with unmodified data from said input means to generate output data; and
output means for supplying said output data to further equipment.
In contrast, in a conventional architecture, all of the pixels in the picture would be processed through the same signal modification path, possibly being converted from red, green and blue (RGB) to hue, saturation and luminance (HSL), and then back again to RGB, causing errors.
An advantage of the above system is that pixels to be modified can be selected not just in accordance with colour parameters but in accordance with other parameters such as position. Thus, only pixels within a certain area might be modified.
Pixel selection advantageously is carried out by using the architecture referred to below as the "pixel identification table" or alternatively as the "cache tag RAM". The pixel identification table stores digital bits which define which pixels will be selected from the pixel stream for modification. Pixels may be selected as a function of their color (hue) as in prior systems, and/or as a function of other criteria, such as saturation, luminance, (X,Y) pixel coordinates, sharpness, and texture, alone or in any combination.
Further, after a pixel or region to be changed has been isolated, other parameters besides (H,S,L) color attributes can be changed. For example, the sharpness or even the (X,Y) coordinates of a region can be changed. Modifying the (x,y) coordinates of a region would be useful, for example, for special effects such as moving an object in the picture. Detecting pixels according to their (X,Y) coordinates could also be useful for copying pixels at a given x,y from one frame to another for scratch concealment. The latter process might be carried out simply by, for the given X,Y, controlling the frame store of the DCP (discussed below), so that those specific pixels are not overwritten from frame to frame.
SUMMARY OF THE INVENTION
It has now been determined that it is possible to use such a system for more advanced handling of objects. Thus, it is possible to identify an object and to track it as it moves from frame to frame. It is also possible to store information relating to the object so that manipulations other than colour correction can be carried out.
According to one inventive aspect disclosed herein there is provided a method of digitally processing a sequence of video frames wherein an object which appears in the frames and undergoes relative motion or transformation is selected in a first frame by an operator; the pixels relating to the object are tagged in that frame by means of information including at least one colour or appearance attribute; corresponding pixels relating to the object are located automatically in subsequent frames, by means of said information including at least one colour or appearance attribute and by means of information indicating the expected position or shape of the object in the subsequent frames; and the pixels relating to the object in each of the frames are processed.
An appearance attribute could include texture or sharpness. A transformation could be in the form of morphing or the like or could be simply rotation.
Processing of the pixels could consist of colour correction, or could consist of storing information about the object so that it could be copied to another sequence, or modified in size, or replaced by a different object.
The information indicating the expected position of the object in each frame could be derived in various ways including edge detection or another motion vector detection system. In a simple system an operator could draw e.g. a rectangle around the object in the first frame and a second rectangle around the object at a different position in the last frame. The system would then calculate by extrapolation the expected position of the rectangle in the intermediate frame, assuming constant speed of movement. By positioning the rectangle manually in a few intermediate frames more accurate results could be obtained or changing speed or direction detected and accounted for.
It may be desired to identify more accurately the boundary of an object and according to another inventive aspect disclosed herein, there is provided a method of identifying the picture elements within a particular object, comprising the steps of marking a plurality of points on the boundary of the object, defining vectors joining the points so as to define the boundary of the object, carrying out a scanning operation so as to identify the pixels within the area, and storing the locations of the pixels.
Additionally, account may be taken of a change in shape, for example as a car turns a corner. Starting and finishing shapes may be defined, and the system will estimate intermediate shapes.
Providing all the pixels within the originally defined object are selected in accordance with the selected colour parameters, by selecting in subsequent frames only pixels which have those parameters and are within the defined boundarys, the system can exhibit considerable selectivity. For example, a slight turning of the object might require the boundaries to be narrowed slightly, but even if this is not done, it will be accounted for since the object colours will remain the same and only pixels with those colour parameters will be selected. Pixels within the defined boundary which do not belong to the object colour parameters will not be selected.
Another inventive aspect disclosed herein relates to an efficient system for storing the properties of pixels or for storing the required criteria for pixels to be selected. In this system, at least two tables are established, each table having a plurality of locations representing respective possible values of a property of a pixel. Appropriate values are stored in each table, and by simple AND logic it is possible to identify the properties of a chose pixel. The tables could respectively contain H,S,L values, x,y co-ordinates and other information as described herein. Such an arrangement permits efficient storage of the information.
In one system, each table contains information relating to a number of pixels and each location can have a plurality of values. Thus it is possible, by using different values and having corresponding values in two or more tables, to tie together the total information relating to a particular pixel.
The system described herein permit sophisticated editing techniques. In particular, object can be tracked and their properties altered, or they can be lifted from a video.
It will be appreciated that protection may be sought for any of the inventive aspects discussed above, whether along or in combination.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a flow of a process in a first form of implementation of the present invention;
FIGS. 2A and 2B show types of an encoding method and a positional relation of pixels;
FIG. 3 shows a portion of image data in the encoded data;
FIG. 4 illustrates a Huffman table to encode c0';
FIG. 5 shows a method for scanning run length in a #8 encoding method;
FIG. 6 show a manner of dividing a block to be processed into sub-blocks;
FIG. 7 shows a flow of an S3 process;
FIG. 8 which is composed of FIGS. 8A to 8C shows a flow of S21, S25 and S32 processes;
FIG. 9 shows a maximum block division method and encoded data format;
FIG. 10 shows a manner of converting a difference value to a corrected difference value;
FIG. 11 shows a flow of an S10 encoding process;
FIG. 12 shows a circuit configuration to implement the present invention;
FIG. 13 shows a circuit configuration to implement the first form of implementation;
FIG. 14 shows a block diagram of an encoding apparatus of a first embodiment of the present invention;
FIG. 15 shows position of surrounding pixels a, b and c for a pixel under consideration;
FIG. 16 illustrates block generation; and
FIG. 17 shows a block diagram of a plane encoding circuit.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
First Embodiment
FIG. 1 shows a flow of a process of a first embodiment of the present invention. In FIG. 1, S0 denotes a step to set a reference Huffman table to be described later, S1 denotes a step to read image data, S2 denotes a step to read block data from the image data at a maximum block size (16.times.16 pixels in the present embodiment), S3 denotes a step to determine a block division method and an encoding method, S4 denotes a step to store the block division method and the encoding method determined in the step S3 as reference information, S5 denotes a step to calculate statistics of frequency of occurrence of difference value (prediction error) to be described later which is outputted based on the reference information, S6 denotes a step to update the block data read in the step S2 to virtual block data to be described later, S7 denotes a step to determine an end point of an image, S8 denotes a step to generate an optimal Huffman table based on the difference value statistics calculated in the step S5, S9 denotes a step to determine a loop end and S10 denotes a step to encode the entire image based on the reference information.
The present embodiment is now briefly explained. The encoding in the present embodiment is not a non-reversible encoding but an encoding method in which an error between a pixel value of an actual image and a pixel value of an encoded image is admitted within a range of .+-.e at maximum. This is referred to as a near lossless encoding.
It is assumed here that a maximum size of block to be encoded is 16.times.16 pixels and the encoding is performed at one of 16.times.16 pixels, 8.times.8 pixels and 4.times.4 pixels. For the encoding method for each block, six types of prediction encoding methods (#0 to #5), two types of plane encoding methods (#6 and #7) and a run length encoding method (#8) which may be used only when the block is represented in binary form may be selected (see FIGS. 2A and 2B).
The nine types of encoding methods used for the encoding the block are briefly explained. FIG. 3 shows formats of code data by the respective encoding methods.
The #0 to #5 are for the prediction encoding method in which the predicted value is calculated using surrounding pixels A, B and C shown in FIG. 2A by the formulas of A+B-C, B, A, B+(A-C)/2, A+(B-C)/2, (A+B+C)/3, and a difference between the pixel under consideration and the predicted value is Huffman-encoded. The Huffman table is prepared for each block size and it is encoded by using a separate Huffman table for each block size.
The #6 and #7 are for the plane encoding method. In the #6, a plane which approximates a pixel value of the block is determined by P(x, y)=ax+by+c0, where c0 is a left upper corner pixel value in the block, and the values of a, b and c0 and a difference between the approximate plane and the pixel value in the block are Huffman-encoded.
If a and b are below a predetermined level, a difference c0' from the previously appeared c0 is Huffman-encoded for c0, and if they are above the predetermined level, the c0 value itself is encoded. A flag is inserted after a, b and c0 so that the difference data is not added when all of the reference values in the block are zero.
The #7 is used when all pixel values in the block are equal, that is, when it is a plane of a=b=c0, and the c0' described above is Huffman-encoded. FIG. 4 shows a Huffman table which is used when the c0' is encoded in the #6 and #7. In FIG. 4, S denotes a sign bit which is 0 or 1 depending on the sign of the difference value c0. When the c0 is no smaller than 14 or no larger than -14, the difference value is not used and the value of c0 is added after the ESC code.
The #8 is for the run length encoding method and it is used only when there are only two pixel values in the block. For example, assuming that a higher brilliance pixel represent white and a lower brilliance pixel represents black, the pixels in the block are scanned in the order shown in FIG. 5, converted to a run length code and it is encoded. For the encoding, predetermined white (high brilliance) run and black (low brilliance) run Huffman table are used and the last run in the block is substituted by the EOB code.
Referring to FIG. 1, a flow of process in the present embodiment is now explained in detail.
In the step S0, the reference Huffman table to be used in encoding the block is set. A Huffman table prepared based on statistical information of predicted errors generated by prediction encoding determined for a plurality of images or by other method is read and a 16.times.16 pixel prediction coding Huffman table, a 8.times.8 pixel prediction encoding Huffman table, a 4.times.4 pixel prediction encoding Huffman table and a plane encoding Huffman table are set.
Four counters to be used when those Huffman tables are newly prepared in the step S8, that is, a 16.times.16 prediction encoding counter, a 8.times.8 pixel prediction encoding counter, a 4.times.4 pixel prediction encoding counter and a plane encoding counter are all initialized to zero.
A Huffman table for Huffman-encoding the information indicating the encoding method (#1 to #8) is prepared for each block size, an encoding counter to be used in newly preparing the Huffman tables is provided and it is also initialized to zero.
In the step S1, the image data to be encoded is read onto an encoding memory.
In the step S2, the block data (maximum block size 16.times.16 pixels) are sequentially read from the image to be encoded. The upper and left lines of the read block are also read as reference areas for the prediction encoding.
In the step S3, for the maximum size block (16.times.16 pixels) read in the step S1, the block division method to further dividing the block to sub-block (blocks to be encoded) appropriate for the encoding and the encoding method (#0 to #8) appropriate for the each block to be encoded are determined and the pixel values in the maximum block are updated within an error of .+-.e from the actual pixel values. In this manner, the encoding can be performed with a small difference value (prediction error). By updating the pixel value within the range of .+-.e so that the difference values have only several values, the amount of information of the difference values can be reduced. The process of the step S3 is described below in detail.
The routine of the step S3 has a reversible structure and a total amount of codes when the maximum block or a cubic block derived by dividing the maximum block several times (which is refereed to as the block to be processed and the size thereof differs depending on the flow of process) is sub-divided (into four blocks) is compared with an amount of codes when such division is not made to determine whether the block is to be divided or not (see FIG. 6), and the reference information including the block division information and the encoding method as well as the amount of codes when the block selected by the reference information is encoded are outputted.
FIG. 7 shows a detailed flow of the iterative process used when the encoding method of the maximum block (the block to be processed) and the division method are selected in the step S3.
In FIG. 7, S30 denotes a step to determine the size of the block to be processed. If the block to be processed is the minimum block size (4.times.4 pixels), the process proceeds to S32, other wise the process proceeds to S20 and S23.
S20 denotes a step to copy the block to be processed to generate a block A. The block to be processed is copied to the block A in the memory area for the iterative process. S21 denotes a step to select the optimal encoding method to minimize the amount of codes LA when the block A is encoded. Detail of the selection method will be described later.
S22 denotes a step to output the amount of codes LA generated when the encoding method selected in S21 is used.
S23 denotes a step to copy the block to be processed to generate a block B. S24 denotes a step to divide the block B into four sub-blocks and S25 denotes a step to select an optimal encoding method for each sub-block as is done in the step S21.
S26 denotes a step to output the total amount of codes which is a sum of amounts of codes when the four sub-blocks are encoded by using the encoding methods selected in the step S25.
In S27, the LA outputted in the step S22 is compare with the LB outputted in the step S29 and If LA<LB, the process proceeds to S28, and if LA.gtoreq.LB, the process proceeds to S29.
S28 denotes a step to determine the size of the block to be processed in the block A as the size of the block to be processed, and the iterative process to determine the block division method for the range of the block to be processed is terminated. In S28 of the iterative process, the block A, that is, the block to be encoded is updated to the pixel values (within the error range of .+-.e) of a virtual block to be described later generated in selecting the encoding method in the step S21. The process proceeds to S31 to generate the block division information and the encoding method selected in the step S21 and the amount of codes LA as the reference information.
S29 denotes a step to copy th e block B comprising four sub-blocks to the range of the block to be processed use d at the beginning of the iterative process. The four sub-blocks are substituted by the blocks to be processed and the iterative process to determine the block division method is iterated from the beginning.
In S32, an optimal encoding method of the block to be processed having the minimum block size is selected. In S33, the amount of codes L is outputted. In S34, the block to be processed is updated by the pixel values of the virtual block to be generated later which is generated in the step S32. In S35, the optimal encoding method and the amount of codes L for the block are generated.
So far, the flow of the iterative process used when the encoding method of the maximum block (or the block to be processed) and the block division method are selected in the step S3 has been described.
The selection of the optimal encoding method in the steps S21, S25 and S32 is now explained in detail.
FIGS. 8A to 8C show a flow to select the optimal encoding method. The amounts of codes of all selectable encoding methods are estimated to conduct the selection of the optimal encoding method.
Prior to the estimation of the amounts of codes, the block data is updated to a form appropriate for each method. In FIG. 8A, S40 denotes a step to initialize the amount of codes Ln(0) generated by the encoding method #0, S41 denotes a step to determine a predicted value A+B-C, S42 denotes a step to update the pixel values in the block within the error range of .+-.e, S43 denotes a step to add one pixel of Huffman code length for the difference value to Ln(0), and S44 denotes a step to determine an end point of the block. When the amount of codes in encoding the block is generated, the process proceeds to S61.
The above precess is referred to mode 0 and mode 1 to mode 5 correspond to the processes for the encoding methods #1 to #5 which are identical to the above process.
S45 denotes a step to initialize the encoding methods #6 and #7. A flag (FLAG) indicating whether the approximate plane and the pixel values in the block completely match or not is prepared and it is initially set to 1 (complete match). The parameter Ln(6) indicating a code length for #6 is set to 0 and a parameter Ln(7) indicating a code length for #7 is set to a value which is not minimum of the Ln(0) to Ln(8). S46 denotes a step to determine whether the entire block is rendered to the same pixel value or not by changing the pixel values in the block within the error range .+-.e, S47 denotes a step to determine the block approximate plane P(x, y)=ax+by+c0 of the #6 described above, S48 denotes a step for each pixel to update the pixel value within the error range .+-.e to generate a pixel value for forming the virtual block and S49 denotes a step to add the Huffman code length for the difference value between the generated pixel value and the approximate plane P(x, y) to the Ln(6). S50 denotes a step to determine whether the difference value (prediction error) is 0 or not. When there is even one pixel value which is different from that of the approximate plane, the FLAG is set to 0 in S51. In S52, the end point of the block is determined.
S53 denotes a step to determine the value of the FLAG. If FLAG=0, the amount of codes of the difference value for each pixel generated in S49 and the amount of codes parameters a, b and c0 (c0') indicating the approximate plane P(x, y) are added to produce Ln(6) (S54), and the process proceeds to S61.
If FLAG=1 (the approximate plane and the pixel values of the block completely match), only the information of the approximate plane need be encoded and the amount of codes of the parameters a, b, c0 (c0') are substituted by Ln(6) (S55), and the process proceeds to S61.
When the value of Ln(6) is determined, Ln(7) is set to a value which is not minimum of Ln(0) to Ln(8).
In S56, the amount of codes of c0 (c0') is set to the value of Ln(7) and Ln(6) is set to a value which is not minimum of Ln(0) to Ln(8). In S64, the block data is updated (to generate the virtual block) within the range of .+-.e such that all pixel values in the block are rendered to c0 and the process proceeds to S61.
S57 denotes a step to determine whether the block is represented by two pixel values or not. If it is represented by the two pixel values, the process proceeds to S58, otherwise Ln(8) is set to a value which is not minimum of Ln(0) to Ln(8), and the process proceeds to S61.
S58 denotes a step to convert the pixel values to a run length. The amount of codes Ln(8) generated when the run length is encoded is determined and the process proceeds to S61.
S61 denotes a step to compare the values of Ln(0) to Ln(8) to select a minimum value and the corresponding optimal encoding method. S62 denotes a step to store the virtual block corresponding to the encoding method selected in S61 of the virtual blocks generated by updating the pixel values within the range of .+-.e in the respective modes (mode 0 to mode 8), and S63 denotes a step to output the optimal encoding method and the amount of codes.
The process before the step S61 determines the amounts of codes for the respective methods. This process is conducted after the block to be processed (or the sub-block) is copied in the separate memory area. Accordingly, the updating of the block data when the optimal encoding method is selected does not affect to the original block to be processed.
The steps S40 to S44 are now explained in detail. The steps S40 to S44 determine the amount of codes Ln(0) generated by the #0.
S41 to S44 denote a loop to raster-process the pixels in the block sequentially from the left upper and the processed pixels (the pixels updated in S42) are substituted by the original pixel value as the reference pixels of the pixels under consideration. When the reference pixels, (A, B, C) are out of the area of the 16.times.16 pixel block read in S2, the pixel values of the block data updated in S6 before the loop precess (S2 to S7) of the previous maximum block is used.
In the step S40, Ln(0) is set to 0. In the step S41, the predicted value P=A+B-C is generated based on the pixel value X under consideration and the surrounding pixel values A, B and C and the difference value D=X-P is calculated. In the step S42, a corrected difference value D' is determined by a formula (1) and the pixel under consideration is updated to D'+P. In the formula (1), .dwnarw.x.dwnarw. represents omission of digits after the decimal point.
D'=.dwnarw.(D+.alpha.)/(2e+1).dwnarw.x(2e+1) (1)
.alpha.=(D/.vertline.D.vertline.)xe (2)
FIG. 11 illustrates a corrected difference value D' when e=3 and the difference value D=4. As shown in FIG. 11, since the number of the difference values D' (-7, 0, 7) is smaller than the number of difference values D(-10 to 10), the information indicating the difference values is significantly reduced. Further, the possibility of the complete matching of the difference values and the predicted values is high.
In the present embodiment, the pixel value to be corrected (the corrected pixel value) X'=X+3. In the step S43, the difference between the updated corrected pixel value X' and the predicted value P, that is, the corrected difference value D' is Huffman-coded by the Huffman table for the size of the block to be processed (or the sub-block) and the generated code length is added to Ln(0). In the step S44, whether the process has been completed to the end point of the block or not is determined, and if it is completed, the process proceeds to a step S61, and if it is not completed, the next pixel is considered and the process starting from the step S41 is repeated. In this manner, the amount of codes Ln(0) generated when the encoding method #0 is used is determined. For the encoding methods #1 to #5, only the prediction methods are different and the methods for determining the amount of generated codes Ln(1 to 5) are same as that for the #0 and the detailed description thereof is omitted.
The steps S45 to S56 and S64 are now explained in detail. S45 denotes a step to initialize the variables for the #6 and #7. The amount of codes Ln(6) generated by the #6 is set to 0 and the amount of codes Ln(7) generated by the #7 is set to a sufficiently large value. A variable FLAG which is used to determine whether all difference values are 0 or not in the #6 is set to 1. In a step S46, a maximum value (max) and a minimum value (min) of the pixel values in the block are determined, and if max-min is smaller than 2e+1, the process branches to a process for the encoding method for the #7 (steps S56 et seq), otherwise the process proceeds to a process for the encoding method #6 (steps S47 et seq).
In a step S47, an approximate plane is determined from the pixel values in the block. The steps S48 to S52 sequentially process the pixel values in the block. In a step S48, the difference value D=X-P is determined from the pixel values X under consideration in the block and the value P corresponding to the position of the pixel under consideration on the approximate plane determined in the step S47, the corrected difference value D' is calculated in the same manner as that in the step S42 and the pixel under consideration is updated to D'+P. In a step S49, the difference value between the updated corrected pixel value X' and the predicted value P is Huffman-encoded by the Huffman table for the plane encoding and the code length thereof is added to Ln(6).
In a step S50, whether the difference value between the corrected pixel value X' and the predicted value P is 0 or not, and if it is not 0, the FLAG is set to 0 in a step S51. In a step S52, the end point of the block is determined, and if it is not the end point, the process starting from the step S48 is continued for the next pixel in the block. S53 denotes a step to determine the value of the FLAG. If it is 0, the process proceeds to S54, otherwise the process proceeds to S55. In the step S54, the amount of codes of a, b and c0 plus one bit for the FLAG is added to the Ln(6). In the step S55, the amount of codes of a, b and c0 plus one bit for the FLAG is substituted by a new Ln(6).
If it is determined that the encoding method #7 is available in the step S46, a quotient of (max-min)/2 is set to c0 in a step S56 and the amount of codes when c0 is encoded is substituted for the Ln(7) and a sufficiently large value is set to the Ln(6).
In a step S64, all pixel values in the block are updated. By this process, the amounts of codes Ln(6) and Ln(7) when the encoding methods #6 and #7 are used are determined.
The steps S57 to S60 are now explained in detail. S57 denotes a step to determine whether the encoding method #8 is available or not. The pixel values in the block are examined and if two pixel values appear, the process proceeds to S58; otherwise the process proceeds to S60.
In S60, a sufficiently large value is set to the Ln(8). In a step S58, the block is scanned in the order shown in FIG. 5 while a higher brilliance value of the two pixel values of the block is set to white and a lower brilliance value is set to black to convert it to white-and-black run length data. In a step S59, the amount of codes to represent the two appearing pixel values and the amount of codes when the run length data is Huffman-encoded by using the Huffman table exclusively prepared for the #8 are added to determine the amount of generated codes Ln(8) for the #8.
In this manner, the amounts of generated codes Ln(0) to Ln(8) for the encoding methods #0 to #8 are determined, and in a step S61, a minimum amount of codes which is the minimum of the above is determined and a sum of the minimum amount of codes and amount of codes indicating the encoding method is stored as the amount of code I of the block (one of the four amounts of codes comprising the amount of code LA or the total amount of code LB in FIG. 7, or the amount of code L) and the information representing the encoding method number (i) is also stored.
In a step S62, the virtual block generated in using the encoding method #i is stored. The virtual block is used to determine the block to be encoded (update the original image values) in the steps S28 and S34, and it is finally used to update the block data in the step S6.
In a step S63, the number i and the amount of codes I are outputted and the optimal encoding method selection process (corresponding to S21, S25 and S32 of FIG. 7) is terminated. The description of the selection of the optimal encoding method in the steps S21, S25 and S32 is thus completed.
In this manner, the block division information and the optimal encoding method in the step S3 have been determined.
FIG. 9 shows an example of the encoding method for each of the divided blocks to be encoded. The reference information based on FIG. 5 is shown in FIG. 10. This completes the detailed description of the step S3.
In a step S4, the reference information generated in S3 is stored in the memory.
In a step S5, for the encoding method number i selected for each of the blocks to be encoded, the numbers of times of occurrence of the encoding methods #0 to #8 are counted by encoding number counters provided one for each block size, and if it is one of #0 to #6, the number of times of occurrence of the corrected difference value D' when encoded by that method is added to the counter corresponding to the encoding method (the prediction encoding counter for #0 to #5, and the plane encoding counter for #6).
In a step S6, the pixel values of the block read in the step S2 are updated by using the data of the block to be encoded which were updated in the virtual block in the step S3 (based on S28 and S34).
In a step S7, the end point of the image is determined, and if it is not the end point of the image, the process staring from the step S2 is conducted to the next block, and if it is the end point, the process proceeds to a step S8.
The loop precess of S2 to S7 is the process to update the block data (virtual block) for each block to generate the image data. When the reference pixels (A, B, C or c0) used in the selection process of the encoding method in the step S3 are out of the area of the 16.times.16 pixel block read in the step S2, the pixel values of the block data updated in the step S6 before the previous loop precess are used.
In a step S8, based on the values of the 16.times.16 pixel prediction encoding counter, the 8.times.8 pixel prediction encoding counter, the 4.times.4 pixel prediction encoding counter and the plane prediction counter, the 16.times.16 pixel prediction encoding Huffman table, the 8.times.8 pixels prediction encoding Huffman table, the 4.times.4 pixel prediction encoding Huffman table and the plane encoding Huffman table are reconstructed. Further, the numbers of times of occurrence of the optimal encoding methods #0 to #8 are counted by the encoding method number counters provided for the respective block sizes, and the Huffman table for encoding the encoding method numbers are reconstructed in accordance with the statistics of the counts.
In a step S9, the end of the optimization processing loop (S0 to S8) is determined, and when a predetermined number of loops is completed, the process proceeds to the step S9, otherwise the process proceeds to the step S0. In this case, the S0 in the initial optimization processing loop and the step S0 in the second and subsequent optimization steps are different. In the S0 of the second and subsequent loops, the optimal Huffman table prepared in the previous step S8 is substituted for the reference Huffman table.
In a step S10, the image data updated in the step S6 is actually encoded by referring the stored reference information. FIG. 12 shows a flow of the encoding process in the step S10. In FIG. 12, S80 denotes a step to encode the Huffman table, S81 denotes a step to read block data of the maximum size (16.times.16 pixels) from the image, S82 denotes a step to read the reference data corresponding to the block data, S83 denotes a step to encode the block data in accordance with the reference information and S84 denotes a step to determine the end point of the block. Detailed description follows.
In the step S80, for the 16.times.16 pixel encoding number Huffman table, the 8.times.8 pixel encoding number Huffman table, the 4.times.4 encoding number Huffman table, the 16.times.16 pixel prediction encoding Huffman table, the 8.times.8 prediction encoding Huffman table, the 4.times.4 prediction encoding Huffman table and the plane encoding Huffman table, the information indicating the structures of the respective Huffman tables is outputted as the Huffman table information.
In the step S81, the block data are sequentially read with the maximum block size from the image data. In the step S82, the reference information for the block data is read from the memory.
In the step S83, the reference information and the data derived by encoding the block data of the maximum block size in accordance with the reference information are outputted as the encoded data sequence.
In the step S84, the end point of the image is determined, and if an unprocessed block is present, the process starting from the step S81 is conducted for the next block.
In this manner, the encoded data sequence for the image data is generated and outputted.
This completes the description of the first embodiment.
[Modification of First Embodiment]
The present invention is not limited to the above embodiment. For example, while the block is divided with the maximum block size of 16.times.16 pixels in the above embodiment, the block size may be set to 128.times.128 or larger pixel.
Further, while the block is divided into four same size sub-blocks in the above embodiment, the block size may be variable.
In the first embodiment, the block is divided based on the amount of codes of the block to be processed and the amounts of codes of the sub-blocks derived by dividing the block to determine the block to be encoded. Alternatively, the amount of codes of all blocks from the maximum block (16.times.16 pixels) to the minimum block (4.times.4 block) may be determined and then the block may be divided based on the amount of all codes.
Further, the amount of codes of the minimum block may be first determined, this block is regarded as the sub-block, a temporary block division method and the amount of codes are determined based on the total amount of codes of the four sub-blocks (4.times.4 pixels) as shown in FIG. 6 and the amount of codes of the block (8.times.8 pixels) of the combined area, a new temporary block division method is determined based on the total amount of codes of the four determined areas (8.times.8 pixels) and the amount of codes of a larger block (16.times.16 pixels) comprising the determined area (8.times.8 pixels), and it is repeated until the maximum block size is reached to determine the block to be encoded.
Further, while the prediction encoding method, the plane encoding method and the run length encoding method are used as the selectable block encoding method in the above embodiment, the application method may be changed by changing the prediction method or other encoding method such as the Marcof model encoding method may be used. While the Huffman encoding is uses as the entropy encoding method in the above embodiment, other entropy encoding method such as arithmetic encoding method may be used.
In S6, the block data need not be updated but the value of the prediction error (generated in S3) when the image by the updated block data is encoded may be stored. In this case, in S10, instead of reencoding by using the updated block data, the encoding may be made by using the prediction error (corrected difference value) determined in the step S3. However, when the difference values for the respective pixels are stored, the circuit configuration and the process to generate the reference pixel values are complex compared to the process in which the prediction encoding is conducted by referring the pixel values (A, B and C) in S41 of FIG. 8A. Accordingly, in the first embodiment, the method of storing the corrected pixel value is used rather than the method for storing the corrected difference value.
FIG. 13 shows an example of the apparatus for conducting the encoding process of the first embodiment. Numeral 1 denotes a CPU control unit which can conduct the encoding and decoding of the image data by the software processing, numeral 2 denotes an image input unit for reading the image data, numeral 3 denotes a printer unit for printing the image data, numeral 5 denotes a communication control unit for transmitting the data encoded by the CPU control unit 1 or the encoded data transmitted from an external equipment and numeral 6 denotes an image memory for storing the image information.
Second Embodiment
FIG. 14 shows a block diagram of a second embodiment of the present invention. In FIG. 14 numeral 100 denotes an image input unit for generating image data such as an image reader, a TV camera or a host computer, numeral 101 denotes a block generation circuit for blocking image data for each pixel, numeral 103 denotes a plane encoding circuit, numeral 104 denotes a buffer for adjusting timing, numeral 105 denotes a sub-block generation circuit for dividing the block in the block generation circuit into sub-blocks, numerals 106, 107, 108 and 109 denote prediction encoding circuit for conducting the prediction encoding in different methods, numeral 110 denotes a selector, numeral 111 denotes an information addition circuit, numeral 112 denotes a buffer for adjusting the timing, numeral 113 denotes a selector, numeral 114 denotes an information addition circuit, numerals 115, 116, 117 and 118 denote signal lines and numeral 119 denotes an output unit for outputting the encoded data to an external memory and an external equipment.
FIG. 15 shows positions of surrounding pixels a, b and c used by the prediction encoding circuits 106, 107, 108 and 109. In FIG. 15, x represents a pixel under consideration and a, b and c are defined by the relative positions to x.
First, the image data to be encoded is stored in the memory 101 from the image input unit 100. The block generation circuit 102 sequentially reads the image data by 16.times.16 pixel block and stores them. As shown in FIG. 3, the data of the left and upper pixels of the 16.times.16 pixel block are also read as shown in FIG. 3 and they are stored. The plane encoding circuit 103 calculates a plane .alpha.x+.beta.y+.gamma. (x, y=0, . . . , 15) which proximates the 16.times.16 pixel block stored in the block generation circuit 102 and Huffman-encodes the values of the parameters .alpha., .beta. and .gamma. and the difference value between the proximate plane and the pixels values in the block and stores them in the buffer 104. Detail of the plane encoding circuit 103 will be described later.
On the other hand, the sub-block generation circuit 105 sequentially reads the 8.times.8 pixel blocks from the 16.times.16 pixel block stored in the block generation circuit 102 and stores them. The data of the left and upper pixels of the 8.times.8 pixel block are also read and stored. The prediction encoding circuit 106 calculates x-a for each pixel in the 8.times.8 pixel sub-block stored in the sub-block generation circuit 105 and Huffman-encodes it. Similarly, the prediction encoding circuit 107 Huffman-encodes x-b, the prediction encoding circuit 108 Huffman-encodes x-c and the prediction encoding circuit 109 Huffman-encodes x-a-b+c. The selector 110 selects the encoded data of the smallest amount of codes of the encoded data outputted from the prediction encoding circuits 106, 107, 108 and 109, outputs the code sequence thereof to the signal line 116 and outputs 2-bit identification data I to the signal line 115 to identify the selected prediction encoding circuit. The information addition circuit 111 adds the identification data I from the signal line 115 to the head of the code sequence of the signal line 116 and stores them in the buffer 112. This process is conducted for the four pixel sub-blocks so that one block of codes of the 16.times.16 pixels are stored in the buffer 112.
The selector 113 compares the amount of codes of the buffer 104 and the buffer 112, selects the smaller one and outputs it to the signal line 118. It also outputs one-bit identification data II for identifying the selected buffer 104 or 112 to the signal line 117. The information addition circuit 114 adds the identification data II from the signal line 117 to the head of the code sequence and outputs them to the output unit 119.
The above process is repeated until the last block stored in the memory 101 is reached to generate the code.
FIG. 17 shows a configuration of the plane encoding circuit 103.
In FIG. 17, numeral 201 denotes a plane calculation circuit, numeral 202 denotes a plane data generation circuit, numeral 203 denotes a buffer, numeral 204 denotes a differentiation circuit, numeral 205 denotes a parameter encoding circuit, numeral 206 denotes a Huffman encoding circuit and numeral 207 denotes a multiplexor.
In the present embodiment, the image data to be encoded is an 8-bit/pixel tonality image.
The encoding is conducted block by block. The 16.times.16 pixel block data from the block generation circuit 201 is sent to the plane calculation circuit 201 and the buffer 203 through the signal lines 208 and 209.
The plane calculation circuit 201 calculates the parameters a, b, c.theta. and m defining the plane in a manner to be described later and sends them to the plane generation circuit 202 and the parameter encoding circuit 205 through 210, 211 and 212. The plane generation circuit 202 generates the plane data in accordance with the received parameters and outputs it to 214. A difference between it and the input block image data from 213 is calculated for each pixel in the block by the differentiation circuit 204 and the plane prediction error value is sent to the Huffman encoding circuit 206 for each pixel. The Huffman encoding circuit 206 allocates the Huffman code in accordance with the built-in Huffman table and outputs the code data to 217.
On the other hand, the parameter inputted to the parameter encoding circuit 205 from the signal line 211 is encoded by an appropriate number of bits and it is sent to the multiplexor 207 from 216. The multiplexor 207 rearrange the parameter code sent from 216 and the difference code sent from 217 in a predetermined sequence to generate a series of bit stream which is outputted to the buffer 104.
An operation of the plane calculation circuit 201 is now explained. It is assumed that the pixel value of the left upper pixel of the input N.times.N (N=16 in the present embodiment) pixels is c.theta.. When the pixel value in the pixel coordinate (x, y) in the block is d.times.y, the following statistics is taken. ##EQU1## By using those formulas, a=(BD-2AC)/(4A.sup.2 -B.sup.2)
b=(BC-2AD)/(4A.sup.2 -B.sup.2)
are calculated and an integer m=min(.left brkt-top.max(.vertline.a.vertline., .vertline.b.vertline.).right brkt-top., 16) is determined, where .vertline. .vertline. indicates an absolute value and .left brkt-top. .right brkt-top. indicates round-up, max() indicates a larger one and min() indicates a smaller one. Thus, m assumes 1 to 16. Further, a and b are normalized and quantized by using m.
a=.left brkt-bot.(a/m+1)128.right brkt-bot.
b=.left brkt-bot.(b/m+1)128.right brkt-bot.
where .left brkt-bot. .right brkt-bot. indicates omission.
Thus, a and b are quantized to values of 0 to 255.
When the initially calculated m is 16, a and/or b can assume a value smaller than 0 or larger than 255 and hence it is clipped to keep the range of 0 to 255. After a and b are rendered to 8-bit values, 1 is subtracted from m so that m is rendered to a 4-bit value of 0 to 15.
The result from the above operation is a parameter which defines a best fit plane based on a minimum square error when the plane passing through the left upper pixel of the block is given by Z=ax+by+C.theta. and the position of the left top pixel is (0, 0, C.theta.).
When all parameters have been calculated, the plane calculator 201 outputs 8-bit values for a, b and C.theta. and a 4-bit value for m.
An operation of the plane generation circuit 202 is now explained.
The plane generation unit 202 receives the plane parameters a, b, C.theta. and m explained above from 106 to generate the prediction block. Namely, a' and b' are calculated by
a'=(m+1)(a-128)256
b'=(m+1)(b-128)256
and the pixel value d.sub.ij of the coordinate (i, j) (where i and j are integers between 0 and N-1) is generated by
d.sub.ij =.left brkt-bot.(C.theta.+a' i+b' j+0.5).right brkt-bot.
The block data generated in this manner is outputted to 214.
While all pixels in the block are encoded in the above embodiment, (1) the block may be divided into sub-blocks and the code indicating whether it is the sub-block to be encoded or not may be added for each sub-block, or (2) a plane corresponding to one-bit block may be provided to determine whether it is the sub-block to be encoded or not. The method briefly described in this supplement is particularly effective to a relatively complex portion in the image.
In accordance with the plane encoding described above, the prediction method which does least cause the reduction of the encoding efficiency is attained for the simple brilliance gradient portion bearing noise which occupies a large portion of the actual image, and the encoding efficiency to the entire image is significantly improved over the prior art.
[Modification of Second Embodiment]
The present invention is not limited to the above embodiment. For example, while four values a, b, c and a+b-c are recited as the prediction values used in the prediction encoding in the above embodiment, any other combination which is constructed from the referable pixels may be used. While the Huffman encoding is used together with the plane encoding and the prediction encoding in the above embodiment, other entropy encoding system such as the arithmetic encoding may be used. The plane encoding is not limited to the embodiment in which .alpha., .beta. and .gamma. of the proximate plane ax+py+y are encoded as the plane defining information but any values which can specify the plane may be used. Alternatively, instead of the plane, a relatively simple curved surface may be used.
The image encoding apparatus of the second embodiment comprises block division means for dividing the image data into blocks, plane encoding means for determining the proximate plane for each divided block and encoding the information specifying the proximate plane and the difference value between the proximate plane and the block, means for sub-dividing the block into sub-blocks, prediction encoding means for selecting the optimal prediction system for each sub-block and encoding the information specifying the selected prediction system and the error by the selected prediction system, means for selecting one of the plane encoding means and the prediction encoding means for each block and means for adding the information for specifying the selected encoding system. Accordingly, the plane encoding for the block having the planar nature and the prediction encoding for each sub-block for other block may be selectively used. Thus, the amount of generated codes can be suppressed.
By making the block size of the plane encoding to be larger than the block size of the prediction encoding, the efficient encoding is attained by utilizing the characteristic of the plane encoding.
By extracting the information specifying the proximate plane as the coefficients of the operation formula, the efficiency of the encoding by block by block is enhanced.
In accordance with the present invention, by converting the pixel values in the image or the block, the image data can be encoded with a relatively high efficiency while controlling the error of the pixel values by using the relatively simple circuit configuration and process.
Further, the multi-value data can be encoded at the high efficiency while controlling the error of the pixel values.
Further, in accordance with the present invention, since the image data to be encoded is updated prior to the encoding within the range of +e, the encoding unit can confirm the decoded image. In this case, if it is after S6, the updated image data may be read and monitored on the display to confirm the decoded image.
In accordance with the present invention, the image is divided into blocks, the formula of the proximate plane which proximate the block is calculated for each block and the error from the proximate plane is encoded. Thus, when the original image free from the actual noise is sufficiently proximated by the plane, the dispersion of the noise bearing on the prediction value can be suppressed to the same degree as that of the noise originally included in the original image and the reduction of the encoding efficiency is prevented.
In accordance with the present invention, the encoding by using the proximate plane for each block and the encoding by the prediction method selected from the plurality of prediction methods are provided so that the high efficiency encoding is attained even when the multi-value data is reversibly encoded.
The present invention may be modified without departing from the scope of the claims.
Claims
- 1. An image processing apparatus comprising:
- division means for dividing an image into blocks of pixels to be encoded;
- generation means for generating parameters defining a plane based on values of first blocks of pixels and values indicating positions of the blocks of pixels to be encoded;
- plane encoding means for encoding difference values between the values of the blocks of pixels indicated by the plane defined by the parameters and the values of the encoded blocks of pixels; and
- output means for outputting the encoded difference values obtained by said plane encoding means and the parameters, as final encoded data.
- 2. An image processing apparatus according to claim 1, wherein said plane is a three-dimensional plane.
- 3. An image processing apparatus according to claim 1, wherein said plane is a proximate plane based on the values of the encoded pixels.
- 4. An image processing apparatus according to claim 1, wherein the parameters are coefficients of a function representing the plane.
- 5. An image processing apparatus according to claim 3, wherein said plane is determined based on statistics derived by calculating the values of the encoded block pixels.
- 6. An image processing apparatus according to claim 1, further comprising prediction encoding means different from said plane encoding means, wherein said plane encoding means and said prediction means are selectively used.
- 7. An image processing method comprising the steps of:
- dividing an image into blocks of pixels to be encoded;
- generating parameters defining a plane based on values of first blocks of pixels and values indicating positions of the blocks of pixels to be encoded;
- encoding difference values between the values of the blocks of pixels indicated by the plane defined by the parameters and the values of the encoded blocks of pixels; and
- outputting the encoded difference values, obtained in said encoding step and the parameters, as final encoded data.
- 8. An image processing method comprising:
- an input step of inputting original image data;
- a first encoding step of rewriting the input original image data within a range from -e to e of a change amount into first image data suitable for high compression rate encoding by a first encoding method and for encoding the first image data by the first encoding method to generate first encoded data;
- a second encoding step of rewriting the input original image data within the range from -e to e of the change amount into second image data suitable for high compression rate encoding by a second encoding method and for encoding the second image data by the second encoding method to generate second encoded data; and
- a selection step of selecting one of the first encoded data and the second encoded data by comparing an amount of the first encoded data and an amount of the second encoded data.
- 9. A method according to claim 8, further comprising a third encoding step of rewriting the input original image data within the range from -e to e of the change amount into third image data suitable for high compression rate encoding by a third encoding method and for encoding the third image data by the third encoding method to generate third encoded data.
- 10. A method according to claim 9, wherein said selection step selects one of the first, second and third encoded data by comparing an amount of the first encoded data, an amount of the second encoded data and an amount of the third encoded data.
- 11. A method according to claim 8, wherein the first and second encoding methods are prediction encoding methods which are different from each other.
- 12. A method according to claim 8, wherein ranges of an allowable change amount used respectively in said first encoding step and said second encoding step are the same.
- 13. An image processing apparatus comprising:
- input means for inputting original image data;
- first encoding means for rewriting the input original image data within a range from -e to e of a change amount into first image data suitable for high compression rate encoding by a first encoding method and for encoding the first image data by the first encoding method to generate first encoded data;
- second encoding means for rewriting the input original image data within the range from -e to e of the change amount into second image data suitable for high compression rate encoding by a second encoding method and for encoding the second image data by the second encoding method to generate second encoded data; and
- selection means for selecting one of the first encoded data and the second encoded data by comparing an amount of the first encoded data and an amount of the second encoded data.
- 14. An image processing method comprising:
- an input step of inputting a block image comprising a plurality of pixels;
- a calculation step of calculating a three dimensional plane function Z based on coordinates of each pixel of the block image and values of the plurality of pixels;
- a generation step of generating a difference between a pixel value represented by the three dimensional plane function Z and the values of the plurality of pixels; and
- an output step of outputting the three dimensional plane function Z and the generated difference as final encoded data.
- 15. A method according to claim 14, wherein the block image is formed by N.times.N pixels.
- 16. A method according to claim 14, wherein said generation step Huffman-encodes the difference.
- 17. A method according to claim 14, wherein the three dimensional plane function Z includes coordinates x and y of each pixel of the block image as parameters.
- 18. A method according to claim 17, wherein the three dimensional plane function Z is represented by Z=ax+by+c, where a, b and c are constants, respectively.
- 19. An image processing apparatus comprising:
- input means for inputting a block image comprising a plurality of pixels;
- calculation means for calculating a three dimensional plane function based on coordinates of each pixel of the block image and values of the plurality of pixels;
- generation means for generating a difference between a pixel value represented by the three dimensional plane function and the values of the plurality of pixels; and
- output means for outputting the three dimensional plane function and the generated difference as final encoded data.
Priority Claims (3)
Number |
Date |
Country |
Kind |
7-156146 |
Jun 1995 |
JPX |
|
7-156147 |
Jun 1995 |
JPX |
|
7-178708 |
Jul 1995 |
JPX |
|
US Referenced Citations (3)
Number |
Name |
Date |
Kind |
4369463 |
Anastassiou et al. |
Jan 1983 |
|
5020121 |
Rosenberg |
May 1991 |
|
5122873 |
Golin |
Jun 1992 |
|
Foreign Referenced Citations (1)
Number |
Date |
Country |
0103733A1 |
Mar 1984 |
EPX |