This application claims the benefit of Chinese Application No. 201110159926.1, filed Jun. 15, 2011, the disclosure of which is incorporated herein by reference.
The present embodiments generally relate to the field of image processing, particularly to a method for removing from a captured image another object beyond the boundary of a specific object.
When an image of a thick document is captured with a camera or a scanner, a user tends to press both sides of the document by hands or somehow to keep the document flat, as illustrated in
Traditional hand detecting methods fall into two general categories, one of which is to create a skin color model from a lot of training data and to categorize pixels of an image to thereby detect a region of hands, and the other of which is to create a statistic model from a lot of training data, to detect hands and to further position the boundary of the hands accurately using a skin color model after a region of the hands is detected. The first category of methods have such a drawback that they are subject to such a great influence of the training data that they are very likely to fail if the color of the hands in the image can not be well modeled and tested with the skin color model or there are many regions of a color similar to the color of skin other than the region of the hands, and the second category of methods have such a drawback that they require a lot of training data of hands for learning a classifier with a strong classifying capability and their effectiveness may not be guaranteed due to a great diversity of appearances of hands.
In view of the foregoing problems, embodiments propose to reserve a region in an image inside the boundary of a specific object to thereby remove indirectly from the image a region of hands or another object than the specific object. According to the embodiments, a boundary of a document is fitted from a current image without requiring a lot of offline training data for creating a model to thereby perform a convenient and prompt process with good popularity. Furthermore the efficiency and preciseness of image processing can be improved significantly by this technique of removing the region of the other object.
According to an embodiment, there is provided an image processing method including: determining an edge map of a foreground object in an image; obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line, the boundary line defining the boundary of a specific object in the foreground object; and removing the foreground object beyond the boundary line other than the specific object.
According to another embodiment, there is provided an image processing device including: edge map determining means for determining an edge map of a foreground object in an image; boundary line determining means for obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line, the boundary line defining the boundary of a specific object in the foreground object; and removing means for removing the foreground object beyond the boundary line other than the specific object.
According to a still another embodiment, there is provided a program product including machine readable instruction codes stored thereon which when being read and executed by a machine perform the foregoing image processing method.
According to a further embodiment, there is provided a storage medium including the foregoing program product according the embodiment.
The foregoing and other objects and advantages of the invention will be further described below in conjunction with the embodiments thereof and with reference to the drawings in which identical or like technical features or components will be denoted with identical or like reference numerals. In the drawings:
a) is a schematic diagram illustrating pressing both sides of a book by hands to keep the book flat;
b) is an illustrative diagram illustrating misjudgment of corner points due to presence of the hands on the boundary of the book;
a) and
a) and
a) and
a) is an illustrative diagram illustrating taking a region of a specific width taken on two sides of a fitted line, and
a) and
a) and
The embodiments will be described below with reference to the drawings. It shall be noted only those device structures and/or processes closely relevant to solutions of the embodiments are illustrated in the drawings while other details less relevant to the embodiments are omitted so as not to obscure the embodiments due to those unnecessary details. Identical or like constituent elements or components will be denoted with identical or like reference numerals throughout the drawings.
According to an embodiment, there is provided an image processing method including: determining an edge map of a foreground object in an image (S210); obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line, the boundary line defining the boundary of a specific object in the foreground object (S220); and removing the foreground object beyond the boundary line other than the specific object (S230).
The embodiments will be detailed below with reference to the drawings by way of an example in which a non-specific object in an image, i.e., the hands beyond the boundary of the book illustrated in
In this embodiment, it is assumed that the left and right boundaries of the book need to be determined and the region of the hands beyond the left and right boundaries is to be removed. However those skilled in the art shall appreciate that the image processing method according to the embodiment can be similarly applicable with the hands or another object being on the upper and lower boundaries of the book in the image.
Various implementations of the respective processes in
Firstly a specific implementation of a process at S210 where an edge map is determined will be described with reference to
As illustrated in
Given an input image f(x, y) (0≦x≦w−1 and 0≦y≦h−1) (w and h represent the width and the height of the input image respectively), the average of color fbackground of a background region of the image is estimated from a boundary region of the image. Here the background region of the image is assumed as a uniform texture region, and therefore the average of color of the background can be estimated easily from the boundary region of the image. Then a distance image can be calculated from the raw image according to the estimated average of color of the background as indicated in the equation (1):
dist(x, y)=|f(x, y)−fbackground| (1)
Here |·| represents the L1 distance (block distance) between two vectors. The L1 distance is a parameter commonly used in the field to represent the difference between two vectors, and the concept and solving method thereof are well know to those skilled in the art, so a repeated description of details thereof will be omitted here. For the distance image dist(x, y) (0≦x≦w−1 and 0≦y≦h−1), a threshold T can be obtained by Otsu algorithm, and the input image is binarized with the threshold, as indicated in the equation (2):
a) and
Referring back to
A method for obtaining an edge map from a raw document image is not limited to what is described above but can be otherwise performed. For example, for obtaining the left edge map, the difference in luminance between a foreground pixel on the raw image and a left neighboring pixel thereof, and if the difference is above a preset threshold, then the pixel is determined as a pixel of the left edge map; and similarly the difference in luminance between each foreground pixel on the raw image and a right neighboring pixel thereof, and if the difference is above a preset threshold, then the pixel is determined as a right edge pixel. The threshold can be set dependent upon a practical condition (for example, the difference in luminance between a foreground pixel and a background pixel of the raw image, etc.) or experimentally or empirically.
Now referring back to
Obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line (S220) can include firstly obtaining candidates for a boundary line from the edge map through fitting (
Firstly a process of a specific example of obtaining candidates for a boundary line from the edge map through fitting will be described with reference to
b illustrates an example of a process of selecting the boundary line from the fitted lines. Specifically at S650, a region of a specific width is taken respectively on each side of each fitted line, i.e., each candidate of the boundary line, as illustrated in
If the determination at S680 is made, then a further benefit can be obtained, that is, a rate of false determination can be lowered to the minimum. A false determination refers to mistaking of the central line of the book for the boundary line, etc, to thereby remove contents in a subsequent process which would otherwise not be removed, e.g., the contents of the book itself, etc.
The difference between the feature representations of the two regions is calculated at S660. A specific example of how to calculate the difference between the feature representations on the two sides will be described below with reference to
Although a color histogram feature is used above to calculate the difference between feature representations, any other appropriate feature representation can be applied so long as the feature representation is sufficient to represent the difference between specific regions (rectangular regions of a specific width in this example) on two sides of a candidate for the boundary line. Furthermore the region for calculation of the difference between feature representations will not necessarily be rectangular but any appropriate shape can be used.
As can be apparent, in a specific example, if only one candidate for the boundary line is obtained in the foregoing process, then this candidate can be taken directly as the final true boundary line.
Referring back to
Noted in the foregoing process, if the book which is a specific object is significantly tilted, for example, beyond a specific range relative to the vertical direction of the image (e.g., −15° to 15°, etc.,) etc., then in a preferable embodiment the direction of the image to be processed is estimated and corrected as in the prior art so that the object of the book in the image to be processed is tilted within a predetermined range to thereby further improve the precision of image processing described above. For details of estimating and correcting the direction of an image in the prior art, reference can be made to, for example, “Skew and Slant Correction for Document Images Using Gradient Direction” by Sun Changming and Si Deyi in the 4th International Conference on Document Analysis and Recognition, etc.
Although the foregoing detailed description has been presented by way of an example in which the image of hands beyond the boundary of an image of a book is removed, those skilled in the art shall appreciate that such an image processing solution can equally be applied to removing an image of various other objects beyond the boundary of an image of a book, which will not be enumerated here.
Furthermore the region in the image between the left and right boundaries of the specific object (e.g., a book, etc.) is determined so as to remove the region of the hands beyond the boundaries in the respective embodiments and specific examples described above, but the processes according to the embodiments can equally be applied to determining the region in the image between the upper and lower boundaries of the specific object so as to remove the region of another object beyond the boundaries, and processes thereof are similar, so a repeated description thereof will be omitted here for the sake of conciseness.
In correspondence to the image processing method according to the embodiment, an embodiment provides an image processing device as illustrated in
Edge map determining means 901 for determining an edge map of a foreground object in an image;
Boundary line determining means 902 for obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line, the boundary line defining the boundary of a specific object in the foreground object; and
Removing means 903 for removing the foreground object beyond the boundary line other than the specific object.
In a specific example, the edge map can be determined with a binary mask, and referring to
Binary mask determining means 9011 for determining a binary mask for the captured image in which a background object is distinguished from the foreground object; and the edge map determining means 901 determines the edge map according to the binary mask determined by the binary mask determining means 9011.
In another specific example, the edge map can alternatively be determined from the difference in luminance, and referring to
Luminance difference calculating means 9012 for calculating the difference in luminance between a foreground pixel of the foreground object in the image and a neighboring pixel of the foreground pixel on one side of the foreground pixel, adjacent to the foreground pixel and farther from the center of the foreground object than the foreground pixel; and if the difference is above a predetermined first threshold, then determining the foreground pixel as a pixel of the edge map.
In another specific example, with reference to
Region obtaining means 9021 for obtaining the number of foreground pixels taking region of a predetermined size as a unit on the obtained edge map, the number of the foreground pixels obtained by counting foreground pixels in the edge map contained in the region of the predetermined size, and for selecting a region with the number of foreground pixels above a predetermined second threshold value;
Candidate fitting means 9022 for fitting the foreground pixels contained in the selected region to the candidates for the boundary line; and
Feature representation obtaining means 9023 for obtaining, from the raw image, feature representations of regions of a specific width on two sides of each candidate for the boundary line adjacent to the candidate for the boundary line, and determining the difference between the feature representations of the regions on the two sides, and for selecting the candidate for the boundary line with the largest difference between the feature representations as the boundary line.
The image processing device according to the embodiment described above with reference to
The foregoing detailed description has been presented in the block diagrams, the flow charts and/or the embodiments to set forth a variety of implementations of the device and/or the method according to the embodiment(s). Those skilled in the art shall appreciate that the respective functions and/or operations in these block diagrams, the flow charts and/or the embodiments can be performed separately and/or collectively in a variety of hardware, software, firmware or essentially any combination thereof In an embodiment, several aspects of the subject matter described in this specification can be embodied in an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP) or another integrated form.
However those skilled in the art shall appreciate that some aspects of the implementations described in this specification can be wholly or partially embodied in an integrated circuit in the form of one or more computer programs running on one or more computers (e.g., in the form of one or more computer programs running on one or more computer systems), in the form of one or more programs running on one or more processors (e.g., in the form of one or more programs running on one or more microprocessors), in the form of firmware or equivalently in the form of essentially any combination thereof, and those skilled in the art are capable of designing circuits for this disclosure and/or writing codes for software and/or firmware for this disclosure in light of the disclosure in this specification.
For example the respective processes in the flow charts of the processes for removing the hands beyond the boundary of the book as illustrated in
In
The following components are also connected to the input/output interface 1205: an input part 1206 (including a keyboard, a mouse, etc.), an output part 1207 (including a display, e.g., a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc.), a storage port 1208 (including a hard disk, etc.) and a communication part 1209 (including a network interface card, e.g., an LAN card, a modem, etc.). The communication part 1209 performs a communication process over a network, e.g., the Internet. A drive 1210 can also be connected to the input/output interface 1205 as needed. A removable medium 1211, e.g., a magnetic disk, an optical disk, an optic-magnetic disk, a semiconductor memory, etc., can be installed on the drive 1210 as needed so that a computer program fetched therefrom can be installed into the storage part 1208 as required.
In the case that the foregoing series of processes are performed in software, a program constituting the software is installed from the network, e.g., the Internet, etc., or a storage medium, e.g., the removable medium 1211, etc.
Those skilled in the art shall appreciate that such a storage medium will not be limited to the removable medium 1211 illustrated in
Therefore the embodiments further propose a program product including machine readable instruction codes stored therein which can perform the foregoing image processing method according to the embodiments when being read and executed by a machine. Correspondingly the various storage mediums listed above in which the program product is embodied will also come into the scope of the embodiments.
In the foregoing description of the embodiments, a feature described and/or illustrated in an embodiment can be used identically or similarly in one or more other embodiments in combination with or in place of a feature in the other embodiment(s).
It shall be noted that the terms “include/comprise” as used in this context refer to presence of a feature, an element, a process or a component but do not preclude presence or addition of one or more other features, elements, processes or components. The terms “first”, “second”, etc., relating ordinal numbers do not mean any execution order or any degree of importance of features, elements, processes or components as defined with these terms but are merely intended to identify these features, elements, processes or components for the sake of clarity of the description.
Furthermore the method according to the respective embodiments will not be limited to being performed in the temporal sequence described in the specification or illustrated in the drawings but can also be performed in another temporal sequence, concurrently or separately. Therefore the sequence in which the method is performed described in the specification will not limit the scope of the embodiments.
The following annexes are also disclosed in connection with implementations including the foregoing embodiments.
Annex 1. An image processing method, including:
determining an edge map of a foreground object in an image;
obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line, the boundary line defining the boundary of a specific object in the foreground object; and
removing the foreground object beyond the boundary line other than the specific object.
Annex 2. A method according to Annex 1, wherein the process of determining the edge map of the foreground object in the image includes: obtaining a binary mask for the captured image in which a background object is distinguished from the foreground object; and then determining the edge map according to the binary mask.
Annex 3. A method according to Annex 2, wherein the process of determining the edge map according to the binary mask includes: selecting a foreground pixel in a region on one side of the center of the binary masked image, and if a pixel farther from the center of the foreground object than the foreground pixel and adjacent to the foreground pixel is a background pixel, then determining the foreground pixel as a pixel of the edge map.
Annex 4. A method according to Annex 1, wherein the process of determining the edge map of the foreground object in the image includes: calculating the difference in luminance between a foreground pixel of the foreground object and a neighboring pixel of the foreground pixel on one side of the foreground pixel, adjacent to the foreground pixel and farther from the center of the foreground object than the foreground pixel; and if the difference is above a predetermined first threshold, then determining the foreground pixel as a pixel of the edge map.
Annex 5. A method according to any one of Annexes 1 to 4, wherein the process of obtaining the candidates for the boundary line from the edge map includes:
obtaining the number of foreground pixels taking a region of a predetermined size as a unit on the obtained edge map, the number of foreground pixels obtained by counting foreground pixels in the edge map contained in the region of the predetermined size, and selecting a region with the number of foreground pixels above a predetermined second threshold value; and
fitting the foreground pixels contained in the selected region to get the candidates for the boundary line.
Annex 6. A method according to any one of Annexes 1 to 5, wherein the process of determining the boundary line among the candidates for the boundary line includes:
for each candidate for the boundary line, obtaining, from the raw image, feature representations of regions of a specific width on two sides of the candidate for the boundary line adjacent to the candidate for the boundary line; and
determining the difference between the feature representations of the regions on the two sides, and selecting the candidate for the boundary line with the largest difference between the feature representations as the boundary line.
Annex 7. A method according to Annex 6, wherein the candidate with the largest difference between the feature representations above a preset threshold is selected as the boundary line.
Annex 8. A method according to Annex 6, wherein the feature representation includes color histograms or gray-level histograms corresponding respectively to the regions on the two sides, and wherein the each of the regions on the two sides is divided into several sub-regions, and color histogram or gray-level histograms are obtained from counting in the respective sub-regions and then the histograms of these sub-regions are connected to get the feature representation of the region.
Annex 9. A method according to any one of Annexes 1 to 8, wherein the edge map includes the left edge map and the right edge map, and the boundary line includes the left boundary line and the right boundary line.
Annex 10. An image processing device for performing the method according to any one of Annexes 1 to 9, including:
edge map determining means for determining an edge map of a foreground object in an image;
boundary line determining means for obtaining candidates for a boundary line from the edge map and determining the boundary line among the candidates for the boundary line, the boundary line defining the boundary of a specific object in the foreground object; and
removing means for removing the foreground object beyond the boundary line other than the specific object.
Annex 11. An image processing device according to Annex 10, wherein the edge map determining means includes binary mask determining means for determining a binary mask for the captured image in which a background object is distinguished from the foreground object.
Annex 12. An image processing device according to Annex 10 or 11, wherein the edge map determining means further includes luminance difference calculating means for calculating the difference in luminance between a foreground pixel of the foreground object in the image and a neighboring pixel of the foreground pixel on one side of the foreground pixel, adjacent to the foreground pixel and farther from the center of the foreground object than the foreground pixel; and if the difference is above a predetermined first threshold, then determining the foreground pixel as a pixel of the edge map.
Annex 13. An image processing device according to any one of Annexes 10 to 12, wherein the boundary line determining means includes:
region obtaining means for obtaining the number of foreground pixels taking a region of a predetermined size as a unit on the obtained edge map, the number of foreground pixels obtained by counting foreground pixels in the edge map contained in the region of the predetermined size, and for selecting a region with the number of foreground pixels above a predetermined second threshold value;
candidate fitting means for fitting the foreground pixels contained in the selected region to obtain the candidates for the boundary line; and
feature representation obtaining means for obtaining, from the raw image, feature representations of regions of a specific width on two sides of each candidate for the boundary line adjacent to the candidate for the boundary line, for determining the difference between the feature representations of the regions on the two sides, and for selecting the candidate for the boundary line with the largest difference between the feature representations as the boundary line.
Annex 14. A scanner, including the image processing device according to any one of Annexes 10 to 13.
Annex 15. A program product, including machine readable instruction codes which performs the foregoing image processing method when being read and executed by a machine.
Annex 16. A storage medium, including the program product according to Annex 15.
Although the embodiments have been disclosed above in the description of the embodiments, it shall be appreciated that those skilled in the art can devise various modifications, adaptations or equivalents to the embodiments without departing from the spirit and scope of the embodiments. These modifications, adaptations or equivalents shall also be construed as coming into the scope of the embodiments.
Number | Date | Country | Kind |
---|---|---|---|
201110159926.1 | Jun 2011 | CN | national |