This application claims the priority benefit of Chinese Patent Application No. 201610037593.8, filed on Jan. 20, 2016 in the Chinese State Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field
The disclosure relates to the field of image processing, and in particular to a method and a device for correcting a document image captured by an image pick-up device.
2. Description of the Related Art
Recently, an image pick-up device has become a very common device. The image pick-up device may be integrated into a mobile phone, a personal computer and a tablet computer. People often capture a large amount of paper documents by using their image pick-up devices, for helping them to record information. Due to reasons such as a shooting angle, a shot document may have a perspective transformation, so that information in the document image is hard to be read by human, and it is more difficult to be read by a computer. For this reason, perspective correction for the document image has attracted more attention.
Currently, there are some methods for correcting the captured document into a rectangle document. However, with these methods, an aspect ratio of an original document image cannot be recovered based on only one captured document image.
It is desired to provide a method and a device for correcting a document image captured by the image pick-up device conveniently.
Additional aspects and/or advantages will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the embodiments.
A brief summary of the disclosure will be set forth hereinafter, so as to provide basic understanding of some aspects of the disclosure. It is to be understood that, this summary is not an exhaustive summary of the disclosure. It is neither intended to determine the critical or important part of the disclosure, nor intended to define the scope of the disclosure. It aims only to give some concepts in a simplified form, for serving as a preamble portion of the detailed description discussed latter.
It is the major object of the present disclosure to provide a method for correcting a document image captured by an image pick-up device. The method includes: determining world coordinates of four vertices of the document image; calculating an original aspect ratio of the document image based on a correspondence between the world coordinates of the four vertices and projective coordinates of the four vertices in a projective space, and an intrinsic matrix and characteristics of an extrinsic matrix of the image pick-up device; determining a projective transformation matrix based on the world coordinates of the four vertices and the aspect ratio; and obtaining a corrected document image based on the determined projective transformation matrix and the document image.
In an aspect of the present disclosure, it is provided a device for correcting a document image captured by an image pick-up device. The device includes a vertex coordinate determining unit, an aspect ratio calculating unit, a projective transformation matrix determining unit and a correcting unit. The vertex coordinate determining unit is configured to determine world coordinates of four vertices of the document image. The aspect ratio calculating unit is configured to calculate an original aspect ratio of the document image based on a correspondence between the world coordinates of the four vertices and projective coordinates of the four vertices in a projective space, and an intrinsic matrix and characteristics of an extrinsic matrix of the image pick-up device. The projective transformation matrix determining unit is configured to determine a projective transformation matrix based on the world coordinates of the four vertices and the aspect ratio. The correcting unit is configured to obtain a corrected document image based on the determined projective transformation matrix and the document image.
Further, it is provided a computer program for implementing the above method in an embodiment of the disclosure.
Further, it is provided a computer program product at least in a form of a computer readable medium, on which a computer program code for implementing the above method is recorded.
These and other advantages of the disclosure will be more apparent through the detailed description of the preferred embodiments of the disclosure given in conjunction with the drawings.
The above and other objects, features and benefits of the disclosure will be understood more easily with reference to the description of the embodiments of the disclosure given in conjunction with the drawings. The components in the drawings are only for showing the principle of the disclosure. In the drawings, identical or similar technical features or components are represented by identical or similar numeral references.
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below by referring to the figures.
Hereinafter, demonstrative embodiments of the disclosure will be described in conjunction with the drawings. For clearness and conciseness, not all the features of the practical embodiments are described in the specification. However, it is to be understood that, in development of any one of the practical embodiments, many decisions specific to the embodiment need be made, so as to achieve the specific object of the developer, such as in accordance with these limitations related to system and service, with these limitations varying with different embodiments. Moreover, it is to be understood that, although the developing work may be very complex and time-wasting, it is only a routine task for those skilled in the art benefiting from the disclosure.
Here, it is to be noted that, in the drawings, only device structures and/or process steps closely related to the solution of the disclosure are shown, and other details less related to the disclosure are omitted, in order to avoid the disclosure with unnecessary details.
It is provided a method for correcting perspective transformation based on only one captured image and recovering an original image based on an original aspect ratio in the disclosure.
In the method according to the present disclosure, an input is a shot image of rectangle or rectangular document (the rectangle document is in a same plane), which is a document image shot by an image pick-up device as shown in
The method and the device for correcting the document image captured by the image pick-up device according to the embodiments of the present invention are described in detail in conjunction with the drawings in the following. The description hereinafter is performed in the following order.
First, in step S202, positions (that are world coordinates) of four vertices of the document image captured by the image pick-up device in an image are determined.
As shown in
In the method according to the present disclosure, the world coordinates of the four vertices of the document image may also be inputted in advance as known parameters.
Next, in step S204, an original aspect ratio of the document image is calculated based on a correspondence between the world coordinates of the four vertices and projective coordinates of the four vertices in a projective space, and an intrinsic matrix and characteristics of an extrinsic matrix of the image pick-up device.
Specifically,
A shape of an original document shown in
It is assumed that a projective transformation matrix from a plane x1x2x3x4 to a plane m1m2m3m4 is H, then xi and mi satisfy a relationship in formula (1) as follows:
where H is a 3×3 matrix, xi and mi (i=1, 2, 3, 4) are 3×1 vectors, and si (i=1, 2, 3) is a real number coefficient.
The following formula can be obtained from the formula (1):
The matrix [x1, x2, x3] is invertible in a case that the aspect ratio r of the original document is not 0, then
Then the following formula can be obtained by bringing the formula (2) into the formula (1):
Since
Assuming that H=[h1 h2 h3], the following formula can be obtained from the formula (2):
Thus, following relationships can be obtained between h1, h2 and the aspect ratio r and the world coordinates mi:
In the formula (4), only r is unknown in h1 and h2 since mi is known and si can be calculated based on mi.
In another aspect, from perspective of parameters of an image pick-up device, the projective transformation H satisfies H=A·R, where A is an intrinsic matrix of the image pick-up device, and R is a rotation matrix (which is also referred as an extrinsic matrix) of the image pick-up device.
If i-th column of the rotation matrix R is represented by ri, then
H=A·R=A·[r1r2r3t] (5)
Based on properties of an external parameter that r1T·r2=0 and |r1|=|r2|, the following formula (6) and formula (7) can be obtained.
h1TA−TA−1h2=0 (6)
h1TA−TA−1h1=h2TA−TA−1h2 (7)
The aspect ratio r of the document can be obtained according to the formula (4) and the formula (7) in a case that the internal matrix A is known.
The internal matrix A of the image pick-up device is
where fx is a focal length of the image pick-up device on a horizontal axis in pixels, fy is a focal length of the image pick-up device on a vertical axis in pixels, and (x0, y0) is a coordinate of a principle point. According to EXIF (exchangeable image file) information, in a case that the focal length of the image pick-up device is f, a resolution is w×h and a size of a sensor is a×b, then the intrinsic matrix A is:
Then assuming that h1=(h11,h21,h31)T and h2=1/r(h12,h22,h32)T, the following formula can be obtained from the formula (7):
In this way, the original aspect r is calculated.
Next, in step S206, the projective transformation matrix is determined based on the world coordinates of the four vertices and the aspect ratio.
At last, in step S208, a corrected document image may be obtained based on the determined projective transformation matrix and the captured document image.
The method according to the present disclosure has the following special scenarios when being applied to a case of shooting an image by using a camera module.
Some mobile phones have square modes. For example, a resolution of a normal image is 3264×2448 and a resolution of an image shot in the square mode is 2448×2448, that is, the original image is cut. In a case that the method according to the present disclosure is applied to the image shot in the square mode, it is only required to input a resolution of an image before being cut, that is, w=3264 and h=2448.
In addition, for an image shot in a zooming mode, a digital zooming of the mobile phone can be read from the EXIF, and a focal length f after zooming is obtained by multiplying an original focal length foriginal by the digital zooming.
In the method according to the present disclosure, the aspect ratio of the original document can be recovered based on a geographic space and an arithmetic property of the image pick-up device. With the method according to the present disclosure, the document image can be corrected with only one image being captured, which is convenient for users.
[2. A Device for Correcting a Document Image Captured by an Image Pick-Up Device]
As shown in
The vertex coordinate determining unit 502 is configured to determine world coordinates of four vertices of the document image.
The aspect ratio calculating unit 504 is configured to calculate an original aspect ratio of the document image based on a correspondence between world coordinates of the four vertices and projective coordinates of the four vertices in a projective space, and an intrinsic matrix and characteristics of an extrinsic matrix of the image pick-up device.
The projective transformation matrix determining unit 506 is configured to determine a projective transformation matrix based on the world coordinates of the four vertices and the aspect ratio.
The correcting unit 508 is configured to obtain a corrected document image based on the determined projective transformation matrix and the document image.
The edge detecting sub-unit 5022 is configured to detect an edge of the document image.
The binarizing sub-unit 5024 is configured to binarize the detected edge.
The coordinate determining sub-unit 5026 is configured to determine the world coordinates of the four vertices based on the binarized edge.
The projective transformation matrix H satisfies:
where H is a 3×3 matrix, mi and xi are 3×1 vectors, mi is a world coordinate of each of the four vertices, xi is a projective coordinate of each of the four vertices in the projective space, and si is a real number coefficient depending on mi.
Assuming that the projective transformation matrix is H=[h1 h2 h3], then relationships between h1, h2 and the aspect ratio r and the world coordinates mi are:
The intrinsic matrix A of the image pick-up device is:
where f is a focal length of the image pick-up device, w and h are resolutions, and a and b are sizes of a sensor.
The characteristic of the extrinsic matrix of the image pick-up device is as follows: r1T·r2=0 and |r1|=|r2|, if the extrinsic matrix is represented as R=[r1 r2 r3t].
A relationship between h1, h2 and the intrinsic matrix A is obtained as follows based on the intrinsic matrix A of the image pick-up device and the characteristics of the extrinsic matrix of the image pick-up device:
h1TA−TA−1h1=h2TA−TA−1h2.
The aspect ratio r is determined based on the relationships between h1, h2 and the aspect ratio r and the world coordinates mi and the relationship between h1, h2 and the intrinsic matrix A.
The projective transformation matrix H is determined as follows based on the world coordinates of the four vertices and the aspect ratio:
For details of operation and functions of each part of the device 500 for correcting the document image captured by the image pick-up device, reference may be made to the embodiments of the method according to the present disclosure for correcting the document image captured by the image pick-up device described in conjunction with
It should be noted here that, the devices as shown in
A method and a device for correcting a document image captured by an image pick-up device are provided in the present disclosure. Compared with the conventional method, the present method has the following advantages.
The basic principle of the disclosure has been described above in conjunction with specific embodiments. However, it is to be noted that, it is to be understood by those skilled in the art that, all or any step or component of the method and apparatus of the disclosure may be implemented in hardware, firmware, software or a combination thereof in any computing apparatus (including a processor, a storage medium and the like) or a network of computing apparatus, which is implementable by those skilled in the art using their basic programming skill upon reading the description of the disclosure.
Thus, the objects of the disclosure may be implemented by a program or a group of programs running on any computing apparatus. The computing apparatus may be a well-known common apparatus. Thus, the objects of the disclosure may also be implemented by providing a program product containing program code for implementing the method or apparatus. That is to say, such program product is also a part of the disclosure, and so does the storage medium in which such program product is stored. Apparently, the storage medium may be any well-known storage medium or any storage medium that will be developed in the future.
In a case that the embodiment of the disclosure is implemented in software and/or firmware, programs composing this software are mounted onto a computer having a dedicated hardware structure, such as the general purpose computer 700 as shown in
In
Linked to the input/output interface 705 are: an input portion 706 (including the keyboard, the mouse and the like), an output portion 707 (including a display, such as a cathode ray tube (CRT) and a liquid crystal display (LCD), a speaker and the like), the storage portion 708 (including a hard disk and the like) and a communication portion 709 (including a network interface card, such as an LAN card and a modem). The communication portion 709 performs communication processes via a network, such as Internet. A driver 710 may also be linked to the input/output interface 705 as required. A removable medium 711, such as a magnetic disk, an optical disk, a magneto-optical disk and a semiconductor memory, may be mounted on the driver 710 as required, so that the computer program that is read out from the removable medium 711 is mounted onto the storage portion 708.
In a case that the embodiment of the disclosure is implemented in software, programs composing this software are mounted or accessible from a network, such as Internet, or from a storage medium, such as the removable medium 711.
It is to be understood by those skilled in the art that, the storage medium is not limited to the removable medium 711 shown in
It is further proposed a program product having machine readable instruction codes stored therein in the disclosure. The instruction codes, when being read out and executed by the machine, performs the above method according to the embodiment of the disclosure.
Accordingly, the storage medium carrying the above program product having machine readable instruction codes stored therein is included in the disclosure. The storage medium includes but is not limited to a soft disk, an optical disk, a magneto-optical disk, a storage card, a storage stick and the like.
It is to be understood by those ordinal skilled in the art that, the listed are exemplary embodiments, and the disclosure is not limited thereto.
In the specification, the expressions such as “a first”, “a second” and “a n-th” is meant to distinguish the described features literally, so as to describe the disclosure clearly. Thus, these expressions should not be considered as limitation.
As an example, various steps of the above method and various integral modules and/or units of the above apparatus may be implemented as software, firmware, hardware or a combination thereof, and may be used as a part of a corresponding apparatus. The various integral modules and units of the above apparatus, when being configured in a form of software, firmware, hardware or a combination thereof, may be implemented in a means or manner well-known to those skilled in the art, which is not described in detail here.
As an example, in a case of software or firmware, programs composing this software are mounted onto a computer having a dedicated hardware structure (such as the general purpose computer 700 as shown in
The feature described and/or illustrated for one embodiment in the above description of the specific embodiment of the disclosure may be applied in one or more other embodiment in a same or similar manner, may be combined with the feature in other embodiment, or may be used to replace the feature in other embodiment.
It is to be emphasized that, term “include/comprise” used herein refers to the presence of a feature, an element, a step or an assembly, but not excludes the presence or addition of other features, elements, steps or assemblies.
Further, the method according to the disclosure is not limited to be performed in the chronological order described in the specification, and may also be performed sequentially, in parallel or separately. Thus, the order described herein in which the method is performed is not meant to limit the technical scope of the disclosure.
The disclosure and the advantages thereof have been described above. It is to be understood that, various variations, alternations and transformations may be made without deviating from the spirit and scope of the disclosure defined in the appended claims. The scope of the disclosure is not limited to the specific embodiment of the process, device, means, method and step described in the specification. It can be understood by those ordinary skilled in the art from the disclosure that, the process, device, means, method and step that exist or to be developed in the future and perform functions substantially the same and obtain substantially the same result as the corresponding embodiment herein can be used. Thus, the appended claim aims to include such process, device, means, method and step in their scope.
It can be seen from the above illustration that, at least the following technical solutions are disclosed.
Appendix 1. A method for correcting a document image captured by an image pick-up device, comprising:
Appendix 2. The method according to appendix 1, wherein the determining world coordinates of four vertices of the document image comprises:
Appendix 3. The method according to appendix 1, wherein the projective transformation matrix H satisfies:
where H is a 3×3 matrix, mi and xi are 3×1 vectors, mi is a world coordinate of each of the four vertices, xi is a projective coordinate of each of the four vertices in the projective space, and si is a real number coefficient depending on mi.
Appendix 4. The method according to appendix 3, wherein when the projective transformation matrix is H=[h1 h2 h3], relationships between h1, h2 and the aspect ratio r and the world coordinate mi are:
Appendix 5. The method according to appendix 4, wherein the intrinsic matrix A of the image pick-up device is:
where f is a focal length of the image pick-up device, w and h are resolutions, and a and b are sizes of a sensor.
Appendix 6. The method according to appendix 5, wherein the characteristic of the extrinsic matrix of the image pick-up device is: r1T·r2=0 and |r1|=|r2| in a case that the extrinsic matrix is represented as R=[r1 r2 r3 t].
Appendix 7. The method according to appendix 6, wherein a relationship between h1, h2 and the intrinsic matrix A is obtained as follows based on the intrinsic matrix A of the image pick-up device and the characteristics of the extrinsic matrix of the image pick-up device:
h1TA−TA−1h1=h2TA−TA−1h2.
Appendix 8. The method according to appendix 7, wherein the aspect ratio r is determined based on the relationships between h1, h2 and the aspect ratio r and the world coordinates mi and the relationship between h1, h2 and the intrinsic matrix A.
Appendix 9. The method according to appendix 8, wherein the projective transformation matrix H is determined as follows based on the world coordinates of the four vertices and the aspect ratio:
Appendix 10. A device for correcting a document image captured by an image pick-up device, comprising:
Appendix 11 The device according to appendix 10, wherein the vertex coordinate determining unit comprises:
Appendix 12 The device according to appendix 10, wherein the projective transformation matrix H satisfies:
where H is a 3×3 matrix, mi and xi are 3×1 vectors, mi is a world coordinate of each of the four vertices, xi is a projective coordinate of each of the four vertices in the projective space, and si is a real number coefficient depending on mi.
Appendix 13 The device according to appendix 12, wherein in case of the projective transformation matrix is H=[h1 h2 h3], then relationships between h1, h2 and the aspect ratio r and the world coordinate mi are:
Appendix 14 The device according to appendix 13, wherein the intrinsic matrix A of the image pick-up device is:
where f is a focal length of the image pick-up device, w and h are resolutions, and a and b are sizes of a sensor.
Appendix 15 The device according to appendix 14, wherein the characteristic of the extrinsic matrix of the image pick-up device is: r1T·r2=0 and |r1|=|r2| in a case that the extrinsic matrix is represented as R=[r1 r2 r3 t].
Appendix 16 The device according to appendix 15, wherein a relationship between h1, h2 and the intrinsic matrix A is obtained as follows based on the intrinsic matrix A of the image pick-up device and the characteristics of the extrinsic matrix of the image pick-up device:
h1TA−TA−1h1=h2TA−TA−1h2.
Appendix 17 The device according to appendix 16, wherein the aspect ratio r is determined based on the relationships between h1, h2 and the aspect ratio r and the world coordinates mi and the relationship between h1, h2 and the intrinsic matrix A.
Appendix 18 The device according to appendix 17, wherein the projective transformation matrix H is determined as follows based on the world coordinates of the four vertices and the aspect ratio:
Although a few embodiments have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the embodiments, the scope of which is defined in the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2016 1 0037593 | Jan 2016 | CN | national |
Number | Name | Date | Kind |
---|---|---|---|
5491759 | Nagao | Feb 1996 | A |
8170368 | Yin | May 2012 | B2 |
8645119 | Och | Feb 2014 | B2 |
8855375 | Macciola | Oct 2014 | B2 |
9122921 | Beato | Sep 2015 | B2 |
9456123 | Emmett | Sep 2016 | B2 |
9524445 | Campbell | Dec 2016 | B2 |
9747499 | Kim | Aug 2017 | B2 |
20100310132 | Perez Gonzalez | Dec 2010 | A1 |
20130343609 | Wilson | Dec 2013 | A1 |
20150116201 | Tsou | Apr 2015 | A1 |
20170208207 | Liu | Jul 2017 | A1 |
Number | Date | Country | |
---|---|---|---|
20170208207 A1 | Jul 2017 | US |