1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a non-transitory computer-readable medium.
2. Description of the Related Art
Recently, there has been a proliferation of mobile terminals having an information processing function and a communication function, such as smartphones and tablets. These mobile terminals generally include cameras and have shooting functions (camera functions). There are increasing chances of shooting a document as a paper medium by using such a camera function and storing and using the shot image as image data in the memory of the mobile terminal.
When storing paper documents upon such camera shooting, the user basically does not shoot from directly in front of the documents, or paper sheets are not held down and hence are not flat unlike in the acquisition of images by a conventional scanner. For this reason, a shot document becomes a distorted image. In addition, the shooting range is not constant, and regions outside documents are also often shot.
Under the circumstances, there is provided application software having a function of simultaneously performing distortion correction using a document frame and trimming processing so as to easily extract only a document range and performing distortion correction in a mobile terminal or the like.
The above application software for performing distortion correction using a document frame and trimming processing is based on the premise that the user accurately designates the position of a document frame (for example, the four corners of a document) before distortion correction/trimming processing. For example, if a designated position is not accurate, end portion data existing in the original image before distortion correction/trimming processing is lost by distortion correction/trimming processing. If the user finds a problem like an end portion loss at the stage of checking after distortion correction/trimming processing, he/she needs to return to the document frame designation screen before distortion correction/trimming processing in order to redo the processing for fine adjustment of the designated position. This leads to an increase in the amount of rework including the designation of a document frame, the recalculation of a distortion amount, and re-correction of the overall image based on the distortion amount. In addition, the user needs to go back in the procedure. That is, this system is poor in usability.
On the other hand, Japanese Patent Laid-Open No. 2005-115711 has proposed a technique in which after the user is allowed to designate coordinate points at the four corners and distortion correction for an overall image is executed, a corrected image is recalculated by adjusting distortion parameters in a longitudinal direction with up and down cursors, and adjusting distortion parameters in a lateral direction with left and right cursors. This technique has an advantage of manually adjusting a distortion correction amount with a proper value after distortion correction and being capable of adjusting the distortion amount again. However, the step size of the proper value used for the adjustment of distortion correction is not decided based on a shooting target. In addition, every time the distortion amount is changed, the overall image is recalculated. For these reasons, fine adjustment is not easy to perform.
The present invention therefore provides a technique of facilitating fine adjustment of a document frame used for distortion correction or trimming processing after distortion correction/trimming processing, with a low recalculation load and high usability.
According to one aspect of the present invention, there is provided an image processing apparatus comprising: a specification unit configured to specify a plurality of points located on edges of an object region included in an image; a first calculation unit configured to calculate a first parameter for performing distortion correction from a plurality of points specified by the specification unit; a first correction unit configured to perform distortion correction of the object region by using the first parameter; a display unit configured to display an object region after correction by the first correction unit and a plurality of points located on edges of an object region after the correction; an adjustment unit configured to adjust, based on an instruction from a user, positions of the plurality of points displayed by the display unit; and a second correction unit configured to, when the adjustment unit adjusts the positions of the plurality of displayed points so as to extend the object region after the correction, perform distortion correction of the extension region by using the first parameter, wherein when the adjustment unit adjusts the positions of the plurality of displayed points so as to extend the object region after the correction, the display unit displays the object region after the correction and the extension region after the correction.
According to another aspect of the present invention, there is provided an image processing method comprising: a specification step of specifying a plurality of points located on edges of an object region included in an image; a first calculation step of calculating a first parameter for performing distortion correction from a plurality of points specified in the specification step; a first correction step of performing distortion correction of the object region by using the first parameter; a display step of displaying an object region after correction in the first correction step and a plurality of points located on edges of an object region after the correction; an adjustment step of adjusting, based on an instruction from a user, positions of the plurality of points displayed in the display step; and a second correction step of, when the positions of the plurality of displayed points are adjusted in the adjustment step so as to extend the object region after the correction, performing distortion correction of the extension region by using the first parameter, wherein when the positions of the plurality of displayed points are adjusted in the adjustment step so as to extend the object region after the correction, the object region after the correction and the extension region after the correction are displayed in the display step.
According to another aspect of the present invention, there is provided an non-transitory computer-readable medium storing a program for causing a computer to function as a specification unit configured to specify a plurality of points located on edges of an object region included in an image, a first calculation unit configured to calculate a first parameter for performing distortion correction from a plurality of points specified by the specification unit, a first correction unit configured to perform distortion correction of the object region by using the first parameter, a display unit configured to display an object region after correction by the first correction unit and a plurality of points located on edges of an object region after the correction, an adjustment unit configured to adjust, based on an instruction from a user, positions of the plurality of points displayed by the display unit, and a second correction unit configured to, when the adjustment unit adjusts the positions of the plurality of displayed points so as to extend the object region after the correction, perform distortion correction of the extension region by using the first parameter, wherein when the adjustment unit adjusts the positions of the plurality of displayed points so as to extend the object region after the correction, the display unit updates display by using the object region after the correction and the extension region after the correction.
It is possible to provide a distortion correction/trimming correction unit with high usability.
Further features of the present invention will become apparent from the following description of exemplary embodiments (with reference to the attached drawings).
The embodiments of the present invention will be described below with reference to the accompanying drawings.
Note that the present invention is not limited to a mobile terminal and can be applied to an apparatus which performs distortion correction/trimming processing for the document image shot by an apparatus having a camera function. For example, the present invention may be applied to a system with any of the arrangements of a smartphone, cellular phone, notebook PC, desktop PC, and digital camera.
In step S501, the CPU 301 activates the camera 104 to shoot the document 201 or reads the image obtained by shooting the document 201, which is stored in the RAM 302. If the resolution of the read image is higher than a predetermined value, the resolution conversion unit 403 may convert the resolution into an arbitrary low resolution. In step S502, the CPU 301 extracts edges from the read image. For example, the CPU 301 performs edge extraction by performing image processing using an edge detection algorithm based on the Canny method. Note that the present invention can use a known edge extraction method and is not limited to any specific method. It is possible to omit edge extraction in this step and acquire information concerning edges or four corner points directly designed by the user.
In step S503, the CPU 301 detects and specifies document frame points (corners) at the four corners located on the edges from the extracted edge information. If the detection accuracy is insufficient or no edge can be detected, the operation unit 308 which has received a user instruction via the display unit 102 capable of touch operation performs coordinate adjustment of document frame points (corners) at the four corners.
In step S504, the CPU 301 sets information concerning a mapping destination after projective transformation of the document frame points (corners) decided in step S503.
In addition to the setting of the output size 702, a margin 706 is set in a temporary buffer setting 705 used for mapping setting for a temporary buffer (the RAM 302 in this case). The temporary buffer setting 705 is used as margin information for deciding margins based on a shooting target. This function is one of the features of the present invention. With the margin 706, the user selects an arbitrary designated ratio with respect to an output size from the list. Although described in detail later, providing a margin region around a document region in the temporary buffer makes it possible to easily handle document frame adjustment after projective transformation. Note that in this embodiment, the same set value is used with respect to an output size in the vertical and horizontal directions. However, the present invention is not limited to this. For example, with the margin 706, it is possible to make different settings in the vertical and horizontal directions.
In this embodiment, the size of the temporary buffer and the mapping destinations of the document frame points (corners) P1 to P4 are determined by using the output size and margin amount set on the mapping destination setting screen 701. Note that a checkbox 703 is used to instruct, upon adjustment document frame points (lines) (to be described later), whether to set the range after the adjustment as an output size. This operation will be described in detail later. Other settings of the margin 706 and settings of a folding process 704 will be described in the second and third embodiments. In the first embodiment, the mapping destination setting screen 701 shown in
In step S505, the CPU 301 causes the projective transformation coefficient calculation unit 401 to calculate projective transformation coefficients to be stored in the temporary buffer based on the coordinate information of the document frame points (corners) P1 to P4 decided in step S503 and the mapping settings decided in step S504. Note that each step after step S505 will be described in detail after the description of this procedure.
In step S506, the CPU 301 executes projective transformation (distortion correction/trimming processing) for the image within the document frame points (corners) on the preview image by using the projective transformation coefficients calculated by the projective transformation unit 402 in step S505.
In step S507, the CPU 301 updates the display content on the display unit 102 into a preview image based on the result of performing distortion correction/trimming processing by projective transformation in step S506. Note that the display on the display unit 102 is executed by storing the display range in the temporary buffer into the display buffer (not shown) of the display unit 306. Note that if the display unit 102 differs in resolution from the display range in the temporary buffer, the resolution conversion unit 403 performs resolution conversion to convert the resolution into the one matching the display unit 102. Thereafter, the display unit 102 performs display. This operation will be described in detail with reference to
In step S508, the CPU 301 determines whether document frame points are confirmed. If it is determined that the image having undergone distortion correction/trimming processing by projective transformation displayed on the display unit 102 is the one desired by the user and document frame points are determined based on user instructions (YES in step S508), the process advances to step S511. If it is determined that document frame points are not confirmed, upon receiving, for example, a user instruction to perform fine adjustment in accordance with the occurrence of an end portion loss or the like caused by trimming (NO in step S508), the process advances to step S509.
In step S509, the CPU 301 adjusts the document frame points on the image after correction, whose display has been updated in step S507, based on user instructions and the like, and avoids image loss arising from trimming, and fine adjustment of a distortion correction position.
In step S510, when the document frame points are adjusted from the original positions toward the extension side in step S509, the CPU 301 performs projective transformation with respect to the regions secured as margins by using the projective transformation coefficients obtained by the projective transformation unit 402 in step S505. The process then returns to step S507, in which the CPU 301 updates the display. Note that projective transformation for the region within the document frame points in step S506 is defined as the first correction, and projective transformation with respect to an extension region (margin region) in step S510 is defined as the second correction. The corrected image obtained by the first correction is combined with the image obtained by the second correction to update the display by using the composite image. Note that when the document frame points are adjusted from the original positions toward the reduction side in step S509, the image of the reduced portion may be deleted from the corrected image obtained by the first correction to update the display in step S507.
If the document frame points are determined (YES in step S508), the CPU 301 causes the projective transformation coefficient calculation unit 401 to calculate projective transformation coefficients in accordance with the adjustment result on the document frame points, thereby updating the projective transformation coefficients in step S511. Note that the projective transformation coefficients calculated in step S505 are defined as the first parameters, and the projective transformation coefficients calculated in step S511 are defined as the second parameters.
In step S512, the CPU 301 performs projective transformation for generating a final output image with respect to the document region desired by the user which is included in the read image by using the projective transformation coefficients updated by the projective transformation unit 402 in step S511.
[Calculation of Projective Transformation Coefficients]
The projective transformation coefficient calculation in step S505 will be described in detail. In this embodiment, projective transformation coefficients are parameters for performing distortion correction for a document region.
Coordinate data can represent coordinates before and after transformation in the following manner. In this case, let P be coordinates before transformation, and P′ be coordinates after transformation.
P1(x1,y1)P1′(x1′,y1′)=P1′(LeftMargin,TopMargin)
P2(x2,y2)P2′(x2′,y2′)=P2′(LeftMargin+tmpWidth,TopMargin)
P3(x3,y3)P3′(x3′,y3′)=P3′(LeftMargin+tmpWidth,TopMargin+tmpHeight)
P4(x4,y4)P4′(x4′,y4′)=P4′(LeftMargin,TopMargin+tmpHeight)
As described above, a feature of the present invention is that a margin region is provided on the assumption that information outside the frame is calculated, while four points as document frame points (corners) are designated, and a projective transformation matrix is calculated by using a mapping relationship. Assume that a margin region indicates a region in a predetermined range from each edge of the document to an outside.
Note that a document region (tmpWidth, tmpHeight) in the temporary buffer corresponds to an aspect ratio based on “A4 size” set with the output size 702. In addition, since this region is a preview display in the process of deciding a final projective transformation matrix, a resolution of about 72 dpi is sufficient. In this case,
(tmpWidth,tmpHeight)=(595,847)[pix]
In addition, it is possible to obtain a margin amount in accordance with “10% margin of output size” set with the margin 706. Assume that in this case, the margin amount is a value which is obtained by dropping the fractional portion and is divisible by 2.
TopMargin=BottomMargin=0.10*tmpHeight≈84[pix]
LeftMargin=RightMargin=0.10*tmpWidth≈58[pix]
As described above, the size of the temporary buffer is given by
ExtensionWidth,ExtensionHeight)=(tmpWidth+LeftMargin+RightMargin,tmpHeight+TopMargin+BottomMargin)
=(595+58*2,847+84*2)=(711,1015)[pix]
In addition, mapping coordinates P1′, P2′, P3′, and P4′ after projective transformation are given by
P1′(x1′,y1′)=P1′(LeftMargin,TopMargin)=(58,84)
P2′(x2′,y2′)=P2′(LeftMargin+tmpWidth,TopMargin)=(653,84)
P3′(x3′,y3′)=P3′(LeftMargin+tmpWidth,TopMargin+tmpHeight)=(653,931)
P4′(x4′,y4′)=P4′(LeftMargin,TopMargin+tmpHeight)=(58,931)
Since the coordinates of the document frame points (corners) before projective transformation have been decided in step S503, projective transformation coefficients (planar projective transformation coefficients) can be obtained from four sets of coordinates before and after projective transformation. For example, when the read image has 1200×1600 [pix], document frame point coordinates are given below
P1(x1,y1)=(200,50)
P2(x2,y2)=(1100,200)
P3(x3,y3)=(50,1400)
P4(x4,y4)=(1150,1500)
Note that in order to obtain projective transformation coefficients, z-axis information is added to the coordinate data to handle P(x, y) as P(x, y, 1), and calculation is performed as follows.
where s is a scale, H is a nomography matrix, P is pre-projective transformation coordinates, and P′ is post-projective transformation coordinates.
In this case, although intermediate expressions are omitted, equation (2) is arranged into equation (3).
In this case, solving the simultaneous equations by substituting the coordinate information of four sets of corresponding points will obtain coefficients of H. More specifically, the following state is obtained by substituting the information of the four sets of corresponding points.
When the resultant data are arranged for each matrix, A·H=B, and H is an unknown quantity. Applying the inverse matrix of A to the two sides of the equation will obtain
H=A
−1
·B
The following is an example of obtaining H by arranging the above data and mathematical expressions.
Therefore, obtaining H=A−1·B will produce
The inverse matrix of H is obtained and substituted into
P=H
−1
·P′
This makes it possible to obtain information indicating the pixel values of specific coordinates of the preview image which should be referred to for the pixel values of all the coordinates constituting the temporary buffer in
Letting P be pre-projective transformation coordinates and P′ be post-projective transformation coordinates, a homography matrix H1 of P′=H1·P is obtained.
Letting P′ be pre-projective transformation coordinates and P be post-projective transformation coordinates, a homography matrix H2 of P=H2·P′ is obtained.
The viewpoint of method 1 described above is changed to that of method 2 to back-calculate the coordinates of P from P′. This obviates the necessity to obtain an inverse matrix required for back calculation in method 1. In the above description, the expressions to be used are switched. However, substituting the coordinate data of P into P′, and the coordinate data of P′ into P will obviate the necessity to change the expression to be used. Table 3 corresponds to Table 1, and Table 4 corresponds to Table 1.
Therefore, obtaining H=A−1·B will produce
[Projective Transformation within Document Frame Points]
The projective transformation within the document frame points in step S506 will be described in detail below. The respective pixel values in the temporary buffer in
[Display Update]
The display update in step S507 will be described in detail below.
[Document Frame Point Adjustment]
The document frame point adjustment in step S509 will be described in detail below.
The user operates the return button 1011 to issue an instruction to return to the immediately preceding step. When the user issues an instruction by operating the return button 1011, the processing is redone from the processing (step S503) of adjusting the positions of the document frame points (corners) in a state before projective transformation. The screen shown in
The user operates the document frame point confirm button 1013 to issue an instruction to confirm document frame points on a preview image and execute projective transformation with a formal image size by using the document frame point information. The user operates the cancel button 1014 to issue an instruction to cancel projective transformation of a target image.
The adjustment of the document frame points shown in
[Projective Transformation of Extension Region (Margin Region)]
The projective transformation of an extension region in step S510 will be described in detail below. The tile pattern and hatched portions in
Obtaining projective transformation coefficients by using a buffer having a margin region in this manner makes it possible to process only a portion for which an instruction is issued to newly perform display, when performing frame point adjustment with respect to an image after projective transformation, by using the same projective transformation coefficients (that is, the projective transformation coefficients used when the first projective transformation is performed). This makes it possible to handle such an instruction with high responsiveness.
The process then returns to step S507 again to update the display. At this time, the display range changes in accordance with the setting of the checkbox 703 in
The processing to be performed when the document frame points are confirmed in step S508 will be described in detail next.
[Update of Projective Transformation Coefficients]
The update of the projective transformation coefficients in step S511 will be described in detail below. When a document frame point is adjusted in step S509, the coordinates of the adjusted document frame point (corner) before projective transformation are obtained, and projective transformation coefficients are recalculated by using the coordinate point. Note that the mapping destination coordinates and the output image size are changed in accordance with document frame points (lines) and the ON/OFF state of the checkbox 703.
If the checkbox 703 is ON, the output image size is changed in the following manner. When, for example, the document frame point (line) LP4 is adjusted by lp4_ax in the x-axis direction, the scale is adjusted to set tmpWidth+lp4_ax as an output size. In addition, the margin amount is adjusted by using the scale.
scaleW=tmpWidth/(tmpWidth+lp4—ax)
LeftMargin=lp4—ax*scaleW
TopMargin=BottomMargin=RightMargin=0
tmpWidth=tmpWidth*scaleW(dstWidth,dstHeight)
=(tmpWidth+LeftMargin+RightMargin,tmpHeight+TopMargin+BottomMargin)
P1′(x1′,y1′)=P1′(LeftMargin,TopMargin)
P2′(x2′,y2′)=P2′(LeftMargin+tmpWidth,TopMargin)
P3′(x3′,y3′)=P3′(LeftMargin+tmpWidth,TopMargin+tmpHeight)
P4′(x4′,y4′)=P4′(LeftMargin,TopMargin+tmpHeight)
If the checkbox 703 is OFF, calculation is performed with the scale being 1 (S=1), and the output size varies in accordance with the adjustment of document frame points.
[Projective Transformation of Output Data]
The projective transformation of the output data in step S512 will be described in detail below. Projective transformation is executed for a region decided as a read image output target (the region within the document frame points) by using the final projective transformation coefficients obtained in step S511.
As described above, in the first embodiment, it is possible to easily adjust document frame points on an image after projective transformation by ensuring a buffer having a margin region and using projective transformation coefficients.
The first embodiment has exemplified the method of facilitating the adjustment of document frame points on an image after projective transformation by ensuring a buffer having a margin region and using projective transformation coefficients. The second embodiment has a feature that when a document is a folded document and projective transformation is applied to the document upon division, it is possible to apply document frame point adjustment to an image after projective transformation by applying projective transformation upon also dividing a margin region.
The following is a case in which twofold is selected from a list of folding processes 704 in
Referring to
Pixel values within the surface A′ including the margin region are calculated by using Ha, and pixel values within the surface B′ are calculated by using Hb. With this operation, pixel values in the margin region are calculated in accordance with projective transformation coefficients on the respective surfaces with respect to a document divided into regions, thereby easily coping with the adjustment of document frame points corresponding to an image after projective transformation. Note that document frame points (fold) can be adjusted in the same manner as for document frame points (corners). Note, however, that the manner of dividing the temporary buffer into regions is not basically changed regardless of the adjustment of document frame points (fold).
If a region is to be changed by the adjustment of document frame points (fold), virtual points VP1 and VP2 are set by extending the straight line connecting points OP1′ and OP2′ as the mapping destinations of document frame points (fold). The overall temporary buffer including the margin region is divided into the surfaces A′ and B′ at the straight line connecting VP1 and VP2.
In the second embodiment, a temporary buffer including a margin region is divided into regions, and pixel values belonging to the respective regions are calculated by using projective transformation coefficients obtained on the respective region surfaces. This makes it possible to adjust document frame points on an even folded document after projective transformation, which has a plurality of projective transformation matrices. Note that in this embodiment, it is determined, based on the set value of the folding process 704, whether division is performed. However, the present invention is not limited to this. For example, edges may be extracted to determine, from the state of the edges, whether to perform division.
The first embodiment has exemplified the method of facilitating the adjustment of document frame points on an image after projective transformation by ensuring a buffer having a margin region and using projective transformation coefficients. In this case, a margin region is decided by a ratio to an output size.
The third embodiment has a feature that a margin region is decided from the resolution of a display image.
When visually adjusting document frame points on a display unit 102, errors greatly change depending on in which scale a display image is displayed. In this embodiment, a maximum scale is obtained from a resolution dispimg_resolution of a display image by frame line projective transformation, and a margin is decided from the scale.
If, for example, it is assumed that a visual error of about 10 mm occurs on a display image of dispimg_resolution=72[dpi], an error of about (10 [mm]/25.4 [mm])*72 [dot]=28 [pix] is assumed.
When performing projective transformation like that shown in
S12=tmpdispWidth/|L12|
S23=tmpdispHeight/|L23|
S34=tmpdispWidth/|L34|
S14=tmpdispHeight/|L14|
The size of the margin region is then given as 28 [pix]×Smax, where Smax is the largest scale of the above scales.
With the above operation, a margin region can be decided in accordance with a display state. This makes it possible to prevent a wastefully large margin region from being ensured or an adjustment range from being insufficient because of a lack in margin.
Embodiment(s) of the present invention can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a ‘non-transitory computer-readable storage medium’) to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2013-241271, filed Nov. 21, 2013, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2013-241271 | Nov 2013 | JP | national |