This application is based upon and claims the benefit of priority from Japanese patent application No. 2020-053665 filed on Mar. 25, 2020, the disclosure of which is incorporated herein in its entirety by reference.
The present invention relates to an image processing device, an image processing method, and a program.
A technology that recognizes the state of a recognition object captured in an image with higher accuracy is required. Japanese Unexamined Patent Application, First Publication No. 2009-75848 (hereinafter referred to as Patent Document 1) discloses a technology for reliably reading an indicated value of a meter as a related technology.
It is necessary to properly specify a feature range in an image to recognize a recognition object with high accuracy.
Therefore, an example object of the present invention is to provide an image processing device, an image processing method, and a program capable of solving the problem described above.
According to a first example aspect of the present invention, an image processing device includes: at least one memory configured to store instructions; and at least one processor configured to execute the instructions to: calculate a feature amount of each of feature amount calculation ranges of a plurality of subdivisions set in a processing target range of a captured image obtained by capturing an image of a recognition object; and specify a plurality of feature amount calculation ranges capable of constituting vertices of a polygon among the feature amount calculation ranges as feature ranges based on the feature amount.
According to a second example aspect of the present invention, an image processing method includes: calculating a feature amount of each of feature amount calculation ranges of a plurality of subdivisions set in a processing target range of a captured image obtained by capturing an image of a recognition object; and specifying a plurality of feature amount calculation ranges capable of constituting vertices of a polygon among the feature amount calculation ranges as feature ranges based on the feature amount.
Moreover, in a third example aspect of the present invention, a non-transitory computer-readable recording medium stores a program which causes a computer of an image processing device to execute: calculating a feature amount of each of feature amount calculation ranges of a plurality of subdivisions set in a processing target range of a captured image obtained by capturing an image of a recognition object; and specifying a plurality of feature amount calculation ranges capable of constituting vertices of a polygon among the feature amount calculation ranges as feature ranges based on the feature amount.
According to some example embodiment of the present invention, it is possible to properly specify a feature range in an image to recognize a recognition object with high accuracy.
In the following description, an image processing device according to an example embodiment of the present invention will be described with reference to the drawings.
As shown in
As shown in
The image processing device 1 exerts functions of a control unit 11, a processing target range specifying unit 12, a feature range specifying unit 13, an image transformation unit 14, and a recognition processing unit 15 by executing an image processing program.
The control unit 11 controls other functional units.
The processing target range specifying unit 12 specifies a processing target range related to a recognition object in an image obtained by capturing an image of the recognition object.
The feature range specifying unit 13 specifies a plurality of feature ranges that can form vertices of a polygon when the feature ranges are regarded as points on the basis of a feature amount (feature, feature value) of feature ranges included in the processing target range.
The image transformation unit 14 performs a projective transformation that brings the image of the recognition object closer to a regular image captured from in front of the recognition object.
The recognition processing unit 15 performs recognition processing for a state of the recognition object using a result of the projective transformation on the image obtained by capturing the recognition object.
An example when the image processing device 1 in the present example embodiment is a mobile terminal has been described, but it may also be a PC, a computer server, or the like. In this case, an image capturing device may generate the captured image of the recognition object 2 in the following description, and the PC or computer server may acquire the captured image from the image capturing device to perform the following processing. In the following description, the details of the processing of the image processing device will be described.
First, a user operates the image processing device 1 and captures an image of the recognition object 2. The camera 106 of the image processing device 1 generates a captured image in a range including the recognition object 2 on the basis of the image capturing operation of the user, and records it in a storage unit such as the SSD 104 (step S101). The user instructs the image processing device 1 to start image processing of the recognition object 2. Then, the control unit 11 reads the captured image of the recognition object 2 and outputs the captured image to the processing target range specifying unit 12.
The processing target range specifying unit 12 acquires the captured image (step S102). Here, the processing target range specifying unit 12 may acquire a captured image obtained by the image transformation unit 14 first performing the projective transformation (first projective transformation) on this captured image into an image of the recognition object 2 visually recognized from the front. For example, a square mark is typed or printed on the recognition object 2. In this case, a new captured image may be used (obtained) by generating a projective transformation matrix such that a shape of the mark shown in the captured image becomes a square shape, and performing projective transformation on the captured image using this projective transformation matrix. This projective transformation matrix is calculated by, for example, a known calculation method for a homography transformation matrix using a deviation or a correlation value of each of four coordinate points of rectangular corners of the mark shown in the captured image and four coordinate points of corners of the corresponding positions in the mark determined in advance.
The regular image is an image of the recognition object 2 captured from the front. It is assumed that the captured image and the regular image are images obtained by capturing the image of the recognition object 2 from positions having substantially the same distances from the recognition object 2. In this case, a size of the recognition object 2 shown in the captured image and a size of the recognition object 2 shown in the regular image are substantially the same.
The processing target range specifying unit 12 specifies a processing target range of the recognition object 2 shown in the captured image (step S103). As an example, the processing target range is a range obtained by excluding an exclusion range other than the recognition object 2 shown in the captured image from the captured image. When the recognition object 2 is, for example, an instrument such as a clock, an analog meter, or a digital meter, the processing target range specifying unit 12 may specify only the inside of a board surface (such as a character board surface or a liquid crystal board surface) of a clock or an instrument as the processing target range. In addition, the processing target range specifying unit 12 may also specify a pointer of the clock or the instrument as the exclusion range. The processing target range specifying unit 12 may specify a range that matches the regular image by pattern matching or the like, and may specify this range as the processing target range. The processing target range specifying unit 12 may specify the processing target range using a method other than that described above. For example, the processing target range specifying unit 12 may also specify the processing target range using a machine learning method. The processing target range specifying unit 12 may be an image obtained by masking foreign matter (such as the mark described above) in the recognition object 2.
As shown in
The feature range specifying unit 13 sets a threshold value to a value of the largest feature amount among the feature amounts calculated for each feature amount calculation range r, and specifies a feature amount calculation range r corresponding to a plurality of feature amounts exceeding the threshold value as a feature range by gradually lowering the threshold value (step S105). The feature range specifying unit 13 may specify the feature range in the above-described captured image in which the recognition object 2 is shown, on the basis of a learning model obtained by performing machine learning on a relationship between an input and an output, which sets an image as the input and the feature range in the image as the output.
As an example, the feature range specifying unit 13 specifies four feature ranges. The feature range specifying unit 13 may specify three feature ranges in descending order of the feature amounts. Alternatively, the feature range specifying unit 13 may specify three feature ranges in descending order of the features. The feature range specifying unit 13 specifies a plurality of feature amount calculation ranges r that can constitute the vertices of a polygon as a feature range in features of the feature range described above. In specifying such a feature range, the feature range specifying unit 13 specifies, as a feature range, three or more feature amount calculation ranges r at the outer side as far as possible in the processing target range.
For example, when the feature range specifying unit 13 specifies four feature amount calculation ranges r as feature ranges, it may specify feature ranges one by one from each of areas A, B, C, and D (refer to
Alternatively, for each of the areas A, B, C, and D, the plurality of feature amount calculation ranges r are specified in descending order of the feature amounts, and center coordinates indicated by these feature amount calculation ranges r are compared, and thereby one feature amount calculation range r, in which any one of an x coordinate or a y coordinate is closest to a coordinate indicating an outer frame of the processing target range of the feature amount calculation range r among these center coordinates, may be specified as a feature range. The feature range specifying unit 13 outputs the specified feature ranges to the image transformation unit 14.
The feature range specifying unit 13 may output a predetermined number of feature amount calculation ranges r each having a feature amount equal to or more than a threshold value to a display device (for example, a display of the image processing device 1) or the like on the basis of a relationship between the feature amount and the threshold value of each feature amount calculation range r, and receives a designation (selection information) of a feature amount calculation range r to be specified as a feature range from a user, thereby specifying a feature range. The processing is one example aspect of processing of outputting, by the feature range specifying unit 13, the feature amount calculation ranges to the display device, receiving an input of the selection information indicating feature amount calculation ranges selected by the user among the output feature amount calculation ranges, and specifying a plurality of feature amount calculation ranges capable of constituting the vertices of a polygon as feature ranges on the basis of the selection information. The feature range specifying unit 13 may output a plurality of feature amount calculation ranges close to the outer side in the processing target range to the display device, receives an input of selection information indicating feature amount calculation ranges selected by the user among the output feature amount calculation ranges, and specify a plurality of feature amount calculation ranges capable of constituting the vertices of a polygon as feature ranges on the basis of the selection information.
In the following description, description will proceed on an assumption that four feature ranges (a, b, c, and d of
The image transformation unit 14 acquires coordinate information of four feature ranges. The image transformation unit 14 acquires the regular image of the recognition object 2 from the storage unit. The image transformation unit 14 specifies an amount of deviation between an image pattern included in the feature range specified in the processing target range in the captured image and an image pattern included in a range of a corresponding position in the regular image for each of the four feature ranges (step S106). For example, it is assumed that the processing target range of recognition object 2 is a board surface on which characters are typed, and the characters are printed in the feature range. In this case, the image transformation unit 14 assumes that a part of the characters (an image pattern) appears in the feature range. The image transformation unit 14 calculates the amount of deviation between a part of the characters appearing in the specified feature range in a processing target of the captured image and a part of characters appearing in a corresponding range of the regular image. This amount of deviation may be expressed by an amount of deviation in the vertical direction (the amount of deviation in an x-coordinate direction), the amount of deviation in the horizontal direction (the amount of deviation in a y-coordinate direction), a rotational angle, and the like.
The image transformation unit 14 calculates a projective transformation matrix by a known calculation method for a homography transformation matrix using the amounts of deviation calculated for each of the four feature ranges (step S107). The image transformation unit 14 may calculate the projective transformation matrix by a known calculation method of an affine transformation matrix using an amount of deviation related to the features (numbers) of any three of the four feature ranges dispersed in the recognition object 2.
The image transformation unit 14 generates a projective transformation image obtained by performing a projective transformation on a captured image showing the recognition object 2 using a projective transformation matrix (step S109). When the first projective transformation using the mark described above is performed, this transformation becomes a second projective transformation. The image transformation unit 14 outputs the projective transformation image to the recognition processing unit 15. It is assumed that the recognition object 2 shown in the captured image is an analog meter containing a pointer. The recognition processing unit 15 calculates, on the basis of the position of the scale indicated by the pointer in the projective transformation image, a numerical value that is stored corresponding to that position of the scale, by using interpolation calculation or the like. The image transformation unit 14 outputs the numerical value corresponding to the position of the scale indicated by the pointer. For example, an output destination is a liquid crystal display, and the recognition processing unit 15 outputs the numerical value of the scale indicated by the pointer to the liquid crystal display.
According to the processing described above, the image processing device 1 performs the projective transformation such that a recognition object shown in the captured image is in a state of being captured from the front in the same manner as a recognition object shown in the regular image. The image processing device 1 can specify feature ranges on the basis of a feature amount of the recognition object shown in the captured image, which is suitable for calculating a projective transformation matrix for this projective transformation. As these feature ranges, a plurality of feature amount calculation ranges capable of constituting the vertices of a polymer are specified. Therefore, it is possible to specify three or more feature ranges capable of constituting the vertices of a polygon, which are for calculating a projective transformation matrix used for the projective transformation.
As shown in
The feature amount calculation means 71 calculates the feature amount of each of the plurality of feature amount calculation ranges of a plurality of subdivisions set in the processing target range in the captured image of the recognition object (step S701).
The feature range specifying unit 72 specifies a plurality of feature amount calculation ranges that are capable of constituting the vertices of a polygon among the feature amount calculation ranges as feature ranges on the basis of the feature amount (step S702).
The image processing device described above has a computer system inside. Each of the processes described above is stored in a computer-readable recording medium in a form of a program, and the processing described above is performed by a computer reading and executing this program. Here, the computer-readable recording medium refers to a magnetic disk, a magneto-optical disc, a CD-ROM, a DVD-ROM, a semiconductor memory, or the like. In addition, this computer program may be distributed to a computer via a communication line, and the computer receiving this distribution may execute the program.
In addition, the program described above may be a program for realizing some of the functions described above.
Furthermore, it may also be a so-called difference file (a difference program) that can realize the functions described above in combination with a program already recorded in the computer system.
Number | Date | Country | Kind |
---|---|---|---|
2020-053665 | Mar 2020 | JP | national |