The present invention relates to an image processing device and the like.
In order to effectively utilize satellite images and the like, various automatic analyses are performed. For automatic analysis of an image, development of an analysis method, and performance evaluation, image data prepared with a correct answer is required. Setting the correct answer to the data is also referred to as annotation. In order to improve the accuracy of the automatic analysis, it is desirable that there are many pieces of image data in which the correct answer is set. However, it is often difficult to determine the content of a satellite image, particularly image data generated by a synthetic aperture radar. Therefore, a lot of complicated work is required to prepare the image data in which the correct answer is set. In view of such a background, it is desirable to have a system that makes the work of setting the correct answer to image data efficient.
The transfer reading system of PTL 1 is a system that determines whether an object is lost by image processing. The transfer reading system of PTL 1 generates correct answer data indicating that there is no house in an image based on a comparison result between two pieces of image data captured at different times.
However, in PTL 1, in a case where the object to be annotated is an object that is difficult to determine, accuracy of setting the correct answer may be deteriorated.
In order to solve the above problems, an object of the present invention is to provide an image processing device and the like capable of improving accuracy while efficiently performing annotation.
In order to solve the above problem, an image processing device of the present invention includes a region setting means that sets, as a candidate region, a region in which an object to be annotated is likely to be present in an annotation target image, a standard image extraction means that extracts a standard image that is an image in which an object identical to the object to be annotated is imaged from an annotated image, a data generation means that generates, as annotation data, data in which the annotation target image, a reference image captured when a region including the candidate region is different from the annotation target image, and the standard image are associated with each other, and an output means that outputs the annotation data generated by the data generation means.
An image processing method of the present invention includes setting, as a candidate region, a region in which an object to be annotated is likely to be present in an annotation target image, extracting a standard image that is an image in which an object identical to the object to be annotated is imaged from an annotated image, generating, as annotation data, data in which the annotation target image, a reference image captured when a region including the candidate region is different from the annotation target image, and the standard image are associated with each other, and outputting the generated annotation data.
A recording medium of the present invention records an image processing program for causing a computer to execute the steps of setting, as a candidate region, a region in which an object to be annotated is likely to be present in an annotation target image, extracting a standard image that is an image in which an object identical to the object to be annotated is imaged from an annotated image, generating, as annotation data, data in which the annotation target image, a reference image captured when a region including the candidate region is different from the annotation target image, and the standard image are associated with each other, and outputting the generated annotation data.
According to the present invention, it is possible to improve accuracy while efficiently performing annotation.
The first example embodiment of the present invention will be described in detail with reference to the drawings.
The image processing system of the present embodiment is a system that performs processing related to annotation on an image. The processing related to the annotation refers to, for example, outputting an annotation target image and associating information for identifying an object present in the image and information about a region where the object is present with the image based on an input result by an operation of an operator. The information associated with the image may be either information for identifying an object present in the image or a region where the object is present. In the processing related to the annotation, the information associated with the image is not limited thereto. The image processing system performs a process related to the annotation on the image acquired using, for example, a synthetic aperture radar (SAR). Data generated using the image processing system can be used as teacher data in machine training, for example.
A configuration of the image processing device 10 will be described.
The storage unit 20 includes a target image storage unit 21, a reference image storage unit 22, a region information storage unit 23, and an annotation result storage unit 24.
The region setting unit 11 sets, as a candidate region, a region in which the object to be annotated may be present in an annotation target image. In the following description, the annotation target image, that is, the image to be subjected to the processing related to the annotation is also referred to as a target image.
The region setting unit 11 sets, as a candidate region, a region where an object to be annotated may be present in the target image. The region setting unit 11 reads, for example, a target image to be processed from the target image storage unit 21. The region setting unit 11 stores the range of the candidate region on the target image in the region information storage unit 23. The region setting unit 11 represents the range of the candidate region on the target image by, for example, coordinates in the target image and stores the range in the region information storage unit 23. For example, information about an imaged location and date and time is added to the target image.
The region setting unit 11 sets, for example, a region in which the state of the reflected wave is different from that of the surroundings as a candidate region in which the object to be annotated may be present. For example, the region setting unit 11 identifies a region having luminance different from that of the surroundings in the target image, and sets a rectangular region including the identified region as a candidate region.
The region setting unit 11 identifies all locations where the objects to be annotated may be present in one target image, and sets the identified locations as candidate regions. For example, the region setting unit 11 sets a plurality of candidate regions by sliding the candidate region in the target image. For example, the region setting unit 11 sets a plurality of candidate regions in such a way as to cover the entire region of the candidate region existing in the target image.
The region setting unit 11 may compare the position where the target image is acquired with the map information, and set a candidate region in a region set in advance for the target image. For example, when the object to be annotated is a ship, the candidate region may be set in a region where the ship may be present, such as the sea, the river, and the lake. In such a case, the region setting unit 11 sets the candidate regions only in the regions of the sea, the river, and the lake with reference to the map information, for example.
The region extraction unit 12 extracts an image of a region related to the candidate region from the reference image as a related image. The region extraction unit 12 extracts an image of a region related to the candidate region from the target image as a candidate image. The reference image is an image used as a comparison target for determining whether an object to be annotated exists in the target image. The reference image is an image acquired at a time different from that of the target image in the region including the region of the target image. The number of reference images related to one target image may be plural.
The reference image is, for example, an image captured when a region including a region where the target image is captured is imaged at the time different from the time when the target image is imaged in the same method as the target image. For example, among images captured at the same time every day at the identical location, one image is set as the target image, and an image captured on another day is used as the reference image. The image capturing cycle and the image capturing time may not be constant. For example, information about an imaged location and date and time is added to the reference image. The region extraction unit 12 reads, for example, the reference image from the reference image storage unit 22.
Based on the information of the candidate region stored in the region information storage unit 23, the region extraction unit 12 identifies a region related to the candidate region on the reference image. The region extraction unit 12 extracts an image of a region related to the candidate region from the reference image as a related image.
The region extraction unit 12 may set the target image including the candidate region as the candidate image without extracting the candidate image from the target image. The region extraction unit 12 may refer to the position information added to the image without extracting the target image from the reference image, and may set the reference image including the candidate region as the related image related to the candidate region.
For example, the region extraction unit 12 extracts an image of a region related to the candidate region from the two reference images. The two reference images are images captured at the time different from that of the target image. For example, the region extraction unit 12 extracts a related image G2 and a related image G3 for a candidate image G1 with respect to the image in the candidate region of the target image. For example, the region extraction unit 12 extracts the related image G2 from a reference image A acquired one day before the day on which the target image is acquired by the synthetic aperture radar, and extracts the related image G3 from a reference image B acquired two days before. The region extraction unit 12 associates the candidate image G1, the related image G2, and the related image G3. The number of related images associated with one candidate image may not be two, and is set according to the number of reference images. The number of reference images can be appropriately set.
The standard image extraction unit 13 extracts, from an annotated image, a standard image that is an image in which an object identical to the object to be annotated is imaged. The standard image extraction unit 13 searches the annotated image data stored in the annotation result storage unit 24 as the annotation completion data to extract an image in which the object identical to the object to be annotated is imaged as the standard image. The identical objects include similar objects. As the standard image, for example, an image whose result is determined to be correct in verification of the result of the annotation among the annotated images is used. For example, in a case where the target image is captured by the synthetic aperture radar, the verification of the result of the annotation is made using the optical image.
An image in which the determination result at the time of performing the annotation is incorrect may be associated with the annotated image. The image in which the determination result at the time of performing the annotation is incorrect is, for example, an image identified that the type of the object determined in the annotation is incorrect when the image that is annotated is determined using the image captured by another method. For example, it is assumed that, when an object on an image acquired using a synthetic aperture radar is annotated, an annotation operator determines an object present in a candidate region as a ship. In the subsequent verification of the result of the annotation, for example, in a case where the operator of the annotation or another operator identifies the object existing in the candidate region as the tank, the determination result at the time of the annotation is determined to be incorrect. The verification of the result of the annotation is made, for example, by an operator identifying an object present in the candidate region using an optical image obtained by imaging the location identical to that of the target image.
When an incorrect image is associated with the annotation completion data, the standard image extraction unit 13 extracts, as the standard image, an annotated image and an incorrect image associated with the annotated image. In such a case, the standard image extraction unit 13 sets, for example, an annotated image as a correct image and extracts a set of a correct image and an incorrect image as a standard image.
The standard image extraction unit 13 compares the similarity between the candidate image and the annotated image, and determines that the candidate image and the annotated image are similar when the similarity is equal to or more than the criterion.
The standard image extraction unit 13 determines whether the object of the candidate image and the object of the annotated image are identical based on, for example, similarity of map coordinates and similarity of image feature amounts. The standard image extraction unit 13 may determine whether the object of the candidate image and the object of the annotated image are identical based on items other than the above.
For example, the standard image extraction unit 13 determines whether the imaged object is identical to the candidate image for the annotated image for which the imaging position is determined to be identical. The standard image extraction unit 13 determines whether the imaging positions of the candidate image and the annotated image are identical, for example, based on the distance between the center coordinates of the candidate image and the center coordinates of the annotated image. When the distance between the center coordinates of the candidate image and the center coordinates of the annotated image is equal to or less than the reference, the standard image extraction unit 13 determines that the imaging positions are identical.
When determining whether the object of the candidate image and the object of the annotated image are identical based on the similarity of the image feature amounts, the standard image extraction unit 13 calculates the similarity of the image feature amounts between the images using, for example, feature point matching. In the feature point matching, for example, the standard image extraction unit 13 extracts feature points from the candidate image and the annotated image, and determines that the two images are images obtained by imaging the identical object when the similarity of the feature points satisfies the criterion. A method for determining similarity of image feature amounts using feature point matching is disclosed, for example, in P. F. Alcantarilla, J. Nuevo and A. Bartoli, “Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces”, Proceedings British Machine Vision Conference 2013, pp. 13.1-13.11. The standard image extraction unit 13 may calculate the similarity of the image feature amounts using a method other than the feature point matching method, and the standard image extraction unit 13 may calculate the similarity of the image feature amounts using, for example, histogram comparison or template matching of luminance.
When the image in which the location identical to the candidate region is imaged is not stored in the annotation result storage unit 24, the standard image extraction unit 13 compares each image of the annotation completion data stored in the annotation result storage unit 24 with the annotation target image to extract the standard image. For example, in a case where there is no image having the identical imaging position, the standard image extraction unit 13 extracts, as the standard image, an image that satisfies the criterion of similarity of the image feature amounts among all the annotated images stored in the annotation result storage unit 24.
The standard image extraction unit 13 may determine whether the object of the candidate image and the object of the annotated image are identical by further using the similarity of the sizes of the objects existing in the image. In such a case, for example, the relationship between the number of pixels and the actual distance is set in advance for each of the two images. For example, the standard image extraction unit 13 determines the similarity of the sizes of the objects on the two images based on the ratio or difference of the areas of the objects present in the respective images. In a case where the criterion of the similarity in size is set based on, for example, the ratio of the areas of the objects respectively present in the two images, the standard image extraction unit 13 determines that the sizes of the objects in the two images are identical when the ratio of the areas is within the reference range. In a case where the criterion of the similarity in size is set based on, for example, a difference in area between objects present on the two images, the standard image extraction unit 13 determines that the sizes of the objects on the two images are identical when the difference in area is within the reference.
The data generation unit 14 generates, as annotation data, data in which the annotation target image, the reference image, and the standard image are associated with each other. The reference image is an image captured at the time different from that of the annotation target image with respect to the region including the candidate region. For example, the data generation unit 14 generates annotation data in which the annotation target image, that is, the candidate image obtained by extracting the candidate region from the target image, the related image obtained by extracting the candidate image from the standard image, and the reference image are associated with each other. The data generation unit 14 may generate, as annotation data, the candidate image, the related image, and an image obtained by enlarging the vicinity of the candidate image in the standard image in association with each other. The data generation unit 14 outputs the generated annotation data to the terminal device 30 via the output unit 15, for example.
The data generation unit 14 may generate display data for displaying the candidate image, the related image, and the standard image in a comparable manner as annotation data. The display data for display in a comparable manner refers to, for example, display data in a state in which an operator can compare two images by disposing images to be compared in a lateral direction. The data generation unit 14 may output the generated display data to a display device (not illustrated) connected to the image processing device 10.
In a case where an incorrect image is associated with the standard image extracted by the standard image extraction unit 13, the data generation unit 14 may generate annotation data by using the standard image as a set of a correct image and an incorrect image.
The image G1 in
The data generation unit 14 generates annotation completion data based on the information about the annotation. The information about the annotation is input to the terminal device 30 as annotation information by the operation of the operator. The annotation information is, for example, information for identifying the annotation target image, that is, the type of the object to be annotated on the target image, and information for identifying the region where the object exists in the image. For example, the data generation unit 14 acquires information identifying a region where an object exists in the annotation information as a rectangular region surrounding the object on the candidate image. For example, the data generation unit 14 generates, as the annotation completion data, data in which the type of the object on the candidate image and the information about the region where the object exists are associated with the candidate image based on the annotation information. The region indicated by the annotation information is also referred to as an annotation region. The data generation unit 14 stores the generated annotation completion data in the annotation result storage unit 24. The setting of the annotation region is not limited to a method of surrounding the region with a rectangular line. For example, the annotation region may be set by filling the annotation region.
The output unit 15 outputs the annotation data generated by the data generation unit 14 to the terminal device 30. The output unit 15 may output the display data generated based on the annotation data to a display device (not illustrated) connected to the image processing device 10.
The input unit 16 receives an input of information related to the annotation of the object to be annotated as annotation information with respect to the annotation target image. The input unit 16 acquires the annotation information input to the terminal device 30 by the operation of the operator from the terminal device 30.
For example, the input unit 16 acquires, as the annotation information, information about the range of the annotation region and information identifying the type of the object on the image. The input unit 16 may acquire, as the annotation information, either the information about the range of the annotation region or the information identifying the type of the object on the image. The input unit 16 may acquire information about items other than the above items as annotation information. The input unit 16 may acquire the annotation information from an input device (not illustrated) connected to the image processing device 10.
The target image storage unit 21 of the storage unit 20 stores the image data of the annotation target image as the target image. The target image storage unit 21 stores, for example, the imaging date and time and the imaging position information in association with the target image. The reference image storage unit 22 stores the image data of the reference image. The reference image storage unit 22 stores, for example, the reference image in association with the imaging date and time and the imaging position information. The reference image storage unit 22 may store the reference image in association with information about the target image related to the reference image. The target image and the information associated with the reference image are not limited to these examples. The region information storage unit 23 stores information about the range of the candidate region set by the region setting unit 11. The annotation result storage unit 24 stores, as annotation completion data, the annotation target image and the annotation information in association with each other. The annotation result storage unit 24 may store the information about the image imaging position in association with the image included in the annotation completion data. The annotation result storage unit 24 may store an incorrect image in association with the image included in the annotation completion data.
Each piece of the above-described data related to the annotation stored in the storage unit 20 is input to the image processing device 10 by, for example, an operator. Each piece of data related to the annotation stored in the storage unit 20 may be acquired from the terminal device 30 or a server connected via a network.
The storage unit 20 is includes, for example, a hard disk drive. The storage unit 20 may include, for example, another storage device such as a nonvolatile semiconductor storage device. The storage unit 20 may be configured by combining a plurality of types of storage devices such as a nonvolatile semiconductor storage device and a hard disk drive. Part or all of the storage unit 20 may be included in an external device connected to the image processing device 10 via a network.
The terminal device 30 is a terminal device for operation by an operator, and includes an input device and a display device (not illustrated). The terminal device 30 acquires annotation data from the image processing device 10. The terminal device 30 outputs a display screen on which the annotation work is performed to a display device (not illustrated) based on the annotation data. For example, the terminal device 30 displays a display screen in which the candidate image, the related image, and the standard image are associated with each other on the display device. The terminal device 30 may display both the correct image and the incorrect image for the standard image.
The terminal device 30 receives annotation information input by an operation of an operator. The terminal device 30 outputs the acquired annotation information to the image processing device 10. The number of terminal devices 30 may be plural. The number of terminal devices can be appropriately set.
An operation of the image processing system of the present embodiment will be described.
The region setting unit 11 of the image processing device 10 reads the target image that is the annotation target image from the target image storage unit 21 of the storage unit 20.
When the target image is read, the region setting unit 11 sets a region in which the object to be annotated may be present in the target image as a candidate region (step S11). For example, the region setting unit 11 identifies a region where an object may be present based on the luminance value of each pixel in the image. When the region where the object may be present is identified, a rectangular region including the identified region is set as a candidate region. The region setting unit 11 sets, for example, a region smaller than the entire target image as a candidate region.
When the candidate region is set, the region setting unit 11 stores information about the set candidate region in the region information storage unit 23. The region setting unit 11 stores, for example, coordinates for identifying the outer peripheral portion of the candidate region on the target image in the region information storage unit 23 as information about the candidate region.
The region setting unit 11 sets a plurality of candidate regions in such a way as to cover the entire region of the candidate region existing in the target image. The region setting unit 11 slides the candidate region in the target image, for example, and sets a region where an object may be present as the candidate region.
When the candidate region is set, the region extraction unit 12 selects a candidate region to be annotated from the candidate regions stored in the region information storage unit 23 (step S12). For example, the region extraction unit 12 selects, as a candidate region to be annotated, a candidate region that has been stored earliest as a candidate region among candidate regions for which the annotation has not been completed. The method of selecting the candidate region may be another method.
When the candidate region is selected, the region extraction unit 12 extracts an image of a portion of the candidate region from the target image as the candidate image. The region extraction unit 12 reads the reference image related to the target image from the reference image storage unit 22 to extract an image of a portion of the candidate region from the reference image as the related image (step S13). For example, the region extraction unit 12 extracts an image of a portion in the candidate region from each of the two reference images as a related image.
When the candidate image and the related image are extracted, the standard image extraction unit 13 searches the annotation completion data stored in the annotation result storage unit 24 to extract an image in which the object identical to the candidate image exists in the image as the standard image (step S14). For example, the standard image extraction unit 13 extracts, as a standard image, an image whose similarity satisfies the criterion based on the similarity between the image stored as the annotation completion data and the candidate image.
When the standard image is extracted, the data generation unit 14 generates, as annotation data, data in which the annotation target image, the reference image captured at the time different from that of the annotation target image with respect to the region including the candidate region, and the standard image are associated with each other (step S15). For example, the data generation unit 14 generates, as annotation data, data in which the candidate image, the related image, and the standard image are associated with each other.
When the annotation data is generated, the output unit 15 outputs the generated annotation data to the terminal device 30 (step S16).
When the annotation data is acquired, the terminal device 30 outputs the display data based on the annotation data to a display device (not illustrated). When the annotation information is input by the operation of the operator while the display data is displayed based on the annotation data, the terminal device 30 outputs the input annotation information to the image processing device 10.
The input unit 16 of the image processing device 10 acquires the annotation information from the terminal device 30 (step S17). When the annotation information is acquired, the data generation unit 14 generates annotation completion data in which the data of the candidate image and the annotation information are associated with each other (step S18). The data generation unit 14 stores the generated annotation completion data in the annotation result storage unit 24.
When the annotation completion data is saved, in a case where the processing of the annotation has been completed for all the candidate regions (Yes in step S19), the image processing device 10 ends the process related to the annotation. When there is a candidate region for which the processing of the annotation has not been completed (No in step S19), the image processing device 10 executes processing from the operation of selecting the candidate region in step S12.
The annotation completion data generated by the above method can be used, for example, as teacher data when a machine training model for identifying an image is generated in the image recognition device.
The above description has been made for the example in which the annotation is made to the target image acquired by the synthetic aperture radar, but the target image may be an image acquired by a method other than the synthetic aperture radar. For example, the target image may be an image acquired by an infrared camera.
In the above description, an example in which annotation is made with reference to an image acquired by the same method as the target image is described. In addition to such a configuration, the determination result in the annotation may be verified with reference to another type of image. For example, the annotated image that is acquired by the synthetic aperture radar and the optical image captured by the optical camera that images the visible light region at the identical location may be displayed side by side to verify whether the type of the object determined in the annotation is correct. By verifying the correctness by such a method, it is also possible to generate a correct image and an incorrect image to be used as the standard image.
In the example of
The image processing device 10 of the image processing system according to the present embodiment outputs, as annotation data, a candidate image obtained by extracting a region where an object may be present from a target image that is an annotation target image, a related image obtained by extracting a region related to the candidate image from a reference image, and a standard image in association with each other. The image processing device 10 outputs, as annotation data, a target image that is an annotation target image, an image captured at a time different from that of the target image, and a standard image in which the annotation for the object identical to the object to be annotated is completed in association with each other. By displaying each image in such a way as to be comparable using the annotation data, for example, the operator who makes the annotation can perform the operation of the annotation by referring to the presence or absence of the change in the object to be annotated and the past annotation result, and can easily distinguish the object from the region. By displaying the reference image and the standard image at the time of making annotation, it is possible to suppress variations in determination between the identical operator and operator. As a result, by using the image processing device of the present embodiment, it is possible to improve accuracy while efficiently making annotation.
In a case where the image processing device 10 outputs an incorrect image in the past annotation result as the standard image, the operator can refer to an example of a case where the operator makes a mistake when making the annotation. Therefore, for example, the type of the object to be annotated can be more easily determined. Therefore, in a case where the image processing device 10 outputs an incorrect image in the past annotation result as the standard image, the accuracy of the annotation can be further improved.
The second example embodiment of the present invention will be described in detail with reference to the drawings.
The region setting unit 11 is an example of the region setting unit 101. The region setting unit 101 is an aspect of a region setting means. The standard image extraction unit 13 is an example of the standard image extraction unit 102. The standard image extraction unit 102 is an aspect of a standard image extraction means. The data generation unit 14 is an example of the data generation unit 103. The data generation unit 103 is an aspect of a data generation means. The output unit 15 is an example of the output unit 104. The output unit 104 is an aspect of an output means.
The operation of the image processing device 100 will be described.
The image processing device 100 according to the present embodiment outputs, as annotation data, data in which an annotation target image, a reference image captured when a region including a candidate region is different from the annotation target image, and a standard image that is an annotated image are associated with each other. Therefore, by using the image processing device 100, the operator can perform processing while comparing the images at the time of performing annotation. As a result, by using the image processing device 100 of the present embodiment, it is possible to improve the accuracy while efficiently performing the annotation processing.
Each processing in the image processing device 10 of the first example embodiment and the image processing device 100 of the second example embodiment can be performed by causing a computer to execute a computer program.
The CPU 201 reads and executes a computer program for executing each processing from the storage device 203. The CPU 201 may be configured by a combination of a CPU and a graphics processing unit (GPU). The memory 202 includes a dynamic random access memory (DRAM) or the like, and temporarily stores a computer program executed by the CPU 201 and data being processed. The storage device 203 stores a computer program executed by the CPU 201. The storage device 203 includes, for example, a nonvolatile semiconductor storage device. The storage device 203 may include another storage device such as a hard disk drive. The input/output I/F 204 is an interface that receives an input from an operator to output display data and the like. The communication I/F 205 is an interface that transmits and receives data to and from each device constituting the monitoring system. The terminal device 30 can have a similar configuration.
The computer program used for executing each processing can also be stored in a non-transitory recording medium and distributed. The recording medium may include, for example, a magnetic tape for data recording or a magnetic disk such as a hard disk. The recording medium may include an optical disk such as a compact disc read only memory (CD-ROM). A non-volatile semiconductor storage device may be used as a recording medium.
The present invention is described above by taking the above-described example embodiment as an example. However, the present invention is not limited to the above-described example embodiments. That is, it will be understood by those of ordinary skill in the art that the present invention can have various aspects without departing from the spirit and scope of the present invention as defined by the claims.
This application claims priority based on Japanese Patent Application No. 2021-158568 filed on Sep. 29, 2021, the entire disclosure of which is incorporated herein.
Number | Date | Country | Kind |
---|---|---|---|
2021-158568 | Sep 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/032697 | 8/31/2022 | WO |