This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2021-15337, filed on Feb. 2, 2021, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to a determination technology.
In a software system using machine learning such as deep learning, an operation result depends on properties of input data, and it is difficult to define a correct answer and test the system. Furthermore, a phenomenon is known in which a result output by the system largely fluctuates by applying a change that humans can hardly recognize to input data to a machine learning model such as an image. Therefore, it is needed to ensure robustness so that the results are not affected by the change in the data to which the same correct answer label is added. There is a Data augmentation technique for increasing the types by adding a change (perturbation) to the input data in order to improve the robustness.
U.S. Patent Application Publication No. 2018/0108162, Japanese Laid-open Patent Publication No. 2018-055516, Japanese Laid-open Patent Publication No. 2020-112967, and Japanese Laid-open Patent Publication No. 2019-153057 are disclosed as related art.
According to an aspect of the embodiments, a non-transitory computer-readable recording medium stores a determination program for causing a computer to execute processing including: specifying a difference between feature amounts of a plurality of first images that is captured in chronological order or of which the difference between the feature amounts is equal to or less than a threshold; referring to information in which the difference is associated with a data augmentation processing type and determining one or a plurality of data augmentation processing types used for processing of generating machine learning data on the basis of the specified difference between the feature amounts; and outputting a result of the determination processing.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
For example, with the traditional Data augmentation technique, a developer selects a type of a perturbation operation in advance. Therefore, reality and completeness of input data depend on the selected type of the perturbation operation.
In one aspect, an object is to provide a determination program, a determination method, and a determination device that further improve robustness of a machine learning model.
Hereinafter, examples of a determination program, a determination method, and a determination device according to the present embodiment will be described in detail with reference to the drawings. Note that the present embodiment is not limited by the examples. Furthermore, the examples can be appropriately combined within a range without inconsistency.
First, the traditional Data augmentation technique will be described.
At this time, all perturbation operations are not applied to the original image 100, and the perturbation operation and its parameter range are selected by a developer or the like on the basis of specific criteria. For example, in
In this way, it is possible to generate the image 110 and increase the input data to the machine learning model. However, the selected and applied perturbation operation may be an operation that have little effect on the input data to the machine learning model or does not include necessary operations. More specifically, for example, in a case where the selected and applied perturbation operation does not exist in an application destination of the machine learning model, if all parameters of all perturbation operations are set as targets, a search space in machine learning may increase more than necessary. Furthermore, when a perturbation operation other than the selected and applied perturbation operation occurs in the application destination of the machine learning model, search in the machine learning may be insufficiently performed.
Therefore, in the present embodiment, a determination device estimates and selects a perturbation operation with conditions close to an application domain for an image used for input data to a machine learning model and further improves accuracy and robustness of the machine learning model.
[Functional Configuration of Determination Device 10]
Next, a functional configuration of the determination device 10 illustrated in
The storage unit 20 is an example of a storage device that stores various types of data and a program executed by the control unit 30 and is, for example, a memory, a hard disk, or the like. The storage unit 20 includes an image database (DB) 21, perturbation operation data 22, a machine learning model DB 23, or the like.
The image DB 21 stores an original image used to estimate a perturbation operation, for example, a set of still images or a moving image that are captured in advance in a practical environment, an environment close to the practical environment, or the like. The practical environment is, for example, a place near a stage in a conference hall, and the environment close to the practical environment is, for example, a place near a platform in a classroom. Furthermore, the still image or the moving image may be captured in an environment in which the perturbation operation that may occur in the practical environment can be reproduced, for example, an environment in which changes in vibration and illumination can be reproduced indoors. Note that the conference hall, the classroom, indoors, and the like are examples of the environment, and the practical environment or the like naturally includes outdoors. Furthermore, although the original image is exemplified as the image DB 21, the image DB 21 may include an image obtained by applying a perturbation operation to the original image.
The perturbation operation data 22 stores a type of perturbation operation that may occur, for example, types of the perturbation operations indicated as a set of the perturbation operations in
The machine learning model DB 23 stores, for example, a parameter used to construct a machine learning model generated through machine learning that uses an image generated using the Data augmentation technique as a feature amount and a person or an object included in the image as a correct answer label. Furthermore, the machine learning model DB 23 stores training data for the machine learning model. In the present embodiment, training data that further improves robustness of the machine learning model is generated.
Note that the data described above stored in the storage unit 20 is merely an example, and the storage unit 20 can store various types of data other than the data described above.
The control unit 30 is a processing unit that controls the entire determination device 10 and is, for example, a processor or the like. The control unit 30 includes an image selection unit 31, a perturbation operation estimation unit 32, a format shaping unit 33, a Data augmentation unit 34, and a machine learning model generation and training unit 35. Note that each processing unit is an example of an electronic circuit included in a processor and an example of a process performed by the processor.
The image selection unit 31 compares similarities between still images captured in the practical environment or the like and selects an image that conforms with similarity criteria. Furthermore, the image selection unit 31 selects an image corresponding to a frame that is temporally close from among the moving images captured in the practical environment or the like, for example, images captured in chronological order.
The perturbation operation estimation unit 32 specifies a difference between feature amounts of the respective plural images selected by the image selection unit 31. Furthermore, the perturbation operation estimation unit 32 determines one or a plurality of data augmentation processing types under conditions close to an application domain, for example, the type of perturbation operation on the basis of the specified difference between the feature amounts. Furthermore, the perturbation operation estimation unit 32 aggregates parameters in the respective images for each type of the determined perturbation operation and derives a range of the parameter of the perturbation operation. In this way, the perturbation operation estimation unit 32 estimates a perturbation operation that may occur.
The format shaping unit 33 shapes the type of the perturbation operation estimated by the perturbation operation estimation unit 32 and a range of its parameter in a format that can be used for post-processing such as Data augmentation processing and outputs the format.
The Data augmentation unit 34 applies the perturbation operation on an image using the type of the perturbation operation and the range of the parameter shaped by the format shaping unit 33 and generates input data to the machine learning model.
The machine learning model generation and training unit 35 sets the image, to which the perturbation operation has been applied, generated by the Data augmentation unit 34 as a feature amount and generates and trains the machine learning model using a person and an object included in the image as a correct answer label.
[Details of Functions]
Next, a perturbation operation estimation method mainly performed by the determination device 10 according to the present embodiment will be described in detail.
In the image selection from the still image 200, for example, similarities between the still images 200 are compared, and a still image 200 that conforms with the similarity criteria is selected. Note that it is not necessary for all the selected still images 200 to conform with the similarity criteria, for example, as illustrated in
Then, as illustrated in
Moreover, for example, the determination device 10 aggregates the parameter of the perturbation operation for each of the selected still images 200 and estimates the range of the parameter that may occur. Then, for example, by applying the estimated perturbation operation to any one of the still images 200 while changing the parameter within the estimated range of the parameter, the determination device 10 generates the input data to the machine learning model.
Furthermore, in
In the image selection from the images 300, for example, an image corresponding to a frame that is temporally close in the images 300 is selected. Note that it is not necessary to correspond to the frame that is close to the selected image 300, and for example, as illustrated in
Next, the perturbation operation estimation processing will be described in detail with reference to the flowcharts illustrated in
First, as illustrated in
Next, the determination device 10 selects an image from among the still images 200 or the images 300 input in step S1 as described with reference to
Next, the determination device 10 estimates the type of the perturbation operation that may occur and the range of the parameter from the difference between the feature amounts of the still images 200 or the images 300 selected in step S2 (step S3). Details of step S3 will be described later.
Next, the determination device 10 shapes the type of the perturbation operation and the range of the parameter estimated in step S3 into a format that can be used for the post-processing such as the Data augmentation processing (step S4). After step S4 is executed, the perturbation operation estimation processing illustrated in
Next, image selection processing in step S2 in the entire processing illustrated in
First, as illustrated in
Next, the determination device 10 determines whether or not the similarity of each pair of the plurality of still images 200 complies with predetermined similarity criteria (step S102). Here, the predetermined similarity criteria are, for example, criteria based on a threshold condition or a search result of a solution for an optimization problem using a calculation result of a correlation value, an information amount, or the like according to a pixel value and the like in a window region. Alternatively, the predetermined similarity criteria are criteria based on the similarity between the detected features, such as conspicuous points, lines, or regions, in an image. However, the predetermined similarity criteria are not limited to these criteria.
In a case where the similarities of all the pairs of the plurality of still images 200 do not conform with the predetermined similarity criteria (step S102: No), the image selection processing illustrated in
The image selection processing in a case where the plurality of still images 200 is input in step S1 has been described with reference to
First, as illustrated in
Next, the determination device 10 selects frames that are temporally close from among the frames extracted in step S201 (step S202). In the frame selection in step S202, for example, a pair of frames temporally close to each other may be selected from among the frames in the entire moving image. However, frames may be comprehensively selected such as acquiring pairs from the beginning, middle, and end of the moving image. After the execution of step S202, the image selection processing illustrated in
Next, the perturbation operation estimation processing in step S3 in the entire processing illustrated in
First, as illustrated in
Next, the determination device 10 determines a perturbation operation, for example, using an evaluation function from the pair of images extracted in step S301 (step S302). As illustrated in
In a case where determination of all perturbation operations is not completed for the pair of extracted images (step S303: No), the procedure returns to step S302, and the determination device 10 performs determination regarding remaining perturbation operations. Note that, for example, a perturbation operation that obviously does not occur in the pair of extracted images such as a sky color with respect to an image of indoors may be excluded from all the perturbation operations.
On the other hand, in a case where the determination regarding all the perturbation operations is completed for the pair of extracted images (step S303: Yes), the determination device 10 holds a determination result for the pair of extracted images (step S304). To hold the determination result, for example, as illustrated in
In a case where the determination regarding the perturbation operation is not completed for all the pairs of extracted images (step S305: No), the procedure returns to step S302, and the determination device 10 determines perturbation operations for remaining pairs of images.
On the other hand, in a case where the determination regarding the perturbation operation is completed for all the pairs of extracted images (step S305: Yes), the determination device 10 aggregates the parameter range P0 that is held for each type of perturbation operations (step S306). More specifically, for example, the determination device 10 (1) remains a value included in an average±a standard deviation or (2) calculates the average and the maximum and minimum values for each of an upper limit and a lower limit of the parameter range P0 for all the pairs of extracted images for each type of perturbation operations. Note that, the aggregation method includes, for example, statistical processing or the like.
Then, as illustrated in
Next, the format shaping processing in step S4 in the entire processing illustrated in
First, as illustrated in
Next, the determination device 10 acquires a format template as illustrated in
Next, the determination device 10 shapes the parameter range for each perturbation operation acquired in step S401 using the template acquired in step S402 and outputs the parameter range (step S403). More specifically, for example, as illustrated in
Then, input data to a machine learning model is generated according to the Data augmentation technique using the parameter range for each perturbation operation generated by executing the entire processing illustrated in
In the example in
First, the determination device 10, for example, calculates a plurality of different parameters by dividing the parameter range of each perturbation operation by the Step value for each perturbation operation. In the example in
Next, the determination device 10 applies combinations of the calculated parameters, that is, 100 combinations of 10 parameters for the translation*10 parameters for the brightness to an original image in the example in
By applying the combinations of the parameters of each perturbation operation to the still image 200 or the image 300 that is the original image, as illustrated in
In this way, the determination device 10 generates the input data to the machine learning model, sets each of the generated images as a feature amount, and generates and trains a machine learning model using the feature amount and its correct answer label.
[Effects]
As described above, the determination device 10 specifies a difference between feature amounts of a plurality of first images that is captured in chronological order or of which the difference between the feature amounts is equal to or less than a threshold, refers to information in which the difference is associated with a data augmentation processing type, determines one or the plurality of data augmentation processing types used for processing of generating machine learning data on the basis of the specified difference between the feature amounts, and outputs a result of the determination processing.
The determination device 10 can estimate and select a perturbation operation with conditions close to an application domain for the image used for the input data to the machine learning model. Then, because the machine learning data is generated using the perturbation operation, the determination device 10 may further improve robustness of the machine learning model.
Furthermore, the determination device 10 further generates machine learning data using the result of the determination processing.
As a result, the determination device 10 may further improve the robustness of the machine learning model.
Furthermore, the determination device 10 further trains the machine learning model on the basis of the machine learning data generated using the result of the determination processing.
As a result, the determination device 10 may further improve the robustness of the machine learning model.
[System]
Pieces of information including a processing procedure, a control procedure, a specific name, various types of data, and parameters described above or illustrated in the drawings can be changed in any ways unless otherwise specified. Furthermore, the specific examples, distributions, numerical values, and the like described in the embodiments are merely examples, and can be changed in any ways.
Furthermore, each component of each device illustrated in the drawings is functionally conceptual and does not necessarily have to be physically configured as illustrated in the drawings. For example, specific forms of distribution and integration of each device are not limited to those illustrated in the drawings. For example, all or a part thereof can be configured by being functionally or physically distributed or integrated in optional units according to various types of loads, usage situations, or the like. Moreover, all or any part of individual processing functions performed in each device may be implemented by a central processing unit (CPU) and a program analyzed and executed by the CPU, or may be implemented as hardware by wired logic.
[Hardware]
The communication interface 10a is a network interface card or the like and communicates with another server. The HDD 10b stores programs and databases for activating the functions illustrated in
The processor 10d is a hardware circuit that reads a program that executes processing similar to the processing of each processing unit illustrated in
In this way, the determination device 10 operates as an information processing device that executes operation control processing by reading and executing the program that executes similar processing to each processing unit illustrated in
Furthermore, a program that executes similar processing to each processing unit illustrated in
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2021-015337 | Feb 2021 | JP | national |