The present invention relates to a training data generation apparatus, a training data generation method, and a program.
Studies have been conducted on techniques for estimating the condition (steps, slopes, etc.) of the surface of a road such as a pavement or a roadway on which a moving body such as an automobile, a pedestrian, or a wheelchair moves, by using sensors mounted on the moving body (for example, see NPL 1 and NPL 2).
[NPL 1] Akihiro Miyata, Iori Araki, Tongshun Wang, Tenshi Suzuki, “A Study on Barrier Detection Using Sensor Data of Unimpaired Walkers”, IPSJ journal (2018)
The condition of a road surface as described above is often estimated using a model that has been built through machine learning performed using training data. However, machine learning performed using training data is problematic in that sufficient learning accuracy cannot be acquired, and in that a large amount of training data is required for machine learning, which results in an increase in costs, for example.
An object of the present invention made in view of the problems above is to provide a training data generation apparatus, a training data generation method, and a program that are capable of generating training data that realizes learning with high accuracy, while suppressing an increase in costs.
To solve the above-described problems, a training data generation apparatus according to the present invention includes a noise determination unit that determines whether or not training data that is to be used in machine learning includes noise, and a noise addition unit that generates new training data by adding noise to training data that has been determined by the noise determination unit as not including noise.
Also, to solve the above-described problems, a training data generation method according to the present invention is a training data generation method that is to be carried out by a training data generation apparatus, comprising the steps of: determining whether or not training data that is to be used in machine learning includes noise; and generating new training data by adding noise to training data that has been determined as not including noise.
Also, to solve the above-described problems, a program according to the present invention enables a computer to function as the above-described training data generation apparatus.
With the training data generation apparatus, the training data generation method, and the program according to the present invention, it is possible to generate training data that realizes learning with high accuracy, while suppressing an increase in costs.
Hereinafter, an embodiment for carrying out the present invention will be described with reference to the drawings. In the drawings, the same reference numerals indicate the same or equivalent constituent elements.
The training data generation apparatus 10 shown in
Training data that includes road surface data detected by sensors (such as an acceleration sensor, a gyro sensor, and a gravity sensor) mounted on the moving body is input to the noise determination unit 11 as determination-target training data. Road surface data is constituted by sensor values detected during a period in which the moving body moves on the road surface, and is a time series data. Training data that is input to the noise determination unit 11 is data formed by attaching teacher labels to road surface data acquired during a predetermined period, the teacher labels indicating the condition of the road surface (whether or not the road surface is flat, whether or not there is a step, etc.) during the predetermined period, for example. Teacher labels are manually attached, for example. It is possible that the training data to be input to the noise determination unit 11 does not have teacher labels attached thereto, and teacher labels may be attached at any point in time after the noise determination unit 11 has performed the determination described below.
The noise determination unit 11 determines whether or not the input determination-target training data (road surface data) includes noise. In general, values detected by the sensors when the moving body travels on a rough road surface fluctuate more widely than values detected by the sensors when the moving body travels on a smooth road surface. In other words, fluctuations in road surface data are small during a period in which the moving body travels on a smooth road surface, and fluctuations in road surface data are large during a period in which the moving body travels on a rough road surface. The noise determination unit 11 determines training data that includes road surface data obtained during a period in which fluctuations are large (larger than a predetermined value, for example), such as road surface data obtained during a period in which the moving body travels on a rough surface, as training data that includes noise. Similarly, the noise determination unit 11 determines training data that includes road surface data obtained during a period in which fluctuations are small (smaller than a predetermined value, for example), such as road surface data obtained during a period in which the moving body travels on a smooth surface, as training data that does not include noise. That is to say, the noise determination unit 11 determines whether or not training data includes noise based on the magnitude of fluctuations in the values of training data (the values of road surface data in the present embodiment).
Upon determining that the determination-target training data includes noise, the noise determination unit 11 adds the determination-target training data to integrated training data stored in the integrated training data storage unit 13, as training data that includes noise (hereinafter referred to as “training data with noise”). Integrated training data is data formed by integrating pieces of training data corresponding to various states to be estimated (various conditions of a road surface in the present embodiment).
Upon determining that the determination-target training data does not include noise, the noise determination unit 11 adds the determination-target training data to the integrated training data as training data that does not include noise (hereinafter referred to as “training data without noise”). Also, the noise determination unit 11 outputs the determination-target training data (training data without noise) to the noise addition unit 12.
The noise addition unit 12 adds noise to the training data determined by the noise determination unit 11 as not including noise, and the resulting data to the integrated training data stored in the integrated training data storage unit 13, as training data with noise. In other words, the noise addition unit 12 generates new training data by adding noise to the training data determined as not including noise. Details of noise addition performed by the noise addition unit 12 will be described below.
The integrated training data storage unit 13 integrates and stores the training data with noise, output from the noise determination unit 11 and the noise addition unit 12, and the training data without noise, output from the noise determination unit 11, as integrated training data. Upon a predetermined amount of training data being stored, the integrated training data storage unit 13 outputs the integrated training data stored therein.
The estimation system 1 shown in
The learning apparatus 20 includes a learning unit 21. The learning unit 21 performs machine learning on a learning model 22, using the training data generated by the training data generation apparatus 10, and thus builds a trained model 23. Various models, including a model using the convolutional neural network, the SVM (Support Vector Machine), and so on, may be used as the learning model 22.
The estimation apparatus 30 includes an estimation unit 31. Road surface data detected by sensors mounted on the moving body moving on a road surface is input to the estimation unit 31 as input data. The estimation unit 31 inputs the input data to the trained model 23 built by the learning apparatus 20, and outputs the output from the trained model 23 as the result of estimation of the condition of the road surface on which the moving body moves.
As described above, in the estimation system 1 shown in
Upon receiving input determination-target training data (step S11), the noise determination unit 11 determines whether or not the determination-target training data includes noise (step S12).
Upon determining that the determination-target training data includes noise (step S12: Yes), the noise determination unit 11 adds the determination-target training data to the integrated training data as training data with noise (step S13).
Upon determining that the determination-target training data does not include noise (step S12: No), the noise determination unit 11 adds the determination-target training data to the integrated training data as training data without noise (step S14).
The noise addition unit 12 adds noise to the training data determined by the noise determination unit 11 as not including noise (step S15), and adds the training data to which noise has been added, to the integrated training data as training data with noise (step S16).
Training data that does not include noise is, for example, training data that corresponds to a case in which the road surface on which the moving body moves is smooth. As shown in
Again, as shown in
Upon determining that at least the predetermined amount of integrated training data has been collected (step S17: Yes), the integrated training data storage unit 13 outputs the integrated training data stored therein (step S18) and terminates processing.
Upon determining that at least the predetermined amount of integrated training data has not been collected (step S17: No), processing returns to step S11, and new determination-target training data is input to the noise determination unit 11.
Next, addition of noise performed by the noise addition unit 12 will be described with reference to
The noise addition unit 12 adds noise to the training data without noise, in all directions, instead of adding noise in only the vertical direction in which detection values fluctuate due to unevenness of the road surface, for example. That is to say, the noise addition unit 12 adds noise to road surface data in the directions of the three axes (the X, Y and Z axes) that are orthogonal to each other. As a result, it is possible to build a model for estimating the condition of the road surface from training data, regardless of the orientation of the device on which the sensors that detect the road surface data are mounted.
For each of the plurality of types of sensors, the noise addition unit 12 adds, to the values detected by the sensor, noise values that are distributed in a normal distribution with a mean of 0 and a variance that is the same as the variance of the values detected by the sensor, for example. That is to say, the noise addition unit 12 adds noise values according to Formula (1) shown below, where x denotes a value detected by the sensor to which a noise value has not been added, x′ denotes the value to which a noise value has been added, and std{circumflex over ( )}2 denotes the variance of the values detected by the sensor.
x′=x+N(0,std{circumflex over ( )}2) Formula (1)
Note that N(μ,σ{circumflex over ( )}2) denotes random values that are distributed in a normal distribution with a mean of μ and a variance of σ{circumflex over ( )}2.
By adding noise values that are distributed in a normal distribution as described above in each of the three axis directions that are orthogonal to each other, it is possible to prevent the mean and the variance from significantly changing before and after the addition of the noise values. Note that noise values to be added to training data may be noise values that are not distributed in a normal distribution as described above. Also, the variance of the normal distribution may be greater than the variance of the values detected by the sensor.
In the example shown in
As described above, in the present embodiment, the training data generation apparatus 10 includes a noise determination unit 11 that determines whether or not training data that is to be used in machine learning includes noise, and a noise addition unit 12 that generates new training data by adding noise to training data that has been determined by the noise determination unit 11 as not including noise.
By generating new training data by adding noise to training data that has been determined as not including noise, it is possible to generate a sufficient amount of training data and realize learning with high accuracy, while suppressing an increase in costs.
Although the present embodiment has been described using an example in which the training data generation apparatus 10 generates training data from road surface data detected by sensors mounted on a moving body, the present invention is not limited to such an example. The training data generation apparatus 10 can generate training data from various kinds of data that may include noise.
While the training data generation apparatus 10 has been described above, a computer may be used so as to function as the training data generation apparatus 10. Such a computer can be realized by storing a program that describes the content of processing performed to realize the functions of the training data generation apparatus 10, in a storage unit of the computer, and causing a CPU of the computer to read out and execute the program.
The program may be recorded on a computer-readable recording medium. By using such a recording medium, it is possible to install the program to a computer. Here, the recording medium on which the program is recorded may be a non-transitory recording medium. A non-transitory recording medium is not specifically limited, and may be a recording medium such as a CD-ROM or a DVD-ROM, for example.
Although the above embodiment has been described as a presentative example, it is obvious for a person skilled in the art that various modifications and replacements may be applied within the spirit and the scope of the present invention. Therefore, the present invention should not be construed as being limited by the above-described embodiment, and various modifications and changes may be made without departing from the scope of the claims. For example, it is possible to combine a plurality of constituent blocks described in the configuration diagram according to the embodiment into one block, or to divide one constituent block into a plurality of blocks.
Number | Date | Country | Kind |
---|---|---|---|
2018-174580 | Sep 2018 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/035167 | 9/6/2019 | WO | 00 |