This disclosure relates to a training data generation method, a training data generation program, a training data generation apparatus, and a product identification apparatus.
Patent document 1 (JP-A No. 2017-27136) discloses a shop system that identifies products by image recognition. The system is expected to be applied in store checkout counters, for example.
When capturing an image in which there is more than one product, sometimes the products partially overlap each other. In such cases, this poses an obstacle for conventional image processing to distinguish between the plural products that overlap each other. This problem is also the same in image processing using machine learning, which has been receiving attention in recent years.
It is a problem of this disclosure to enable a product identification apparatus that identifies plural products to distinguish between overlapping products when using machine learning to train a computing unit that computes the quantities of the products.
A training data generation method pertaining to a first aspect is used to generate a computing unit for a product identification apparatus that computes, from a group image in which there are one or more types of products, the quantities of each type of the products included in the group image. The training data includes plural learning group images and labels assigned to each of the plural learning group images. The training data generation method comprises a first step of acquiring individual images in each of which there is one product of one type and a second step of generating the plural learning group images including one or more of the products by randomly arranging the individual images. The plural learning group images generated in the second step include learning group images in which the individual images at least partially overlap each other.
According to this method, at least some of the learning group images are learning group images in which the individual images at least partially overlap each other. Consequently, training image data configuring the computing unit capable of identifying the overlapping products can be obtained.
A training data generation method pertaining to a second aspect is the training data generation method pertaining to the first aspect, further comprising a third step of assigning, as the labels to the learning group images, the quantities of each type of the products included in the learning group images generated in the second step.
According to this method, the training data includes as the labels the quantities of each of the products. Consequently, the computing unit can be trained to be able to identify the quantities of the products.
A training data generation method pertaining to a third aspect is the training data generation method pertaining to the first aspect, further comprising a third step of assigning, as the labels to the learning group images, coordinates of centroids corresponding to each of the individual images included in the learning group images generated in the second step.
According to this method, the training data includes as the labels the coordinates of the centroids of the individual images. Consequently, the computing unit can be trained to not mistake plural products for a single product.
A training data generation method pertaining to a fourth aspect is the training data generation method pertaining to the first aspect, further comprising a third step of assigning, as the labels to the learning group images, replacement images in which each of the individual images included in the learning group images generated in the second step have been replaced with corresponding representative images.
According to this method, the training data includes as the labels the replacement images in which the individual images have been replaced with the representative images.
A training data generation method pertaining to a fifth aspect is the training data generation method pertaining to the fourth aspect, wherein the representative images are pixels representing centroids of each of the individual images.
According to this method, the training data includes as the labels the replacement images in which the individual images have been replaced with their centroid pixels.
A training data generation method pertaining to a sixth aspect is the training data generation method pertaining to the fourth aspect, wherein the representative images are outlines of each of the individual images.
According to this method, the training data includes as the labels the replacement images in which the individual images have been replaced with their outlines.
A training data generation method pertaining to a seventh aspect is the training data generation method pertaining to any one of the first aspect to the sixth aspect, wherein in the second step an upper limit and a lower limit of an overlap ratio defined by the ratio of an area of overlap with respect to the area of the individual images can be designated.
According to this method, the degree of overlap between the individual images in the learning group images is designated. Consequently, learning by the computing unit suited to degrees of overlap that can realistically occur is possible.
A training data generation method pertaining to an eighth aspect is the training data generation method pertaining to any one of the first aspect to the seventh aspect, wherein in the second step at least one of a process that enlarges or reduces the individual images at random rates, a process that rotates the individual images at random angles, a process that changes the contrast of the individual images at random degrees, and a process that randomly inverts the individual images is performed per individual image when arranging the individual images.
According to this method, the volume of the training data increases. Consequently, the recognition accuracy of the computing unit can be improved.
A training data generation method pertaining to a ninth aspect is the training data generation method pertaining to any one of the first aspect to the eighth aspect, wherein the products are food products.
According to this method, the recognition accuracy of the computing unit can be improved in regard to food products.
A training data generation program pertaining to a tenth aspect is used to generate a computing unit for a product identification apparatus that computes, from a group image in which there are one or more types of products, the quantities of each type of the products included in the group image. The training data includes plural learning group images and labels assigned to each of the plural learning group images. The training data generation program causes a computer to function as an individual image acquisition unit that acquires individual images in each of which there is one product of one type and a learning group image generation unit that generates the plural learning group images including one or more of the products by randomly arranging the individual images. Included among the learning group images are learning group images in which the individual images at least partially overlap each other.
According to this configuration, at least some of the learning group images are learning group images in which the individual images at least partially overlap each other. Consequently, training image data configuring the computing unit capable of identifying the overlapping products can be obtained.
A training data generation apparatus pertaining to an eleventh aspect is used to generate a computing unit for a product identification apparatus that computes, from a group image in which there are one or more types of products, the quantities of each type of the products included in the group image. The training data includes plural learning group images and labels assigned to each of the plural learning group images. The training data generation apparatus comprises an individual image acquisition unit that acquires individual images in each of which there is one product of one type and a learning group image generation unit that generates the plural learning group images including one or more of the products by randomly arranging the individual images. The learning group image generation unit causes the individual images to at least partially overlap each other.
According to this configuration, at least some of the learning group images are learning group images in which the individual images at least partially overlap each other. Consequently, training image data configuring the computing unit capable of identifying the overlapping products can be obtained.
A product identification apparatus pertaining to a twelfth aspect computes, from a group image in which there are one or more types of products, the quantities of each type of the products included in the group image. The product identification apparatus comprises a camera and a neural network that processes output from the camera. The neural network learns using training data. The training data includes plural learning group images and labels assigned to each of the plural learning group images. The plural learning group images include learning group images in which the individual images at least partially overlap each other.
According to this configuration, the training data including the individual images of the plural products that overlap each other is used in the learning by the neural network. Consequently, the recognition accuracy of the neural network is improved.
According to this disclosure, training image data configuring a computing unit capable of identifying overlapping products can be obtained.
Embodiments of the present invention will be described below with reference to the drawings. It will be noted that the following embodiments are specific examples of the present invention and are not intended to limit the technical scope of the present invention.
The product identification apparatus 10 has an imaging device 20 and an identification computer 30. The imaging device 20 and the identification computer 30 are connected to each other via a network N. The network N here may be a LAN or a WAN. The imaging device 20 and the identification computer 30 may be installed in locations remote from each other. For example, the identification computer 30 may be configured as a cloud server. Alternatively, the imaging device 20 and the identification computer 30 may also be directly connected to each other without the intervention of the network N.
The imaging device 20 has a base 21, a support 22, a light source 23, a camera 24, a display 25, and an input unit 26. The base 21 functions as a platform on which to place the tray T. The support 22 supports the light source 23 and the camera 24. The light source 23 is for illuminating the products placed on the tray T. The camera 24 is for imaging the products G placed on the tray T. The display 25 is for displaying the identification results of the products G. The input unit 26 is for inputting the names and so forth of the products G.
As shown in
The product determination unit 35 has a computing unit X. The computing unit X is a function approximator capable of learning input/output relationships. The computing unit X typically is configured as a multi-layered neural network. The computing unit X acquires a learned model M as a result of prior machine learning. The machine learning typically is performed as deep learning, but it is not limited to this.
A learning phase for the computing unit X of the identification computer 30 to acquire the learned model M is performed by supervised learning. The supervised learning is executed using training data 40 shown in
In this embodiment, each learning group image 41 comprises a combination of individual images 43a to 43c shown in
As shown in
As shown in
A training data generation apparatus 50 shown in
The training data generation apparatus 50 generates the training data 40 by the procedure shown in
This acquisition of individual images is also performed in regard to the products G2 and G3.
Next, settings are input to the training data generation apparatus 50 (step 106). The settings are, for example, the following values.
Next, the learning group image generation unit 62 generates one learning group image 41 by randomly arranging the individual images (step 108). Specifically, as shown in
These processes are intended to reproduce individual differences that are often seen in food products. The individual differences are differences that arise in regard to the same product, such as size, shape, and color (the extent to which bread is baked), for example. Moreover, variations in the arrangement directions of the products G can be handled by the rotation process.
Moreover, as shown in
Next, the label assignment unit 63 generates the label 42 and assigns the label 42 to the learning group image 41 (step 110). Specifically, the label assignment unit 63 generates the label 42 from the record of the individual images arranged in the learning group image 41. The label 42 in this embodiment is the quantities of each of the products G1 to G3. The label 42 is assigned to the learning group image 41; that is, it is associated and recorded with the learning group image 41.
The training data generation apparatus 50 repeats step 108 and step 110 until the number of the learning group images 41 to which the labels 42 have been assigned reaches the number that was set. Because of this, numerous sets of the learning group images 41 and the labels 42 are generated.
(3-1)
At least some of the plural learning group images 41 are learning group images in which the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 at least partially overlap each other. Consequently, according to the method of generating the training data 40, the program for generating the training data 40, and the training data generation apparatus 50 according to this disclosure, the training data 40 configuring the computing unit X capable of identifying the overlapping products G can be obtained.
(3-2)
The training data 40 includes as the labels the quantities of each of the products G. Consequently, the computing unit X can be trained to be able to identify the quantities of the products G.
(3-3)
The degree of overlap between the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 in the learning group images 41 is designated. Consequently, learning by the computing unit X suited to degrees of overlap that can realistically occur is possible.
(3-4)
Before being arranged in the learning group image 41, the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 are subjected to enlargement/reduction, rotation, changes in contrast, and inversion. Consequently, the volume of the training data 40 increases, so the recognition accuracy of the computing unit X can be improved.
(3-5)
The recognition accuracy of the computing unit X can be improved in regard to food products.
(3-6)
According to the product identification apparatus 10 according to this disclosure, the training data 40 including the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 of the plural products G that overlap each other is used in the learning by the neural network. Consequently, the recognition accuracy of the neural network is improved.
In this embodiment, the label 42 includes coordinates of centroids of the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 arranged in the learning group image 41. In step 110 of
The product identification apparatus 10 that has acquired the learned model M using this training data 40 first obtains the coordinates of the centroids of each of the products G in the inference phase. Conversion from the coordinates of the centroids to the quantities of the products G is performed by another dedicated program stored in the identification computer 30.
The training data 40 includes as the labels 42 the coordinates of the centroids of the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6. Consequently, the computing unit X can be trained to not mistake plural products G for a single product.
In this embodiment, the label 42 is a replacement image in which the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 included in the learning group image 41 are replaced with representative images. In this embodiment, the representative images are centroid pixels P of the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6. In step 110 of
The format of the label 42 will be further described. The label 42 is, for example, an image of the same size as the learning group image 41. In a case where the learning group image 41 has X×Y number of pixels arrayed in X columns and Y rows, the label 42 also has X×Y number of pixels arrayed in X columns and Y rows. The pixels of the label 42 are not configured by RGB but are configured as N-dimensional vectors. Here, N is the number of the types of the products G registered in the training data generation apparatus 50 (e.g., N=3 in a case where the products G1, G2, G3 are registered). A pixel at x-th column and y-th row is given as the following vector.
A(x,y)=(axy1,axy2, . . . axyi, . . . axyN) [Formula 1]
Here, axyi is the number of the products G of the i-th type at coordinate (x, y), that is, the number of centroid pixels P corresponding to the products G of the i-th type existing at coordinate (x, y).
The product identification apparatus 10 that has acquired the learned model M using this training data 40 first obtains the replacement images in the inference phase. The replacement images are also configured by pixels given by vector A. Conversion from the replacement images to the quantities of the products G is performed by another dedicated program stored in the identification computer 30. For example, the program finds, by the following formula, quantities Hi of the products G of the i-th type included in the learning group image 41.
The training data 40 includes as the label 42 the replacement images in which the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 included in the learning group image 41 have been replaced with the centroid pixels P. Consequently, the computing unit X can be trained to not mistake plural products G for a single product.
(3-1)
In the third embodiment, one centroid pixel P is used as a representative image depicting one individual image. Instead of this, a region comprising plural pixels representing the centroid position may also be used as a representative image depicting one individual image. In this case, the above formula is appropriately modified, by means such as multiplying the coefficient for example, so as to be able to accurately calculate the quantities Hi of the products G of the i-th type.
(3-2)
In the third embodiment, the centroid pixel P is used as the representative image. Instead of this, the representative image may also be another pixel. For example, the representative image may also be a pixel at the center point of a quadrangular region surrounding the individual image (where each of the four sides of the region pass through the top, bottom, right, and left endpoints of the individual image). Alternatively, the representative image may also be the pixel at one vertex (e.g., the lower left vertex) of the quadrangular region surrounding the individual image.
In this embodiment, the label 42 is a replacement image in which the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 included in the learning group image 41 have been replaced with representative images. In this embodiment, the representative images are outline images O of the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6. In step 110 of
The product identification apparatus 10 that has acquired the learned model M using this training data 40 first obtains the replacement images in the inference phase. Conversion from the replacement images to the quantities of the products G is performed by another dedicated program stored in the identification computer 30.
The training data 40 includes as the labels 42 the replacement images in which the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6 included in the learning group image 41 have been replaced with the outline images O of the individual images 43a1 to 43a6, 43b1 to 43b6, 43c1 to 43c6. Consequently, the computing unit X can be trained to not mistake plural products for a single product.
Number | Date | Country | Kind |
---|---|---|---|
2018-212304 | Nov 2018 | JP | national |