The present invention relates to an odor data analysis.
In recent years, an odor analysis has been carried out using sets of odor data obtained from odor sensors. In the odor analysis, it is desired to quantitatively express a meaningful relationship between the sets of odor data obtained from the odor sensors. For example, it is desired to quantitatively express a relationship between odor data representing an odor of a wine A and odor data representing an odor of a wine B, which are obtained from odor sensors. Moreover, for example, even if odors are completely different genres such as coffee and tire odors, it is desired to quantitatively express an odor relationship in an integrated manner.
Patent Document 1 describes a technique called “word embedding (Word Embedding)” or “distributed representation of words” using, for example, word2vec, Glove, or the like. Incidentally, the “word embedding” or the “distributed representation of words” is a technique to express the meaning of each word by a real number vector of low dimensions.
Japanese Laid-open Patent Publication No. 2017-151838
However, Patent Document 1 is a technique for representing a semantic relationship of a natural language by vectors, it is impossible to quantitatively express a relationship between sets of odor data obtained from odor sensors.
It is one object of the present invention to quantitatively express the relationship between the sets of odor data obtained from the odor sensors.
In order to solve the above problems, according to an example aspect of the present invention, there is provided a data processing apparatus including:
an acquisition unit configured to acquire odor data; and
a prediction unit configured to predict a label of the acquired odor data in a label space by using a model in which a relationship between sets of odor data and labels in the label space expressing features of odors is learned.
According to another example aspect of the present invention, there is provided a learning apparatus including:
a learning data acquisition unit configured to acquire learning data in which odor data of each object and a label representing the object in a label space expressing features of odors are associated with each other; and
a learning unit configured to train a model for predicting a label of odor data in the label space from the odor data, by using the learning data.
According to still another example aspect of the present invention, there is provided an information processing method, including:
acquiring odor data; and
predicting a label of the acquired odor data in a label space by using a model in which a relationship between sets of odor data and labels in the label space expressing features of odors is learned.
According to a further example aspect of the present invention, there is provided a recording medium storing a program, the program causing a computer to perform a process including:
acquiring odor data; and
predicting a label of the acquired odor data in a label space by using a model in which a relationship between sets of odor data and labels in the label space expressing features of odors is learned.
According to the present invention, it is possible to quantitatively express each relationship among sets of odor data obtained from odor sensors.
[Principles]
First, a basic principle in example embodiments of the present invention will be described. A prediction apparatus in the example embodiments expresses odor data with vectors by predicting a label in a certain vector space with respect to input odor data.
A prediction apparatus 10 predicts a label in a vector space indicating odor features based on input odor data. Because the label in the vector space indicates a vector quantity in that vector space, the odor data are expressed by the vector. By expressing odor data in the vector space, it becomes possible to quantitatively analyze each relationship among multiple odors, and to add and subtract some odors.
(Hardware Configuration)
The interface 12 communicates with an external apparatus. Specifically, the interface 12 is used to input odor data from the odor sensor 5 or a device that stores sets of odor data, or to output a label obtained as a prediction result to an outside.
The processor 13 is a computer such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit) with the CPU, and controls the entire prediction apparatus 10 by executing a program prepared in advance. The memory 14 is formed by a ROM (Read Only Memory), a RAM (Random Access Memory), or the like. The memory 14 stores various programs to be executed by the processor 13. Also, the memory 14 is used as a work memory during executions of various processes by the processor 13.
The recording medium 15 is a non-volatile and non-transitory recording medium such as a disk-shaped recording medium, a semiconductor memory, or the like, and is formed to be detachable from the prediction apparatus 10. The recording medium 15 records various programs executed by the processor 13. When the prediction apparatus 10 executes each of processes described later, such as a label prediction process and the like, a program recorded on the recording medium 15 is loaded into the memory 14 and executed by the processor 13.
The database 16 stores data necessary for processes performed by the prediction apparatus 10. Specifically, the database 16 stores learning data for use in a case of training a model for predicting a label. In addition to the above, the prediction apparatus 10 may include an input device such as a keyboard or a mouse, or a display device.
(Functional Configuration for Learning)
In the learning of the prediction apparatus 10, the predictive model 21 is trained using the learning data stored in the training DB 23. The learning data are data in which sets of the odor data for various objects are correlated with respective labels representing the objects in the label space. As will be described later, the labels are defined for each label space. During the learning of the prediction apparatus 10, odor data of an object are input to the predictive model 21, and the predictive model 21 predicts a label for the odor data, generates a prediction result (referred to as a “predicted label”), and outputs the prediction result to the parameter update unit 22. The parameter update unit 22 acquires a correct answer label for the odor data of the object from the training DB 23. After that, the parameter update unit 22 calculates an error between the predicted label and the correct answer label, and updates the parameters of the predictive model 21 so that the error is minimized. Thus, the predictive model 21 is trained using the learning data.
(Label Space)
Next, the label space will be described. The label space is a vector space indicating odor features, and is a space in which each label obtained as a prediction result is defined. By expressing odors using labels indicating respective vector quantities in the label space, each relationship among a plurality of odors is quantitatively expressed. For instance, labels located at a close distance in the label space indicate a close odor in that label space, and labels located at an opposite direction in the label space are considered to exhibit odors of contrasting properties in that label space. However, even in the same odor, in a case of a different label space, vectors expressing those odors are different. In the example embodiments, as described below, several label spaces are used to express odors.
(1) Space Expressing a Structure and Chemical Properties of a Substance
In a first example, we use a space that expresses a structure and chemical properties of a substance as a label space. Since the odor of a substance is thought to be determined by the structure and the chemical properties of the substance, it is considered effective to use a space expressing the structure and the chemical properties of the substance as the label space. Specifically, a vector space centered on an index, which quantitatively expresses a structure and chemical properties of a molecule, is defined as a label space.
(2) Space Representing a Sensory Evaluation Index
In a second example, a space representing an index obtained in sensory tests by humans is used as the label space.
(3) word2vec Space
In a third example, a word2vec space is used as the label space. The “word2vec” is a method of expressing the meaning of a word with a vector (distributed representation), and in this example, the word2vec space is used as the label space. As illustrated in
However, since the nature of the word2vec space depends on a sentence (corpus) used to learn the odor, when using the word2vec space as the label space, the word2vec needs to be learned using sentences related to the odor. Thus, the word2vec, which is learned using odor-related sentences, for instance, research documents such as papers on a sense of smell, review comments on cosmetics, review articles on food catalogues and gourmand guides, and the like, are used as the label space.
(4) Space Representing a Reaction When Smelling
As a fourth example, a label space may be formed using some biological reactions that occurs in the human body when humans smell odors. For instance, brain waves when humans smell odors, a functional magnetic resonance imaging (fMRI), a heart rate interval (RRI: RR Interval), or the like are used.
(5) Combination of First to Fourth Examples
As a fifth example, a combination of two or more of the first to fourth examples described above may be used. Specifically, first, a new label space may be created by simply combining two or more label spaces among the first to fourth examples. Alternatively, a sensory evaluation index space of the second example and a word2vec space of the third example may be used in two stages. In the sensory evaluation index, the odor is often expressed using nouns, adjectives, and onomatopes. For instance, hexane is called “odor like kerosene”, hexanal is called “odor of old rices”, and the like, and the odor is expressed by associating with an appropriate language. Accordingly, by first associating a language with an odor and then using the word2vec, which represents a distance between languages, it is possible to use the label space that is close to a sensation when humans perceive odors.
(Learning Process)
Next, a learning process performed by the prediction apparatus 10 will be described.
First, odor data are input to the prediction apparatus 10 (step S11). In this case, an output of the odor sensor 5 may be directly input to the prediction apparatus 10, or the odor data stored in the storage apparatus or the like may be input to the prediction apparatus 10. The prediction apparatus 10 predicts a label of the odor data using the predictive model 21, and outputs the predicted label (step S12). Next, the parameter update unit 22 compares the predicted label obtained from the predictive model 21 with a correct answer label of the odor data obtained from the training DB 23, and updates parameters of the predictive model based on an error (step S13).
Next, the prediction apparatus 10 determines whether or not a predetermined end condition is provided (step S14). When the end condition is not provided (step S14: No), the process returns to step S11, and steps S11 to S13 are repeated. On the other hand, when the end condition is provided (step S14: Yes), the process is terminated. Incidentally, the end condition is a predetermined condition relating to a repetition count of processes in steps S11 to S13, a degree of variation in the error between the predicted label and the correct label, or the like.
(Functional Configuration for Prediction)
Next, a configuration for performing prediction using a predictive model trained by the above-described learning process will be described.
(Predictive Process)
Next, a prediction process by the prediction apparatus 30.
Next, an application example of the process by the prediction apparatus 30 will be described.
(1) Distance Calculation Process
(2) Product Proposal Process
(3) Odor Data Proposal Process
An odor data proposal process is a process to propose a form of odor data when outputting an arbitrary label.
(Modification)
In the above example, a label is represented using the label space; however, a vector representation for odor data may be generated, instead of using the label space.
Next, a second example embodiment will be described.
(Data Processing Apparatus)
(Learning Apparatus)
A part or all of the example embodiments described above may also be described as the following supplementary notes, but not limited thereto.
(Supplementary note 1)
A data processing apparatus comprising:
an acquisition unit configured to acquire odor data; and
a prediction unit configured to predict a label of the acquired odor data in a label space by using a model in which a relationship between sets of odor data and labels in the label space expressing features of odors is learned.
(Supplementary note 2)
The data processing apparatus according to supplementary note 1, wherein
the label space is a space expressing semantic relations of words as a distributed representation, and
the model is trained using sentences related to the odors.
(Supplementary note 3)
The data processing apparatus according to supplementary note 1, wherein
the label space is a space expressing a sensory evaluation index of the odors, and
the model is trained using sensory test results of the odors.
(Supplementary note 4)
The data processing apparatus according to supplementary note 1, wherein
the label space is a space expressing chemical properties of the odors, and
the model is trained using chemical properties of substances.
(Supplementary note 5)
The data processing apparatus according to supplementary note 1, wherein
the label space is a space expressing features of biological reactions, and
the model is trained using the features of the biological reactions when humans smell odors.
(Supplementary note 6)
The data processing apparatus according to any one of supplementary notes 1 through 5, wherein
the acquisition unit acquires two or more sets of odor data,
the prediction unit predicts labels of the two or more sets of odor data, and
the data processing apparatus further comprises a calculation unit configured to calculate each distance among the predicted labels for the two or more sets of odor data.
(Supplementary note 7)
The data processing apparatus according to any one of supplementary notes 1 through 5, further comprising:
a storage unit configured to store labels in the label space for a plurality of articles; and
an article presentation unit configured to determine an article which distance is equal to or less than a predetermined value from among distances between the label predicted by the prediction unit and respective labels of the plurality of articles stored in the storage unit.
(Supplementary note 8)
The data processing apparatus according to any one of supplementary notes 1 through 5, further comprising a distance determination unit configured to determine whether or not a distance between the label predicted by the prediction unit and an arbitrary label is equal to or less than a predetermined value, and output a determination result.
(Supplementary note 9)
The data processing apparatus according to any one of supplementary notes 1 through 8, wherein the odor data are data indicating a feature amount of an odor waveform output from an odor sensor.
(Supplementary note 10)
A learning apparatus comprising:
a learning data acquisition unit configured to acquire learning data in which odor data of each object and a label representing the object in a label space expressing features of odors are associated with each other; and
a learning unit configured to train a model for predicting a label of odor data in the label space from the odor data, by using the learning data.
(Supplementary note 11)
An information processing method, comprising:
acquiring odor data; and
predicting a label of the acquired odor data in a label space by using a model in which a relationship between sets of odor data and labels in the label space expressing features of odors is learned.
(Supplementary note 12)
A recording medium storing a program, the program causing a computer to perform a process comprising:
acquiring odor data; and
predicting a label of the acquired odor data in a label space by using a model in which a relationship between sets of odor data and labels in the label space expressing features of odors is learned.
While the invention has been described with reference to the example embodiments and examples, the invention is not limited to the above example embodiments and examples. It will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2019/036576 | 9/18/2019 | WO |