The technology of the present disclosure relates to an information processing apparatus, an information processing method, a program, and a drug evaluation method.
In order to improve an efficiency of new drug development, for example, a method of performing a toxicity evaluation using cells such as myocardial cells produced from induced pluripotent stem (iPS) cells has been developed (see, for example, WO2019/131806A). The toxicity evaluation is performed by evaluating responsiveness of cells to a drug.
As a culture vessel for the myocardial cells, for example, a well plate in which a plurality of wells are formed is used. A microelectrode array (MEA) in which a plurality of microelectrodes are disposed is provided on a bottom surface of each well. Such a well plate is called an MEA plate. A waveform (for example, a myocardial waveform indicating pulsation of the myocardial cells) indicating an electrophysiological change of cells cultured in the wells is output from each microelectrode of the microelectrode array. The toxicity evaluation is performed by measuring a change in waveform with respect to a drug.
Variations occur in the waveform output from each of the plurality of microelectrodes provided in the wells. Therefore, for each well, one microelectrode from which the most ideal waveform is output is selected as a target electrode to be a target for the toxicity evaluation. Such selection of the target electrode is also called selection of the golden channel (see, for example, Reference 1). For example, in a case of the myocardial cells, a waveform closest to the healthiest state is selected among the myocardial waveforms output from the microelectrodes based on a known myocardial waveform representing a healthy state (for example, a state without a disease such as arrhythmia) obtained in the past measurement.
Since there is no clear definition for the ideal waveform that serves as a reference for selecting the target electrode, and only a rough shape is determined, it is difficult to mechanically select the target electrode based on the waveform output from each of the microelectrodes. Therefore, a current state of the art is that the target electrode is selected by a sensory evaluation in which the waveform output from each of the microelectrodes is compared with a known ideal waveform by a human.
However, in the selection of the target electrode based on such a sensory evaluation, there are problems in that the evaluation takes a long time and the evaluation varies depending on an experience of an evaluator. Reference 1 described above discloses that the target electrode is selected based on a peak amplitude of the waveform output from each of the microelectrodes, but it is difficult to select a waveform close to an ideal waveform with high accuracy by such a simple mechanical method.
Therefore, it is desired to develop a method that enables selection of a waveform close to an ideal waveform in a short time while maintaining the selection accuracy of the waveform by the sensory evaluation so far. Such a problem regarding selection of a waveform close to an ideal waveform is not limited to the field related to the toxicity evaluation, and exists in various fields.
An object of the technology of the present disclosure is to provide an information processing apparatus, an information processing method, and a program that enable selection of a waveform close to an ideal waveform with high accuracy in a short time.
In order to achieve the above object, an information processing apparatus according to the present disclosure comprises: a processor, in which the processor acquires a plurality of unknown waveform data of which a determination result of superiority or inferiority based on similarity to an ideal waveform is unknown, performs a determination of the superiority or inferiority for each of the plurality of unknown waveform data based on a plurality of teacher waveform data to which the determination result of the superiority or inferiority is linked, and outputs the superiority or inferiority of the plurality of unknown waveform data in a comparable manner.
It is preferable that the processor performs clustering on a set including the plurality of teacher waveform data and the plurality of unknown waveform data, and performs the determination of the superiority or inferiority by obtaining, for each of clusters including at least one of the plurality of unknown waveform data, a probability that the unknown waveform data has a superior determination, as a result of the clustering.
It is preferable that the processor obtains the probability based on the number of the teacher waveform data with a superior determination and the number of the teacher waveform data with a superior determination and an inferior determination, for each of the clusters.
It is preferable that the probability is represented by a value obtained by dividing the number of the teacher waveform data with the superior determination by the number of the teacher waveform data with the superior determination and the inferior determination, and that the processor ranks and outputs the superiority or inferiority of the plurality of unknown waveform data based on the probability in a comparable manner.
It is preferable that the processor performs the clustering by a k-medoids method or a k-means method, but other clustering algorithms may be used.
It is preferable that the processor performs a filtering process of excluding unknown waveform data that does not satisfy an evaluation criterion from the set or lowering a rank in terms of the superiority or inferiority.
It is preferable that the processor inputs the unknown waveform data to a neural network that has been trained through machine learning based on the teacher waveform data, and performs the determination of the superiority or inferiority based on a result output from the neural network.
It is preferable that the processor inputs the unknown waveform data to an encoder of an auto-encoder that has been trained through machine learning based on the teacher waveform data with a superior determination, and then performs the determination of the superiority or inferiority based on a difference between waveform data restored by a decoder and the unknown waveform data input to the encoder.
It is preferable that the unknown waveform data is a pulse signal output by a cell.
Examples of the cell include a nerve cell, a myocardial cell, a skeletal muscle cell, and a smooth muscle cell, and the cell is preferably a myocardial cell.
A drug evaluation method according to the technology of the present disclosure is preferably a drug evaluation method of evaluating a drug based on information output from the information processing apparatus described above, in which the unknown waveform data with a high rank of superiority or inferiority is used for drug evaluation.
An information processing method according to the technology of the present disclosure comprises: acquiring a plurality of unknown waveform data of which a determination result of superiority or inferiority based on similarity to an ideal waveform is unknown; performing a determination of the superiority or inferiority for each of the plurality of unknown waveform data through machine learning based on a plurality of teacher waveform data to which the determination result of the superiority or inferiority is linked; and outputting the superiority or inferiority of the plurality of unknown waveform data in a comparable manner.
A program according to the technology of the present disclosure causes a computer to execute: an acquisition process of acquiring a plurality of unknown waveform data of which a determination result of superiority or inferiority based on similarity to an ideal waveform is unknown; a determination process of performing a determination of the superiority or inferiority for each of the plurality of unknown waveform data through machine learning based on a plurality of teacher waveform data to which the determination result of the superiority or inferiority is linked; and an output process of outputting the superiority or inferiority of the plurality of unknown waveform data in a comparable manner.
According to the technology of the present disclosure, it is possible to provide an information processing apparatus, an information processing method, and a program that enable selection of a waveform close to an ideal waveform with high accuracy in a short time.
Exemplary embodiments according to the technique of the present disclosure will be described in detail based on the following figures, wherein:
Hereinafter, embodiments according to the technology of the present disclosure will be described with reference to the drawings.
An MEA plate 30 is used for culturing the cells. The cell culture apparatus 10 is provided with a culture chamber 11 that accommodates the MEA plate 30 therein. In addition, the cell culture apparatus 10 is provided with a slide-type lid 12 for opening and closing the culture chamber 11. The MEA plate 30 is attached to the culture chamber 11 in a state where the cells have been seeded. The culture chamber 11 functions as an incubator and enables the culture of the cells for a long period of time.
In the present embodiment, as the cells, myocardial cells produced from iPS cells are cultured by the cell culture apparatus 10. In addition, the cell culture apparatus 10 measures an extracellular potential representing a myocardial waveform of the myocardial cells seeded on the MEA plate 30 by a multipoint measurement method, and outputs waveform data obtained by the measurement to the information processing apparatus 20. The waveform data represents a pulse signal output by the myocardial cells.
The information processing apparatus 20 is configured of a general computer such as a personal computer. Software for analyzing the waveform data input from the cell culture apparatus 10 is installed in the information processing apparatus 20. The information processing apparatus 20 includes a display unit 21 and an input unit 22. The display unit 21 is a display device such as a liquid crystal display or an organic electro luminescence (EL) display. The input unit 22 is an input device such as a keyboard, a touch pad, or a mouse. The information processing apparatus 20 is connected to the cell culture apparatus 10 by wire or wirelessly. The display unit 21 and the input unit 22 may be configured as external devices connected to the information processing apparatus 20.
The information processing apparatus 20 calculates an extracellular potential duration (field potential duration (FPD)), an interspike interval (ISI), and the like based on the input waveform data. Since the FPD corresponds to a QT interval (time from a start of a Q wave to an end of a T wave) in an electrocardiogram, it is used as an index of arrhythmia. Prolongation of the QT interval indicates a potential for arrhythmia. A user can perform a toxicity evaluation for evaluating responsiveness of the cells to a drug based on the FPD or the like.
The potential measurement circuit 50 transmits the measured myocardial waveform to the information processing apparatus 20 as waveform data via the communication I/F 51. In a case in which the number of the wells 32 formed in the MEA plate 30 is 48 and the number of the microelectrode arrays 40 provided in one well 32 is 16, 768 pieces of the waveform data are transmitted from the potential measurement circuit 50 to the information processing apparatus 20.
The information processing apparatus 20 includes a processor 23, a memory 24, an input unit 22, a display unit 21, a communication I/F 25, a bus 26, and the like. The processor 23 is a computer that realizes various functions by reading out a program 28 and various types of data stored in the memory 24 and executing processing. The processor 23 is, for example, a central processing unit (CPU).
The memory 24 is a storage device that stores the program 28 and the various types of data in a case in which the processor 23 executes processing. The memory 24 includes, for example, a random access memory (RAM), a read only memory (ROM), a storage, or the like. The RAM is, for example, a volatile memory used as a work area or the like of the processor 23. The ROM is, for example, a non-volatile memory that holds the program 28 and the various types of data. The ROM is, for example, a flash memory. The storage is, for example, a large-capacity storage device such as a hard disk drive (HDD) or a solid state drive (SSD), and stores an operating system (OS), various types of data, and the like. The memory 24 may be configured as an external device connected to the information processing apparatus 20.
In addition, the memory 24 stores teacher waveform data TD. The teacher waveform data TD is waveform data to which determination results of a determination of superiority or inferiority indicating whether or not a waveform is close to a known ideal waveform are linked. The determination of the superiority or inferiority is performed by a sensory evaluation such as comparison with a known ideal waveform by a human. The ideal waveform is a waveform representing a healthy state (for example, a state without a disease such as arrhythmia) obtained by the past measurement.
Variations occur in the output waveforms of the plurality of electrodes 41 included in the microelectrode array 40 provided in the well 32. For the toxicity evaluation, it is necessary to use a waveform close to the ideal waveform. As will be described below in detail, the information processing apparatus 20 performs a process of selecting the electrode 41 from which a waveform with the highest similarity to the ideal waveform is output from the plurality of electrodes 41 included in the microelectrode array 40, as a target electrode of the toxicity evaluation. Such selection of the target electrode is also referred to as selection of a golden channel (hereinafter, referred to as GC). The information processing apparatus 20 performs the GC selection process using a plurality of waveform data transmitted from the cell culture apparatus 10 and a plurality of teacher waveform data TD stored in advance in the memory 24.
The processor 23 functions as a data acquisition unit 60, a determination unit 61, and an output unit 62. The data acquisition unit 60 performs an acquisition process of acquiring the waveform data transmitted from the cell culture apparatus 10. Since similarity of the waveform data transmitted from the cell culture apparatus 10 to the ideal waveform is unknown, the waveform data acquired by the data acquisition unit 60 from the cell culture apparatus 10 will be referred to as unknown waveform data UD below. In the unknown waveform data UD, the determination result of the superiority or inferiority based on the similarity to the ideal waveform is unknown.
The determination unit 61 performs a determination process of determining the superiority or inferiority with respect to the similarity to the ideal waveform for each of the plurality of unknown waveform data UD acquired by the data acquisition unit 60 based on the plurality of teacher waveform data TD stored in advance in the memory 24. As will be described below in detail, in the present embodiment, the determination unit 61 determines the superiority or inferiority by using a method of clustering that is a kind of machine learning algorithm.
The output unit 62 performs an output process of outputting the superiority or inferiority of the plurality of unknown waveform data UD determined by the determination unit 61 in a comparable manner. For example, the output unit 62 causes the display unit 21 to display the superiority or inferiority of the plurality of unknown waveform data UD in a comparable manner.
The superiority or inferiority of the teacher waveform data TD is determined by a human evaluation using values of FPD, ISI, maximum potential of P1 (hereinafter, referred to as P1max), minimum potential of P1 (hereinafter, referred to as P1min), and A2, and the overall shape of the waveform as evaluation criteria. For example, a waveform that satisfies Equations (1) to (4) and satisfies an evaluation criterion that the overall shape is close to the ideal waveform is determined as being “superior” (that is, GC), and a waveform that does not satisfy the evaluation criterion is determined as being “inferior”.
P1max≥200 V (1)
P1min≤−200 V (2)
A2≥215 V (3)
FPDcF≥340 msec (4)
Here, FPDcF is a value obtained by dividing FPD by ISI1/3.
The above determination result is added to the teacher waveform data TD (see
The teacher waveform data TD is linked with the determination result of the superiority or inferiority based on a human evaluation. “1” indicates that the waveform is determined as being similar to the ideal waveform (that is, GC). “0” indicates that the waveform is determined as being dissimilar to the ideal waveform (that is, not GC). That is, “1” indicates that the determination result of the superiority or inferiority based on the similarity to the ideal waveform is “superior determination”, and “0” indicates that the determination result of the superiority or inferiority based on the similarity to the ideal waveform is “inferior determination”.
Next, the determination unit 61 combines the plurality of unknown waveform data UD and the plurality of teacher waveform data TD included in the created set, and performs clustering by a k-medoids method (step S11). As an example, as shown in
A known k-medoids method is used for the clustering. In the k-medoids method, first, k points are randomly selected as medoids in the n-dimensional space. Next, each point is assigned to the closest medoid cluster. Then, in each cluster, a new medoid is set such that a total of distances to all the other points in the cluster is minimized. After that, the processing is repeated until there is no change in the medoid. In a case of the present embodiment, since k=3, three clusters CL1 to CL3 are generated. It is also possible to use a k-means method instead of the k-medoids method. In the k-means method, the centroid of a point in the cluster is calculated, and the processing is repeated until there is no change in the centroid.
Next, the determination unit 61 specifies a cluster including the unknown waveform data UD from the plurality of clusters generated by the clustering (step S12). In the example shown in
Next, the determination unit 61 determines the superiority or inferiority for each of the unknown waveform data UD included in the clusters specified in step S12 (step S13). Specifically, the determination unit 61 determines the superiority or inferiority by obtaining a probability (hereinafter, referred to as a score SC) that the unknown waveform data UD is superior determination “1” for each of the specified clusters. More specifically, the determination unit 61 obtains the score SC based on the number N1 of the teacher waveform data with the superior determination “1” and the number NT of the teacher waveform data with the superior determination “1” and the inferior determination “0”, for each of the specified clusters. NT corresponds to the number of the teacher waveform data TD included in the cluster.
As shown in
Next, the determination unit 61 ranks the unknown waveform data UD based on the determination result of the superiority or inferiority (step S14).
Next, the determination unit 61 determines whether or not a plurality of the first-ranked unknown waveform data UD exist (step S15). In a case in which the determination unit 61 determines that a plurality of the first-ranked unknown waveform data UD exist (step S15: YES), the determination unit 61 re-clusters the cluster including the first-ranked unknown waveform data UD (step S16). A method of the clustering is the same as in step S11. In the example shown in
After step S16, the determination unit 61 returns the processing to step S12. After that, the determination unit 61 executes each of the processes of steps S12 to S15 on a plurality of subclusters obtained by performing the clustering on the cluster CL3. In a case in which the determination unit 61 determines that a plurality of the first-ranked unknown waveform data UD do not exist (that is, there is only one first-ranked unknown waveform data UD) (step S15: NO), a determination process is ended.
The output unit 62 makes it possible to compare the superiority or inferiority of the plurality of unknown waveform data UD by, for example, causing the display unit 21 to display the table shown in
As described above, according to the technology of the present disclosure, the determination of the superiority or inferiority is performed for each of the plurality of unknown waveform data based on the plurality of teacher waveform data to which the determination result of the superiority or inferiority is linked, and the superiority or inferiority of the plurality of unknown waveform data is output in a comparable manner. Accordingly, it is possible to select a waveform close to the ideal waveform with high accuracy in a short time.
In addition, it is possible to evaluate a drug based on the determination information of the superiority or inferiority output from the information processing apparatus 20. Among the plurality of unknown waveform data, the unknown waveform data with a high rank of the superiority or inferiority determined by the information processing apparatus 20 may be used for the drug evaluation.
Hereinafter, examples of the determination process of the superiority or inferiority will be described.
In this example, a MAESTRO768PRO manufactured by Axion BioSystems was used as the cell culture apparatus 10, and a multi-well plate provided with 24 wells 32 was used as the MEA plate 30. Then, the myocardial waveform was measured while culturing the myocardial cells in each well 32 of the cell culture apparatus 10. Waveform data was generated by extracting data of 250 points in a negative direction and 7000 points in a positive direction from a point with the highest voltage as a zero point on a time axis of waveform data measured between 150 seconds and 135 seconds after a start of the measurement. This waveform data represents a myocardial waveform including one interspike interval.
Based on the waveform data acquired by the cell culture apparatus 10, the teacher waveform data TD for three plates were prepared, in which the determination results of the superiority or inferiority were linked with each other by evaluation by a human in the past. That is, the number of the prepared teacher waveform data TD is 1152. In addition, based on the waveform data acquired by the cell culture apparatus 10, the unknown waveform data UD for one well for newly determining the superiority or inferiority was prepared. That is, the number of the prepared unknown waveform data UD is 16.
Next, the plurality of unknown waveform data UD and the plurality of teacher waveform data TD included in the set were combined, and clustering (first clustering) was performed by the k-medoids method using MATLAB (registered trademark), which is numerical analysis software manufactured by MathWorks (registered trademark). Here, k=10.
Next, a score SC was calculated for each of the clusters including the unknown waveform data UD, and, based on the calculated score SC, the unknown waveform data UD were ranked in descending order of the score SC.
As described above, it was confirmed that, by using the method of the clustering, it is possible to select a waveform close to the ideal waveform from the plurality of unknown waveform data with high accuracy in a short time.
Next, various modification examples of the above embodiment will be described.
In the above-described embodiment, the teacher waveform data TD is linked with binary data of “1” or “0” as the determination result of the superiority or inferiority, but data of three or more values may be linked with the teacher waveform data TD. That is, the determination result of the superiority or inferiority is not limited to the one represented by the binary value, and may be represented by the three or more values.
In the above-described embodiment, the determination unit 61 determines the superiority or inferiority of the unknown waveform data UD by the clustering, but the determination unit 61 is not limited to the clustering, and may determine the superiority or inferiority of the unknown waveform data UD using a neural network that has been trained through machine learning.
In the learning phase, in a case in which the time-series data t1 to tn are input to an input layer of the neural network 70, an output value from an output layer is input to an adjustment unit 71. In addition, the determination result of the superiority or inferiority as a label is input to the adjustment unit 71. The adjustment unit 71 compares the output value with the inferior determination result, and adjusts the weight and the bias of the neural network 70 based on a difference between the two values.
As shown in
In this modification example as well, the teacher waveform data TD used for training the neural network 70 is linked with binary data of “1” or “0” as the determination result of the superiority or inferiority, but data of three or more values may be linked with the teacher waveform data TD. That is, the determination result of the superiority or inferiority is not limited to the one represented by the binary value, and may be represented by the three or more values. In this case, the output value from the neural network 70 is represented by three or more values.
In addition, the determination unit 61 may determine the superiority or inferiority of the unknown waveform data UD using an auto-encoder that has been trained through machine learning.
As shown in
Specifically, as shown in
As shown in
In this modification example, the comparison unit 81 outputs binary data of “1” or “0” depending on the difference value, but the present invention is not limited to this, and may be configured to output data of three or more values.
Next, a modification example of the clustering will be described. The ranking of the unknown waveform data UD by the clustering described in the above-described embodiment is performed based on the similarity of the shape with the ideal waveform. In order to improve an accuracy in a discrimination of minute differences in peak values in the waveform, it is preferable to perform a filtering process of excluding unknown waveform data UD that does not satisfy the evaluation criteria from the plurality of unknown waveform data UD to be determined before executing the clustering, using the evaluation criteria represented by Equations (1) to (4).
In this way, by performing the filtering process of excluding unknown waveform data UD that does not satisfy the evaluation criteria before executing the clustering, the unknown waveform data UD that does not satisfy the evaluation criteria is prevented from being ranked higher by the clustering. Accordingly, the accuracy of the determination process by the determination unit 61 is improved.
In the modification example shown in
For example, as shown in
In the example shown in
The filtering process in step S30 is not limited to immediately after step S14, and may be executed after it is determined in step S15 that a plurality of the first-ranked unknown waveform data UD exist. In this case, the filtering process may be performed only on the plurality of first-ranked unknown waveform data UD.
The evaluation criteria used in the filtering process are not limited to Equations (1) to (4) and can be appropriately changed.
In the above-described embodiment, the myocardial cell is used as the cell, but it is also possible to use a cell such as a nerve cell instead of the myocardial cell.
In the above-described embodiment, for example, a hardware structure of a processing unit that executes various kinds of processing, such as the data acquisition unit 60, the determination unit 61, and the output unit 62, is various processors as shown below.
Various processors include a CPU, a programmable logic device (PLD), a dedicated electric circuit, and the like. As is well known, the CPU is a general-purpose processor that executes software (program) and functions as various processing units. The PLD is a processor whose circuit configuration can be changed after manufacturing, such as a field programmable gate array (FPGA). The dedicated electric circuit is a processor that has a dedicated circuit configuration designed to perform a specific process, such as an application specific integrated circuit (ASIC).
One processing unit may be configured of one of these various processors, or may be configured of a combination of two or more processors of the same type or different types (for example, a combination of a plurality of FPGAs or a combination of a CPU and an FPGA). In addition, a plurality of processing units may be configured of one processor. As an example in which the plurality of processing units are configured of one processor, first, one processor is configured of a combination of one or more CPUs and software and this processor functions as the plurality of processing units. Second, as typified by a system on chip (SoC) or the like, a processor that realizes the functions of the entire system including the plurality of processing units by using one IC chip is used. As described above, the various processing units are configured using one or more of the various processors as a hardware structure.
More specifically, the hardware structure of these various processors is an electric circuit (circuitry) in which circuit elements such as semiconductor elements are combined.
The present invention is not limited to the above-described embodiment, and it is needless to say that various configurations can be adopted without departing from the scope of the present invention. In addition to the program, the present invention extends to a computer-readable storage medium that stores the program in a non-temporary manner.
Number | Date | Country | Kind |
---|---|---|---|
2021-023760 | Feb 2021 | JP | national |
This application is a continuation application of International Application No. PCT/JP2021/043955, filed on Nov. 30, 2021, the disclosure of which is incorporated herein by reference in its entirety. Further, this application claims priority from Japanese Patent Application No. 2021-023760, filed on Feb. 17, 2021, the disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2021/043955 | Nov 2021 | US |
Child | 18352213 | US |