DEVICE AND METHOD FOR IMPLEMENTING SEQUENCE TRANSDUCTION NEURAL NETWORK FOR TRANSDUCING INPUT SEQUENCE, AND TRAINING DEVICE AND METHOD USING SAME

TECHNICAL FIELD

The present disclosure generally relates to a device and method for predicting a neoantigen. More specifically, some embodiments of the present disclosure relate to a system for implementing a sequence transduction neural network for transducing an input sequence, and a training method using the same.

RELATED ART

Recently, various concepts and models such as neural networks have been developed in the field of artificial intelligence technology, and research on data prediction using the same has been actively conducted.

However, when predicting data based on an artificial intelligence-based neural network, it is necessary to develop training or prediction algorithms for the models to derive more accurate results, that is, results with a higher prediction probability.

SUMMARY

Various embodiments disclosed in the present disclosure may provide an algorithm that may derive more accurate results when predicting data based on or using an artificial intelligence-based neural network, and may provide a device and method for extracting a neoantigen candidate using an artificial intelligence-based neoantigen prediction model that may provide immune anticancer vaccine treatment to a patient by facilitating prediction of a neoantigen using the algorithm.

Objects of the present disclosure are not limited to the above-described object, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.

A neural network implementation device for implementing a sequence transduction neural network for transducing an input sequence having each network input for at least one computer according to an embodiment of the present disclosure may include at least one memory, and at least one processor configured to communicate with the at least one memory, wherein the at least one processor may be configured to receive first input data, receive second input data corresponding to the first input data, train a sequence transduction neural network by performing an attention operation on the basis of the first input data and second input data labeled with predetermined label information, and determine output data that is output by the sequence transduction neural network trained on the basis of the first input data, the second input data, and the label information.

In addition, a training method for transducing an input sequence performed by a sequence transduction neural network on a computer device according to an embodiment of the present disclosure may include performing a predetermined pre-training operation on the basis of first input data to determine a first key value and a first value value, matching position information to each sequence of second input data when the second input data corresponding to the first input data is input, performing a predetermined self attention operation on the basis of a second key value, a second value value, and a second query value that correspond to the second input data to generate a first query value, and performing the predetermined attention operation on the basis of the first key value, the first value value, and the first query value to determine the output data corresponding to the first input data and the second input data.

In addition, a computer program stored in a computer-readable recording medium may be provided for implementing some embodiments of the present disclosure.

Additionally, a computer-readable recording medium configured to record a computer program may be provided for implementing certain embodiments of the present disclosure.

According to some embodiments of the present disclosure, an algorithm that can derive more accurate results when predicting data based on or using an artificial intelligence-based neural network can be provided, and immune anticancer vaccine treatment can be provided to a patient by facilitating prediction of a neoantigen using the algorithm.

Effects of the present disclosure are not limited to the above effect, and other effects that are not mentioned will be clearly understood by those skilled in the art from the following description.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram schematically showing a training operation for implementing a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 2 is a diagram schematically showing an attention algorithm of the sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 3 is a diagram showing a scaled dot attention operation in an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 4 is a diagram showing an operation of processing second input data in an attention algorithm of a sequence transduction neural network implemented according to an embodiment of the present disclosure.

FIG. 5 is a diagram showing an operation of processing first input data in an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 6 is a diagram showing an operation of generating output data using an attention score value calculated in an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 7 is a diagram showing a configuration of a computer device for transducing an input sequence according to an embodiment of the present disclosure.

FIG. 8 is a block diagram showing a configuration of a processor of the computer device of FIG. 7 according to an embodiment of the present disclosure.

FIG. 9 is a flowchart showing an operation of processing first input data based on an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 10 is a flowchart showing an operation of processing second input data based on an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 11 is a flowchart showing an operation of generating output data using an attention score value calculated based on an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIGS. 12 and 13 are flowcharts showing operations of predicting whether to bind a peptide sequence to major histocompatibility complexes (MHC) and whether a T-cell is activated using an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

Advantages and features of the present disclosure and methods of achieving them will become clear by referring to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present disclosure is not limited to the embodiments disclosed below and can be implemented in various different forms, and the embodiments are merely provided to make the present disclosure complete and to fully inform those skilled in the art to which the present disclosure pertains of the scope of the present disclosure, and the present disclosure is only defined by the scope of the appended claims.

The terms used in the present disclosure are for describing the embodiments and are not intended to limit the present disclosure. In the present disclosure, the singular form also includes the plural form unless specifically stated in the phrase. The terms “comprises” and/or “comprising” that are used in the present specification do not exclude the presence or addition of one or more other components in addition to the components mentioned. The same reference numerals refer to the same components throughout the specification, and the phrase “and/or” includes each and any combination of one or more of the components mentioned. Although “first,” “second,” and the like are used to describe various components, it goes without saying that these components are not limited by these terms. These terms are merely used to distinguish one component from another. Therefore, it goes without saying that a first component mentioned below may also be a second component within the technical idea of the present disclosure.

Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used with meanings commonly understood by those skilled in the art to which the present disclosure pertains. In addition, terms defined in dictionaries that are commonly used are not ideally or excessively interpreted unless explicitly specifically defined otherwise.

The same reference numerals refer to the same components throughout the present disclosure. The present disclosure does not describe all elements of the embodiments, and common content in the art to which the present disclosure pertains or content that overlaps between the embodiments will be is omitted. Terms “unit,” “module,” “member,” and “block” used in the specification may be implemented as software or hardware, and according to embodiments, a plurality of “units,” “modules,” “members,” and “blocks” may be implemented as one component, or one “unit,” “module,” “member,” and “block” may also include a plurality of components.

Throughout the specification, when a first component is described as being “connected” to a second component, this includes not only a case in which the first component is directly connected to the second component but also a case in which the first component is indirectly connected to the second component, and the indirect connection includes connection through a wireless communication network.

In addition, when a certain portion is described as “including,” a certain component, this means further including other components rather than precluding other components unless especially stated otherwise.

Throughout the specification, when a first member is described as being positioned “on” a second member, this includes both a case in which the first member is in contact with the second member and a case in which a third member is present between the two members.

Terms such as first and second are used to distinguish one component from another, and the components are not limited by the above-described terms.

A singular expression includes plural expressions unless the context clearly dictates otherwise.

In each operation, identification symbols are used for convenience of description, and the identification symbols do not describe the sequence of each operation, and each operation may be performed in a different sequence from the specified sequence unless a specific sequence is clearly described in context.

The terms used in the following description are defined as follows.

A device for implementing a sequence transduction neural network according to some embodiments of the present disclosure may include various devices that can perform computational processing and provide results to a user. For example, a device for implementing a sequence transduction neural network may include one or more of a computer device, a server device, and a portable terminal.

Here, the computer device may include, for example, but not limited to, a notebook, a desktop, a laptop, a tablet personal computer (PC), a slate PC, etc., which are provided with a web browser.

The server device may be a server configured to processes information, data, signals in communication with an external device. For instance, the sever device may include an application server, a computing server, a database server, a file server, a game server, a mail server, a proxy server, and a web server.

The portable terminal is, for example, a wireless communication device ensuring portability and mobility and may include all kinds of handheld-based wireless communication devices such as a personal communication system (PCS), a global system for mobile communications (GSM), a personal digital cellular (PDC), a personal handyphone system (PHS), a personal digital assistant (PDA), international mobile telecommunication-2000 (IMT-2000), code division multiple access-2000 (CDMA-2000), w-code division multiple access (W-CDMA), a wireless broadband internet (WiBro) terminal, and a smart phone, and wearable devices such as a watch, a ring, a bracelet, an anklet, a necklace, glasses, contact lenses, or a head-mounted device (HMD).

‘Antigen’ in the present disclosure is a substance that induces an immune response.

A neoantigen is new proteins formed in a cancer cell when a specific mutation occurs in tumor Deoxyribonucleic acid (DNA).

The neoantigen is characterized by being caused by the mutation and manifested only in the cancer cell.

The neoantigen may include, for instance, but not limited to, a polypeptide sequence or a nucleotide sequence. The mutation may include a frameshift or non-lattice shift indel, a missense or nonsense substitution, a splice site alteration, a genomic rearrangement or gene fusion, or any genomic or manifestation alteration causing a new ORF. The mutation may also include a splice variant. A post-translational modification specific to a tumor cell may include an abnormal phosphorylation. The post-translational modification specific to the tumor cell may also include a proteasome-generated spliced antigen.

‘Epitope’ in the present specification may refer to a specific portion of an antigen to which an antibody or a T-cell receptor is normally bound.

‘Major Histocompatibility Complex (MHC)’ in the present specification may be a protein that presents a ‘peptide’ synthesized in a specific cell on a surface of the cell, thereby enabling a T-cell to identify the cell.

‘Peptide’ in the present specification may be a polymer of amino acids. For convenience of explanation, hereinafter, the ‘peptide’ is referred for an amino acid polymer or an amino acid sequence represented on a surface of the cancer cell.

‘MHC-peptide complex’ in the present specification may be presented on the surface of the cancer cell and may be a complex structure of the MHC and the peptide. The T-cell recognizes the MHC-peptide complex to perform the immune response.

Hereinafter, the operation principles and embodiments of the present disclosure will be described with reference to the accompanying drawings.

Cancer is a disease in which a mutation is manifested in a normal cell to indefinitely proliferate cells. Conventionally, surgery, radiation, and chemotherapy are common treatment methods for cancer. However, recently, research on immune anticancer vaccine treatment that utilizes an immune neural network implementation device of a human body has been actively conducted.

The cancer cell creates the neoantigen. The epitope of the neoantigen is represented on major histocompatibility complexes (MHC) located on the surface of the cancer cell. The T-cell recognizes MHC-epitope to cause the immune response.

Therefore, it is necessary to predict an MHC-peptide binding to identify the neoantigen created by the cancer cell.

Specifically, the cancer cell fragments an abnormal peptide created by a gene (mutation), and abnormal peptide fragments are present on the surface of the cell by the MHC. When the T-cell recognizes the abnormal peptide fragments, the immune response that destroys the corresponding cell occurs.

Therefore, when the abnormal peptide fragment among abnormal substances generated by the cancer cell, which may move to the surface of the cell and may be bound to the T-cell, is found, directly synthesized, and injected into a patient, cancer can be treated because the maturation and proliferation of an immature T-cell that may recognize the neoantigen by an antigen-presenting cell are accelerated.

Some embodiments of the present disclosure are intended to implement a sequence transduction neural network through training and to predict whether to bind the MHC to a peptide sequence and the activation of the T-cell based on the implemented sequence transduction neural network. A series of operations or algorithms therefor may be performed by a computer device, and a specific exemplary configuration of the computer device will be described below with reference to FIGS. 7 and 8. The computer device according to certain embodiments of the present disclosure may be a neural network implementation device that forms a neural network.

FIG. 1 is a diagram schematically showing a training operation for implementing a sequence transduction neural network according to an embodiment of the present disclosure.

Referring to FIG. 1, when training a sequence transduction neural network NN according to an embodiment of the present disclosure, the training is performed using pre-stored open data.

Specifically, after an MHC feature 110 is input as first input data and a peptide feature 120 is input as second input data, each is input as input data. The sequence transduction neural network NN is trained regarding whether MHC binding is possible and whether the T-cell is activated based on the input data.

Therefore, the sequence transduction neural network NN for transducing an input sequence may be implemented.

Here, the pre-stored open data used for training may include whether the MHC binding is possible and whether the T-cell is activated depending on the type of the MHC of each peptide as data. Detailed descriptions thereof will be given below.

FIG. 2 is a diagram schematically showing an attention algorithm of a sequence transduction neural network according to one embodiment of the present disclosure.

Referring to FIG. 2, at operation 210, when the first input data is input, the computer device performs predetermined pre-training to determine a first key and a first value for the first input data, and at operation 220, when the second input data corresponding to the first input data is input, the computer device performs multi-head self attention to generate a first query for the first input data.

Next, at operation 230, based on the first key, the first value, and the first query, the computer device performs a scaled dot product attention operation and concatenates each attention head to output a matrix in which each sequence is transduced into a vector. Next, at operation 240, the computer device performs a convolutional neural network (CNN) aggregation operation according to a peptide length, performs an embedding to transduce the matrix into a vector, generates and outputs output data that has passed through a linear layer.

In an embodiment of the present disclosure, the output data may be data that extracts (or filters) only at least one MHC candidate that may activate the T-cell through binding with the T-cell.

FIG. 3 is a diagram showing a scaled dot attention operation in an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure. Hereinafter, an example of operation 230 of FIG. 2 will be described in detail.

Referring to FIG. 3, at operation 231, the computer device may perform self attention based on the first key K_MHC, the first value V_MHC, and the first query Q_Pof the first input data to output an attention score value as an output value, and, at operation 232, the computer device may perform concatenation of all attention heads (or a plurality of attention heads) thereof.

Here, the self attention is for extracting the relationship between three elements: a key, a value, and a query and may be a scaled dot product attention operation.

In an embodiment of the present disclosure, the computer device calculates the first query value for each sequence of the first input data for all first key values as an attention score value and obtains a probability distribution in which the total value becomes 1 by applying a softmax function.

This is referred to as an attention distribution, and each value is referred to as an attention weight value.

The computer device may calculate an attention value by performing a weighted sum of the attention weight value and a hidden state for each sequence and create a single vector by concatenating the attention value with a hidden state at time t.

That is, the computer device may perform an operation of combining a plurality of attention heads generated through a preset operation, for instance, the softmax function based on the first key, the first value, and the first query that correspond to each sequence of the first input data and the second input data.

In an embodiment of the present disclosure, the attention weight value corresponds to attention energy for the second input data corresponding to the first input data.

Therefore, an attention value matrix having all attention values for each sequence of the first input data and the second input data is calculated as a result. This may be represented as shown in Equation 1 below.

$\begin{matrix} Attention (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V & [Equation 1] \end{matrix}$

Equation 1 be an example of the attention operation, and in Equation 1, Q denotes a query that performs the attention operation, K denotes a key, and V denotes a value corresponding to the query and the key.

The softmax function is an activation function used in multi-classes and may be a function that receives an N-dimensional vector and estimates a probability in which the N-dimensional vector belongs to each class when a class is provided as N classes. For example, the softmax function may be defined as shown in Equation 2 below.

$\begin{matrix} y_{k} = \frac{e^{a_{k}}}{\sum_{i = 1}^{n} e^{a_{i}}} & [Equation 2] \end{matrix}$

Equation 2 refers to a softmax function, where n denotes the number of neurons in an output layer, and k denotes the order of the classes.

The first key and the first value for the first input data are determined from the first input data, but the first query may be generated from the second input data, and detailed description thereof will be specifically described below with reference to FIGS. 4 and 5.

The output data may be implemented as data that indicates the degree of matching between sequences according to the attention score value calculated by performing the self attention in the attention algorithm of the sequence transduction neural network.

Specifically, the output data may be generated as output data in which the T-cell immunogenicity is matched to each sequence of output data corresponding to the first input data and may be generated by performing a normalization operation on the T-cell immunogenicity corresponding to the sequence of the output data to generate label information corresponding to the second input data.

FIG. 4 is a diagram showing an operation of processing second input data in an attention algorithm of a sequence transduction neural network according to one embodiment of the present disclosure. Hereinafter, an example of operation 220 of FIG. 2 will be described in detail.

Referring to FIG. 4, when the second input data corresponding to the first input data is input into the implemented sequence transduction neural network, the computer device matches position information PS to each sequence of the second input data. For example, the second input data is a peptide feature (sequence), and both a physicochemical property (AAindex) and an amino acid substitution matrix (BLOSUM) are used as the peptide feature. For example, the second input data may include 9-10 sequences.

Thereafter, a second key, a second value, and a second query that correspond the second input data, that is, for each sequence of the second input data, are generated by an operation with the attention weight through input embedding, and, at operation 221, a predetermined self attention operation is performed based on the second key, the second value, and the second query. In this embodiment, the attention weight may be operated through pre-training.

Here, the self attention is multi-head attention and may perform num_heads parallel attentions on the second key, the second value, and the second query that have a dimension in which the dimension of d_modelis divided into num_heads and may be performed by concatenating all of the attention heads (e.g. each attention value matrix) may be performed.

In this case, different attention weights W_Q, W_K, and W_Vare each imparted to attention heads. Thereafter, a value obtained by multiplying a matrix concatenating all of the attention heads by another weight matrix Wo is output as a final result value for the multi-head attention.

Thereafter, the output values are input into and pass through a linear layer to generate the first query value for the first input data.

FIG. 5 is a diagram showing an operation of processing first input data in an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure. Hereinafter, an example of operation 210 of FIG. 2 will be described in detail.

Referring to FIG. 5, at operation 211, when the first input data is input into the implemented sequence transduction neural network, the computer device performs a predetermined pre-training operation. At operations 212 and 213, the computer device determines the first key and the first value as each output value passes through or is processed by the linear layer.

For example, the first input data may include at least one of pieces of information about the type of the MHC and the structure of the MHC.

Specifically, the first input data may be an MHC feature (e.g. a three-dimensional structure), and the MHC feature may use 181 sequences close to a binding site, and as an example, the type of the MHC may be changed to 360 sequences. Meanwhile, the first input data may be provided as a sequence corresponding to a predetermined range based on a point at which the peptide sequence is bound to the MHC sequence.

Referring to FIG. 6, when the matrix in which each sequence has been transduced into a vector is output by operation 230 of FIG. 2, the computer device performs a CNN aggregation operation according to a peptide length at operation 241 and performs embedding to transduce the matrix into a vector at operation 242. That is, embedding vectors for the first input data and the second input data are generated. Next, at operation 243, output data that has passed through or processed by the linear layer is generated and output. In operation 243, information corresponding to an experimental method may additionally be combined with the embedding vector through onehot encoding to generate output data.

FIG. 7 is a diagram showing a configuration of a computer device for transducing an input sequence according to an embodiment of the present disclosure.

Referring to FIG. 7, a computer device 800 may include a memory 810, a processor 820, a communication interface 830, an input and output interface 840, and an input and output device 850. However, each component is shown as one component in FIG. 7, but this is only for convenience of explanation, and each component shown in FIG. 7 can be implemented singularly or plurally as needed.

The memory 810 stores data and/or various types of information that support or are associated with various functions of the computer device 800. The memory 810 may store a plurality of application programs (or applications) that are driven on or executed by the computer device 800, and data and commands for operating the computer device 800. At least some of the application programs may be downloaded from an external server via wireless communication. Meanwhile, the application program may be stored in the processor 820, installed on the computer device 800, and executed or driven to perform an operation (or a function) by the processor 820.

Specifically, the memory 810 stores at least one machine learning model and at least one process or operation to implement the sequence transduction neural network for transducing the input sequence.

Meanwhile, the memory 810 may include at least one type of storage medium such as a flash memory type, a hard disk type, a multimedia card micro type, a card-type memory (e.g., an SD or XD memory), a random access memory (RAM), a static RAM (SRAM), a read-only memory (ROM), an electrically erasable programmable ROM (EEPROM), a programmable ROM (PROM), a magnetic memory, a magnetic disk, or an optical disk. In addition, the memory may store information temporarily, permanently, or semi-permanently and may be provided as a built-in or detachable type.

The processor 820 may control some or all components in the computer device 800 to process signals, data, information, etc. that are input or output, or execute commands, algorithms, and application programs that are stored in the memory 810 to perform various processes or operations, and provide appropriate information or functions to each user or process the same by implementing the sequence transduction neural network for transducing the input sequence.

The processor 820 may perform a predetermined pre-training operation on the basis of first input data to determine a first key and a first value, receive the second input data corresponding to the first input data, match position information to each sequence of the second input data, perform a predetermined self attention operation on the basis of a second key, a second value, and a second query that correspond to the second input data to generate a first query value, and perform the predetermined attention operation on the basis of the first key, the first value, and the first query to determine output data corresponding to the first input data and the second input data.

In addition, the processor 820 may perform an operation of combining a plurality of attention heads generated through a predetermined operation on the basis of the first key, the first value, and the first query that correspond to each sequence of the first input data and the second input data.

The operation of combining the attention heads may be defined as a concatenation operation.

The concatenation operation may refer to a function that concatenates a plurality of text strings into a single text string.

The processor 820 may be configured to output the output data including the attention energy for the second input data corresponding to the first input data.

The output data may be generated as output data in which T-cell immunogenicity is matched to each sequence of the output data corresponding to the first input data.

Meanwhile, the neural network may perform a normalization operation on the T-cell immunogenicity corresponding to the sequence of the output data to generate label information corresponding to the second input data.

When the neural network predicts the binding information of the peptide sequence and the MHC, the output data may include the binding information corresponding to the first input data and the second input data.

The binding information may include at least one of binding probability, the presence or absence of binding, binding strength, dissociation constant, or IC50.

On the other hand, when the neural network predicts the activation of the T-cell corresponding to the peptide sequence, the output data may include the probability of the activation of the T-cell corresponding to the first input data and the second input data.

The first input data according to an embodiment of the present disclosure may include at least one of pieces of information about the type of the MHC and/or the structure of the MHC.

In addition, the second input data includes a plurality of peptide sequences.

When the second input data includes at least one peptide sequence that is not bound to the MHC, the sequence transduction neural network implemented in the neural network implementation device may predict the binding information of the MHC and the peptide sequence.

On the other hand, when the second input data includes at least one peptide sequence that does not activate the T-cell, the sequence transduction neural network may be implemented to predict the probability of the activation of the T-cell.

Specifically, referring to Table 1 below, when the second input data includes a peptide sequence that is negative for MHC binding, the sequence transduction neural network may predict the binding information of the peptide sequence and the MHC.

On the other hand, referring back to Table 1, when the second input data includes the peptide sequence that is negative for the activation of the T-cell, the sequence transduction neural network may predict the probability of the activation of the T-cell.

TABLE 1

MHC BINDING
ACTIVATION OF

PEPTIDE
MHC TYPE
(IC50)
T-CELL

MGQIVTMFE
A*0201
=21
nM
Positive

B*0101
>20000
nM
NaN

GQIVTMFEA
A*0101
<100
nM
Negative

MFEALPHII
A*0204
=12000
nM
NaN

Table 1 is a diagram showing an example of data used for neural network training according to an embodiment of the present disclosure. The data used for training may include the peptide sequence matched with Kd. Kd denotes the dissociation constant of an antigen binding site. The lower the Kd value may mean the higher the affinity of the antibody for the antigen, and the higher the Kd value may mean the lower the affinity of the antibody for the antigen.

When the binding information of the peptide sequence and the MHC corresponding to the first input data, which are included in the output data corresponding to the first input data, is less than a predetermined value, the processor 820 may modify the variable of the sequence transduction neural network using a loss function determined based on the binding information.

Here, the binding information, which will be predicted, may be determined based on the Kd value.

Kd may be matched in a form of a specific number, but may also be matched in a specific range (inequality).

Each Kd can be used as label information when the neural network operates as a model that predicts the binding information of the MHC and the peptide sequence, and the neural network may reflect the range of the label information in the loss function.

In particular, in the case that the training is performed using the peptide sequence with a specific range of Kd, when Kd of the output data corresponding to the input data is within the corresponding range, it may be determined that the model has no loss, and the training may be performed.

For example, in the case that the training is performed with the peptide sequence ‘GQIVTMFEA’ and the MHC type ‘A*0101,’ when Kd output by the neural network is 97 nM, a corresponding Kd is included in the label information ‘<100 nM,’ and thus it may be determined that the neural network has no loss, and the training may be performed.

According to another embodiment, in the case that the training is performed with the peptide sequence ‘MGQIVTMFE’ and MHC type ‘B*0101,’ when Kd output by the neural network is 20003 nM, a corresponding Kd is included in the label information ‘>20000 nM,’ and thus it may be determined that the neural network has no loss, and the training may be performed.

In summary, the label information of training data used for training in an embodiment of the present disclosure may include a specific value and a specific range. When the label information is a specific range and a value predicted by the neural network is included in the specific range, it may be determined that the neural network has no loss, and the training may be performed.

Meanwhile, the above-described operation is only one embodiment of the present disclosure, and there is no limitation on the form of preprocessed data used by the neural network for training.

Meanwhile, the second input data may include at least one of the amino acid substitution matrix (BLOSUM) and the physicochemical property (AAindex) that correspond to the peptide sequence.

In addition, the first input data may be provided as a sequence corresponding to a predetermined range based on the point at which the peptide sequence is bound to the MHC sequence.

In addition, at least one processor may generate the output data in which the T-cell immunogenicity is matched to each sequence of the output data corresponding to the first input data and perform a normalization operation on the T-cell immunogenicity corresponding to the sequence of the output data to generate label information corresponding to the second input data.

In addition, the processor 820 may generate the output data by additionally combining information corresponding to an experimental result with an embedding vector determined by performing the predetermined attention operation on the basis of the first key, the first value, and the first query.

Meanwhile, the computer device 800 may include one or more components (e.g. a transceiver, a receiver, and/or transmitter) configured to communicate with an external device. For example, the computer device 800 may have the communication interface 830 for wireless communication and the input and output interface 830 for wired communication.

Specifically, the communication interface 830 may transmit and receive signals to and from the external device via a network 600 through wireless communication. To this end, the communication interface 830 may include at least one wireless communication module, short range communication module, and the like.

First, the wireless communication module may include one or more wireless communication modules configured to perform or support various wireless communication methods such as global system for mobile communication (GSM), code division multiple access (CDMA), wideband code division multiple access (WCDMA), universal mobile telecommunications system (UMTS), time division multiple access (TDMA), wireless local area network (WLAN), digital living network alliance (DLNA), wireless broadband (WiBro), worldwide interoperability for microwave access (WiMAX), high-speed downlink packet access (HSDPA), high-speed uplink packet access (HSUPA), long-term evolution (LTE), 4G, 5G, and 6G in addition to a Wi-Fi module and a wireless broadband module.

In addition, the short range communication module is configured for short range communication and may perform or support short range communication using at least one of Bluetooth™, radio frequency identification (RFID), infrared data association (IrDA), ultra-wideband (UWB), ZigBee, near field communication (NFC), wireless-fidelity (Wi-Fi), Wi-Fi direct, and wireless universal serial bus (Wireless USB) techniques.

Meanwhile, the input and output interface 240 may be connected to the input and output devices 250 in a wired manner, that is, through wired communication, to transmit and receive signals. To this end, the input and output interface 840 may include at least one wired communication module, and the wired communication module may include not only various wired communication modules such as a local area network (LAN) module, a wide area network (WAN) module, or a value-added network (VAN) module, but also various cable communication modules such as universal serial bus (USB), high definition multimedia interface (HDMI), digital visual interface (DVI), recommended standard 232 (RS-232), powerline communication, or plain old telephone service (POTS).

However, since one or some of the components shown in FIG. 7 may be not essential for implementing the computer device 800 according to an embodiment of the present disclosure, a device according to an embodiment of the present disclosure may include a larger number of or fewer number of components than the components described above. That is, the present disclosure is not limited to the number, names, operations, etc. of the components.

FIG. 8 is a block diagram showing a configuration of a processor included in the computer device of FIG. 7 according to an embodiment of the present disclosure.

Referring to FIG. 8, the processor 820 of the computer device 800 may implement a neural network NN. The neural network NN may include a first input layer 821, a second input layer 822, an attention layer 823, and an output layer 824. Referring to FIG. 8, the first key and first value are determined and output from the first input data input through the first input layer 821, and the second input layer 822 generates the first query from the second input data corresponding to the first input data.

Thereafter, the attention layer 823 performs self attention based on the first key, the first value, and the first query, and the output layer 824 generates and outputs output data using the output value of the attention layer 823.

When the neural network NN operates as a model that predicts the binding of the MHC and the peptide sequence, the output layer 824 may output the output data including the MHC binding information corresponding to the first input data and the second input data.

That is, the output layer 824 may include the binding information of the MHC input as the first input data and the peptide sequence input as the second input data.

The loss function may be determined based on the difference between the binding information included in the output data and the label corresponding to specific first input data and second input data.

Based on this, training for predicting the peptide sequence that may be bound to the MHC may be performed by modifying each variable that constitutes the neural network NN in the neural network implementation device.

Meanwhile, when the neural network NN operates as a model that predicts the activation of the T-cell corresponding to the MHC and the peptide sequence, the output data output from the output layer 824 may include information or data regarding whether the T-cell corresponding to the first input data and the second input data is activated.

Specifically, the output layer 824 may output the output data in the form of a positive value when the T-cell is activated or a negative value when the T-cell is not activated.

According to an embodiment of the present disclosure, by using beta distribution of a case that the neural network NN operates as a model that predicts the activation of the T-cell corresponding to the MHC and the peptide sequence, the label information of the corresponding data may be generated.

In order to label the data required for training, the processor 820 may perform a plurality of tests to output information or data regarding whether the T-cell corresponding to the first input data and the second input data is activated.

Meanwhile, the processor 820 according to an embodiment of the present disclosure may perform the plurality of tests based on or using the same first input data and second input data.

The neural network NN may output a value representing whether the T-cell corresponding to the first input data and the second input data is activated.

The processor 820 in which the neural network NN is implemented may determine the number of plurality of tests and the number of times the T cell corresponding to first input data and second input data is activated.

The processor 820 implementing the neural network NN may generate the beta distribution corresponding to the first input data and the second input data on the basis of the number of times the T cell is activated for the number of plurality of tests.

That is, the processor 820 may output the activation of the T cell for the first input data and the second input data as the beta distribution.

The beta distribution may be a continuous probability distribution defined in section [0, 1] according to two parameters α and β. The parameter may be a parameter that determines the shape of the distribution.

The probability density function of the beta distribution may be defined as shown in Equation 3 below.

$\begin{matrix} F (x; a, b) = \frac{1}{B (a, b)} {x^{a - 1} (1 - x)}^{b - 1} & [Equation 3] \end{matrix}$

where f denotes the beta distribution, and B (a, b) denotes the beta function, which may be defined as shown in Equation 4 below.

$\begin{matrix} B (a, b) = \overset{1}{\int_{0}} {t^{a - 1} (1 - t)}^{b - 1} d t & [Equation 4] \end{matrix}$

The processor 820 executing or driving the neural network may determine the mean and variance of the beta distribution.

In addition, the processor 820 may determine the label information corresponding to the first input data and the second input data on the basis of the mean and variance of the beta distribution. Specifically, the processor 820 may determine that the difference between the mean and the variance of the beta distribution is label information corresponding to the first input data and the second input data.

Typically, when a large number of tests are performed on data distribution, a variance value may decrease, and when the number of tests is small, the variance value may increase.

Based on this, the label corresponding to the first input data and the second input data, that is, the MHC and the peptide sequence, may be determined as shown in Equation 5 below.

$\begin{matrix} L = E - V & [Equation 5] \end{matrix}$

where L denotes the label information determined by the processor 820, E denotes the mean of the beta distribution, and V denotes the variance of the beta distribution.

Meanwhile, the operation of the layer constituting each of the neural networks illustrated in FIG. 8 is only one embodiment of the present disclosure for illustration purposes only, and the present disclosure is not limited thereto.

Referring to FIG. 9, when the first input data is input into the implemented sequence transduction neural network (S1010), the computer device 800 performs a predetermined pre-training operation (S1020). Next, as each output value is processed by or passes through a linear layer, the first key and the first value are determined (S1030).

Referring to FIG. 10, when the second input data corresponding to the first input data is input into the implemented sequence transduction neural network (S1110), the computer device 800 matches position information to each sequence of the second input data (S1120).

Next, the second key, the second value, and the second query that correspond to the second input data, that is, each sequence of the second input data, are generated by operating the attention weight through input embedding, and a predetermined self attention operation is performed based on the second key, the second value, and the second query (S1130).

Next, the output values are processed by or pass through the linear layer to generate the first query value for the first input data (S1140).

Referring to FIG. 11, when the first key, the first value, and the first query of the first input data are input (S1210), the self attention may be performed on the basis of the first key, the first value, and the first query to output the attention score value as an output value (S1220) and concatenation of all attention heads may be performed (S1230).

Next, the CNN aggregation operation is performed according to a length of the second input data such as a length of the peptide sequence (S1240), and embedding vectors for the first input data and the second input data are generated (S1250).

Next, the output data that is processed by or has passed through the linear layer is generated and output (S1260).

As described above, the output data may include the binding information of the MHC and the peptide sequence or the probability of the activation of the T-cell.

FIGS. 12 and 13 are flowcharts showing operations of predicting binding information of the MHC and a peptide sequence and whether to a T-cell corresponding to the peptide sequence is activated based on or using an attention algorithm of a sequence transduction neural network according to an embodiment of the present disclosure.

FIG. 12 shows an operation of predicting binding information of the MHC and the peptide sequence by a neural network according to an embodiment of the present disclosure.

Referring to FIG. 12, the first input data and the second input data are input into the neural network. In FIG. 12, information about the type of the MHC and the structure of the MHC may be input as the first input data (S1310).

In addition, the second input data may include the peptide sequence. When the neural network predicts the binding information of the MHC and the peptide sequence as in FIG. 12, the second input data may include at least one peptide sequence that is not bound to the MHC (S1320).

In the embodiment of FIG. 12, the neural network may receive the information about the type of the MHC and the structure of the MHC as the first input data and receive a plurality of peptide sequences as the second input data (S1320).

Thereafter, the neural network that has received the first input data and the second input data may perform the attention operation using the query, the key, and the value that are derived from each data and predict the binding information of the MHC and the peptide sequence (S1330).

FIG. 13 shows an operation of predicting whether the T-cell corresponding to the peptide sequence is activated by a neural network.

Referring to FIG. 13, the first input data and the second input data are input into the neural network.

In the embodiment of FIG. 13, the neural network may receive information about the type of the MHC and the structure of the MHC as the first input data and receive a plurality of peptide sequences as the second input data (S1410 and S1420).

At operation S1420, the second input data may include both of the peptide sequence that activates the T-cell and the peptide sequence that does not activate the T-cell.

The neural network that has received the first input data and the second input data may perform the attention operation using the query, the key, and the value that are derived from each data and predict the probability of the activation of the T-cell (S1430).

Meanwhile, disclosed embodiments may be implemented in the form of a recording medium in which computer-executable commands are stored. The commands may be stored in the form of program code, and when executed by a processor, program modules are generated to perform operations of the disclosed embodiments. The recording medium may be implemented as a computer-readable recording medium.

The computer-readable recording medium includes all types of recording media in which computer-decodable commands are stored. For example, there may be a ROM, a RAM, a magnetic tape, a magnetic disk, a flash memory, an optical data storage device, and the like.

As described above, the disclosed embodiments have been described with reference to the accompanying drawings. Those skilled in the art to which the present disclosure pertains will understand that the present disclosure may be implemented in different forms from the disclosed embodiments without departing from the technical spirit or essential features of the present disclosure. The disclosed embodiments are illustrative and should not be construed as limiting.

	Number	Date	Country
Parent	PCT/KR2023/012469	Aug 2023	WO
Child	19060594		US

DEVICE AND METHOD FOR IMPLEMENTING SEQUENCE TRANSDUCTION NEURAL NETWORK FOR TRANSDUCING INPUT SEQUENCE, AND TRAINING DEVICE AND METHOD USING SAME

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

CROSS-REFERENCE TO RELATED APPLICATIONS

Continuations (1)