The disclosure herein generally relates to the field of Electroencephalogram (EEG) classification, and, more particularly, to method and system for EEG motor imagery classification.
Brain-computer interface (BCI) research has vast medical applications such as stroke rehabilitation and neural prosthesis. The BCI systems decode the patterns in brain activity to control electro-mechanical devices. One such BCI system is Electroencephalogram (EEG) based Motor Imagery. EEG is a popular method to record the brain's electrical activity due to its high temporal resolution and portability. Motor imagery (MI) can be defined as a subject's mental process of motor activity without actual movement. During the process of motor movement or imagery, the underlying functional connectivity within the sensory motor cortex changes resulting in an amplitude suppression or Event Related Desynchronization (ERD) or an amplitude increase or Event Related Synchronization (ERS) of central mu (8-12 Hz) and beta (18-25 Hz) rhythms. It is widely accepted that the process of MI involves similar cortical regions involved in the preparation or planning of movements. Hence, the motor imagery and movement preparation can be theorized as similar functional activities. Accurate classification of the neural activity is of great importance to enable these EEG based MI systems.
Spatio-temporal analysis of networks between lobes of the brain helps in learning about patterns of neuronal activity associated with motor functions. The most intuitive way to represent and analyze these networks is using graph which is a topographical representation consisting of a set of nodes and edges. In graph-based studies, typically, the nodes are defined in two ways: the first approach is to use source-space (EEG signal source i.e., the brain) to determine the nodes, usually captured from brain source imaging techniques or source reconstruction using EEG. Here, anatomical regions of the brain are used to generate the cortical networks or connectomes. The second method uses the sensor-space (EEG electrode position e.g., 10-20 electrode system) of the EEG acquisition systems to define the nodes. The EEG signals acquired at sensor level are described as a linear combination of the active brain and non-brain electrical sources whose activities are volume conducted on the scalp. Due to the volume conduction, many correlations exist between electrodes thereby reducing the spatial resolution which affect the detection of various functional interactions between brain regions. Existing deep learning works employ the sensor-space for EEG graph representations wherein the channels of the EEG are considered as nodes and connection between the nodes are either predefined or are based on certain heuristics. However, these representations are ineffective and fail to accurately capture the underlying brain's functional networks.
One recent work EEG-GAT: Graph Attention Networks for Classification of Electroencephalogram (EEG) Signals by Andac Demir et.al. discloses a method of generating EEG representations using a graph shift operator to learn the optimal connectivity pattern between EEG channels. But this method does not learn the importance of each edge (i.e., the edge weight). Also, the graph embedding or representation is obtained using the readout functions such as sum or mean which are static functions and may not accurately represent the graph. Another recent work EEG-GNN: Graph Neural Networks for Classification of Electroencephalogram (EEG) Signals by Andac Demir et. al. represents EEG signal as a graph, where nodes represent the EEG channels and the node connectivity (edges) is based on the policy formulated by neuro-scientist (for example, based on distance between electrodes). So, the accuracy of graph representation largely depends on expertise of the neuro-scientist.
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method for EEG motor imagery classification is provided. The method includes receiving one or more Electroencephalogram (EEG) signals and corresponding ground truth labels comprising a ground truth graph label, a ground truth edge label, and a ground truth node label. Each of the one or more EEG signals comprise a plurality of channels. The method further includes extracting a plurality of temporal embeddings corresponding to each of the plurality of channels of each of the one or more EEG signals using a temporal feature extractor. Further, the method includes constructing one or more graphs corresponding to each of the one or more EEG signals. Each of the one or more graphs comprise a plurality of nodes corresponding to the plurality of channels and a weighted adjacency matrix defining connectivity between each pair of nodes among the plurality of nodes. Each of the plurality of nodes is associated with the plurality of temporal embeddings of the corresponding channel. Furthermore, the method includes iteratively training a Graph Neural Network (GNN), the weighted adjacency matrix, a graph classifier, a node classifier, and an edge classifier for a plurality of pre-defined number of iterations by generating a plurality of node embeddings corresponding to each of the one or more EEG signals by the Graph Neural Network (GNN) using the one or more graphs; classifying each of the one or more EEG signals based on the corresponding plurality of node embeddings using the graph classifier, the node classifier, and the edge classifier to obtain a graph label, a node label and an edge label, wherein the graph label provides motor imagery classification, the node label provides quality of the EEG signal, and the edge label determines affinity between a pair of nodes among the plurality of nodes; and updating the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier based on a total loss obtained as a sum of a graph classification loss, a node classification loss, and an edge classification loss.
In another aspect, a system for EEG motor imagery classification is provided. The system includes a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: receive one or more Electroencephalogram (EEG) signals and corresponding ground truth labels comprising a ground truth graph label, a ground truth edge label, and a ground truth node label. Each of the one or more EEG signals comprise a plurality of channels. The one or more hardware processors are further configured to extract a plurality of temporal embeddings corresponding to each of the plurality of channels of each of the one or more EEG signals using a temporal feature extractor. Further, the one or more hardware processors are configured to construct one or more graphs corresponding to each of the one or more EEG signals. Each of the one or more graphs comprise a plurality of nodes corresponding to the plurality of channels and a weighted adjacency matrix defining connectivity between each pair of nodes among the plurality of nodes. Each of the plurality of nodes is associated with the plurality of temporal embeddings of the corresponding channel. Furthermore, the one or more hardware processors are configured to iteratively train a Graph Neural Network (GNN), the weighted adjacency matrix, a graph classifier, a node classifier, and an edge classifier for a plurality of pre-defined number of iterations by generating a plurality of node embeddings corresponding to each of the one or more EEG signals by the Graph Neural Network (GNN) using the one or more graphs; classifying each of the one or more EEG signals based on the corresponding plurality of node embeddings using the graph classifier, the node classifier, and the edge classifier to obtain a graph label, a node label and an edge label, wherein the graph label provides motor imagery classification, the node label provides quality of the EEG signal, and the edge label determines affinity between a pair of nodes among the plurality of nodes; and updating the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier based on a total loss obtained as a sum of a graph classification loss, a node classification loss, and an edge classification loss.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause a method for EEG motor imagery classification. The method includes receiving one or more Electroencephalogram (EEG) signals and corresponding ground truth labels comprising a ground truth graph label, a ground truth edge label, and a ground truth node label. Each of the one or more EEG signals comprise a plurality of channels. The method further includes extracting a plurality of temporal embeddings corresponding to each of the plurality of channels of each of the one or more EEG signals using a temporal feature extractor. Further, the method includes constructing one or more graphs corresponding to each of the one or more EEG signals. Each of the one or more graphs comprise a plurality of nodes corresponding to the plurality of channels and a weighted adjacency matrix defining connectivity between each pair of nodes among the plurality of nodes. Each of the plurality of nodes is associated with the plurality of temporal embeddings of the corresponding channel. Furthermore, the method includes iteratively training a Graph Neural Network (GNN), the weighted adjacency matrix, a graph classifier, a node classifier, and an edge classifier for a plurality of pre-defined number of iterations by generating a plurality of node embeddings corresponding to each of the one or more EEG signals by the Graph Neural Network (GNN) using the one or more graphs; classifying each of the one or more EEG signals based on the corresponding plurality of node embeddings using the graph classifier, the node classifier, and the edge classifier to obtain a graph label, a node label and an edge label, wherein the graph label provides motor imagery classification, the node label provides quality of the EEG signal, and the edge label determines affinity between a pair of nodes among the plurality of nodes; and updating the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier based on a total loss obtained as a sum of a graph classification loss, a node classification loss, and an edge classification loss.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
Existing EEG based MI classification methods construct graphs using static adjacency matrix that are either pre-defined or based on certain heuristics. Hence, the constructed graph does not accurately represent the EEG signal. Embodiments of present disclosure generates a representation (embedding) of the EEG signal by feeding a graph as input to a trained Graph Neural Network (GNN). The graph comprises nodes corresponding to EEG channels and connectivity between the nodes is defined by a weighted adjacency matrix. The GNN and weighted adjacency matrix are trained together using training samples unlike state of the art methods. Thus, the embedding generated by the trained GNN accurately represents the EEG signal and therefore can be used for accurate motor imagery classification (i.e. graph classification). In addition, the embedding can also be used for node and edge classification by using appropriate classifiers.
Referring now to the drawings, and more particularly to
The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) 106 receives an EEG signal as input and gives classification of the EEG signal as output. The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. Functions of the components of system 100 are explained in conjunction with flow diagrams depicted in
In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method (200) depicted in
comprising a Graph Neural Network and classifiers for EEG signal classification, according to some embodiments of the present disclosure. The steps of method 200 are explained in conjunction with functional block diagram illustrated in
Once the one or more EEG signals are received, at step 204 of the method 200 the one or more hardware processors are configured to extract a plurality of temporal embeddings (alternatively referred as temporal features) from each of the plurality of channels of each of the one or more EEG signals via a temporal feature extractor illustrated in
h
0
=f(x, θf) (1)
Once the temporal features are extracted, one or more graphs (G) are constructed corresponding to each of the one or more EEG signals at step 206 of the method 200. Each graph comprises a plurality of nodes corresponding to each of the plurality of channels of corresponding EEG signal among the one or more EEG signals and a weighted adjacency matrix (AW) defining connectivity (alternatively referred as edges) between each pair of nodes among the plurality of nodes. Each of the plurality of nodes is associated with the plurality of temporal embeddings (h0) of the corresponding channel.
Once the one or more graphs are constructed, at step 208 of the method 500, the one or more hardware processors are configured to iteratively train a Graph Neural Network (GNN), the weighted adjacency matrix, a graph classifier, a node classifier, and an edge classifier for a plurality of pre-defined number of iterations by steps 208A to 208C. At step 208A, a plurality of node embeddings (h a) corresponding to each of the one or more EEG signals are generated by the GNN using the one or more graphs according to equation 2. As understood by a person skilled in the art, each node embedding is a feature vector that represents each node in each of the one or more graphs. Thus, the plurality of node embeddings together represent the entire graph and the corresponding EEG signal. In the first iteration of training, the temporal embeddings associated with the nodes of the one or more graphs are considered as hold in equation 2. In the subsequent iterations, the node embeddings generated at the previous iteration are considered as hold. hnew represents updated node embedding generated by the GNN at a current iteration. In general, ith node embedding is given by equation 3, wherein Ni represents set of nodes that are in neighborhood of node i in the one or more graphs, W represents a transformation matrix which is used to transform the hold to hnew and can be updated during training of the GNN, b represents bias, and σ represents non-linearity (for example, ReLu). The node embedding generated at the end of the plurality of iterations is given by equation 4, wherein h0 is the temporal embedding, AW is the weighted adjacency matrix and θGNN is weights of the GNN.
h
new=GNN(holdAW:θGNN (2)
h
i
new=σAj,ihjoldW+b) (3)
h
n=GNN(h0,AW;θGNN) (4)
Once the node embeddings are generated, each of the one or more EEG signals are classified, at the step 208B, based on the corresponding plurality of node embeddings using the graph classifier, the node classifier, and the edge classifier to obtain a graph label, a node label, and an edge label. The graph label provides motor imagery classification, the node label provides quality of the EEG signal (for example, good or poor), and the edge label determines affinity between a pair of nodes among the plurality of nodes (for example, strong or weak). In an embodiment, the node embeddings are vectorized or flattened before feeding into the graph classifier. Similarly, the node embeddings are concatenated before feeding into the edge classifier while they are fed as it is into the node classifier to classify the one or more EEG signals. Example architectures of the graph classifier, the node classifier and the edge classifier are given in tables 2, 3 and 4, respectively. Here, N represents number of channels in the one or more EEG signals, T represents length of the one or more EEG signals expressed in terms of samples, hin and hout represent the number of input and output neurons of the fully connected (FC) layer, nodes is number of node categories and edge is number of edge categories.
The graph label ŷgraph is predicted according to equation 5, wherein h is vectorized node embedding, θgraph represents parameters of the graph classifier and Clsgraph represents the graph classifier. Once the graph label is predicted, a graph classification loss is calculated based on the predicted graph label and the ground truth graph label. An example calculation is given by equation 6, wherein CE is cross entropy loss which calculates additional values required to predict ground truth ygraph instead of the predicted classification ŷgraph. As understood by a person skilled in the art, other loss functions such as variants of cross entropy loss, hinge loss, supervised contrastive loss etc. can be used instead in alternate embodiments. The node label ŷnode for ith node in a graph among the one or more graphs is predicted according to equation 7, wherein hi is the node embedding of the ith node, Clsnode represents the node classifier, and θnode represents parameters of the node classifier. Once the node label is predicted, a node classification loss is calculated based on the predicted node label and ground truth node label. An example calculation is given by equation 8, wherein N is total number of nodes in a graph among the one or more graphs, CE is cross entropy loss which calculates additional values required to predict ground truth ynode instead of the predicted classification ŷnode. The edge label ŷedge for an edge connecting a node i and a node j in a graph among the one or more graphs is predicted according to equation 9, wherein hi and hj are the node embeddings of the node i and node j respectively, Clsedge represents the edge classifier, and θedge represents the parameter of the edge classifier. All the edges in the graph can be classified in a similar way to obtain the edge labels. Once the edges are classified, an edge classification loss is calculated based on predicted edge label and ground truth edge label. An example calculation is given in equation 10, wherein |E| is total number of edges in a graph among the one or more graphs, CE is cross entropy loss which calculates additional values required to predict ground truth yi,jedge instead of the predicted classification ŷi,jedge for the edge ei,j connecting node i and node j.
Once the one or more EEG signals are classified and classification losses are calculated, a total loss is determined as sum of graph classification loss, node classification loss and the edge classification loss according to equation 11. Further, the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier are updated based on the total loss at the step 208C. Unlike state of the art methods where the weighted adjacency matrix is pre-defined or based on certain heuristics, in the method of present disclosure, it is learnt along with the GNN to accurately represent the EEG signals.
total=graph+node+edge (11)
For updating the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier a first, a second, a third, a fourth and a fifth gradient of the total loss with respect to parameters of the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier, respectively are calculated. Then, the GNN, the weighted adjacency matrix, the graph classifier, the node classifier, and the edge classifier are updated using an optimizer (for example, Adam optimizer) based on the computed first, second, third, fourth and fifth gradient.
Once the GNN, the weighted adjacency matrix and the classifiers are trained by the step 208, they can be used for classifying any input EEG signal as illustrated in
The experimental results are provided to highlight effectiveness of the method of present disclosure on the EEG MI classification task (i.e. graph classification). Pytorch with Deep Graph Library (DGL) is used for the experiments. The term ‘model’ is used to together refer to the GNN, temporal feature extractor, and the graph classifier.
Dataset: The dataset used is physionet EEG motor movement and imagery dataset (EEG-MMIDB) (G. Schalk et.al., “Bci2000: a general-purpose brain-computer interface (bci) system,” IEEE Transactions on biomedical engineering, vol. 51, no. 6, pp. 1034-1043, 2004; A. L. Goldberger et. al., “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” circulation, vol. 101, no. 23, pp. e215-e220, 2000). The dataset contains EEG signals recorded from 109 subjects. The EEG signals were obtained via BCI2000 system using 64-channels at a sampling rate of 160 Hz. Each subject participated in 3 trials of left, right, both fists and both feet motor imagery and movement. The EEG signals are pre-processed by performing average re-referencing and z-scoring before performing the method of present disclosure and no other artifact removal techniques were used.
Following are the details of the dataset used for experiments: (i) Two class subset: For the two-class classification task, EEG signals of subjects performing the following activities are considered—(a) imagine open and close of left fist, and (b) imagine open and close of right fist; (ii) Four class subset: For the four-class classification task, EEG signals of subjects performing the following activities are considered—(a) imagine open and close of left fist, (b) imagine open and close of right fist, (c) imagine open and close of both feet; and (d) resting. Two versions of the EEG trial data are taken for training and testing of the models. They are: (i) 6 seconds duration: The duration of each EEG trial is 6 seconds. The trial data consists of 4 seconds of imagery activity, and 1 second of resting-state before and after the start of imagery activity; (ii) 3 seconds duration: The duration of each EEG trial is 3 seconds. The trial data consists of first 3 seconds of imagery activity.
Evaluation: the performance of the proposed method is evaluated in a cross-subject setting wherein the model is evaluated on the EEG signals from the subjects that are not part of the training set, i.e., disjoint sets of subjects are used for training and testing. The mean accuracy of the model for 5-fold cross-validation is reported and compared with the following methods: (i) Park et al. (“Augmented complex common spatial patterns for classification of noncircular eeg from motor imagery tasks,” IEEE Transactions on neural systems and rehabilitation engineering, vol. 22, no. 1, pp. 1-10, 2013.): The method uses a complex-valued, common spatial pattern (CSP) algorithm with the Strong Uncorrelating Transform (SUT) for feature extraction. A Support Vector Machine (SVM) is trained on the extracted features for the motor imagery classification task; (ii) Dose et al. (“A deep learning mi-eeg classification model for bcis.” in EUSIPCO, 2018, pp. 1676-1679.): This method represents EEG signals as a 2D grid, where each row represents the signal of a particular EEG lead. 2D convolution layers are used to extract the Spatio-temporal features. These features are fed to a fully connected layer to obtain the final prediction. This method considers raw EEG signals for feature extraction. The experimental setup and evaluation protocol used to evaluate method of present disclosure are similar to the Dose et al.
Results: Table 5 shows the performance of the models on the two-class dataset. The accuracy of the model in the subject independent setting is shown. The method of present disclosure achieves a mean accuracy of 82.92%. Further, it can be observed that the performance of the method of present disclosure is better than the existing approaches. Table 6 shows the performance of the models on the four-class dataset. The models are trained and tested on EEG signals with different durations. The method of present disclosure achieves a mean accuracy of 64.38% and 68.60% on the cross-validation set containing EEG signals with a duration of 3 and 6 seconds, respectively.
Ablation study: Following ablation experiments are performed to highlight the effectiveness of the method of present disclosure. (i) Ablation-1: Temporal convolution and GNN with a fixed adjacency matrix. In this experiment, the temporal feature for each channel is obtained by the temporal convolution block. Then the GNN with a fixed adjacency matrix is used to aggregate channel features. This experiment is performed to highlight the need for GNN with a trainable adjacency matrix. Table 7 shows the result for this experiment. It can be observed that there is a drop in the performance of the model from 68.60% to 59.22%. (ii) Ablation-2: Temporal convolution with pooling-based channel feature aggregation. In this experiment, the temporal features of each channel are obtained by the temporal convolution block. Instead of GNN, a pooling layer (such as mean, sum, and max) is used to aggregate the channel features. This experiment is performed to highlight the importance of the GNN with a trainable weighted adjacency matrix. Table 7 shows the result for this experiment. It can be observed that there is a drop in the performance of the model from 68.60% to 56.97%, 57.86%, and 57.71% for mean, max, and sum pooling layers, respectively.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
202221053592 | Sep 2022 | IN | national |
This U.S. patent application claims priority under 35 U.S.C. § 119 to: Indian Patent Application No. 202221053592, filed on Sep. 19, 2022. The entire contents of the aforementioned application are incorporated herein by reference.