Automated diagnostic systems utilizing, for example, machine learning, have been playing an increasingly important role in healthcare. Over the last few years, machine learning techniques (especially neural networks or deep neural networks) have been successfully applied to medical image classification. Classification modules may be used to provide an indication for the presence of a certain anatomy, pathology, object and/or organ in an image, but do not provide information with respect to a spatial location of the identified classification. Although some techniques for generating visual explanations associated with an output of a e.g. deep neural network classifier have been proposed, these methods provide means for measuring the impact of individual input voxels on the classifier decision. In some cases, however, these methods are limited in their practical applicability as resulting attribution heat maps may be diffuse and difficult to interpret.
The exemplary embodiments are directed to a computer-implemented method of training a machine learning module to provide classification and localization information for an image study, comprising: receiving a current image study; applying the machine learning module to the current image study to generate a classification result including a prediction for one or more class labels for the current image study using a classification module of the machine learning module; receiving, via a user interface, a user input indicating a spatial location corresponding to a predicted class label; and training a localization module of the machine learning module using the user input indicating the spatial location corresponding to the predicted class label.
The exemplary embodiments are directed to a system of training a machine learning module to provide classification and localization information for an image study, comprising: a non-transitory computer readable storage medium storing an executable program; and a processor executing the executable program to cause the processor to: receive a current image study; apply the machine learning module to the current image study to generate a classification result including a prediction for one or more class labels for the current image study using a classification module of the machine learning module; receive, via a user interface, a user input indicating a spatial location corresponding to a predicted class label; and train a localization module of the machine learning module using the user input indicating the spatial location corresponding to the predicted class label.
The exemplary embodiments are directed to a non-transitory computer-readable storage medium including a set of instructions executable by a processor, the set of instructions, when executed by the processor, causing the processor to perform operations, comprising: receiving a current image study; applying a machine learning module to the current image study to generate a classification result including a prediction for one or more class labels for the current image study using a classification module of the machine learning module; receiving, via a user interface, a user input indicating a spatial location corresponding to a predicted class label; and training a localization module of the machine learning module using the user input indicating the spatial location corresponding to the predicted class label.
The exemplary embodiments may be further understood with reference to the following description and the appended drawings, wherein like elements are referred to with the same reference numerals. The exemplary embodiments relate to systems and methods for machine learning and, in particular, relate to systems and methods for dynamically extending and/or modifying a machine learning module. The machine learning module comprises a pre-trained classification module, which identifies a class label for a particular image study, and an untrained or partially trained localization module, which is to be trained using relevant spatial information provided by a user based on the identified class label and/or the image study. Thus, once the localization module has been trained to a stable state, the machine learning module may autonomously provide both a class label and a relevant spatial location for an image study. The classification module may also be configured to continually adapt based on other user input such as, for example, the addition of new classes and/or corrections to an identified class label. It will be understood by those of skill in the art that although the exemplary embodiments are shown and described with respect to X-ray images or image studies, the systems and methods of the present disclosure may be similarly applied to any of a variety of medical imaging modalities in any of a variety of medical fields for any of a variety of different pathologies and/or target areas of the body.
As shown in
In some embodiments, the classification module 112 of the machine learning module 110 has been pre-trained, during manufacturing, with training data including image studies (e.g., x-ray images or image studies) that have corresponding classification information so that the machine learning module 110 is delivered to a clinical site (e.g., hospital) with classification capabilities. Thus, the classification module 112 is trained to provide a medical image classification (e.g., class label) based on an image being analyzed. Image classifications provide, for example, an indication of a presence of a particular anatomy, pathology, object, organ, etc. Classes may include, for example, the presence of effusion, fractures, nodules, support devices, etc. Although the classification module 112 has been pre-trained, the classification module 112 may be configured to continually adapt by learning new user inputs such as, for example, new classes and/or classification corrections. In some embodiments, the classification module 112 may include an internal module such as, for example, an image classification module.
While the classification module 112 is pre-trained, the localization module 114 may be manufactured and delivered to the clinical site in an untrained state. Thus, with each use of the machine learning module 110, user input including spatial location information may be used to train the localization module 114 so that once the localization module is trained to a stable state, the localization module 114 will be capable of identifying a relevant spatial location of an identified class for a particular image study. In some embodiments, user inputs indicating relevant spatial information may include, for example, a bounding box drawn over a relevant portion of the image study. In some embodiments, the localization module 114 may include an internal module for bounding box detection. It will be understood by those of skill in the art that although the localization module 114 is described as being manufactured and delivered in an untrained state, the localization module 114 may also be delivered in a partially trained state using, for example, testing data acquired during a testing stage. With the acquisition of sufficient data and subsequent training, the machine learning module 110 may eventually be a fully trained, autonomous decision making system.
The user may input any relevant information via, the user interface 104, which may include any of a variety of input devices such as, for example, a mouse, a keyboard and/or a touch screen via the display 106. User inputs may be stored to the database 120 for training of the classification module 112 and/or localization module 114.
As shown in
The system 100 may keep track of labels for which the classification module 112 or localization module 114 is in a stable state. To determine whether a module is considered as stable for a certain label, the system 100 may rely on a set of predefined performance requirements and/or rules. An exemplary rule may be that at least 500 images containing the label were seen during on-site module adaptation. However, it should be understood that this is just one example of a predefined requirement/rule and other requirements and/or rules may also be used. Classification or localization results related to stable classes are forwarded to the user interface. Classification or localization results related to labels which are not considered to be stable may not be directly displayed to the user.
Where, however, the localization module 114 has been trained to a stable state for an identified class, the results 122 will show localization results along with the classification results. Localization results may include the spatial location via, for example, a bounding box over the relevant portion of the current image study 118.
It will be understood by those of skill in the art that the user interfaces described and shown in
Any user inputs such as, for example, relevant spatial location, edits, additions or corrections may be stored to the database 120 to be used by the training engine 116 to train the classification module 112 and/or localization module 114 accordingly.
Those skilled in the art will understand that the classification module 112 and the localization module 114 of the machine learning module 110 along with the training engine 116 may be implemented by the processor 102 as, for example, lines of code that are executed by the processor 102, as firmware executed by the processor 102, as a function of the processor 102 being an application specific integrated circuit (ASIC), etc. It will also be understood by those of skill in the art that although the system 100 is shown and described as comprising a computing system comprising a single processor 102, user interface 104, display 106 and memory 108, the system 100 may be comprised of a network of computing systems, each of which includes one or more of the components described above. In one example, the classification module 112 and the localization module 114 of the machine learning module 110 along with the training engine 116 may be executed via a central processor of a network, which is accessible via a number of different user stations. Alternatively, one or more of the classification module 112 and the localization module 114 of the machine learning module 110 along with the training engine 116 may be executed via one or more processors. Similarly, the database 120 may be stored to a central memory 108. The current image study 118 may be acquired from any of a plurality of imaging devices networked with or otherwise connected to the system 100 and stored to a central memory 108 or, alternatively, to one or more remote and/or network memories 108.
As described above, however, for earlier iterations of the machine learning module 110, while the classification module 112 is pre-trained to be able to provide classifications (e.g., identify class labels) for the current image study 118, the localization module 114 may be untrained or partially trained so that the machine learning module 110 is not yet trained to show relevant spatial location information. Thus, in these cases, a user interface may show the current image study 118 along with the classification results so that the user input may include relevant spatial information via, for example, a bounding box drawn over a relevant portion of the current image study 118.
In later iterations, where the localization module 114 has been trained to a stable state, the classification/localization result 122 will identify relevant class labels and show relevant spatial locations for corresponding identified classes. In these embodiments, the user input may include editing of spatial information by, for example, adjusting a location and/or size of a displayed bounding box. Regardless of whether the localization module 114 is in a stable state, however, user inputs may also include other data such as, for example, adding findings (e.g., addition of class labels) and/or corrections to findings (e.g., removing findings or class labels).
In 250, all the user inputs are stored to the database 120 so that, in 260, the training engine 116 trains the machine learning module 110 to include the data from the database 120. In particular, the classification module 112 is trained with user inputs corresponding to classification results while the localization module is trained with user inputs corresponding to spatial location. The classification module 112 and the localization module 114, however, implement transfer-learning techniques (e.g., sharing of module components, sharing of feature maps) in order to exploit the commonalities of localization and classification tasks. For example, certain feature extractors or convolutional filters may be shared among both the classification module 112 and the localization module 114.
In some embodiments, the classification module 112 and the localization module 114 are deep neural networks and share the same layers as a backbone for an object detector. In other embodiments, only certain layers of the classification network and object detector backbone may be shared. In further embodiments, it is possible to implement the training setup in such a way that the classification and localization modules 112, 114 are updated in an alternating fashion. If the classification and localization modules 112, 114 share components, the training process may be configured in such a way that during the retraining of individual modules, certain layers/components (e.g., neural network convolutional filter weights) may be frozen. For example, during a gradient step with respect to the classification loss, a latter half of the layers of a shared deep neural network may be frozen while during a gradient step with respect to an object localization loss, a first half of the layers may be frozen. In other embodiments, it is also possible to implement the training setup in such a way that the classification and localization modules 112, 114 are jointly updated (e.g., by combining the classification and localization loss functionals and performing a joint backpropagation).
It will be understood by those of skill in the art that the method 200 may be continuously repeated so that machine learning module 110 is dynamically expanded and modified with each use thereof. In particular, since the localization module 114 is continuously trained with new localization data provided by the user, the localization module 114 will eventually be trained to a stable state so that the deep neural network 110 may provide a fully autonomous classification and localization result for an image study. Even when the deep neural network 110 is capable of providing a fully autonomous result, however, user input may be utilized to continually adapt and modify the deep neural network 110 to overcome shifts in data distribution (“domain bias”) and to mitigate the effect of catastrophic forgetting. An on-site adaption may be continued to be triggered based on a set of pre-defined rules (e.g., 1000 new images containing at least 10000 foreground/positive labels are available).
Those skilled in the art will understand that the above-described exemplary embodiments may be implemented in any number of manners, including, as a separate software module, as a combination of hardware and software, etc. For example, the machine learning module 110, classification module 112, localization module 114 and training engine 116 may be programs including lines of code that, when compiled, may be executed on the processor 102.
Although this application described various embodiments each having different features in various combinations, those skilled in the art will understand that any of the features of one embodiment may be combined with the features of the other embodiments in any manner not specifically disclaimed or which is not functionally or logically inconsistent with the operation of the device or the stated functions of the disclosed embodiments.
It will be apparent to those skilled in the art that various modifications may be made to the disclosed exemplary embodiments and methods and alternatives without departing from the spirit or scope of the disclosure. Thus, it is intended that the present disclosure cover the modifications and variations provided that they come within the scope of the appended claims and their equivalents.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/086676 | 12/18/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63199301 | Dec 2020 | US |