The present invention relates, generally, to systems and methods for determining whether particular medical treatments are appropriate and, more particularly, to the application of machine learning techniques to the evaluation of such treatments and interventions.
Determining whether a particular medical invention is appropriate for a given patient continues to be challenging. Such determinations are important, however, as they can have a profound impact on patient health outcomes, healthcare costs, and other individual and societal factors.
In the context of healthcare insurance providers and other similarly situated entities, it is particularly desirable to avoid false-positives, i.e., instances in which a patient is incorrectly classified as a candidate and/or subjected to unnecessary medical interventions. Toward that end, health insurance providers often carry out a “utilization review” in which the insurer evaluates the medical necessity of a requested medical procedure for the purpose of providing preauthorization.
Even given recent advances in medical care, insurance case management techniques, and data analysis, healthcare costs (and consequently insurance premiums) continue to rise in an unsustainable fashion. This is due in part to the difficult of determining whether a requested medical intervention is appropriate for a particular individual under the circumstances.
Systems and methods are thus needed which overcome the limitations of the prior art. Various features and characteristics will also become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and this background section.
Various embodiments of the present invention relate to systems and methods for, inter alia: i) using machine learning techniques to determine a set of confidence levels associated with a set of patient outcomes; ii) utilizing heterogeneous forms of aggregated data (such as imaging, lab studies, exam findings, survey information, and the like) as inputs to a machine learning system as described herein, ii) improving insurer utilization reviews using the machine learning systems described herein; iii) using multiple pre-trained artificial neural networks to implement the machine learning systems described herein; and iv) utilizing the machine learning systems described herein to determine whether a particular health care provider or physician is appropriate given the desired medical intervention.
Various other embodiments, aspects, and features are described in greater detail below.
Exemplary embodiments will hereinafter be described in conjunction with the following drawing figures, wherein like numerals denote like elements, and:
The following detailed description of the invention is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background or the following detailed description.
Various embodiments of the present invention relate to systems and methods for applying machine learning techniques to the problem of determining the appropriateness of particular medical interventions. The disclosed techniques provide systems and methods for considering a wide range of heterogeneous data types (e.g., digital images, radiological reports, lab studies, exam findings, survey information, and the like) that are normalized for use as inputs to one or more machine learning systems. This normalization itself may leverage one or more machine learning modules, such as convolutional neural networks and the like.
While systems and methods are often described herein in the context of surgery and surgical procedures, the invention is not so limited, and may be used to predict the necessity of a wide range of invasive and non-invasive treatments. The term “medical intervention” therefore comprehends any form of treatment, ranging from medication to various non-invasive and/or invasive diagnostic procedures, performed to treat a one or more health conditions.
The various data files accumulated from data sources 110 are suitable normalized (as described further below) to produce normalized data 120. This data may then be further processed to produce machine language (ML) inputs 130. These inputs 130 are then provided to previously trained machine learning system 140 to produce a classification output 150 which, in various embodiments, corresponds to a probability level (e.g., within the range 0.0-1.0, inclusive) that a particular medical intervention is appropriate. As further described below, machine learning system 140 may be trained using outputs derived from a jury of experts (e.g., medical professionals qualified to make such determinations). Stated another way, system 100, as a whole, is trained to simulate the judgment of an expert panel with respect to whether a particular medical intervention is appropriate.
With continued reference to
Lab studies 114 may include, for example, data regarding serum, urine, cerebrospinal fluid, microbacteriological culture, and other bodily fluids. Data sources 110 may include other functional data such as ultrasound data, cardiac stress tests (chemical or exertional), pulmonary function data, renal function tests, electroencephalogram data, myelographical data, angiographical data, bone density data, and the like.
Images provided by data sources 110 may be stored and transferred in any convenient format, such as the standard Dicom format. Radiological reports 113 may be in the form of mixed numerical and text data, a PDF, or may be in the form of a fax print-out format. Lab studies 114 may also be in PDF, fax, or mixed numeral and text format. Office notes 117 may be in the form of structured or unstructured text, and exam findings 115 may be in the form of audio files (e.g., heart or lung sounds), or scalar values such as range of motion measurements.
As illustrated in
The normalized data 120 is then processed to produce a set of ML inputs 130 for machine learning system 140. In the illustrated embodiment, for example, normalized X-ray and CT/MRI data 121, 122 may be processed by a computer vision or convolutional neural network system 131 to extract features from those images (e.g., anatomical dimensions, etc.). Similarly, normalized medical history data 123 and radiology reports 124 may be processed by a natural language processing system 132. Finally, physical data 125 and survey data 126 may be encoded by respective encoding modules 133 and 134.
Referring briefly to
While
In some embodiments, the outputs 250 may be fed into a secondary machine learning module 160 that is trained to determine whether a particular medical intervention is indicated given the expected outcomes.
In general, systems in accordance with various embodiments are able to predict categorical surgical outcomes of infection, persistent pain, poor functional outcome, death, repeat surgery, excessive cost, continued use of narcotics, and the like.
In the case of total hip surgery, for example, the patient's demographics, selected lab studies, surveys, functional organ tests (electrocardiogram, cardiac stress test, pulmonary function tests), co-morbidities, and hip x-ray information may be relevant. Features are extracted using computer vision and/or machine learning techniques, and the population of patients used for this model include people who have already had surgery. This is in contrast to the embodiment illustrated in
In addition, the system may use this model to decide if a new patient should have surgery by determining the optimal classification thresholds. As a part of the random forest creation, for example, the algorithm plot the variables with the highest Gini coefficients against the categorical outcome.
For example, consider the case in which the system as data on 10,000 people who have had a total hip replacement. The system generated a random forest where one of the leaves indicates that the patient developed an infection. Assuming that the system determined from its analysis that the variable with the highest Gini coefficient was the laboratory value of WBC (white blood cell count) before surgery—a continuous variable between 0.0 and 20.0. The system could effectively plot WBC on the X axis against the number of patients who developed an infection on the Y axis. The system could then calculate the true positive and false positive rates for thresholds from 1-20, and plot a receiver operating curve to determine the optimal threshold WBC count. The optimal threshold would be the WBC count that maximizes the area under the ROC curve.
In general, empirical testing of machine learning systems in accordance with the present subject matter has shown that such systems exhibit predictive accuracies that meet and often exceed those of providers utilizing heuristic and other traditional techniques.
The machine learning systems of
As discussed briefly above, one of the advantages of systems in accordance with the present invention is the use of heterogeneous data—i.e., a wide range of data types, ranging from images, sounds, lab studies, and the like—which is then normalized in a way that can be used to train the relevant machine learning models.
In accordance with another embodiment, the system includes an AI system configured to engage in an interactive conversation with a patient. In this way, the AI system administers a survey, which then serves as the input for another neural network system. Thus, the AI module acts as the agent conducts the survey.
In the illustrated embodiment, user interface 512 is a web page or collection of web pages displayed in a web browser operating on device 510 and provided by a survey module 521 (e.g., a web service with associated back end databases, software, etc.) located at a remote server 520. Server 520 may be associated with, for example, an insurance provider, a health care provider, or an individual surgeon.
Interaction of patient 500 with survey user interface 512 causes survey results 530 to be generated and transmitted over network 140 (e.g., the Internet) to server 520, whereupon they are stored within a database 525. Survey results 530 are preferably transmitted in a secure fashion, e.g., via an https protocol.
In some embodiments, data entered by patient 500 may be transformed to produce survey results 530 that are better configured for use by a machine learning system. For example, one of the questions 511 may be an open-ended question such as, “how would you describe your back pain right now.” In response, the patient may be asked to type (or speak) a response, which is then provided to a speech recognition system and/or natural language processing system as illustrated in
It will be appreciated that the particular architecture illustrated in
The arrows in
The number of nodes in each layer (n, k, and j) may vary depending upon the application, and in fact may be modified dynamically by the system itself to optimize its performance. In some embodiments (e.g., deep learning systems), multiple hidden layers 702 may be incorporated into ANN 700.
Each of the layers 702 and 703 receives input from a previous layer via a network of weighted connections (illustrated as arrows in
ANN 700 is trained via a learning rule and “cost function” that are used to modify the weights of the connections in response to the input patterns provided to input layer 701 and the training set provided at output layer 703, thereby allowing ANN 700 to learn by example through a combination of backpropagation and gradient descent optimization. Such learning may be supervised (with known examples of past survey inputs and surgery outcomes provided to input layer 701 and output layer 703), unsupervised (with uncategorized examples provided to input layer 701), or involve reinforcement learning, where some notion of “reward” is provided during training.
Once ANN 700 is trained to a satisfactory level (e.g., without overtraining), it may be used as an analytical tool to make predictions and perform “classification” of the input 701. That is, new inputs are presented to input layer 701, where they are processed by the middle layer 702 and, via forward propagation through the weights associated with each of the edges, produce an output 703. As described above, output layer 703 will typically include a set of confidence levels or probabilities associated with a corresponding number of different classes, such as the appropriateness of a particular medical intervention.
As shown in
In general, CNN 800 implements a convolutional phase 822, followed by feature extraction 820 and classification 830. Convolutional phase 822 uses an appropriately sized convolutional filter that produces a set of feature maps 821 corresponding to smaller tilings of input image 810. As is known, convolution as a process is translationally invariant—i.e., features of interest (bone geometry, X-ray features, etc.) can be identified regardless of their location within image 810.
Subsampling 824 is then performed to produce a set of smaller feature maps 823 that are effectively “smoothed” to reduce sensitivity of the convolutional filters to noise and other variations. Subsampling might involve taking an average or a maximum value over a sample of the inputs 821. Feature maps 823 then undergo another convolution 828, as is known in the art, to produce a large set of smaller feature maps 825. Feature maps 825 are then subsampled to produce feature maps 827.
During the classification phase (830), the feature maps 827 are processed to produce a first layer 831, followed by a fully-connected layer 833, from which outputs 840 are produced. For example, output 841 might correspond to the likelihood that a particular feature has been recognized.
In general, the CNN illustrated in
While the above discussion often focuses on the use of artificial neural networks, the range of embodiments are not so limited. Any of the various modules described herein may be implemented as one or more machine learning models that undergo supervised, unsupervised, semi-supervised, or reinforcement learning and perform classification (e.g., binary or multiclass classification), regression, clustering, dimensionality reduction, and/or such tasks. Examples of such models include, without limitation, artificial neural networks (ANN) (such as a recurrent neural networks (RNN) and convolutional neural network (CNN)), decision tree models (such as classification and regression trees (CART)), ensemble learning models (such as boosting, bootstrapped aggregation, gradient boosting machines, and random forests), Bayesian network models (e.g., naive Bayes), principal component analysis (PCA), support vector machines (SVM), clustering models (such as K-nearest-neighbor, K-means, expectation maximization, hierarchical clustering, etc.), linear discriminant analysis models.
In accordance with various embodiments, the output 150 of
In other embodiments, the outputs 150, 250 may be used to determine which surgeons perform a particular procedure at a satisfactory level as to remain within a contracted (network) group of physicians. In another embodiment, the outputs 150, 250 may be used to determine which health care facilities perform a particular procedure with the best outcomes. In yet another embodiment, the outputs 150, 250 are used to determine which facilities perform a particular procedure at a level that sufficient to remain within a contracted group.
In summary, a machine learning system for determining the appropriateness of a selected medical intervention generally includes a plurality of health-related data sources, the health-related data sources providing at least one data file of a first type, and a second data file of a second type; a normalization module configured to receive the first and second data files and perform a normalization procedure on at least one of the first and second data files, a previously trained machine learning model configured to receive the normalized data files and produce a prediction output, wherein the prediction output includes a confidence level associated with an appropriateness of the selected medical intervention.
The machine learning model may include, for example, an artificial neural network, a probabilistic neural network, a convolutional neural network, or a decision tree.
In various embodiments, the first data file is a two-dimensional image file, and the normalization procedure includes producing an input vector based on the two-dimensional image file. In various embodiments, the two-dimensional image file is selected from the group comprising an X-ray image, a cat-scan (CT) image, and a magnetic resonance image (MRI). In various embodiments, the first data file is a time-varying real value parameter, and the normalization procedure produces an input vector based on the time-varying real value parameter.
In one embodiment, the time-varying real value parameter is a heart-beat audio file. In another embodiment, the time-varying real parameter is a spoken utterance.
In one embodiment, the first data file is a text file, and the normalization procedure includes producing an input vector by applying natural language processing (NLP) to the text file.
In one embodiment, the prediction output is further processed to determine a selected health-care provider for the selected medical intervention.
In one embodiment, the data sources are selected from the group consisting of diagnostic image sources, radiological reports, lab studies, exam findings, survey results, and office notes.
A machine learning system for predicting outcomes of a selected medical intervention includes, in one embodiment: a plurality of health-related data sources, the health-related data sources providing at least one data file of a first type, and a second data file of a second type; a normalization module configured to receive the first and second data files and perform a normalization procedure on at least one of the first and second data files; and a previously trained machine learning model configured to receive the normalized data files and produce a prediction output including a set of confidence levels associated with a respective set of patient outcomes.
The set of patient outcomes may be selected from the group consisting of: infection, pain level, functional outcome, death, requirement for repeat surgery, cost, and continued use of narcotics.
In one embodiment, the system includes a recommendation module configured to take as its input the prediction output relating to patient outcomes and produce a recommendation output comprising a confidence level associated with whether the patient should undergo the selected medical intervention. The recommendation module may be a previously trained tree-based machine learning model, such as a random forest model.
The recommendation module may further utilize, to produce the recommendation output, a receiver operating characteristic (ROC) curve applied to a plot of at least one of the outcomes as a function of data from at least one of the health-related data sources.
The first data file may be a two-dimensional image file, and the normalization procedure includes producing an input vector based on the two-dimensional image file.
The two-dimensional image file may be selected from the group comprising an X-ray image, a cat-scan (CT) image, a positron emission tomography (PET) image, an ultrasound image, and a magnetic resonance image (MRI)
The first data file may e a time-varying real value parameter, and the normalization procedure produces an input vector based on the time-varying real value parameter.
The time-varying real value parameter may be a heart-sound audio file, a lung sound audio file, a carotid artery audio file, or a spoken utterance.
In one embodiment, the data file is a text file, and the normalization procedure includes producing an input vector by applying natural language processing (NLP) to the text file.
In one embodiment, the prediction output is further processed to determine a selected health-care provider for the selected medical intervention.
In one embodiment, the data sources are selected from the group consisting of diagnostic image sources, radiological reports, lab studies, exam findings, survey results, co-morbidities, ICD10 data, and office notes.
The various systems, modules, and methods described above may be implemented in software using any convenient general-purpose programming language. Suitable languages include, without limitation, web assembly (Wasm), Python, C++, C#, Java, PHP, and the like. In addition, various standard machine learning libraries and linear algebra libraries may be employed.
Embodiments of the present disclosure may be described herein in terms of functional and/or logical block components and various processing steps. It should be appreciated that such block components may be realized by any number of hardware, software, and/or firmware components configured to perform the specified functions. For example, an embodiment of the present disclosure may employ various integrated circuit components, e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices. In addition, those skilled in the art will appreciate that embodiments of the present disclosure may be practiced in conjunction with any number of systems, and that the systems described herein is merely exemplary embodiments of the present disclosure. Further, the connecting lines shown in the various figures contained herein are intended to represent example functional relationships and/or physical couplings between the various elements. It should be noted that many alternative or additional functional relationships or physical connections may be present in an embodiment of the present disclosure.
As used herein, the term “module” refers to any hardware, software, firmware, electronic control component, processing logic, and/or processor device, individually or in any combination, including without limitation: application specific integrated circuits (ASICs), field-programmable gate-arrays (FPGAs), dedicated neural network devices (e.g., Google Tensor Processing Units), electronic circuits, processors (shared, dedicated, or group) configured to execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that provide the described functionality.
As used herein, the word “exemplary” means “serving as an example, instance, or illustration.” Any implementation described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other implementations, nor is it intended to be construed as a model that must be literally duplicated.
While the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing various embodiments of the invention, it should be appreciated that the particular embodiments described above are only examples, and are not intended to limit the scope, applicability, or configuration of the invention in any way. To the contrary, various changes may be made in the function and arrangement of elements described without departing from the scope of the invention.
This application claims priority, as a continuation-in-part, to U.S. patent application Ser. No. 16/122,100, filed Sep. 5, 2018, the entire contents of which are hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 16122100 | Sep 2018 | US |
Child | 16190874 | US |