Aspects of embodiments of the present invention relate to systems for collecting and authoring medical diagnosis information, systems for diagnosing patients based on the medical diagnosis information, and methods of operating such systems.
In the field of medical diagnosis, medical professionals such as doctors and nurses generally diagnose a patient's disease by conducting patient interviews, performing physical inspections, obtaining samples for chemical or biological analysis, and classifying the patient's symptoms into a disease based on the medical professional's knowledge and experience and in conjunction with medical reference materials.
Medical reference materials generally group diseases based on common characteristics. For example, all urinary tract infections may be grouped together. However, urinary tract infections have a wide variety of root causes and may have different presentations or symptoms based on the sex, age, and causes of the particular infection. As such, in many circumstances, standard medical reference materials do not provide sufficient granularity to provide a precise diagnosis of a patient's disease.
Embodiments of the present invention are related to a system and a method for collecting and authoring medical diagnosis information for performing precise determinations of patient diseases.
In an authoring phase, medical diagnosis information can be used to match symptoms presented by a patient to potential diagnoses. Broadly, this medical diagnosis information is generated by collecting and structuring clinical case data from a wide variety of patients, correlating the resulting diagnoses with the recorded symptoms to cluster related diseases together, and generating predictive models and disease models from the collected and analyzed data.
In a diagnosis phase, a medical practitioner can supply the symptoms presented by a patient to a computer system, in which the computer system utilizes the disease models to find the more likely matching disease for the supplied symptoms.
Embodiments of the present invention are directed to systems and methods related to the authoring and development of disease models from clinical data.
According to one embodiment of the present invention, a method for authoring clinical diagnosis data includes: parsing, on a processing device, clinical data regarding components and diagnoses of diseases, the components including at least one of signs, symptoms, and factors to generate structured clinical data; correlating, on the processing device, the structured clinical data by at least one of the components; determining, on the processing device, clusters of the components that are related to the diseases; identifying, on the processing device, one or more predictive components of the clusters of components related to the diseases to generate a diagnosis predictive model; and generating, on the processing device, a disease model using the diagnosis predictive model, the disease model being for diagnosing a patient in accordance with the identified one or more predictive components.
The parsing the clinical data may include: identifying a measurement associated with a symptom using semantic mapping; and extracting the identified measurement.
The measurement may include a temperature, a pulse rate, a blood pressure, or an O2 saturation.
The method may further include: displaying the structured clinical data in a user interface; receiving a request to modify the structured clinical data via the user interface; and modifying the structured clinical data in accordance with the request.
The method may further include: displaying the diagnosis predictive model in a user interface, the display including an indication of the strength of correlation between signs, symptoms, and factors; receiving a request to modify the diagnosis predictive model; and modifying the diagnosis predictive model in accordance with the modification.
The request may include one of adding, excluding, and locking in a sign, a symptom, or a factor.
The identifying one or more principal components may include performing principal component analysis on the structured data.
The method may further include displaying a diagnosis summary, wherein the diagnosis summary displays a frequency with which particular signs, symptoms, and factors are correlated with a diagnosis.
The method may further include computing, on the processing device, a symptoms ontology, the symptoms ontology relating symptoms and alternate phrasings to the same concept and mapping concepts to observed values.
According to one embodiment of the present invention, a system for authoring clinical diagnosis data may include a processing device including a processor and a memory storing instructions, the instructions configuring the processor to: parse clinical data regarding components and diagnoses of diseases, the components including at least one of signs, symptoms, and factors; correlate the clinical data by at least one of the components; determine clusters of the components that are related to the diseases; identify one or more predictive components of the clusters of components related to the diseases to generate a diagnosis predictive model; and generate a disease model from the diagnosis predictive model, the disease model being for diagnosing a patient in accordance with the identified one or more predictive components.
The instructions may configure the processor to parse clinical data by: identifying a measurement associated with a symptom using semantic mapping; and extracting the identified measurement.
The measurement may include a temperature, a pulse rate, a blood pressure, or an O2 saturation.
The instructions may further configure the processor to: display the structured clinical data in a user interface; receive a request to modify the structured clinical data via the user interface; and modify the structured clinical data in accordance with the request.
The instructions may further configure the processor to: display the diagnosis predictive model in a user interface, the display including an indication of the strength of correlation between signs, symptoms, and factors; receive a request to modify the diagnosis predictive model; and modify the diagnosis predictive model in accordance with the modification.
The request may include one of adding, excluding, and locking in a sign, a symptom, or a factor.
The instructions may further configure the processor to identify one or more principal components by performing principal component analysis on the structured data.
The instructions may further configure the processor to display a diagnosis summary, wherein the diagnosis summary displays a frequency with which particular signs, symptoms, and factors are correlated with a diagnosis.
The instructions may further configure the processor to compute a symptoms ontology, the symptoms ontology relating symptoms and alternate phrasings to the same concept and mapping concepts to observed values.
The accompanying drawings, together with the specification, illustrate exemplary embodiments of the present invention, and, together with the description, serve to explain the principles of the present invention.
In the following detailed description, only certain exemplary embodiments of the present invention are shown and described, by way of illustration. As those skilled in the art would recognize, the invention may be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Like reference numerals designate like elements throughout the specification.
Various embodiments of the present invention can be performed on one or more computing devices (or “computers”), each of which includes one or more processors executing computer program instructions and interacting with other system components for performing the various functionalities described herein. For example, the web server 10, the electronic databases 18, and the end user terminals 12a and 12e may be various types of computing devices. For the sake of convenience herein, the term “computing device” will be used to refer to one or more such devices in which program instructions for performing various functions can be performed by a single device, by multiple devices performing the same functions in parallel, or by multiple devices in which some devices are configured to preform different functions from other devices.
The computer program instructions are stored in a memory implemented using a standard memory device, such as, for example, a random access memory (RAM). The computer program instructions may also be stored in other non-transitory computer readable media such as, for example, a CD-ROM, flash drive, or the like. Also, although the functionality of each of the servers is described as being provided by the particular server, a person of skill in the art should recognize that the functionality of various servers may be combined or integrated into a single server, or the functionality of a particular server may be distributed across one or more other servers without departing from the scope of the embodiments of the present invention.
Each of the various servers in the system may be a process or thread, running on one or more processors, in one or more computing devices 500 (e.g.,
The central processing unit 521 is any logic circuitry that responds to and processes instructions fetched from the main memory unit 522. It may be implemented, for example, in an integrated circuit, in the form of a microprocessor, microcontroller, or graphics processing unit (GPU), or in a field-programmable gate array (FPGA) or application-specific integrated circuit (ASIC). Main memory unit 522 may be one or more memory chips capable of storing data and allowing any storage location to be directly accessed by the central processing unit 521. In the embodiment shown in
A wide variety of I/O devices 530 may be present in the computing device 500. Input devices include one or more keyboards 530a, mice, trackpads, trackballs, microphones, and drawing tablets. Output devices include video display devices 530c, speakers, and printers. An I/O controller 523, as shown in
Referring again to
The removable media interface 516 may for example be used for installing software and programs. The computing device 500 may further include a storage device 528, such as one or more hard disk drives or hard disk drive arrays, for storing an operating system and other related software, and for storing application software programs. Optionally, a removable media interface 516 may also be used as the storage device. For example, the operating system and the software may be run from a bootable medium, for example, a bootable CD.
In some embodiments, the computing device 500 may include or be connected to multiple display devices 530c, which each may be of the same or different type and/or form. As such, any of the I/O devices 530 and/or the I/O controller 523 may include any type and/or form of suitable hardware, software, or combination of hardware and software to support, enable or provide for the connection to, and use of, multiple display devices 530c by the computing device 500. For example, the computing device 500 may include any type and/or form of video adapter, video card, driver, and/or library to interface, communicate, connect or otherwise use the display devices 530c. In one embodiment, a video adapter may include multiple connectors to interface to multiple display devices 530c. In other embodiments, the computing device 500 may include multiple video adapters, with each video adapter connected to one or more of the display devices 530c. In some embodiments, any portion of the operating system of the computing device 500 may be configured for using multiple display devices 530c. In other embodiments, one or more of the display devices 530c may be provided by one or more other computing devices, connected, for example, to the computing device 500 via a network. These embodiments may include any type of software designed and constructed to use the display device of another computing device as a second display device 530c for the computing device 500. One of ordinary skill in the art will recognize and appreciate the various ways and embodiments that a computing device 500 may be configured to have multiple display devices 530c.
A computing device 500 of the sort depicted in
The computing device 500 may be any workstation, desktop computer, laptop or notebook computer, server machine, handheld computer, mobile telephone or other portable telecommunication device, media playing device, gaming system, mobile computing device, or any other type and/or form of computing, telecommunications or media device that is capable of communication and that has sufficient processor power and memory capacity to perform the operations described herein. In some embodiments, the computing device 500 may be a virtualized computing device and the virtualized computing device may be running in a networked or cloud based environment. In some embodiments, the computing device 500 may have different processors, operating systems, and input devices consistent with the device.
In other embodiments the computing device 500 is a mobile device, such as a Java-enabled cellular telephone or personal digital assistant (PDA), a smart phone, a digital audio player, or a portable media player. In some embodiments, the computing device 500 includes a combination of devices, such as a mobile phone combined with a digital audio player or portable media player.
As shown in
In some embodiments, a central processing unit 521 provides single instruction, multiple data (SIMD) functionality, e.g., execution of a single instruction simultaneously on multiple pieces of data. In other embodiments, several processors in the central processing unit 521 may provide functionality for execution of multiple instructions simultaneously on multiple pieces of data (MIMD). In still other embodiments, the central processing unit 521 may use any combination of SIMD and MIMD cores in a single device.
A computing device may be one of a plurality of machines connected by a network, or it may include a plurality of machines so connected.
The computing device 500 may include a network interface 518 to interface to the network 504 through a variety of connections including, but not limited to, standard telephone lines, local-area network (LAN), or wide area network (WAN) links, broadband connections, wireless connections, or a combination of any or all of the above. Connections may be established using a variety of communication protocols. In one embodiment, the computing device 500 communicates with other computing devices 500 via any type and/or form of gateway or tunneling protocol such as Secure Socket Layer (SSL) or Transport Layer Security (TLS). The network interface 518 may include a built-in network adapter, such as a network interface card, suitable for interfacing the computing device 500 to any type of network capable of communication and performing the operations described herein. An I/O device 530 may be a bridge between the system bus 550 and an external communication bus.
Still referring to
Still referring to
In some circumstances, the clinical case history is entered as free form text. An exemplary case history entered as free form text is shown in Table 1.
In other circumstances, clinical case histories contributed for the contribution operation 231 may be entered via forms in a dedicated software application or in a web browser-based application provided by web server 10. For example,
The supplied clinical case history 251, utilization data 252, and disease pathway 253 information is processed to generate structure clinical case data 254, prevalence and correlation frequency data 255, symptoms ontologies 256, predictive models 257, disease models 258, and diagnosis dialogs 259, as described in more detail below.
In operation 305, the prevalence and correlation frequency information 255 can then be used to identify key predictive components using “principal component analysis” to generate a symptoms ontology 256, classifying symptoms and demographics into categories of diseases. Principal component analysis is a mathematical procedure that converts a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called “principal components.” Here, the symptoms, demographics, and other factors are possibly correlated variables that relate to various diagnoses. By applying principal component analysis, the principal (or predictive) components, that is, the symptoms, demographics, and other factors that best correlate with particular diagnoses can be identified. Continuing the previous example, different types of urinary tract infections may have different symptoms and principal component analysis identifies which of the variables (or combinations thereof) are the most reliable predictors and distinguishers between the various types of urinary tract infection. Leveraging these unique variables, the system can help to distinguish between these various types of urinary tract invention and provide more targeted diagnoses.
In addition, the symptoms ontology 256 can be used with the prevalence and correlation frequency information 255 to generate diagnosis predictive models 257 in operation 307. The resulting symptoms ontology 256 and diagnosis predictive models 257 are used to generate disease models in operation 309.
In some embodiments of the present invention, a general computing device is configured to perform the automatic generation of disease models from a large collection of supplied clinical case data. In other embodiments of the present invention, at least some operations of the method may be performed by dedicated hardware, such as a field programmable gate array (FPGA) and an application specific integrated circuit (ASIC).
The parsing operation 301 generally corresponds to the role of the system in the contribution stage 231 in the lifecycle 230, where data contributed by various actors is initially processed by the system. The correlation and clustering of the patient data in operation 303 generally corresponds to the coupling stage 233 and the consensus stage 235. The generation of diagnosis predictive models 307 corresponds to the component stage 237 in the lifecycle 230, and the generation of disease models in operation 309 corresponds to the content stage 239 in the lifecycle 230.
Each of the operations in the method for generating statistical models (one embodiment of which is depicted in
Referring again to operation 301, the clinical case history data may be parsed to generate structured clinical case data 254 from the freeform text or text entered into a more structured form.
For example, as seen in
In various embodiments of the present invention, the parsing of the text can be performed using a variety of natural language processing (NLP) techniques to identify the various words within the text and semantic mapping techniques to map the words to clinical concepts.
The parsed structured data may be quickly reviewed by a medical professional, and the data can be easily annotated and edited to correct errors and to add information. In addition, the organization of data in the display can be standardized, allowing medical professionals to quickly and easily understand the case history without having to read and understand the freeform notes originally entered. New factors, signs, symptoms, and findings can be automatically imported from external data sources and/or manually entered (e.g., using a drag and drop interface).
The data from the various sources can then be coupled together to generate prevalence and correlation frequency data 255 using, e.g., mathematical clustering (“cluster analysis” and “principal component analysis”), as described in more detail above with respect to operations 303 and 305.
Using the interface shown in
By identifying strong correlations (e.g., correlations having a metric exceeding a threshold) between particular signs, symptoms, lab results and correctly identified diseases, key predictive components of the clinical observations can be identified and used to create a symptoms ontology 256, classifying symptoms and demographics into categories of diseases. Systems and methods of the present invention identify the correlations and set initial weights to specify relevance and value, which may be later adjusted and edited by an author who interprets the correlations identified by the system.
The multiple diagnosis predictive models corresponding to different stages over the course or pathway of the disease (e.g., incubation, early, developed, progression, and waning) for particular demographics (e.g., male, female, young, old, history, allergies, and combinations thereof) can be combined to develop a disease model 258.
Currently, the classification of symptoms (e.g., back pain, lower back, upper back, type of pain) classifications are dealt with on an organization system that makes sense to a medical professional, but may not make sense to a lay person (e.g., too much terminology). This is a “complaints” ontology which attempts to understand how people experience and report their symptoms and to create logical bridges between how a patient describes their symptoms and how these diseases are mapped out by the medical profession. Other systems may attempt to guide lay users to currently medically standard terms, but lay users may not fully understand the meanings of these individual types. As such, a community created system for creating guides to help patients answer the right question, e.g., simple physical tests to perform or “tips” on how to report medically useful information.
A diagnosis dialog 259 (e.g., a wizard) can be generated using the disease models, the diagnosis predictive models, and the symptom ontology in which specific questions are sequentially asked and lab tests may be suggested based on the predictive abilities of particular tests (e.g., tests associated with more strongly correlated factors would be more clearly indicative of particular diseases).
For example, when diagnosing whether an individual has a urinary tract infection, a lay person may be asked: “1. Are you experiencing any pain or discomfort while urinating?” “2. Do you find you are having to urinate more frequently or with increased urgency?” “3. How many times have you been diagnosed with Urinary Tract Infection before?” “4. Is there pain or discomfort when you press on your abdomen?” and “5. Is your urine discolored or cloudy?” in that order. In some embodiments, the order in which the questions are asked may also be performed in a way that optimized the predictiveness of the tests and questions while minimizing the invasiveness of the test performed. Questions that have already been answered or that would have no predictive value given already known information (e.g., answers to previous questions) would not be asked.
Data viewed or entered at any given stage may be modified (e.g., by adding and removing signs, symptoms, and factors) by medical professionals based on their observations. These changes may be reviewed or aggregated with other entries to allow medical professionals to collaboratively refine the quality of the information stored in the system as medical professionals verify or identify problems with the stored information.
While the present invention has been described in connection with certain exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims, and equivalents thereof.
This application claims the benefit of U.S. Provisional Patent Application No. 61/700,309, “CLINICAL DIAGNOSIS OBJECTS AUTHORING,” filed in the United States Patent and Trademark Office on Sep. 12, 2012, the entire disclosure of which is incorporated herein by reference and the benefit of U.S. Provisional Patent Application No. 61/719,766, “CLINICAL DIAGNOSIS OBJECTS INTERACTION,” filed in the United States Patent and Trademark Office on Oct. 29, 2012, the entire disclosure of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61700309 | Sep 2012 | US | |
61719766 | Oct 2012 | US |