The present description relates generally to generating and delivering healthcare insights.
In the Information Age, the amount of information stored as digital data is growing exponentially over time. For example, the emergence of electronic health records (EHRs) of patients, as well as an unprecedented amount of drug-related information and disease-related information from pharmaceutical and medical research and development respectively, has resulted in a vast amount of healthcare information being digitally available to researchers and other end-users. Further, increasingly sophisticated techniques have emerged in recent decades to attempt to extract knowledge from the vast amount of digital information. However, such data is typically stored in various structured and unstructured formats across different platforms and systems.
The inventors have recognized the above issues and have devised several approaches to address them. In particular, systems and methods for generating and delivering personalized healthcare insights are provided. In one embodiment, a method comprises determining, with a processor, healthcare insights based on a knowledge graph constructed from data from a heterogeneous plurality of data sources, generating, with the processor and a machine learning model, a plurality of healthcare recommendations for a user based on the healthcare insights, selecting, with the processor, at least one healthcare recommendation from the plurality of healthcare recommendations based on user behavior, and outputting, to a user device for display to the user, the at least one healthcare recommendation. In this way, a large number of healthcare insights and recommendations may be determined for users, but only a subset of such healthcare insights and recommendations may be selectively provided to and personalized for users.
The above summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the subject matter, nor is it intended to be used to limit the scope of the subject matter. Furthermore, the subject matter is not limited to implementations that solve any or all of the disadvantages noted above or in any part of this disclosure.
The present description relates to systems and methods for healthcare insights with knowledge graphs. In particular, systems and methods are provided for constructing a knowledge graph from data obtained from a plurality of data sources, and deriving healthcare insights and inferences from the knowledge graph. A knowledge graph comprises a directed heterogeneous graph-structured data model that encodes a network of entities and their relationships. Healthcare insights automatically derived from such knowledge graphs may be used by healthcare providers, such as nurses, to enhance the provision of care, or may be directly shared with patients or other users. A computing environment, such as the computer environment or system depicted in
Server 101 may be a computing device configured to construct knowledge graphs from data obtained from heterogeneous healthcare data sources, generates healthcare insights and inferences from the knowledge graph, and uses such insights for improving various applications such as recommendation systems and search engines. In different embodiments, server 101 may take the form of a mainframe computer, server computer, desktop computer, laptop computer, tablet computer, network computing device, mobile computing device, mobile communication device, and so on.
Server 101 includes a logic subsystem 103 and a data-holding subsystem 104. Server 101 may optionally include a display subsystem 105, communication subsystem 106, and/or other components not shown in
Logic subsystem 103 may include one or more physical devices configured to execute one or more instructions. For example, logic subsystem 103 may be configured to execute one or more instructions that are part of one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more devices, or otherwise arrive at a desired result.
Logic subsystem 103 may include one or more processors that are configured to execute software instructions. Additionally or alternatively, the logic subsystem 103 may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. Processors of the logic subsystem 103 may be single or multi-core, and the programs executed thereon may be configured for parallel or distributed processing. The logic subsystem 103 may optionally include individual components that are distributed throughout two or more devices, which may be remotely located and/or configured for coordinated processing. One or more aspects of the logic subsystem 103 may be virtualized and executed by remotely accessible networked computing devices configured in a cloud computing configuration.
Data-holding subsystem 104 may include one or more physical, non-transitory devices configured to hold data and/or instructions executable by the logic subsystem 103 to implement the herein described methods and processes. When such methods and processes are implemented, the state of data-holding subsystem 104 may be transformed (for example, to hold different data).
In one example, the server 101 includes a healthcare knowledge system 111 configured as executable instructions in the data-holding subsystem 104. The healthcare knowledge system 111 may import or extract data from a plurality of data sources 141, transform and structure the data from the plurality of data sources 141 as a knowledge graph, and generate insights from the knowledge graph. To effectively predict or assess the potential risk of a member, for example, the plurality of models may process one or more healthcare claims associated with the member. To that end, one or more healthcare databases 112 comprising one or more knowledge graphs storing aggregated data from a plurality of data sources may be stored in the data-holding subsystem 104 and accessible to the healthcare knowledge system 111. That is, the one or more healthcare databases 112 comprise graph databases rather than relational databases. The one or more healthcare databases 112 may include healthcare data and other data stored locally or remotely, for example in one or more databases or data sources 141 communicatively coupled to the server 101 via the network 115.
Data-holding subsystem 104 may include removable media and/or built-in devices. Data-holding subsystem 104 may include optical memory (for example, CD, DVD, HD-DVD, Blu-Ray Disc, etc.), and/or magnetic memory devices (for example, hard drive disk, floppy disk drive, tape drive, MRAM, etc.), and the like. Data-holding subsystem 104 may include devices with one or more of the following characteristics: volatile, nonvolatile, dynamic, static, read/write, read-only, random access, sequential access, location addressable, file addressable, and content addressable. In some embodiments, logic subsystem 103 and data-holding subsystem 104 may be integrated into one or more common devices, such as an application-specific integrated circuit or a system on a chip.
It is to be appreciated that data-holding subsystem 104 includes one or more physical, non-transitory devices. In contrast, in some embodiments aspects of the instructions described herein may be propagated in a transitory fashion by a pure signal (for example, an electromagnetic signal) that is not held by a physical device for at least a finite duration. Furthermore, data and/or other forms of information pertaining to the present disclosure may be propagated by a pure signal.
When included, display subsystem 105 may be used to present a visual representation of data held by data-holding subsystem 104. As the herein described methods and processes change the data held by the data-holding subsystem 104, and thus transform the state of the data-holding subsystem 104, the state of display subsystem 105 may likewise be transformed to visually represent changes in the underlying data. Display subsystem 105 may include one or more display devices utilizing virtually any type of technology. Such display devices may be combined with logic subsystem 103 and/or data-holding subsystem 104 in a shared enclosure, or such display devices may be peripheral display devices.
When included, communication subsystem 106 may be configured to communicatively couple server 101 with one or more other computing devices, such as client device 121. Communication subsystem 106 may include wired and/or wireless communication devices compatible with one or more different communication protocols. As non-limiting examples, communication subsystem 106 may be configured for communication via a wireless telephone network, a wireless local area network, a wired local area network, a wireless wide area network, a wired wide area network, etc. In some embodiments, communication subsystem 106 may allow server 101 to send and/or receive messages to and/or from other devices via a network such as the public Internet. For example, communication subsystem 106 may communicatively couple server 101 with client device 121 via network 115. In some examples, network 115 may be the public Internet. In other examples, network 115 may be regarded as a private network connection and may include, for example, a virtual private network or an encryption or other security mechanism employed over the public Internet.
Further, the server 101 provides a network service that is accessible to a plurality of users through a plurality of client systems such as the client device 121 communicatively coupled to the server 101 via the network 115. As such, computing environment 100 may include one or more devices operated by users, such as client device 121. User device 121 may be any computing device configured to access a network such as network 115, including but not limited to a personal desktop computer, a laptop, a smartphone, a tablet, and the like. While one client device 121 is shown, it should be appreciated that any number of user devices may be communicatively coupled to the server 101 via the network 115.
Client device 121 includes a logic subsystem 123 and a data-holding subsystem 124. Client device 121 may optionally include a display subsystem 125, communication subsystem 126, a user interface subsystem 127, and/or other components not shown in
Logic subsystem 123 may include one or more physical devices configured to execute one or more instructions. For example, logic subsystem 123 may be configured to execute one or more instructions that are part of one or more applications, services, programs, routines, libraries, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more devices, or otherwise arrive at a desired result.
Logic subsystem 123 may include one or more processors that are configured to execute software instructions. Additionally or alternatively, the logic subsystem 123 may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. Processors of the logic subsystem 123 may be single or multi-core, and the programs executed thereon may be configured for parallel or distributed processing. The logic subsystem 123 may optionally include individual components that are distributed throughout two or more devices, which may be remotely located and/or configured for coordinated processing. One or more aspects of the logic subsystem 123 may be virtualized and executed by remotely accessible networking computing devices configured in a cloud computing configuration.
Data-holding subsystem 124 may include one or more physical, non-transitory devices configured to hold data and/or instructions executable by the logic subsystem 123 to implement the herein described methods and processes. When such methods and processes are implemented, the state of data-holding subsystem 124 may be transformed (for example, to hold different data).
Data-holding subsystem 124 may include removable media and/or built-in devices. Data-holding subsystem 124 may include optical memory (for example, CD, DVD, HD-DVD, Blu-Ray Disc, etc.), and/or magnetic memory devices (for example, hard drive disk, floppy disk drive, tape drive, MRAM, etc.), and the like. Data-holding subsystem 124 may include devices with one or more of the following characteristics: volatile, nonvolatile, dynamic, static, read/write, read-only, random access, sequential access, location addressable, file addressable, and content addressable. In some embodiments, logic subsystem 123 and data-holding subsystem 124 may be integrated into one or more common devices, such as an application-specific integrated circuit or a system on a chip.
When included, display subsystem 125 may be used to present a visual representation of data held by data-holding subsystem 124. As the herein described methods and processes change the data held by the data-holding subsystem 124, and thus transform the state of the data-holding subsystem 124, the state of display subsystem 125 may likewise be transformed to visually represent changes in the underlying data. Display subsystem 125 may include one or more display devices utilizing virtually any type of technology. Such display devices may be combined with logic subsystem 123 and/or data-holding subsystem 124 in a shared enclosure, or such display devices may be peripheral display devices.
In one example, the client device 121 may include executable instructions 131 in the data-holding subsystem 124 that when executed by the logic subsystem 123 cause the logic subsystem 123 to perform various actions as described further herein. As one example, the client device 121 may be configured, via the instructions 131, to provide data regarding a patient to the server 101, receive one or more healthcare recommendations from the server 101 generated based on the data regarding the patient and a knowledge graph stored at the server 101, and display the one or more healthcare recommendations via a graphical user interface on the display subsystem 125 to a user such as a healthcare provider or the patient. The client device 121 may be further configured to receive feedback regarding the one or more healthcare recommendations via the user interface subsystem 127.
When included, communication subsystem 126 may be configured to communicatively couple client device 121 with one or more other computing devices, such as server 101. Communication subsystem 126 may include wired and/or wireless communication devices compatible with one or more different communication protocols. As non-limiting examples, communication subsystem 126 may be configured for communication via a wireless telephone network, a wireless local area network, a wired local area network, a wireless wide area network, a wired wide area network, etc. In some embodiments, communication subsystem 126 may allow client device 101 to send and/or receive messages to and/or from other devices, such as server 101, via a network 115 such as the public Internet.
Client device 121 may further include a user input subsystem 127 comprising user input devices such as keyboards, mice, game controllers, cameras, microphones, and/or touch screens. A user of client device 121 may input feedback regarding a healthcare recommendation, for example, via user input subsystem 127. As discussed further herein, client device 121 may stream, via communication subsystem 126, user input received via the user input subsystem 127 to the server 101 over the network 115. In this way, the server 101 may update one or more models for generating healthcare recommendations and/or one or more knowledge graphs based on the user feedback.
Thus server 101 and client device 121 may each represent computing devices which may generally include any device that is configured to perform computation and that is capable of sending and receiving data communications by way of one or more wired and/or wireless communication interfaces. Such devices may be configured to communicate using any of a variety of network protocols. For example, client device 121 may be configured to execute a browser application that employs HTTP to request information from server 101 and then displays the retrieved information to a user on a display such as the display subsystem 125.
The insights module 205 is configured to generate healthcare insights from a knowledge graph. To that end, the insights module 205 may be configured to generate knowledge graph embeddings from one or more knowledge graphs. Knowledge graph embeddings comprise lower-level representations of the entities and relationships depicted in a knowledge graph. To that end, the insights module 205 may represent each entity and relation of a given knowledge graph with a vector. For example, given a knowledge graph, the insights module 205 may generate a vector KG=(E, R, W), where E is a set of entities, R is a set of edges between entities, and W is a set of edge weights. Such a vector preserves the knowledge graph structure with a reduced dimensionality relative to the graph. The insights module 205 may use one or more knowledge graph embedding models to generate the embeddings from the knowledge graphs. For example, the insights module 205 may be configured to use a translation-based embedding model, a matrix factorization model, or a neural network model. The insights module 205 may then infer new knowledge or derive new insights using the knowledge graph embeddings of the knowledge graph, as described further herein.
The user behavior module 210 is configured to receive and aggregate user feedback regarding recommendations provided by the healthcare knowledge system 200. For example, the user behavior module 210 may receive indications of user engagement with a healthcare recommendation, such as reviewing the recommendation, accepting the recommendation, rejecting the recommendation, and so on. Such user behavior may be encoded in the knowledge graph(s) described herein as user feedback, which in turn may enable further refinement of the knowledge graph and subsequent healthcare recommendations. For example, based on user behavior, recommendations may be ranked and filtered.
The knowledge module 215 is configured to aggregate data from a plurality of data sources into a knowledge graph. For example, the knowledge module 215 may import data from a plurality of data sources, perform extraction, transformation, and loading (ETL) of the data, and represent the data as entities and relationships in a knowledge graph. An example knowledge graph is described further herein with regard to
The queueing module 220 is configured to manage integrated data updates into the knowledge graph managed by the knowledge module 215, for example in a data update queue, as well as to manage the output of recommendations determined based on insights, for example in a recommendation queue. For example, while an initial knowledge graph may be constructed from initial data sources, the initial knowledge graph may be regularly updated by incrementally integrated or consolidating further and additional datasets into the knowledge graph. The queuing module 220 may therefore organize and queue updates to the knowledge graph.
The recommendation module 225 is configured to generate recommendations such as healthcare recommendations for users based on the insights determined by the insights module 205, for example, from the knowledge graph constructed and updated by the knowledge module 215. A healthcare recommendation may comprise, as an illustrative and non-limiting example, a recommendation for a medicine to prescribe a patient based on one or more symptoms experienced by the patient.
The personalization module 230 is configured to coordinate personalized output for a given user based on data from the insights module 205, the user behavior module 210, the knowledge module 215, the queuing module 220, and the recommendation module 225. For example, the personalization module 220 facilitates interactions between modules of the healthcare knowledge system 200 to provide output that is personalized to a user as well as personalized for a given patient.
The modeling module 235 is configured to construct and train machine learning model(s), for example, to provide artificially intelligent services to users. For example, the modeling module 235 may include one or more machine learning models trained as a chatbot, which may utilize the knowledge graph of the knowledge module 215 and/or insights of the insights module 205, for example, to automatically converse via text with a user. Each model of the modeling module 240 may comprise a machine learning model, as an illustrative and non-limiting example. For example, one or more of the models may comprise a machine learning model trained via supervised or unsupervised learning. To that end, a model of the modeling module 240 may comprise, as a non-limiting and illustrative example, one or more of an artificial neural network, a linear regression model, a logistic regression model, a linear discriminant analysis model, a classification or regression tree model, a naïve Bayes model, a k-nearest neighbors model, a learning vector quantization model, a support vector machine, a random forest model, a boosting model and so on. The models in particular may comprise different types of machine learning models.
The user interface module 240 is configured to generate a user interface for transmission to a user device, such as a client device 121, which may be displayed as a graphical user device via a display subsystem 125, for example. The user interface module 240 may further receive data from the client device 121, for example, and provide the received data to one or more modules of the healthcare knowledge system 200. In this way, the user interface module 240 manages interactions between a user and the healthcare knowledge system 200.
To that end, the healthcare knowledge system 310 includes a traffic flow module 312 configured to effectively connect user requests to infrastructure of the healthcare knowledge system 310. For example, while a user may attempt to access the healthcare knowledge system 310 via a set domain name (e.g., www.example.com), the traffic flow module 312 dynamically routes the user to application endpoints.
For example, user requests from the user device 305 may be dynamically routed via the traffic flow module 312 to an application load balancer 314. The application load balancer 314 distributes the incoming traffic across multiple targets, such as multiple elastic compute cloud (EC2) instances. For example, the application load balancer 314 may distribute the traffic to an edge and service proxy 316, which in turn directs the traffic to a scalable container service module 320.
The scalable container service module 320 enables the execution of modules contained therein in a managed cluster of EC2 instances. To that end, the scalable container service module 320 may include a knowledge graph core 325 and an application platform 330. The user of the user device 305 thus interacts with the application platform 330, which may provide services such as a specialty normalization 332, a specialty recommendation 334, a symptom checker 336, and so on, which are powered by the knowledge graph core 325. The knowledge graph core 325 may comprise, for example, the insights module 205 and knowledge module 215 as described hereinabove. To provide dynamic knowledge graph functionality for the application platform 330, a graph service module 340 includes a search module 342 and a graph database module 344. The search module 342 may enable search via elastic search, for example, while the graph database module 344 provides graph data storage and operation functionality.
A plurality of databases or data sources 410 includes, for example, one or more disease databases 412, one or more patient databases 414, and one or more medicine databases 416. For example, the one or more disease databases 412 may comprise a dataset of an ICD-9 ontology which maps diagnostic codes (e.g., ICD-9 codes) to related terms. The one or more patient databases 414 may include, for example, a database such as MIMIC-III (“Medical Information Mart for Intensive Care”) which comprises a large, single-center database including information relating to patients admitted to critical care units at a large tertiary care hospital. Such a database may include de-identified data including but not limited to vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and so on. The one or more medicine databases 416 may include, for example, the DrugBank database which includes detailed drug (i.e., chemical, pharmacological, and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information.
Method 400 thus constructs 418 a heterogeneous knowledge graph 420 from the data stored in the plurality of databases 410. The knowledge graph 420, as depicted, includes a plurality of entities and edges connecting the entities. For example, the knowledge graph 420 includes a plurality of disease entities 422 with a plurality of directed disease edges indicating relationships between the disease entities 422, a plurality of patient entities 424, a plurality of patient-disease edges 425 indicating relationships between the patient entities 424 and the disease entities 422, a plurality of medicine entities 426 with a plurality of directed medicine edges indicating relationships between the medicine entities 426, and a plurality of patient-disease edges 427 indicating relationships between the patient entities 424 and the medicine entities 426.
Method 400 then trains knowledge graph embeddings 430 with the knowledge graph 420 based on a knowledge graph embedding model, such as a translation-based embedding model. The knowledge graph embeddings 430 comprise jointly-learned embeddings, including, as depicted: disease embeddings 432, 436, and 440; patient embeddings 438; medicine embeddings 442, 444, and 448; and relation embeddings 434 and 446. As discussed hereinabove, the knowledge graph embeddings 430 comprise a lower-dimensional representation of the knowledge graph 420.
Method 400 receives new patient data 450 comprising, for example, a patient entity linked to one or more disease entities as depicted. Based on the new patient data 450, method 400 then generates 460 a medicine recommendation 470, for example, based on the knowledge graph embeddings 430 and thus on the knowledge graph 420. For example, method 400 identifies one or more medicine entities linked to a patient entity which is in turn linked to the same set of disease entities as indicated in the new patient data 450, and returns the one or more medicine entities as the medicine recommendation 470, as depicted.
Although the knowledge graph 420 described hereinabove includes disease, patient, and medicine entities, the knowledge graphs of the present disclosure may include additional information. As an illustrative example,
The knowledge graph 500 further includes one or more procedure code entities 520 associated with the diagnosis code entity 505 via one or more diagnosis code-procedure code edges 522. Similarly, the diagnosis code entity 505 is associated with a disease entity 525 for a disease via a diagnosis code-disease edge 527. The disease entity 525 is further linked to a treatment entity 530 for a treatment of the disease 525 via a disease-treatment edge 532, and a diagnostic method entity 535 for a diagnostic method of diagnosing the disease 525 via a disease-diagnostic method edge 537.
The knowledge graph 500 further includes a member entity 540 associated with the diagnosis code 505 via a diagnosis code-member edge 542. The member 540 is further linked to a feedback entity 545 via a member-feedback edge 547, wherein the feedback entity 545 includes user feedback for the member 540. In this way, members 540 or patients may be linked, via the knowledge graph 500, to diagnosis codes 505 and thus to symptoms 510, diseases 525, and so on.
The knowledge graph 500 further includes a medication information entity 550 associated with the diagnosis code 505 via a diagnosis code-medication edge 552. The medication information 550 may in turn be associated with one or more side effect entities 555 via a medication-side effect edge 557.
Similarly, the diagnosis code 505 is associated with at least one specialty entity 560 via a diagnosis code-specialty edge 562. The specialty or specialty entity 560 indicates a healthcare specialty associated with the diagnosis code 505. Further, one or more providers 565 are associated with the specialty 560 via one or more corresponding specialty-provider edge(s) 567. The specialty 560 is further associated with an entity 570 indicating the specialty in layman's terms via a specialty-laymen edge 572.
Thus, by organizing data as depicted in the knowledge graph 500, the different entities such as symptoms, specialties, providers, diseases, treatments, diagnostic methods, medication, side effects, patients or members, and even user feedback may be associated.
At 605, method 600 performs data extraction. For example, method 600 performs web crawling of biomedical literature, for example in a network-connected archive or database such as PubMed. At 610, method 600 performs entity linking. For example, method 600 may use Unified Medical Language System (UMLS) identifiers (IDs) with natural language processing techniques, including neural language models trained for medical entity linking (e.g., MedLinker), to map text in the extracted data to entities in the knowledge base or knowledge graph. To that end, entity linking includes constructing a knowledge graph and/or updating a knowledge graph with the extracted data, for example to connect the identified entities to other entities of the knowledge graph. Entity linking may be performed via a trained neural network model, for example, and/or natural language processing (NLP) models.
At 615, method 600 performs knowledge inference. To that end, method 600 performs relationship extraction, for example by detecting and classifying relationships between entities. Further, method 600 generates new knowledge candidates based on the extracted relationships, for example by using a rule-based technique for generating the new candidates and then evaluating whether the candidates match entities in the knowledge graph. For example, a score may be computed for each new candidate, and candidates may be added and linked to the knowledge graph if the score is above a threshold. A ranking and filtering strategy may be used, for example to take the domain relevance and trustworthiness of the origin of the data into account.
At 620, method 600 performs publishing and serving of the knowledge graph. To that end, method 600 trains knowledge graph embedding based on the updated knowledge graph, and further performs embedding-based validation. The knowledge graph embedding may be used for generating new recommendations based on new patient data, for example as described hereinabove with regard to
Method 600 then obtains feedback and evaluation at 625, for example by joint learning with text, and bootstrapping the extraction pipeline. That is, the extraction pipeline comprising the method 600 may be a stepwise or iterative learning process in which the knowledge graph published and served at 620 is treated as the initial knowledge to be updated at 605.
Method 600 may rely on medical or healthcare-related ontologies, such as UMLS and Systematized Nomenclature of Medicine (SNOMED), as depicted at 630 for example, to perform both the data extraction at 605 as well as the publishing and serving at 625.
Method 700 begins at 705. At 705, method 700 imports data from a plurality of data sources. For example, the server 101 may import data from the plurality of data sources 141 into the data-holding subsystem 104. Further, at 710, method 700 extracts, transforms, and loads the imported data to obtain an integrated data set. That is, method 700 extracts the imported data, converts or transforms the extracted data into a different format or structure, and stores or loads the transformed data into a graph structure. In this way, the extract, transform, load (ETL) process converts the data that may have been previously stored in relational database structures or other files (e.g., raw text) into a graph format. Further, the transformation step may include conversion of the relations encoded in the data in relational database structures into entities and edges or links therebetween.
At 715, method 700 identifies data in the integrated data set as entities and relationships. As mentioned hereinabove, during the ETL process some of the data is already transformed into entities and relationships. Method 700 further identifies additional entities and links therebetween, for example in order to connect data from disparate data sources. For example, a first data source may include patient or member data with diagnosis codes, a second data source may include provider data and specialties, and a third data source may include medication information. As an illustrative example, the data sources may include specialty layman's terms, provider specialty, provider types, diagnosis codes, symptoms data, Medicare specialty, drug data, disease information, treatment data, member data from medical claims, user interaction data, and so on. While the ETL process may identify entities and edges for data within each data source, at 720 method 700 may identify connections between such data sources in order to link the entities as depicted in
At 725, method 700 determines insights based on the knowledge graph. For example, method 700 may use a trained machine learning model (e.g., trained and stored in the modeling module 235) to automatically identify new connections within the knowledge graph. For example, while a given member or patient may be linked to a diagnosis code and a particular medication in the knowledge graph because the patient is prescribed the particular medication, the knowledge graph may indicate that a large plurality of similar patients (i.e., other patients linked to similar symptoms, diagnosis code, disease, and so on) are prescribed a different medication with a lower cost. Method 700 may thus identify the potential for prescribing the medication as an alternative to the patient as an insight. Similarly, healthcare providers may be identified with certain symptoms and diseases, for example, thereby providing a certain healthcare insight that may be useful for users searching for providers who treat a given disease.
At 730, method 700 stores the insights for output. Such healthcare insights may be stored separately, for example in the insights module 205, or may be stored in the knowledge graph itself. The stored insights and the knowledge graph may be used to serve personalized recommendations in various applications and systems, for example in personalized provider searches, personalized recommendations based on symptoms, healthcare decision support systems, and so on. Method 700 then returns.
Data associated with a user may be captured and transmitted by and via a user interface 810, for example, of a user device (not shown) such as the client device 121. Such data may include a click stream (e.g., a series of mouse clicks made while interacting with a network service), a member or user profile, reactions to recommendations, user interaction with a chatbot, user interaction with a care guide, claims history, and so on. Such data may be captured and organized as events by the event dispatch 812, for example, which routes the events to the event store 814 for offline computation 802 and/or the recommendation engine 850 for online computation 804 and real-time provision of recommendations to the user interface 810, for example. Events comprise units of information or data to be processed, and are routed by the event dispatch 812 to trigger one or more subsequent actions or processes, such as updating a knowledge graph 840.
For example, the events may be routed via the event dispatch 812 to the event store 814 comprising a database storing events. The extract-transform-load (ETL) linkage 816 uses an ETL approach to extract, transform, and load data from the event store 814 into an appropriate format or structure for downstream processing, and passes the data to the data dispatch 820. The data dispatch 820 dispatches the ETL data from the ETL linkage 816 as well as offline data 822 to one or more of a model retrain module 825 and an offline computation module 830.
The model retrain module 825 uses the data from the data dispatch 820 to train or retrain one or more machine learning models 827 configured to generate healthcare recommendations, as one example. The offline computation module 830 may comprise an analytics engine, as described further herein, configured to understand users and items, extract insights regarding such users and items, and generate recommendations. To that end, the offline computation module 830 may retrieve data including other healthcare recommendations from one or more third-party APIs 832, a knowledge graph 840 as described hereinabove, the models 827, and data from the data dispatch 820. The offline computation module 830 may thus update the knowledge graph 840 with new data and insights, and new insights may further be transmitted to the insights dispatch 835.
In order to deliver results in real-time, a serving data store 845 of the online computation 804 stores results of offline computation 802 including insights determined by the offline computation module 830 and dispatched by the insights dispatch 835, as well insights and/or other knowledge stored in the knowledge graph 840. The recommendation engine 850 of the online computation 804 retrieves results from the serving data store 845 and may perform ranking and filtering of results to provide personalized results to the user interface 810.
The analytics engine 905 understands users and items, and extracts insights regarding the users and items. Items may comprise any entity recommended to a user, including but not limited to providers, specialties, alerts, medications, and so on. The analytics engine 905 may be configured to understand the topics of conversations by a user, understand the relationship between a symptom and a specialty, and understand the health journey a user is undergoing based on claims data associated with the user. The analytics engine 905 may therefore comprise one or more machine learning models, including supervised and/or unsupervised models configured to tag the users and items. The analytics engine 905 may further comprise various embeddings to measure the similarities of the users and the items. The analytics engine 905 may further comprise business rules to tag a user or an item. The analytics engine 905 may further interact with external services (e.g., external services 151) via one or more APIs, for example to tag claims data and users as candidates. The analytics engine 905 may thus receive factual data (e.g., member profile, claim history, and so on), records about user behavior (e.g., how the user interacts with the network services), and data from third-party APIs as inputs. The analytics engine 905 processes the data inputs to generate and output tags about users and items. Tags may comprise entities and relationships as described hereinabove.
The knowledge engine 910 stores the knowledge learned with the analytics engine 905, infers new knowledge from knowledge learned with the analytics engine 905, and serves aggregated knowledge. The knowledge engine 910 may thus comprise the knowledge graph systems described hereinabove such as the knowledge module 215. For example, the knowledge engine 910 may store knowledge and insights in a knowledge graph, similar to the knowledge graph 500 described hereinabove. Such knowledge and insights may include tags regarding a health plan of a user (e.g., indicating if the user has telehealth benefits), tags regarding communications of the user (e.g., indicating if the user is a frequent care guide communicator), most concerned questions of the user, mappings of specialty terms to laymen's terms, mappings of symptoms to specialties, and so on.
The user behavior engine 915 collects data regarding how a user interacts with the network service(s) of the server 101, for example, so that the user behavior may be modeled. Such data may include, as illustrative examples, a click stream of the user, recommendations and reactions to the recommendations by the user, user interaction with a chatbot, user interaction with a care guide, and so on. The user behavior engine 915 may use the analytics engine 905 to tag users, for example, based on the user behavior collected by the user behavior engine 915.
The recommendation engine 920 serves recommendation requests by retrieving computed or recomputed results, as well as computing in real-time for real-time serving (e.g., provider ranking). The recommendation engine 920 may thus comprise a serving data store (e.g., the serving data store 845) storing offline computation results. The recommendation engine 920 may further comprise one or more machine learning models trained to determine dynamic behaviors based on when or where an API of the recommendation engine 920 is called. The recommendation engine 920 may further comprise a log collector to record a snapshot of when the recommendation was generated.
The queueing engine 925 manages data updates within the personalization system 900. For example, the queueing engine 925 may comprise and/or manage a data update queue, a recommendation generation queue, and a notification queue. The data update queue may determine when data updates are used for updating a knowledge graph and/or retraining one or more machine learning models, for example. The recommendation generation queue may determine when new recommendations are generated based on updated data. The notification queue may determine when notifications of new healthcare recommendations or healthcare insights are served to the user.
The personalization service 1005 also manages interactions with the knowledge service 1050, which may comprise for example the knowledge graph core 325 described hereinabove. The knowledge service 1050 stores knowledge in a knowledge graph and serves knowledge and/or insights stored in the knowledge graph responsive to requests. The knowledge service 1050 further interacts with a graph search module 1055, which may comprise the search module 342, as well as a managed graph database module 1060, which may comprise the graph database module 344.
The structure of the system 1000 ensures that direct linking, direct reading of data stores, shared memory, back doors, and so on may be avoided, as all communication is facilitated by the personalization service 1005. Example methods for how the personalization service 1005 facilitates communication in the system 1000 are described further herein with regard to
Method 1100 begins at 1105. At 1105, method 1100 receives a message from the data updates queue. Responsive to receiving the message from the data updates queue, at 1110, method 1100 retrieves data through API(s) and/or database(s) based on the message. Further, at 1115, method 1100 receives or retrieves related existing insights from the knowledge engine.
At 1120, method 1100 inputs the retrieved data and the existing insights to one or more machine learning model(s) to generate new insights. Alternatively, in some examples, method 1100 applies rules to the retrieved data and the existing insights to generate new insights. At 1125, method 1100 outputs the new insights back to the knowledge engine. As new insights are generated, at 1130, method 1100 sends a message to the recommendation queue to generate new recommendations based on the new insights. Method 1100 then returns.
Method 1200 begins at 1205. At 1205, method 1200 receives a message from the recommendation generation queue. The message from the recommendation generation queue indicates that a new recommendation should be generated.
In response to receiving the message, method 1200 continues to 1210. At 1210, method 1200 collects data to generate recommendations. For example, method 1200 may gather the data needed to generate the recommendations by rules or by a model from one or more databases and/or APIs. At 1215, method 1200 receives or retrieves existing insights from the knowledge module or knowledge service 1050, for example. Further, at 1220, method 1200 receives or retrieves existing recommendations from a database of recommendation records, such as the recommendation records 1025.
At 1225, method 1200 inputs the retrieved data and the existing insights into one or more machine learning model(s) to generate new insights. Alternatively, method 1200 applies rules to the retrieved data and the existing insights to generate new insights.
At 1230, method 1200 filters the recommendations through the user behavior engine. In particular, by running the recommendations through the user behavior engine, recommendations which users might not be interested in are filtered out. For example, as discussed hereinabove, the user behavior engine evaluates user behavior with respect to recommendations. The user behavior engine may then filter and rank recommendations according to how other users interact with recommendations. For example, if users consistently ignore a particular recommendation, the user behavior engine may remove the recommendation, down-weight the recommendation, or assign a lower rank to the recommendation for serving to the user. Conversely, if users consistently interact with a particular recommendation over other recommendations, the user behavior engine may up-weight or assign a higher rank to the recommendation for serving to the user.
At 1240, method 1200 stores the final recommendations in a database for delivery. For example, the recommendations may be stored, along with any ranking or filtering of the recommendations, in the recommendation records database for subsequent retrieval.
If there are recommendations worth push notifications or other direct notifications, at 1245, method 1200 sends a message with the recommendations to the notification queue, so that subscribers of the queue can deliver the notifications. Method 1200 then returns.
Method 1300 begins at 1310. At 1310, the user device 1302 transmits one or more message(s) to the personalization service 1304. At 1315, the personalization service 1304 receives the message(s). The personalization service 1304 then validates the message(s) at 1320, for example by checking whether the message contains valid identifiers (IDs).
At 1325, the personalization service 1304 determines whether the message(s) are not valid. If the message(s) are invalid (“YES”), method 1300 continues to 1330. At 1330, the personalization engine 1304 transmits an insertion fail message to the user device 1302. At 1335, the user device 1302 receives the insertion fail message. Method 1300 then returns. Thus, invalid message(s) are not transmitted through to internal data sources such as the queueing engine 1306.
However, referring again to 1325, if the message(s) are valid and thus not invalid (“NO”), method 1300 continues to 1340. At 1340, the personalization service 1304 transmits the validated message(s) to the queueing engine 1306.
At 1345, the queueing engine 1306 receives the validated message(s). At 1350, the queueing engine 1306 writes the validated message(s). For example, the queueing engine 1306 inserts the message to one or more of the data updates queue, the recommendation generation queue, the notification queue, and so on depending on the type of message. Upon successfully writing the validated message(s), at 1355, the queueing engine 1306 transmits an insertion complete message to the personalization service 1304. At 1360, the personalization service 1304 receives the insertion complete message, and at 1365, the personalization service 1304 transmits the insertion complete message to the user device 1302. At 1370, the user device 1302 receives the insertion complete message. Method 1300 then returns. Thus, the personalization service 1304 facilitates the writing of valid messages to internal data sources such as the queueing engine 1306.
Method 1400 begins at 1410. At 1410, the user device 1402 transmits an indication of the user of the user device 1402 to the personalization service 1404. At 1415, the personalization service 1404 receives the indication of the user from the user device 1402.
At 1420, the personalization service 1404 transmits, to the recommendation engine 1406, a request for generated recommendations for the indicated user. At 1425, the recommendation engine 1406 receives the request for the generated recommendations for the indicated user from the personalization service 1404.
At 1430, the recommendation engine 1406 retrieves the generated recommendations for the indicated user based on the request, and transmits the generated recommendations for the indicated user back to the personalization service 1404. At 1435, the personalization service 1404 receives the generated recommendations from the recommendation engine 1406.
At 1440, the personalization service 1404 transmits the generated recommendations for the user to the user behavior engine 1408. At 1445, the user behavior engine 1408 receives the generated recommendations. At 1450, the user behavior engine 1408 ranks and filters the recommendations for the user, for example, based on the user behavior of the user and/or the user behavior of users similar to the user. At 1455, the user behavior engine 1408 transmits the ranked and filtered recommendations to the personalization service 1404.
At 1460, the personalization service 1404 receives the ranked and filtered recommendations from the user behavior engine 1408. At 1465, the personalization service 1404 transmits the ranked and filtered recommendations to the user device 1402. At 1470, the user device 1402 receives the ranked and filtered recommendations. The ranked and filtered recommendations, which are personalized for the user of the user device 1402, may thus be displayed to the user of the user device 1402 via a graphical user interface, as an illustrative example. The personalization of the recommendations occurs due to the methods described herein for generating healthcare insights and recommendations, as well as the ranking and filtering of the recommendations based at least on user behavior. Method 1400 then returns.
Method 1400 may also be implemented with a notification queue, such as the notification queue 1014, in place of the user device 1402. For example, the notification queue may transmit a list of recommendation IDs for retrieval and pushing of the recommendations. The personalization service 1404 similarly interacts with the recommendation engine 1406 and the user behavior engine 1408 to return ranked and filtered recommendations to the notification queue 1014.
A technical effect of the present disclosure includes the automatic generation of a recommendation based on disparate data from a plurality of data sources. Another technical effect of the present disclosure includes the output, such as the display, of a healthcare recommendation automatically generated based on a knowledge graph containing data from a plurality of data sources. Yet another technical effect includes the transformation of raw data from a plurality of data sources into a unified knowledge graph that links information from different data sources of the plurality of data sources. Another technical effect of the present disclosure includes the automatic selection of a healthcare recommendation from a plurality of healthcare recommendations based on user behavior.
In one embodiment, a method comprises determining, with a processor, healthcare insights based on a knowledge graph constructed from data from a heterogeneous plurality of data sources, generating, with the processor and a machine learning model, a plurality of healthcare recommendations for a user based on the healthcare insights, selecting, with the processor, at least one healthcare recommendation from the plurality of healthcare recommendations based on user behavior; and outputting, to a user device for display to the user, the at least one healthcare recommendation.
In a first example of the method, the method further comprises predicting the user behavior based on at least one machine learning model trained on user behavior for a plurality of users. In a second example of the method optionally including the first example, the method further comprises generating the plurality of healthcare recommendations for the user in a batch process responsive to data updates relating to one or more of the user, the heterogeneous plurality of data sources, and the user behavior. In a third example of the method optionally including one or more of the first and second examples, the plurality of healthcare recommendations are generated during offline computations, and the method further comprises storing the plurality of healthcare recommendations in a recommendation database. In a fourth example of the method optionally including one or more of the first through third examples, the method further comprises selecting the at least one healthcare recommendation and outputting the at least one healthcare recommendation in real-time responsive to the user interacting with a network service. In a fifth example of the method optionally including one or more of the first through fourth examples, the method further comprises selecting the at least one healthcare recommendation and outputting the at least one healthcare recommendation in real-time responsive to a network service retrieving the plurality of recommendations for the user according to a notification queue. In a sixth example of the method optionally including one or more of the first through fifth examples, selecting the at least one healthcare recommendation comprises ranking and filtering the plurality of healthcare recommendations based on the user behavior.
In another embodiment, a computer-readable storage medium includes an executable program stored thereon, the program configured to cause a computer processor to: determine healthcare insights based on a knowledge graph constructed from data from a heterogeneous plurality of data sources; generate, with a machine learning model, a plurality of healthcare recommendations for a user based on the healthcare insights; select at least one healthcare recommendation from the plurality of healthcare recommendations based on user behavior; and output, to a user device for display to the user, the at least one healthcare recommendation.
In a first example of the computer-readable storage medium, the program is further configured to cause the computer processor to predict the user behavior based on at least one machine learning model trained on user behavior for a plurality of users. In a second example of the computer-readable storage medium optionally including the first example, the program is further configured to cause the computer processor to generate the plurality of healthcare recommendations for the user in a batch process responsive to data updates relating to one or more of the user, the heterogeneous plurality of data sources, and the user behavior. In a third example of the computer-readable storage medium optionally including one or more of the first and second examples, the plurality of healthcare recommendations are generated during offline computations, and the program is further configured to cause the computer processor to store the plurality of healthcare recommendations in a recommendation database. In a fourth example of the computer-readable storage medium optionally including one or more of the first through third examples, the program is further configured to cause the computer processor to select the at least one healthcare recommendation and output the at least one healthcare recommendation in real-time responsive to the user interacting with a network service. In a fifth example of the computer-readable storage medium optionally including one or more of the first through fourth examples, the program is further configured to cause the computer processor to select the at least one healthcare recommendation and output the at least one healthcare recommendation in real-time responsive to a network service retrieving the plurality of recommendations for the user according to a notification queue. In a sixth example of the computer-readable storage medium optionally including one or more of the first through fifth examples, the program is further configured to cause the computer processor to select the at least one healthcare recommendation by ranking and filtering the plurality of healthcare recommendations based on the user behavior. In a seventh example of the computer-readable storage medium optionally including one or more of the first through sixth examples, the program is further configured to cause the computer processor to train knowledge graph embeddings based on the knowledge graph, receive new patient data, and generate the healthcare recommendation based on the knowledge graph embeddings and the new patient data.
In yet another embodiment, a system comprises a user device configured for a user, and a server communicatively coupled to the user device, the server configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to: determine healthcare insights based on a knowledge graph constructed from data from a heterogeneous plurality of data sources; generate, with a machine learning model, a plurality of healthcare recommendations for the user based on the healthcare insights; select at least one healthcare recommendation from the plurality of healthcare recommendations based on user behavior; and output, to the user device for display to the user, the at least one healthcare recommendation.
In a first example of the system, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause the processor of the server to predict the user behavior based on at least one machine learning model trained on user behavior for a plurality of users. In a second example of the system optionally including the first example, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause the processor of the server to generate the plurality of healthcare recommendations for the user in a batch process responsive to data updates relating to one or more of the user, the heterogeneous plurality of data sources, and the user behavior. In a third example of the system optionally including one or more of the first and second examples, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause the processor of the server to generate the plurality of healthcare recommendations during offline computations, and store the plurality of healthcare recommendations in a recommendation database. In a fourth example of the system optionally including one or more of the first through third examples, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause the processor of the server to select the at least one healthcare recommendation and output the at least one healthcare recommendation in real-time responsive to the user interacting with a network service of the server.
In one representation, a method comprises constructing, with a processor, a knowledge graph with data from a heterogeneous plurality of data sources, generating, with the processor, healthcare insights from the knowledge graph, and outputting, to a user device for display to a user, a healthcare recommendation based on the healthcare insights.
In a first example of the method, the heterogeneous plurality of data sources includes a provider data source, a patient data source, a medicine data source, a disease database, and at least one medical ontology data source. In a second example of the method optionally including the first example, the method further comprises training knowledge graph embeddings based on the knowledge graph, receiving new patient data, and generating the healthcare recommendation based on the knowledge graph embeddings and the new patient data. In a third example of the method optionally including one or more of the first and second examples, the method further comprises updating the knowledge graph with user feedback and user behavior regarding the healthcare recommendations, and ranking subsequent healthcare recommendations based on the updated knowledge graph. In a fourth example of the method optionally including one or more of the first through third examples, the healthcare insights include one or more of newly-identified edges between entities in the knowledge graph, including one or more of a relationship between a patient and a medication, a relationship between a symptom and a diagnostic code, and a relationship between a healthcare provider and a disease. In a fifth example of the method optionally including one or more of the first through fourth examples, the method further comprises updating the knowledge graph to include entities relating plain language terminology to medical terminology, and performing a search responsive to a user query including the plain language terminology for one or more of healthcare providers and symptoms with the knowledge graph.
In another representation, a computer-readable storage medium includes an executable program stored thereon, the program configured to cause a computer processor to: construct a knowledge graph with data from a heterogeneous plurality of data sources; generate healthcare insights from the knowledge graph; and output, to a user device for display to a user, a healthcare recommendation based on the healthcare insights.
In a first example of the computer-readable storage medium, the heterogeneous plurality of data sources includes a provider data source, a patient data source, a medicine data source, a disease database, and at least one medical ontology data source. In a second example of the computer-readable storage medium optionally including the first example, the program is further configured to cause the computer processor to train knowledge graph embeddings based on the knowledge graph, receive new patient data, and generate the healthcare recommendation based on the knowledge graph embeddings and the new patient data. In a third example of the computer-readable storage medium optionally including one or more of the first and second examples, the program is further configured to cause the computer processor to update the knowledge graph with user feedback and user behavior regarding the healthcare recommendations, and rank subsequent healthcare recommendations based on the updated knowledge graph. In a fourth example of the computer-readable storage medium optionally including one or more of the first through third examples, the healthcare insights include one or more of newly-identified edges between entities in the knowledge graph, including one or more of a relationship between a patient and a medication, a relationship between a symptom and a diagnostic code, and a relationship between a healthcare provider and a disease. In a fifth example of the computer-readable storage medium optionally including one or more of the first through fourth examples, the program is further configured to cause the computer processor to update the knowledge graph to include entities relating plain language terminology to medical terminology, and perform a search responsive to a user query including the plain language terminology for one or more of healthcare providers and symptoms with the knowledge graph.
In yet another representation, a system comprises a user device configured for a user, and a server communicatively coupled to the client device, the server configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to: construct a knowledge graph with data from a heterogeneous plurality of data sources; generate healthcare insights from the knowledge graph; and output, to the user device for display to the user, a healthcare recommendation based on the healthcare insights.
In a first example of the system, the heterogeneous plurality of data sources includes a provider data source, a patient data source, a medicine data source, a disease database, and at least one medical ontology data source, the heterogeneous plurality of data sources communicatively coupled to the server via a network. In a second example of the system optionally including the first example, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to train knowledge graph embeddings based on the knowledge graph, receive new patient data, and generate the healthcare recommendation based on the knowledge graph embeddings and the new patient data. In a third example of the system optionally including one or more of the first and second examples, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to update the knowledge graph with user feedback and user behavior regarding the healthcare recommendations, and rank subsequent healthcare recommendations based on the updated knowledge graph. In a fourth example of the system optionally including one or more of the first through third examples, the healthcare insights include one or more of newly-identified edges between entities in the knowledge graph, including one or more of a relationship between a patient and a medication, a relationship between a symptom and a diagnostic code, and a relationship between a healthcare provider and a disease. In a fifth example of the system optionally including one or more of the first through fourth examples, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to update the knowledge graph to include entities relating plain language terminology to medical terminology, and perform a search responsive to a user query including the plain language terminology for one or more of healthcare providers and symptoms with the knowledge graph. In a sixth example of the system optionally including one or more of the first through fifth examples, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to transmit search results of the search obtained based on the knowledge graph to the user device for display to the user in a graphical user interface. In a seventh example of the system optionally including one or more of the first through sixth examples, the server is further configured with executable instructions in non-transitory memory of the server that when executed cause a processor of the server to iteratively update the knowledge graph with new and additional data from the heterogeneous plurality of data sources.
As used herein, an element or step recited in the singular and preceded with the word “a” or “an” should be understood as not excluding plural of said elements or steps, unless such exclusion is explicitly stated. Furthermore, references to “one embodiment” of the present invention are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features. Moreover, unless explicitly stated to the contrary, embodiments “comprising,” “including,” or “having” an element or a plurality of elements having a particular property may include additional such elements not having that property. The terms “including” and “in which” are used as the plain-language equivalents of the respective terms “comprising” and “wherein.” Moreover, the terms “first,” “second,” and “third,” etc. are used merely as labels, and are not intended to impose numerical requirements or a particular positional order on their objects.
This written description uses examples to disclose the invention, including the best mode, and also to enable a person of ordinary skill in the relevant art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined by the claims, and may include other examples that occur to those of ordinary skill in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements with insubstantial differences from the literal languages of the claims.
The present application claims priority to U.S. Provisional Application No. 63/068,959, entitled “SYSTEMS AND METHODS FOR GENERATING AND DELIVERING PERSONALIZED HEALTHCARE INSIGHTS”, and filed on Aug. 21, 2020. The entire contents of the above-listed application are hereby incorporated by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
63068959 | Aug 2020 | US |