1. Technical Field
This application relates generally to data-driven analytics for case management tools used in the healthcare industry.
2. Brief Description of the Related Art
Today, many healthcare organizations, such as health insurance plans, managed care organizations, integrated delivery networks, and the like, hire case managers (or nurses) to properly navigate and guide patients to improve the quality of their care deliveries and lower the costs of care paths. These case managers take a list of patients, and call or message them, and track their statuses, typically using spreadsheets or Electronic Health Record (EHR) systems. Given the complex nature of patients' conditions and care paths, the problem of addressing specific needs of patients has been extremely challenging. Moreover, the current practice of case management has resulted in alarm fatigue to many patients, decreasing the customer satisfaction level for the health plan. Case management tools have often been incorporated into Population Health Management System (PHMS), and representative case management systems include, for example, Athena Health's athenaCoordinator, AllScripts Care Coordination tool, and many others.
Forecasting next medical events and costs is an extremely challenging task. Traditionally, predicting major medical events has involved aggregating medical and pharmacy claims to feed into either generalized linear models or tree-based prediction models. The performance of such approaches, however, often is far from being useful in practice. The major reasons for this poor performance are several fold. In this first instance, most medical and pharmacy claims are item- (or service-) based, as opposed to being episode-based. Further, when such claims are preprocessed to feed into predictive algorithms, episodic information is often lost in the process. For example, although it is easy to derive certain features (such as how many particular services have been performed during the period of observation), there is not always sufficient information to determine what types of events (or episodes) have occurred. Another problem is that the patient population is often heterogeneous. For example, diabetic and non-diabetic patients can react very differently to a certain type of interventions. Furthermore, people under a HMO plan often go through different types of care paths than, say, people under a PPO plan. Yet another problem is that predictive models are typically built based on an aggregated snapshot of data, e.g., claims from a certain time period of time (e.g. over an entire year) might be aggregated in order to build static features for predicting medical events. These coarse aggregations, however, mask how those features are changing over time.
These and other deficiencies in prior art medical event forecasting are addressed by the techniques of this disclosure.
Accordingly, this disclosure provides for improvements to case management tools and systems through the use of a high-precision event forecast engine, which greatly enhances the case management process. The forecast engine preferably leverages a flexible and extensible form of data structure that combines diverse formats of claims (e.g., both medical and pharmacy) and that highlights “episodes” rather than items, where an episode is a collection of claims that happened within a specified time window. An “episode array” is an array of episodes that are ordered by time. In this approach, disparate claims are combined and summarized by member (patient), and the data structure makes it easier to discover episodic progression of medical events. As a further aspect, and to address the problem of patient heterogeneity that can bias results, the forecast engine preferably works with respect to “cohorts” or groups. In this aspect, the patient population is divided into multiple cohorts, where the definitions of cohorts typically involve various types of information, such as comorbidity conditions, geographic information, types of plans, logistical information, and combinations thereof. Cohorts preferably are defined using both static and dynamic features. “Static features” are features that relatively remain the same over time, such as gender, date of birth, address, and eligibility information. “Dynamic features” are features that change their values based on observation periods. These features include, for example, chronic conditions, medication profiles, risk scores, and utilization patterns. Preferably, multiple definitions of cohorts are implemented, and optimal cohort definitions are then estimated through iterative validations. In particular, the forecast engine uses rolling time window processing to extract dynamic features in the data sets, and then one or more machine learning algorithms are applied to the extracted data. Preferably, predictions are blended to reduce bias, and cohort-wise machine learning models preferably learn on dynamic features, which are then put together for final predictions. Subsequently, a validation step is applied when outcomes are later observed in the future. The results of the validation operation may then be used to update the cohort definitions as well as the model parameters for the machine learning algorithms.
The predictions generated by the forecast engine provide for an improved case management system that facilitates case management on a per-patient basis. In particular, the machine learning (ML) algorithms predict, for example, which events are coming next, when those events will happen, and the like, and they enable the case manager to obtain or provide other useful care information, e.g., to simulate risk trajectories for different care path options, prioritize a list of patients that need help, track prior interventions (phone call, text message, mail, etc.), share intervention histories with different case managers, analyze the effectiveness of the interventions, and analyze the performance of case managers.
The foregoing has outlined some of the more pertinent features of the subject matter. These features should be construed to be merely illustrative. Many other beneficial results can be attained by applying the disclosed subject matter in a different manner or by modifying the subject matter as will be described.
For a more complete understanding of the subject matter and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
By way of background, the techniques of this disclosure may be implemented in a case management system or tool. A typical case management system is a collection of various computing machines or infrastructure that is network-accessible, and that supports various display interfaces. Further details of a case management system user interface that is enhanced by the forecast engine of this disclosure are provided below.
As is well-known, case management is a managed care technique within the health care coverage system of the United States. Typically, and from a health care perspective, case management is defined as a collaborative process of assessment, planning, facilitation, care coordination, evaluation, and advocacy for options and services to meet an individual's and family's comprehensive health needs through communication and available resources to promote quality cost effective outcomes.
A representative case management tool 100 is depicted in
As depicted, in addition to data provided from the document database 206, the event forecast engine 204 receives various input data including medical/pharmacy claim data from a medical/pharmacy claim database 208, as well as patient intervention data 210 that is provided from the case management tool. This data may be input to the event forecast engine, or it may be provided programmatically (via an API), or in any other convenient manner. Medical claims typically include claims from inpatient, outpatient, skilled nursing, home health, and other non-pharmacy events. Pharmacy claims typically include prescription medication fill events. Pharmacy claims typically have 30, 60, 90 days of service periods. Typically, medical and pharmacy claims are grouped separately. These data may be provided from other sources, and these data sources may comprise part of the event forecast engine.
In general, and as will be seen, the event forecast engine operates in one or more domains (or use cases). A primary application is for patient targeting. In this use case, the engine is used to identify a subset of a population who will likely go through a certain medical event, e.g. surgery or disease, at some point in the future. Preferably, the engine provides different forecasting results calibrating user's demographic and medical history information. In this use case, the engine forecasts the next events and the “time” of the events.
The data transformation component (or “data transformer”) 212 preferably takes multiple sources of claims data and member demographics data to construct a member-level view of the data. Extracting a member-level view of the data is non-trivial. Each claim source typically has a different list of variables, and even the definitions of the variables are different. For example, inpatient claims may use ICD-9 codes to record procedures that are performed, while outpatient claims may use HCPCS codes. Moreover, inpatient claims may have additional variables, such as length of stay and Diagnosis Related Group codes, where outpatient claims have no such variables. Putting multiple sources of claims into a single table can be difficult, as a system would need to create an un-manageable number of variables (or columns in a table view); indeed, even after doing so, analyzing such data may be quite difficult. To address this issues, the data transformer 212 merges such multiple sources to construct a member-level view of the data, preferably by using a flexible data representation (such as a JSON format), together with the notion of an “episode” that combines multiple claim items into one.
Thus, the event forecast engine 204 operates with data that is based on “episodes,” as opposed to events. As used herein, an “episode” is a collection of claims that happen within a specified time window e.g. a day, three days, etc. The time window is configurable. An “episode array” is an array of episodes that are ordered by time. Preferably, the document database 206 associated with the event forecast engine organizes data in a manner that can advantage of the data points that comprise an episode. While multiple tables and complex relations (e.g., those provided in relational database management systems) often are necessary to represent different types of claims, organizing data relationally obfuscates episodic information. Because the event forecast engine preferably operates over episode data, the document database 206 preferably is structured to contain different types of events in a concise view. A preferred implementation of the document database 206 thus is a class of NoSQL databases and, even more preferably, one that stores data in so-called JSON (JavaScript Object Notation) formats.
JSON is a lightweight, text-based, language-independent data-interchange format that is used with Asynchronous JavaScript and XML (collectively referred to as AJAX), which are well-known technologies that allow user interaction with Web pages to be decoupled from the Web browser's communications with a Web server. AJAX is built upon dynamic HTML (DHTML) technologies including: JavaScript, a scripting language commonly used in client-side Web applications; the Document Object Model (DOM), a standard object model for representing HTML or XML documents; and Cascading Style Sheets (CSS), a style sheet language used to describe the presentation of HTML documents. In AJAX, client-side JavaScript updates the presentation of a Web page by dynamically modifying a DOM tree and a style sheet. In addition, asynchronous communication, enabled by additional technologies, allows dynamic updates of data without the need to reload the entire Web page. These additional technologies include an application programming interface (API) that allows client-side JavaScript to make HTTP connections to a remote server and to exchange data, and JSON. JSON syntax is a text format defined with a collection of name/value pairs and an ordered list of values. By structuring the document database to support JSON formats, the event forecast system 202 is organized to interact with the data sources programmatically and over typical network request-response flows.
The document database includes various data structures that are populated by the data received from the data sources. Thus, for example,
As can be seen, the member/patient data object 500 depicts multiple input sources into a “member-level view,” which as used herein refers to a collection of episodes that are related to a specific member. The merging of multiple data sources into a single data set (such as provided herein) has not been possible or efficient in prior art relational-based schemes, primarily due to the highly disparate nature of the variables used to define each of the data sources. In contrast, by gathering different sources of inputs and summarizing them at member-level as provided herein, data can be stored much more efficiently (in the document database 206) and acted upon by the feature extraction function 216 to identify dynamic features that have real predictive value.
The data objects are shown in
As will be seen, the forecast engine operates generally among numerous cohort definitions (e.g., that include Cohorts A, B and C) to find a set of optimal cohort definitions 616, preferably (as will be described) by iteratively validating prediction results. To this end, cohort information is first processed by the feature extractor 618. As depicted, preferably there is a feature extractor instance for each cohort, although a single instance may be used. The output of the feature extractor 618 for each cohort 616 is then supplied to one or more machine learning (ML) algorithms 620, and typically there will be multiple ML instances, with one instance per cohort processing thread as depicted. In general, the forecast engine operates as follows. Preferably, cohorts are defined using both static and dynamic features. Preferably, multiple definitions of cohorts are implemented, such as depicted. As will be explained in more detail below, the feature extractor 618 extracts static, dynamic, and dynamic features with rolling windows, and then one or more of the machine learning algorithms 620 are applied to the extracted data. Preferably, and to reduce bias, predictions output from the ML algorithms 620 are blended by a blender component 622. In operation, the cohort-wise machine learning models 620 preferably learn on dynamic features, which are then put together by the blender component 622 for final predictions. Subsequently, a validation step 624 is applied (when outcomes are later observed in the future).
As depicted in
Thus, according to the techniques herein, disparate claims preferably are combined and summarized by member. The data formats make it easier to discover episodic progression of medical events. A collection of member-level JSON documents are stored in a NoSQL database. Static, dynamic, and dynamic features with rolling windows are extracted. Dynamic features with rolling windows have strong predictive power. Optimal cohorts are iteratively-learned, preferably by updating machine learning model parameters as well as cohort definitions.
Enabling technologies for the machine learning algorithms 624 for the forecast engine include, without limitation, vector autoregressive modeling (e.g., Autoregressive Integrated Moving Average (ARIMA)), state space modeling (e.g., using a Kalman filter), a Hidden Markov Model (HMM), recurrent neural network (RNN) modeling, RNN with long short-term memory (LSTM), Random Forests, Generalized Linear Models, Extreme Gradient Boosting, Extreme Random Trees, and others. By applying these modeling techniques, new types of features are extracted, e.g., as follows: model parameters (e.g. coefficients for dynamics, noise variance, etc.), latent states, and predicted values for a next couple of observation periods. These features have strong predictive power.
The predictions may be validated using a validation function 818. As described above with respect to
The cohort data extracted by this feature extraction process may then be used to update the member data structure object and, in particular, by adding the extracted cohort(s) as new features to the member object.
The dynamic feature extraction using rolling windows (DFRW) technique is now described in further detail. The technique relies upon the notion that dynamic features change their values depending on observation periods. For example, and as depicted in
Generalizing, the forecast engine extracts two types of signals from the multiple data sources that are supplied, namely, a before-pattern (a temporal pattern that leads to an event of interest), and an after-pattern (a temporal pattern that happens after an event of interest). Preferably, and as has been described, the patterns are customized to cohorts, such as demographic cohorts. For a new patient, both before and after patterns preferably are calibrated to match the new patient's demographic and medical history information. Typically, the patterns have different application domains. For example, before-patterns are used to target or predict a specific group of patients, and after-patterns are used to evaluate future outcomes and utilizations.
As has been described, extracting before-patterns typically involves a multiple step process. The time series data is first aligned, e.g., by setting the event of interest as a reference point. The time series is then clustered based on similarities. In an example scenario, the similarity is calculated based on multiple factors: similarity in previous medical events, similarity in the sequence of medical events, and similarity in the timestamps of medical events. For the extracted clusters, the patterns are generalized, preferably by being expressed as a function of demographic and medical history information. For a new patient, the engine can scan the extracted patterns and find a best match. The engine can also calculate the probability of following this pattern.
Extracting after-patterns also is a multi-step process. The time series data is first aligned, e.g., by setting the event of interest as a reference point (but this time right to left). The next step is removing noisy patterns (or non-causal patterns). The non-causal patterns are identified by comparing the likelihoods of after-patterns between the target population and the overall population. The extracted patterns are clustered, e.g., to a tree structure. Similar events (similar in both time and event type) may be grouped as a branch, then this process is recursively applied, resulting in a tree diagram. After constructing a tree, the strengths of branches and event times are expressed as a function of a patient's demographic and medical history information.
An improved case management system uses the above-described forecast engine as a predictive processing component. As noted above, the notion of case management does not imply or require any such prediction; thus, the inclusion of the forecast engine provides an improvement to another technology (the case management system or tool itself) or technical field (automated case management). In particular, the forecast engine provides for an improvement to the case management system be depicting the multiple disparate input sources in a member-level view from which feature extraction is performed, preferably using the rolling window technique. The outputs of the modeling include various additional features (such as latent representation of the data, predicted values for the future events, model parameters (such as transition probabilities, drift momentum, etc.)), which may then be used by the case management system.
As noted, the forecast engine-driven case management system is used to provide for an improvement case management system. It is assumed that the forecast engine is executing or has been executed to generate predictions, additional model parameters, etc., as has been described. The forecast engine may operate continuously, periodically, in response to a condition or occurrence, or otherwise. As noted, the engine transforms multiple sources in a member-level view and extracts dynamic features using rolling time windows to provide for enhanced predictions. In one aspect, the case management system uses such information to enable case managers (or other administrators) to generate a “chase list” of patients who are anticipated (predicted) to have future events.
A representative case manager user interface is depicted in
It is not required that the case management UI be web-based. The case manager tool may operate with a mobile device or other Internet-of-Things (IoT) appliance or device to provide the case management user interface. Thus, for example, a case management system may provide a mobile app that is installed on a case manager or member mobile device and by which one or more information displays may then be provided based on the information generated by the forecast engine.
The techniques of this disclosure provide significant advantages. They enable case management tools and systems to operate more efficiently and accurately by providing much more useful information to the case manager or other user. Using the member-level view and the feature extraction based on rolling windows, the system predicts major events (e.g., inpatient events) with a high precision. The engine outputs a list of likely events in a next prediction period (e.g., with probability scores). By ordering members by these scores, the case manager can stratify members at-risk. Health Plans can use this system to prevent potential hospitalization, e.g., by contacting and proactively managing risky members, and thus to prevent readmissions. By using facility-based or physician-based cohorts, the system can also simulate and compare the performances e.g. complication rate simulation. By using facility-based or physician-based cohorts, the system also can be used to find better matching between patients and physicians (or facilities).
The forecast engine can be applied to various domains. As noted, one direct application is patient targeting. Using the engine, the case manager can identify a subset of population who will likely go through a certain medical event e.g. surgery or disease. The engine also provides different forecasting results calibrating user's demographic and medical history information. The engine forecasts the next events and the “time” of the events.
The above identified use cases are merely representative.
Each above-described process preferably is implemented in computer software as a set of program instructions executable in one or more processors, as a special-purpose machine.
Representative machines on which the subject matter herein is provided may be Intel Pentium-based computers running a Linux or Linux-variant operating system and one or more applications to carry out the described functionality. One or more of the processes described above are implemented as computer programs, namely, as a set of computer instructions, for performing the functionality described.
While the above describes a particular order of operations performed by certain embodiments of the invention, it should be understood that such order is exemplary, as alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, or the like. References in the specification to a given embodiment indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic.
While the disclosed subject matter has been described in the context of a method or process, the subject matter also relates to apparatus for performing the operations herein. This apparatus may be a particular machine that is specially constructed for the required purposes, or it may comprise a computer otherwise selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including an optical disk, a CD-ROM, and a magnetic-optical disk, a read-only memory (ROM), a random access memory (RAM), a magnetic or optical card, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. The functionality may be built into the name server code, or it may be executed as an adjunct to that code. A machine implementing the techniques herein comprises a processor, computer memory holding instructions that are executed by the processor to perform the above-described methods.
While given components of the system have been described separately, one of ordinary skill will appreciate that some of the functions may be combined or shared in given instructions, program sequences, code portions, and the like.
Preferably, the functionality is implemented in an application layer solution, although this is not a limitation, as portions of the identified functions may be built into an operating system or the like.
The functionality may be implemented with any application layer protocols, or any other protocol having similar operating characteristics.
There is no limitation on the type of computing entity that may implement the client-side or server-side of the connection. Any computing entity (system, machine, device, program, process, utility, or the like) may act as the client or the server.
While given components of the system have been described separately, one of ordinary skill will appreciate that some of the functions may be combined or shared in given instructions, program sequences, code portions, and the like. Any application or functionality described herein may be implemented as native code, by providing hooks into another application, by facilitating use of the mechanism as a plug-in, by linking to the mechanism, and the like.
More generally, the techniques described herein are provided using a set of one or more computing-related entities (systems, machines, processes, programs, libraries, functions, or the like) that together facilitate or provide the described functionality described above. In a typical implementation, a representative machine on which the software executes comprises commodity hardware, an operating system, an application runtime environment, and a set of applications or processes and associated data, that provide the functionality of a given system or subsystem. As described, the functionality may be implemented in a standalone machine, or across a distributed set of machines.
As noted, the functionality may be co-located or various parts/components may be separately and run as distinct functions, in one or more locations (over a distributed network).
The techniques herein generally provide for the above-described improvements to a technology or technical field (namely, event-based forecasting methods), as well as the specific technological improvements to other industrial/technological processes (e.g., existing case management systems, tools and devices).
Number | Date | Country | |
---|---|---|---|
62144062 | Apr 2015 | US |