The present techniques provide systems and methods for enhancing and/or identifying an enhanced set of patterns for the identification of clinical conditions and/or diagnosis within a patient population as well as sets of less enhanced patterns that provides a continuum of several statistical metrics with which patient data sets and/or subsets of patient data sets can be characterized.
For purposes of summarizing the disclosure, certain aspects, advantages and novel features of the present techniques have been described herein. It is to be understood that not necessarily all such advantages can be achieved in accordance with any particular embodiment of the techniques disclosed herein. Thus, the techniques disclosed herein can be embodied or carried out in a manner that achieves or enhances one advantage or group of advantages as taught herein without necessarily achieving other advantages as can be taught or suggested herein.
Various embodiments will be described hereinafter with reference to the accompanying drawings. These embodiments are illustrated and described by example only, and are not intended to limit the scope of the disclosure. In the drawings, similar elements may have similar reference numerals.
The relationships of time series patterns described herein may also be detected and identified using time series objectification pattern analysis technologies as discussed in U.S. Pat. No. 7,081,095, U.S. Pat. No. 7,758,503, and U.S. patent application Ser. No. 13/102,307, the entire contents of each of which are incorporated by reference as if completely disclosed herein. The disclosed techniques may be applied to the pattern analysis of these disclosures or to other analysis systems.
In some embodiments, patterns can be encapsulated within a software element. A software element, as referred to herein, can include computer instructions written using any suitable human-readable computer language. In some examples, the software element can be written as a script using a Domain Specific Language (DSL), such as the Pattern Definition Language (also referred to herein as PDL), among others. In some embodiments, the software elements can be applied to a data region (also referred to herein as a region). In some embodiments, a data region may include a set of time series related to physiological data from a patient within a healthcare facility, and a start time and end time corresponding with the set of time series. PDL scripts (also referred to herein as scripts) 102 of
Once a script 102 is executed with data from a particular region, the system can determine whether any occurrences were identified for the anchor occurrence definition. For example, the anchor occurrence definition may include various physiological factors associated with clinical conditions, such as sepsis, among others. If any occurrences are identified based on the anchor occurrence definition, the script 102 is considered to be positive for the given region. If there are not any occurrences identified based on the anchor occurrence definition, the script 102 is considered negative for the given region.
In some embodiments, classifying regions as positive or negative is part of a binary classification test that measures sensitivity and specificity. Sensitivity, as described herein, refers to the proportion of actual positive occurrences for a particular condition which are correctly identified as positive occurrences for a particular condition. For example, sensitivity can measure the percentage of sick people who are correctly identified as having a particular condition. Specificity, as described herein, refers to the proportion of negative occurrences which are correctly identified as negative occurrences. For example, specificity may refer to the percentage of healthy people who are correctly identified as not having a particular condition. This binary classification test provides the ability to match the predictive power of an executed script 102 against a known “gold standard” to determine sensitivity and specificity. A gold standard can refer to a diagnostic test or benchmark that produces the most accurate results provided certain conditions.
In one embodiment, a tag is used to mark a region with a user-defined classification. For example, a tag may be applied to a region to indicate that the region is associated with a “Sepsis” case. In some examples, a “Sepsis” tag is created and applied to all regions which correspond with sepsis cases. In one embodiment, a set of regions is tagged for the enhancement process using any suitable number of tags. In some examples, two tags may be used to tag a set of regions for the enhancement process. A marked region universe 106 may include two tagged regions. One tagged region can indicate the target region universe 108, which may include all of the physiological data points indicated with white and black circles in the marked region universe 106. A second tagged region, the known region set 110, can include a subset of the target region universe 108 and is indicated with the black circles that represent physiological data points in the target region universe 108. The known region set 110 can be tagged to be “known” to have a condition which the scripts 102 are attempting to identify.
In some embodiments, the identification of the target region universe 108 and the known region set 110 can enable the identification of the accuracy of a script 102. For example, a script 102 can be executed and the results of the script 102 can be compared with the target region universe 108 and the known region set 110. In some examples, the script 102 may be configured to identify a particular physiological condition, also referred to herein as a target condition. The comparison of the results of the script 102 with the target region universe 108 and the known region set 110 can determine the sensitivity and specificity of the script 102 in relation to identifying the target condition.
In some embodiments, the sensitivity and specificity can be derived for any suitable number of scripts 102 along with four sets of distinguished regions. In some embodiments, the four sets of distinguished regions can include a true positive set (also referred to herein as a TP set), a true negative set (also referred to herein as a TN set), a false positive set (also referred to herein as a FP set), and a false negative set (also referred to herein as a FN set). A TP set can include regions identified as positive for the target condition by the PDL script 102 and regions marked as positive in the target region universe 108. The TN set can include regions identified as negative for the target condition by the PDL script 102 and regions marked as negative in the target region universe 108. The FP set can include regions identified as positive for the target condition by the PDL script 102, but marked as negative in the target region universe 108. The FN set can include regions identified as negative for the target condition by the PDL script 102, but marked as positive in the target region universe 108.
For each marked region universe 106 to which a specific PDL Script 102 is applied, there may exist a PDL result which combines a reference to the script, the sensitivity and specificity results, and the four sets (TP, TN, FP, FN), which are identified by comparing the script results to the target region universe 108 and the known regions 110. In some examples, the PDL result can be stored in a results field 112. In some embodiments, a PDL result (also referred to herein as a script result) may be acquired without execution of the PSP Engine 104.
Script results can be compared to determine their relative predictive power within the target region universe 108. In some embodiments, the comparison of script results may not be binary. For example, one script result may be more accurate than another script result at predicting various physiological conditions, such as sepsis, among others. In one embodiment, the accuracy of the scripts 102 may be categorized into three groups: a high accuracy group, a low accuracy group, and an inconclusive group. A high accuracy group may indicate a higher sensitivity and specificity than a low accuracy group. A low accuracy group may indicate that neither sensitivity nor specificity is higher than the high accuracy group. All other comparisons can be categorized as inconclusive. In an alternative embodiment, time is also considered such that the comparison would also take into consideration the earliest possible time at which the specified pattern could be identified within the region.
In the current embodiment, the inconclusive nature of script result comparison causes the result of the enhancement process to be a set of scripts rather than a single script and therefore the result of the iterative enhancement process includes an enhanced script set 114. An enhanced script set 114 can refer to an aggregation of script instances that represent the results from an enhancement. In some examples, an enhanced script set 114 may exclude scripts 102 that provide a lower accuracy result. In some embodiments, a line graph may be created (as shown in
PDL scripts 102 may be composable, that is to say that a PDL script 102 can be composed of two or more different PDL scripts 102. In some embodiments, PDL scripts 102 can be combined in various ways. For example, two PDL Scripts 102 that describe events as their anchor occurrence definition can be combined to form a script with a binary as its anchor occurrence definition. In some examples, the script result generated from the composition of other scripts 102 can be different from the script results of the individual scripts 102 which make up the composition. Further, the script result of the composition of two individual scripts 102 may provide better predictive value than any individual script.
In the current embodiment, the process of combining two scripts 102 into a single script 102 is called a script consolidation move (also referred to as a move). In an alternative embodiment, moves can create a new script using techniques other than the combination of individual scripts 102. In some examples, a move can create a new script by modifying an existing script. For example, a move can create a new script by altering the where clause and/or event correction criteria contained in an existing script. A move creates a new script and a new script result. Moves can be characterized by their comparative effect on the script result. In some embodiments, a safe move is characterized as a move that does not add any results to either the false positive set or the false negative sets within the script result (as compared to the scripts 102 used by the move to generate a final script). In some examples, safe moves are considered valuable because they do not decrease either the sensitivity or the specificity of the result. If a Safe Move adds either to the number of True Positives or the True Negatives, then the Safe Move increases Sensitivity and/or Specificity without any “cost” (i.e. without decreasing either Sensitivity and/or Specificity).
Since the outcome of a move can be a valid PDL script 102, the result of a move can be determined by executing the resultant script through the PSP Engine 104. In some cases, the outcome of a move can be derived without executing the PSP Engine 104. For example, if all of the occurrence instances are available as represented by an occurrence thumbnail (a lightweight object which contains a representation of an occurrence instance and the type of the occurrence instance, region association, start time, and end time, among others), an examination of these occurrence thumbnail can determine the script results of a binary move if the move does not include a “where clause.”
In one embodiment, moves may be split into three groups: a deterministic move, a relational move, and an experimental move. A deterministic move can be a move in which the result can be derived by looking at the respective script results of the scripts which can make up the final composition. A relational move is a move for which the result can be derived by looking at a comprehensive set of occurrence thumbnails (also referred to herein as the occurrence bin) 116 of
In some embodiments, deterministic moves, relational moves, and experimental moves can indicate a “computational cost.” For example, the computational cost may include factors such as execution time, processing power, and memory usage. In some embodiments, an experimental move can be more computationally expensive than a relational move. Additionally, a relational move can be more computationally expensive than a deterministic move.
In one embodiment, the iterative pattern enhancement engine (also referred to herein as IPE engine) 118 can be broken down into two phases as shown in
In some examples, desirable predictive characteristics may be designated as better sensitivity and/or specificity. Alternatively, better predictive characteristics may be designated as “filling the gaps” within the overall continuum of predictability. For example, the IPE engine 118 may include generating a broad distribution of scripts 102 that have a very high granularity of predictive behavior. In some examples, the IPE engine 118 may generate scripts 102 that have a broad distribution of specificity. For example, if scripts 102 are identified that have specificity below 70% and specificity above 80%, the IPE engine 118 may identify scripts 102 that have specificity between 70 and 80%. In these examples, the IPE engine 118 can fill in the gaps within a predictive continuum in addition to finding “better” predictive results.
In some embodiments, scripts 102 can be based on physiological streams. For example, one script may be based on White Blood Count (also referred to herein as WBC). Another script may be based on Platelets and Bicarbonate. In one embodiment, every script has a finite set of at least one physiological stream on which the script is based. The set of physiological streams on which a script is based is called the dependent point stream set. In some embodiments, the IPE engine 118 may use a broad range of dependent point stream sets. For example, an IPE engine 118 may identify a script that is highly sensitive and specific to sepsis, along with additional scripts 102 that are highly sensitive and specific to sepsis but are based on different dependent point stream sets. In these examples, the IPE engine 118 can identify a correlation between patients that have sepsis using various different scripts 102. In some embodiments, the IPE engine 118 may not have access to several physiological streams. In these embodiments, access to a wide range of scripts 102 can increase the robustness of the overall system.
In one embodiment, seeding the field 120 is the first phase of the IPE engine 118. Seeding the field 120 can refer to generating a set of results 110 (also referred to herein as a result set) for a script 102 based on the target region universe 108. In some embodiments, seeding the field 120 within the IPE engine 118 may be automated by a processor or adaptively applied.
In some embodiments, the process of seeding the field 120 can use a target region universe 108 that includes a marked region 108. In some examples, the target region universe 108 can include an indication as to which data regions within the target region universe 108 have a condition (e.g. sepsis, among others) and which data regions do not have a condition. This marked set is referred to as the marked region universe 106. Alternatively, the process of seeding the field 120 can use multiple marked region universes 106.
In some embodiments, additional input can be supplied. For example, the process of seeding the field can also include scripts 102 and hints 124. In one embodiment, the scripts 102 can include set of predetermined PDL scripts 102. In some examples, the predetermined PDL scripts 102 may be translated from scripts 102 written with another tool or in a simple text editor. In some embodiments, the scripts 102 may comply with the PDL language format and the scripts 102 may be expected to execute without error. In some examples, invalid scripts 102 can be identified and/or ignored without stopping the seeding the field 120 process.
In one embodiment, scripts 102 are executed through the PSP engine 104 and matched against the marked region universe 106 to generate predictive characteristics such as sensitivity, specificity, a true positive set, a true negative set, a false positive set, and a false negative set, among others. Once predictive characteristics have been generated, a result object is created and put into the results field 112. In some embodiments, the IPE engine 118 may decompose the scripts 102 and seed the field 120 with individual elements of the script 102.
For example, the script 102 may contain the following language:
In this example, the script 102 may indicate for the IPE engine 118 to generate 3 separate scripts 102: one script for platelet_fall, one script for bicarb_fall, and one script for destabilization. In some embodiments, the IPE engine 118 can analyze the accuracy of each of the three scripts 102 separately. In some examples, each script may produce different results, which can affect the size of the results field 112. In one embodiment, portions of a script 102 may not produce results that are included in the results field 112. For example, portions of a script 102, such as a portion related to platelet_fall, among others, may not produce results if the portion of the script 102 failed to execute properly. In some examples, the results from a portion of a script 102 may be excluded from the results field 112 if the results do not meet other criteria. In some embodiments, portions of a script 102 that do execute and meet certain criteria may have results included in the results field 112.
In some embodiments, the IPE engine process 114 can proceed with a script 102 if the script 102 produces more than 1 result. In some examples, the IPE engine process 114 can proceed without seeding the field 120 with any additional scripts 102.
In one embodiment, the IPE engine 118 can provide one or more mechanisms to seed the field 120. In some examples, scripts 102 can be generated to seed the field 120 using a variety of automated techniques. In one embodiment, the IPE engine 118 can generate scripts 102 by starting with a script template and generating a set of scripts from the script template. For example, a script template may include the following language:
“identify @Name as {value<@X} in WBC;”
In this example, the script template can indicate a threshold event within the PSP engine 104. In some embodiments, the script template may or may not be a valid PDL script, but the script template is preferably in the shape of a valid PDL Script. In some embodiments, the script template is a PDL Script with elements set as variables. If the variables are replaced with valid script elements, then a valid PDL script can be produced. In the above example, the script template contains two variables: @Name and @X. In some examples, the two variables can be set to particular values. For example, the two variables may be set with the statements @Name=“WBC_Below2” and @X=2. In some embodiments, setting the variables can result in a new script. In the example above, a new script may be generated with the following language:
identify WBC_Below2 as {value<2} in WBC;
In some embodiments, a template along with a set of variable values can represent a set of scripts. For example, the following template and variables can produce three scripts:
The IPE engine 118 can use the scripts 102 produced from templates and variables to provide results for the seed the field process 116. Once an executable script has been generated, the executable script can be analyzed through the PSP engine 104 and placed, along with the predictive characteristics for the executable script, into the results field 112.
In some embodiments, the IPE engine 118 can generate any suitable number of scripts 102 provided a template and a range of variable values. In some examples, if a WBC variable is considered to have a particular range, then the IPE engine 118 can determine the number of scripts 102 to create. For example, if the WBC variable has a range between 0 and 30, the IPE engine 118 may generate a set of 10 scripts 102 using the following WBC variable values:
In some embodiments, the IPE engine 118 may detect results from any suitable number of scripts 102 generated by the IPE engine 118. For example, ten scripts 102 may produce ten results, which can be placed in the results field 112. In some examples, the IPE engine 118 can repeat the process of generating scripts 102 and placing results from the scripts 102 in the results field 112 for any number of physiological streams.
In some embodiments, the IPE engine 118 automates the creation of scripts 102 in the seed the field 120 phase and generates higher level objects such as binaries, images, classifications, and repeating occurrences, among others, in the subsequent phase referred to as make moves 122. In some examples, the final input into the seeding the field process 116 is the hints 124. The hints 124 inform, constrain and/or direct both the seeding the field process 116 and the make moves 122 process. For example, the hints 124 may contain any suitable number of elements that can be analyzed within the seeding the field process 116. In some examples, the hints 124 may include elements that indicate physiological streams from which scripts 102 should be constructed. The hints 124 may also include specific event correction script elements used to create templates and parameters by which templates can be constructed. Additionally, the hints 124 may include parameters to constrain and/or direct the choice of variable values within particular ranges that are used with templates to generate any suitable number of scripts 102. In some embodiments, the hints 124 represent configuration entries and values and can be detected through a tool, such as a graphical user interface or through an automated process. In one embodiment, this configuration can be persisted, transmitted and retrieved.
Seeding the Field with Property Distribution Analysis
In some embodiments, script generation can use data regions in the target region universe 108 as a source for variable values in script templates. In some examples, the scripts 102 that are generated based on the variable values and the script templates can produce results that are included in the results field 112. For example, instead of choosing arbitrary ranges for variable values for a script template, the IPE engine 118 can analyze the target region universe 108 to determine variable values that provide predictive separation within a particular target region universe 108.
In some examples, the variable values may be selected for a template, such as “identify @Name as {value<@X} in WBC,” by taking a range of variable values of WBC and dividing the range of variable values into equal windows. In some embodiments, selecting variable values by dividing a range of variable values into equal windows can be used by the IPE engine 118. In some examples, if the target region universe 108 includes patients that have been marked as having a particular condition, such as sepsis, then data from the target region universe 108 can be used to guide the selection of values for the variables. For example, determining variable values by equally splitting a range of variable values can provide a high predictive granularity with the least number of scripts 102.
In some embodiments, the IPE engine 118 can use a process called property distribution analysis to determine the appropriate number of variable values to select from a range of variable values. In some examples, the property distribution analysis process can determine the number of variable values to select for any number of physiological factors, such as WBC, among others. The property distribution analysis process can prevent the selection of too few variable values from a range of variable values, which can obscure predictive characteristics. The property distribution analysis process can also prevent the selection of too many variable values from a range of variable values, which can result in the generation of too many scripts 102 with the same predictive characteristics. If the property distribution analysis process generates too many scripts 102, the IPE engine process 114 can become inefficient and the results may be overly complicated. The property distribution analysis process of the IPE engine 118 can identify various numbers of variable values to select based on the data, which can ensure the generation of scripts 102 with different predictive characteristics.
In one embodiment, the process of property distribution analysis is based on the process of using a sliding property value to find statistical separation.
identify RiseInWBC as rise in WBC where {Candidate.Magnitude>@X}
In Script 1, candidate.magnitude indicates the magnitude for particular candidate values. In some examples, magnitude can refer to an increase in a physiological measurement in relation to time. Additionally, @X indicates the variable X has a particular value, such as a positive integer, among others. In some examples, the following script, referred to herein as Script 2, is generated when the value of X equals 0:
identify RiseInWBC as Rise in WBC where {Candidate.Magnitude>0}
In some examples, Script 2 may identify at least one occurrence of a rise in WBC in 100% of the sepsis cases and also at least one occurrence of a rise in WBC in 100% of the Non-Sepsis cases. In these examples, every patient within the target region universe 108 had some rise in WBC during a hospital stay. Therefore, Script 2 may provide very little statistical separation between sepsis and non-sepsis patients because the rise in magnitude in Script 2 is equal to zero. As the magnitude increases in the chart 200, the chart depicts some degree of statistical separation. For example, when the magnitude equals 1, a higher percentage of sepsis patients than non-sepsis patients are identified. In another example, when the magnitude equals 2, the statistical separation between sepsis patients and the non-sepsis patients becomes greater. According to chart 200, when the magnitude of the rise in WBC is 2 or greater, 98% of Sepsis patients have a WBC rise of this magnitude, while 75% of non-sepsis patients have a rise in WBC of this magnitude. According to chart 200, a Magnitude>=3 corresponds with a larger statistical separation in patients that have a rise in WBC. In one example, 85% of sepsis patients have a rise in WBC of at least a magnitude 3, while 15% of non-sepsis patients have a rise in WBC of at least a magnitude 3. In some embodiments, as the size of the magnitude of the rise in WBC increases, regions of statistical separation exist. In one embodiment, the process of analyzing the increase in magnitude of the rise in WBC is modeled with the Property Distribution Analysis in the IPE.
It should be noted that
In one embodiment, the IPE engine 118 simulates the process of identifying statistical separation by using a sliding property value illustrated in
In this example, Script 3 contains two variables: @Name and @X. In some embodiments, the IPE engine 118 may have several functions that can generate unique names. In these embodiments, the IPE engine 118 may attempt to identify the variable @X. In the above example, the template can represent a property distribution analysis regarding the PercentChange property.
The use of the operator “>=” provides the process with a sliding characteristic such that the scripts may range from identifying a majority of the patients in the target region universe 108 with a condition, to identifying fewer patients with a condition. The sliding characteristic is depicted in relation to the statistical separation analysis of chart 200 of
In some embodiments, the property distribution analysis process uses a template script, a property variable, and an operator. In some examples, the distribution analysis process includes creating a broadly defined script to obtain a superset of occurrences from which properties can be analyzed. For example, the property distribution analysis process may use the following script, also referred to herein as Script 4:
In some examples, Script 4 may be an executable PDL script derived from the template script. In this example, Script 4 may use template Script 3, but can remove the filter “Where {Candidate.PercentChange>=@X}” such that all fall occurrences are identified.
In one embodiment, the property distribution analysis process also includes executing Script 4 with data from the 23 patients and aggregating the results into a table 400 of
In one embodiment, the property distribution analysis process also includes sorting the results by the property value. In some examples, the property distribution analysis process may focus on results that are greater or equal to a particular magnitude. In these examples, the results may be sorted in table 500 of
In some embodiments, the property distribution analysis process may also include grouping the results in order to verify that duplicate values do not exist. As seen in
In some embodiments, the table 800 depicts a complete set of values 806 for a particular variable @X that corresponds with a certain magnitude. In some examples, any suitable number of scripts can be generated from the identified values in table 800. Each identified value may have a different statistical characteristic and the possible statistical characteristics for this template/property combination are known for the given marked region universe 106.
In one embodiment one form of the seeding the field mechanism is to use the property distribution analysis process. In some examples, the property distribution analysis process may include various assumptions. For example, the property distribution analysis process may assume a marked region universe 106 exists. The property distribution analysis process may also assume a template script with a single variable for a property value exists. Additionally, the property distribution analysis process may also assume an appropriate accumulative expression (e.g. >=) is used. Based on these assumptions, the property distribution analysis process may include creating a broadly defined executable script to obtain a superset of occurrences. The property distribution analysis process may also include executing the script to obtain occurrences and creating a set that maps the region identifier with the value of the isolated property under consideration. Additionally, the property distribution analysis process may include sorting the results by the property value. Furthermore, the property distribution analysis process may include grouping the results by the property value to deal with duplicates. The property distribution analysis process may also include determining which regions are aggregated by a “>=@X” condition for each row. In addition, the property distribution analysis process may also include calculating the value range for each change in the count of regions captured. The property distribution analysis process may also include removing all rows for which there is no change in the region count. Furthermore, the property distribution analysis process may include generating scripts from the identified values and calculating the predictive characteristics.
In one embodiment, alterations and additional logic exist to direct/constrain the process. For example, very small differences in the value of a property can change the number of regions identified. These tiny changes can be known to be physiologically insignificant. Constraints can be, and in one embodiment are, added to the process to not allow these tiny differentiations. For example, the process may use a minimum percent change variable that indicates when a new script may be generated.
Seeding the field 120 can be performed with any combination of the processes described above in relation to
In one embodiment, once the field has been “seeded” and more than one script has been placed in the field, then the process of executing combinatorial moves can begin.
In one embodiment, during the process of seeding the field 120 another accumulation of data, the occurrence bin 116, is created as shown in
The results field 112 provides a list of scripts 102 with varying predictive characteristics with respect to a given marked region universe 106. The predictive characteristics, in one embodiment, include sensitivity, specificity and four region sets: True Positive (TP), False Positive (FP), True Negative (TN) and False Negative (FN). The scripts 102 can also contain physiological stream dependencies. Further, in some embodiments, the seeding the field process 116 creates an occurrence bin 116 of lightweight objects representing the occurrences identified for the scripts 102 in the results field 112 for the regions in the target region universe 108. As shown in
In some embodiments, the process of enhancement can be an iterative process of examining scripts 102 in the results field 112 and finding moves that will create desired affects in the predictive characteristics and/or the physiological stream dependencies. In some examples, the process of making moves simulates the process of a researcher attempting to find more predictive patterns. For example, a directional event (e.g. a fall trend) may be created that is expected to have a high correlation to sepsis. In some embodiments, the directional event may be incorporated into a PDL script and the PDL script may be executed using data related to a set of patients as input. The PDL script may produce results categorized into groups. For example, a group of patients that fall into the false negative group can represent cases in which the directional event (e.g. fall event) did not identify patients that are known to have sepsis. In some embodiments, the false negative group may be analyzed to determine how patients that have sepsis have been included in the false negative group. For example, analysis of the false negative group may determine that the fall event was too strict (e.g. that the magnitude expectation was too high) or that the fall event was not exhibited within a sub-group of sepsis patients.
Some embodiments may address when the fall event was too strict. For example, the fall event may be modified so that the fall event is broader (e.g. by lowering the magnitude expectation). As a result to broadening the fall event, the size of the false negative group may expand beyond a predetermined threshold. If the size of the false negative group grows beyond a particular threshold, a relational solution can be used. For example, by using the fall event in isolation the size of the false positive group may be large. To reduce the size of the false positive group, the fall event may be coupled with another event that correlates to sepsis. For example, the fall event may be considered over a limited time distance in conjunction with another event that is correlated to sepsis. In some examples, considering two events associated with sepsis can identify more patients with sepsis and reduce the size of the false negative group. The combination of two events associated with sepsis can be referred to as a binary move. In some embodiments, a binary move can include combining two events that have a limited correlation to sepsis within the target universe, which results in a third new pattern that has a better correlation to sepsis within the target universe.
In some embodiments, analysis may address results in which a subset of sepsis patients do not exhibit the fall event. In some examples, an alternate pattern may be generated that targets the subset of sepsis patients that did not exhibit the fall event. The alternate pattern may be combined with the fall event to create a classify move. In some embodiments, the classify move includes results from the alternate pattern or the fall event without reference to time.
The IPE engine 118 automates this approach by using searches, set manipulation and other mechanisms to identify moves that will create new scripts with useful predictive characteristics (also referred to herein as an enhanced result set), such as an enhanced sensitivity and specificity range 128. The process of creating new scripts with predictive characteristics can be automated because the results of the moves can be determined either through a mathematical function (e.g. deterministically) or through actually creating the new script and executing the PSP engine 104 to see the results (e.g. through experimentation). In some embodiments, any combination of the techniques described above can be implemented.
An example of a deterministic move is the classify move, which is illustrated in
In
In some embodiments, the IPE engine 118 can determine when to execute a classify move 1102. In some embodiments, the classify move 1102 can enhance the outcome (by increasing sensitivity) if the following conditions exist:
When both of these conditions are true then the classify move 1102 can increase sensitivity without decreasing specificity, which is also referred to as a safe move. In some embodiments, an effective move can include finding the cases in which AFP Δ BFP=Ø and then finding within that set the maximum size of ATP Δ BTP. The following formula describes the effective move:
In some examples, if an unsafe move is allowed (e.g. if Specificity can be relaxed), then the maximum gain in sensitivity can be achieved at the minimum cost of specificity with the following:
A second example of a deterministic move is the global image move. The global image move is illustrated in
The region CTP 1208 (e.g. the set of patients that make up the True Positive set for the Result Script C) can be derived by taking the intersection of 1210 ATP and 1212 BTP. The region CFP 1214 can be derived by taking the intersection of AFP 1216 and BFP 1218. The region CFN 1220 can be derived by taking the union of AFN 1222 and BFN 1224. The region CTN 1226 can be derived by taking the union of ATN 1228 and BTN 1230.
In some embodiments, the IPE engine 118 can use these equations to determine when to execute a classify move 1102. The classify move 1102 can enhance the outcome (by increasing specificity) if the following conditions exist:
In some embodiments, a global image move can be identified by finding all cases in which AFN Δ BFN=Ø and then finding within that set the maximum size of ATN Δ BTN, as described in the following conditions:
AFN Δ BFN=Ø and |ATN Δ BTN|=maximum |ATN Δ BTN|
If an unsafe move is allowed (e.g. if sensitivity can be relaxed), then the maximum gain in specificity can be achieved at the minimum cost of sensitivity with the following:
In some embodiments, classify moves 1102 and global image moves represent two deterministic moves. Another category of move is a relational move, which includes a binary move. In some examples, a binary move is constructed by combining two occurrence types into a relational binary. For example, the following scripts may be combined into a binary script.
These two scripts can be combined into the following binary script:
The results of this combined Script can be determined by executing the PSP engine 104. Further, if the occurrence results of the platelet_fall and bicarb_fall scripts are in the occurrence bin 116, then the IPE engine 118 can determine the result by querying the occurrence bin 116. This is true because the relationship described in the destabilization script can be based on the coincidence in time of occurrences in the platelet_fall and bicarb_fall streams.
Many other moves are available to the IPE engine 118, such as a Time-Limited Image and a Repeating Occurrence, among others. In some embodiments, many language features in PDL and IronPython can be represented by moves in the IPE engine 118.
In some embodiments, the IPE engine 118 can incorporate the following operations:
The selection of the seed can be based on the desired results of the move. For example, if the goal is to maximize sensitivity, then the process may select a seed with the highest sensitivity and search for moves to increase the sensitivity. In some embodiments, other factors may apply. For example, the selection may be limited to scripts 102 with specific physiological stream dependencies.
In one embodiment, the list of seeds drives the process and when the list of seeds is empty then the process ends and the results can be reported. Since moves can create results that may be good seeds, the list of seeds is dynamic and will grow and shrink during the process.
Finding an enhanced move can be dependent on the type of moves available and the desired results. This process may use the 4 quadrant sets (TP, FP, TN, FN), as in the case of a classify move 1102 of
In one embodiment, once a set of positive moves have been found then those moves are validated. Validation may include pruning moves that represent equivalent paths or may be directed and/or constrained by hints 124. Validation can includes more “costly” processing than the filters used within the “Find Enhanced Moves” operation or involves looking at the set of moves found as a whole.
Once valid moves have been selected, the process can determine if any good moves have been found. If no good moves have been found, the process starts a new iteration with the next seed in the list. If valid moves are found, then the valid moves are executed and log records are generated and placed into the enhancement log 126 of
Once moves have been executed, the results are placed into the results field 112 and optionally the occurrence bin 116 of
At this point in the process, the IPE engine 118 can determine whether any seeds still exist in the seed list. If seeds still exist in the seed list, then a new iteration is executed at operation 2.
If no seeds exist in the seed list, then the process ends and the results are presented. In one embodiment, the results consist of a set of enhanced scripts 114 and the enhancement log 126 of
Alternatively, the entire result field 110 and the occurrence bin 116 can be outputted for examination by a researcher, persisted and/or transmitted as input into another software component.
In some situations, scripts 102 may be considered and/or created that are based on statistical anomalies rather than true physiological phenomenon. One embodiment of the IPE engine 118 provides a plurality of exemplary mechanisms which may be used alone or in combination to identify and eliminate these cases.
In one example, the process avoids approaches that would engage or aggregate anomalies. For example, the use of the “>=” operator in the property distribution analysis process rather than the “=” operator or a range (e.g. both “>=” and “<=”) can help to avoid fine tuning.
In another example, the feedback mechanism (also referred to herein as evaluate and configure) 1302 supplied through the enhancement log 1304 of
As well, the IPE engine 1308 may provide the ability to execute the results against subsequent marked region universes 106 of
In one embodiment, the IPE engine 1308 is utilized to identify scripts 1310 to support a visualization that includes a plane representing a fixed set of physiological sub-systems as large rows or ranks. On top of this background individual pixels (or small shapes) are placed to indicate the existence of a pattern. In one embodiment each pixel in the plane is a separate pattern. In an alternative embodiment, a pattern (or set of patterns) is represented by a single row of pixels on the plane and the x-axis of the plane represents time. In this way, the evolution of a condition over time can be visualized in a single image. Alternatively, substantially all pixels on the plane represent a pattern or set of patterns and animation is used to demonstrate the evolution of a condition over time. Each pixel in this visualization can further be differentiated by color. Additionally, iconic or textual elements may be overlaid to further communicate features of the condition or the evolution of the condition. The color displayed for each pixel can be chosen by the count of instances of the patterns represented, severity, correlativity metrics of the pattern, or features of the pattern to name a few. In an alternative embodiment, the field represents the one portion of the visualization and a pattern catalog represents another area in the a way that the selection of pixels can drive the display of individual patterns (in textual, parametric or diagrammatic form) or the selections of patterns and/or their individual elements can indicate which pixel or pixel row is associated. The IPE process 1308 and user interface may be used to support this visualization by providing aid in creating and identifying patterns, ordering or placing those patterns into a layout for the plane, and/or otherwise categorizing patterns.
The processor 1402 may be connected through a system interconnect 1406 (e.g., PCI, ISA, PCI-Express, HyperTransport®, NuBus, etc.) to an input/output (I/O) device interface 1408 adapted to connect the computing system 1400 to one or more I/O devices 1410. The I/O devices 1410 may include, for example, a keyboard and a pointing device, wherein the pointing device may include a touchpad or a touchscreen, among others. The I/O devices 1410 may be built-in components of the computing system 1400, or may be devices that are externally connected to the computing system 1400.
The processor 1402 may also be linked through the system interconnect 1406 to a display interface 1412 adapted to connect the computing system 1400 to a display device 1414. The display device 1414 may include a display screen that is a built-in component of the computing system 1400. The display device 1414 may also include a computer monitor, television, or projector, among others, that is externally connected to the computing system 1400. In addition, a network interface card (NIC) 1416 may be adapted to connect the computing system 1400 through the system interconnect 1406 to a network (not depicted). The network (not depicted) may be a wide area network (WAN), local area network (LAN), or the Internet, among others.
The storage device 1418 can include a hard drive, an optical drive, a USB flash drive, an array of drives, or any combinations thereof. The storage device 1418 may include an enhanced result set generator 1420 that can generate an enhanced result set by generating an enhanced script using a first script and a set of data, such as the marked region universe 106.
It is to be understood that the block diagram of
The various software components discussed herein may be stored on the tangible, non-transitory, computer-readable medium 1500, as indicated in
Conditional language used herein, such as, among others, “can,” “may,” “might,” “could,” “e.g.,” and the like, unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or operations. Thus, such conditional language is not generally intended to imply that features, elements and/or operations are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without author input or prompting, whether these features, elements and/or operations are included or are to be performed in any particular embodiment.
While the above detailed description has shown, described, and pointed out features as applied to various embodiments, it will be understood that various omissions, substitutions, and changes in the form and details of the device or process illustrated can be made without departing from the spirit of the disclosure. As will be recognized, certain embodiments of the techniques described herein can be embodied within a form that does not provide all of the features and benefits set forth herein, as some features can be used or practiced separately from others. The scope of the techniques is indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
This application claims the benefit of U.S. Provisional Patent Application No. 61/629,164 filed Nov. 14, 2011 and of U.S. Provisional Patent Application No. 61/629,147 filed Nov. 14, 2011, the disclosures of which are hereby incorporated by reference in their entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
61629164 | Nov 2011 | US | |
61629147 | Nov 2011 | US |