The present disclosure relates to systems and methods for rating computer model output for a dynamic system relative to empirical results for the dynamic system.
Computer aided engineering (CAE) has become a vital tool in reducing vehicle prototype tests and shortening product development time. One goal of CAE is to reduce or eliminate the extensive physical prototype testing currently relied upon for various types of certifications, such as safety certifications for automotive systems, for example. Before utilizing computer models in product development for various vehicle dynamic systems, the quality, reliability, and predictive capabilities of the computer models must be assessed quantitatively and systematically. In addition, one of the key difficulties for model validation of dynamic systems is that most of the responses are functional responses that may be represented by time history curves, for example. This calls for the development of an objective metric that can assess the differences of both the time history associated with key features, such as phase shift, magnitude, and slope between empirical test curves and model predictions.
A previous metric, “Error Assessment of Response Time Histories (EARTH)” provides three independent measures to evaluate the predicted results of a computer model relative to empirical data associated with the key features of the functional responses, such as phase error, magnitude error, and slope error, that represent the physical characteristics of the response. This metric uses dynamic time warping to reduce the interactions among the three types of errors that measure the discrepancy between time histories of empirical data relative to model predictions, and has a smaller number of metric tuning parameters relative to many other metrics. Because the ranges of the three errors may be quite different and there is no single rating that can provide a quantitative assessment alone, the initial EARTH metric employs a linear regression method to combine the three errors into one score. A numerical optimization method is employed to identify the linear coefficients so that the resulting EARTH rating can match closely with subjective ratings of experts in the field for a specific application. However, the linear combination of the component errors in the EARTH metric is mainly numerical-based and application dependent; and therefore may not be scalable to other applications. In addition, a sensitivity study of the EARTH metric indicates that the EARTH metric does not provide desired robustness for some applications with respect to the number of samples used in the evaluation. In particular, the magnitude and slope errors change significantly based on the number of samples used in the analysis.
A computer system and computer-implemented method executed on a computer system for determining an objective metric for a computer model of a dynamic system based on an analysis of computer generated data relative to empirical test data stored in a computer readable storage device include time-shifting the computer generated data relative to the empirical test data and computing an associated cross-correlation for each time shifted data set, determining a phase error and phase score based on the time shifted data set that provides a maximum cross-correlation, performing dynamic time warping on the maximum cross-correlation time shifted data set using a cost function based only on distance between associated data points of the time shifted data set and test data and determining an associated magnitude error and magnitude score, determining a slope error and slope score based on the maximum correlation time shifted data set and the test data, and combining the phase score, the magnitude score, and the slope score to determine the objective metric for the computer model. In various embodiments, the system and method may also include auto-calibration of metric parameters. The auto-calibration may include comparison of subjective ratings stored in a corresponding database in a computer readable storage device that includes data representing similarity between representative empirical data sets and computer generated data sets. Metric parameters may be tuned or optimized so that the objective metric corresponds to subjective ratings by subject matter experts.
In one embodiment a computer system and computer-implemented method executed on a computer system perform dynamic time warping on test data and computer generated data to determine a magnitude error and magnitude score using a cost function that includes only zero order derivatives, i.e. does not rely on the slope or topology of the test data curve and computer generated data curve. A slope error is determined by dividing the time (phase) shifted computer generated data into multiple intervals each having a plurality of data points and calculating the average slopes of each interval to generate slope curves without using dynamic time warping. A slope score is determined using metric parameters to assign a score between zero and unity or equivalent percentages.
Various embodiments according to the present disclosure provide associated advantages. For example, systems and methods according to embodiments of the present disclosure may be used to quantitatively assess the accuracy and predictive capacity of a computer model of a dynamic system with multiple responses. The systems and methods quantify error associated with phase, magnitude, and shape (slope) independently using dynamic time warping to minimize the effect of localized phase and topology while measuring magnitude and topological error. Magnitude error is calculated using a cost function that is robust with respect to the number of samples used. The different error measures are combined to provide an overall error measure and a single intuitive score for the computer model relative to the selected application. The metric uses a small set of parameters that have associated physical corollaries to facilitate subject matter experts' subjective analysis through a parameter calibration process to determine thresholds, and is scalable to different applications.
As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale; some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the present invention.
The present inventors recognized that the prior art EARTH metric used to evaluate computer models relative to empirical data was not robust, with results varying for different numbers of samples in the empirical data set. A number of other robustness issues were also identified. For example, the linear fitting used in the EARTH metric to calculate slope curves may introduce approximation error. The Dynamic Time Warping (DTW) path is sensitive to different data interpolation because it uses both distance and slope components in the cost function calculation. In addition, conducting DTW to slope curves may reduce or eliminate local shape differences, the slope error calculation is too sensitive to the number of data points, and the slope score does not correlate well with subjective evaluations of subject matter experts.
Embodiments according to the present disclosure provide systems and methods for rating a computer model relative to empirical results for dynamic systems that maintain the advantages of the EARTH metric while providing a number of advantages. In addition to the previously described advantages, the enhanced EARTH (EEARTH) metric is more robust and provides consistent magnitude and slope ratings with better correlation to subjective ratings provided by subject matter experts.
The EARTH metric is divided into two categories: global response error and target point response error. The global response error is defined as the error associated with the complete time history with equal weight on each point. The three main components of the global response error are phase error, magnitude error, and topology (or slope) error. The target point error is defined as the error associated with a certain localized phenomenon of interest, such as peak error and time-to-peak error. The target point error represents the characteristic of a part of the time history, but does not indicate an overall performance of the entire time history. In addition, the target point error is generally application dependent and therefore it is not described in detail.
Separately quantifying the errors associated with phase, magnitude and topology/slope is challenging because they are not independent and have significant interactions. For example, to quantify the error associated with magnitude, the presence of a phase difference between the time histories may result in a misleading measurement. A unique feature of the EARTH metric is employing a known technique of dynamic time warping (DTW) to separate the interaction of phase, magnitude, and topology/slope errors. DTW is an algorithm for measuring discrepancy between time histories. It aligns peaks and valleys as much as possible by expanding and compressing the time axis according to a cost (distance) function. As recognized by the present inventors, the cost function specified for the DTW algorithm used in the EARTH metric, in addition to the method employed to calculate the magnitude and slope errors may contribute to a lack of robustness, particularly with respect to sensitivity to the number of samples.
A block diagram illustrating operation of a system or method for rating computer model data relative to empirical data for dynamic systems according to embodiments of the present disclosure is shown in
Block 20 represents the original empirical or test data (T), while block 22 represents the original computer model data or CAE data (C). The empirical data is collected from sensors or transducers, such as accelerometers or force sensors, for example, with the signals from the sensors gathered during an experiment or test. For example, crash test data may include data from multiple sensors collected during a crash test to measure force/acceleration for head, neck, and chest of a crash test dummy. A computer model of a corresponding simulated crash is used to generate CAE data 22. The data is pre-processed so that both empirical data 20 and model data 22 have similar measurement characteristics, such as sampling rate, filtering, etc. In one embodiment, both data sets are represented as non-ambiguous curves (e.g.: time-history curves), and both signals are synchronized with respect to the physical meanings of the signal's characteristics so that both signals are aligned by physical meanings and timing. In addition, for each time step of the reference signal, a value of the analyzed signal should be provided with both signals assessed at their common sampling points. The signals should also use the same system of units.
In one embodiment, a sampling rate of 10 kHz is used for the data signals 20, 22 for analysis using the algorithm described herein. Signals of higher or lower sampling rates may be re-sampled to this rate as part of the pre-processing. Those of ordinary skill in the art will recognize that the EEARTH metric may also work with other sampling rates. However, the tuning parameters may need to be adjusted accordingly, and the score interpretation may be affected.
Because the metric calculations could be difficult when using very noisy signals, data collection and/or pre-processing may include filtering of the signals. In addition, the assessment of the correlation should be focused on the relevant parts of the given signals. For automotive safety applications, such as vehicle crash tests, signals may include pre-crash and post-crash phases that are usually not of interest and should be excluded from the metric. Therefore, an interval of evaluation should be selected that describes the part of the signals of interest to be assessed.
With continuing reference to
When the original C curve is moved to the right by m time steps, the number of overlap points after time shift n is reduced to (N−m) and the corresponding cross correlation value ρR(m) is calculated according to:
This is repeated to determine the maximum or best cross correlation between test curve 20 and computer model curve 22 as generally represented by block 28. The maximum cross correlation ρE is the maximum of all ρL(m) and ρR(m). The number of the time shifting steps that yields the maximum cross correlation ρE is defined as the phase error nε as represented by block 30. The corresponding shifted and truncated CAE curve C is recorded as Cts and represented by block 40, and the corresponding truncated test curve is recorded as Tts and represented block 42.
The phase error determined at block 30 is then used to calculate the phase score as represented by block 32. The phase score may be calculated or determined according to the following:
where the allowable time shift threshold parameters and corresponding representative values for a typical application are represented by:
As also represented in
The magnitude error is a measure of discrepancy in the amplitude of the time histories of the test curve 20 and computer model curve 22. The magnitude error is defined as the difference in amplitude of the two time histories when there is no time lag between them. Before calculating the magnitude error as represented by block 46, the difference between the time histories caused by error in phase and topology/slope are minimized by using dynamic time warping as represented by block 44. The initial EARTH metric magnitude scores changed significantly when the number of the sample points was reduced from 2000 to 250. Further investigation identified that the local cost function of the dynamic time warping involving both the distance and the slope, was the main cause of this significant change in the magnitude score. As such, the EEARTH calculation according to embodiments of the present disclosure uses the following local cost function, which is less sensitive or more robust to the number of samples used in the calculation:
d(i,j)=(Cts(i)−Tts(j))2
The cost function is used to generate a local cost matrix that is stored in the computer readable storage device. The cost matrix is used by the DTW algorithm to find the alignment path that runs through the low-cost areas in the cost matrix. This alignment path defines the correspondence of elements of both Cts (i) and Tts (j) that will lead to the minimum accumulated cost function. The magnitude error εmag is then calculated as represented by block 46 according to:
The magnitude error is then used to calculate the magnitude score as represented by block 48 according to:
The magnitude score is represented EM where εm* is the maximum allowable magnitude error, and KE
The topological or slope error is a measure of discrepancy in topology/slope of the test curve 20 and computer model curve 22. The topology/slope of a time history is defined by the slope at each point. To ensure that the effect of global time shift is minimized, the slope is calculated from the truncated time shifted histories Tts and Cts as generally represented by blocks 40 and 42 of
The inventors of the present disclosure recognized that the EARTH slope scores were also affected by, or sensitive to, different sampling rates. This sensitivity was determined to be due to the implementation of the slope curve calculation and using dynamic time warping on these slope curves before calculating the EARTH slope error. In the initial EARTH metric, a polynomial fitting was first employed to smooth the time shifted histories Tts and Cts, and then the derivative curves (Cts+d and Tts+d) were calculated from the polynomial fitting curves. The polynomial fitting is an approximation method, so it can introduce variation into the metric. In addition, dynamic time warping was performed on the resulting slope curves before calculating the slope error. The inventors noted that DTW used here could reduce the slope differences and the EARTH slope score may not be able to differentiate between the good or poor correlations.
In the EEARTH metric according to embodiments of the present disclosure, the time shifted histories Tts and Cts are first divided into multiple intervals with pre-defined length/time based on the sampling rate (e.g. 1 ms) so that each interval includes multiple data points. Next, average slope is calculated in each interval to generate slope curves (Cts+d and Tts+d) as represented by blocks 60 and 62. Therefore, the slope curves are used to calculate the slope error directly without performing dynamic time warping.
The slope error is then calculated based on the slope curves as represented by block 64 according to:
The slope error is then used to calculate the slope score as represented by block 66 according to:
The slope score is determined in a similar manner as the magnitude score as previously described. The maximum allowable slope error defines the order of the regression. In this way, the best EEARTH slope score is 100%, which means there is no difference between the two slope curves. If the slope error is equal to or greater than the maximum allowable slope error threshold or constraint, then the EEARTH slope score is 0%. For values in between, the EEARTH slope score is calculated by the regression method shown. Similar to the magnitude and phase scores, the metric calibration parameters, thresholds, or constraints may be obtained from a database of values stored in the computer readable storage device. The database may contain metric parameter values as determined during a calibration process using subjective evaluations by subject matter experts (SME's) as described in greater detail below.
As such, the EEARTH metric according to embodiments of the present disclosure improves robustness by reducing sensitivity of the slope error and slope score to the number of samples by (1) dividing the phase shifted curves into multiple intervals with pre-defined length each having multiple data points, (2) calculating average slopes of each intervals to generate slope curves, and (3) calculating the slope error without the use of dynamic time warping. Analysis reveals that the EEARTH metric slope ratings are not significantly affected by changes in the sampling rates and better correspond with subjective ratings of subject matter experts as compared with the original EARTH metric.
The three EEARTH sub-scores for the phase 32, magnitude 48, and slope 66 are combined using associated weighting factors as represented by block 68 according to:
E=w
P
·E
P
+w
M
·E
M
+w
S
·E
S
The weighting factors may vary depending on the particular application and may be determined in a similar fashion as other metric calibration parameters by subject matter experts for a particular application. In one representative embodiment, equal weighting factors of ⅓ are applied to the sub-scores to generate a single EEARTH score metric as represented by block 70. Depending on the particular application, the single EEARTH score metric may be further combined with one or more other metrics to rate the computer model performance relative to empirical data for a particular dynamic system.
The auto-tuning or auto-calibration process begins with generating a representative dynamic response database as represented by block 210. A set of representative dynamic responses with test data and computer model data is stored in the database with the database being stored in one or more computer readable storage devices as described with reference to
Block 220 of
An optimization goal with corresponding constraints for the EEARTH metric auto calibration is formulated as generally represented by block 240. This may include defining an optimization objective and design variables and ranges for a particular application. Once a metric calibration goal is formulated, an optimization algorithm is employed to find the optimal values of the EEARTH metric parameters as represented by blocks 240, 250, and 260. The metric parameter values are calculated using the SME ratings database with the each result evaluated and determined to be acceptable or not for the particular application as represented by block 250. If the objective ratings of the EEARTH metric are not an acceptable match to the empirical data as determined by the subjective ratings of the SMEs as determined at block 250, then the parameter values are adjusted or updated as represented by block 260. This optimization loop continues until an acceptable set of parameter values is obtained. When acceptable parameter values are determined as represented by block 250, then the EEARTH metric parameter values are finalized as represented by block 270 and used in subsequent determination of the EEARTH metric score as described above.
In one embodiment, system 300 includes a computer 310 configured to execute instructions and process data stored in computer readable storage devices 312 to determine an objective metric for a computer model of a dynamic system based on an analysis of computer generated data relative to empirical test data. Computer 310 includes software and/or hardware configured to time-shift the computer generated data relative to the empirical test data and compute an associated cross-correlation for each time shifted data set, determine a phase error and phase score based on the time shifted data set that provides a maximum cross-correlation, and perform dynamic time warping on the maximum cross-correlation time shifted data set using a cost function based only on distance between associated data points of the time shifted data set and test data and determine an associated magnitude error and magnitude score. Computer 310 may also be configured to determine a slope error and slope score based on the maximum correlation time shifted data set and the test data, combine the phase score, the magnitude score, and the slope score to determine the objective metric for the computer model.
As demonstrated by the representative embodiments according to the present disclosure, an objective metric such as the EEARTH metric provides various associated advantages relative to previous metrics used to evaluate computer generated test data. For example, systems and methods according to embodiments of the present disclosure may be used to quantitatively assess the accuracy and predictive capacity of a computer model of a dynamic system with multiple responses. The systems and methods quantify error associated with phase, magnitude, and shape (slope) independently using dynamic time warping to minimize the effect of localized phase and topology while measuring magnitude and topological error. Magnitude error is calculated using a cost function that is robust with respect to the number of samples used. The different error measures are combined to provide an overall error measure and a single intuitive score for the computer model relative to the selected application. The metric uses a small set of parameters that have associated physical corollaries to facilitate subject matter experts' subjective analysis through a parameter calibration process to determine thresholds, and is scalable to different applications.
While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms of the invention. Rather, the words used in the specification are words of description rather than limitation, and it is understood that various changes may be made without departing from the spirit and scope of the invention. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention. While various embodiments may have been described as providing advantages or being preferred over other embodiments with respect to one or more desired characteristics, as one skilled in the art is aware, one or more characteristics may be compromised to achieve desired system attributes, which depend on the specific application and implementation. These attributes include, but are not limited to: cost, strength, durability, life cycle cost, marketability, appearance, packaging, size, serviceability, weight, manufacturability, ease of assembly, etc. The embodiments discussed herein that are described as less desirable than other embodiments or prior art implementations with respect to one or more characteristics are not outside the scope of the disclosure and may be desirable for particular applications.