This invention relates to digital flight data processing that have been recorded on aircraft during flight operations.
On a typical day, as many as 25,000 aircraft flights occur within the United States, and several times that number occur throughout the world. Most of these flights are safe. A few might exhibit safety issues. Many aircraft are equipped with instrumentation that collects from a few dozen parameters to a few thousand parameters every second for the full duration of the flight. These types of data have long been used for crash investigations, but can also be used for routine monitoring of flight operations. The subject invention relates to the latter activity. This provides an opportunity to analyze this data to identify portions of flights that exhibit safety issues. Aviation experts review these flights and recommend appropriate actions as a result.
Flight data, recorded during aircraft flight, consist of a series of parameter values. Each parameter describes a particular aspect of flight. Some parameters relate to continuous data such as altitude and airspeed. Other parameters assume a relatively small number of discrete values (e.g., two or three), such as thrust reverser position, flight guidance or autopilot command mode. Parameter measurements are usually made once per second although they may be recorded more or less frequently. Hundreds or even thousands of parameters may be collected for each second of an entire flight. These data are recorded for thousands of flights. The resulting data for an even modest size set of flights are voluminous.
These types of data have long been used for crash investigations but can also be used for routine monitoring of flight operations. The subject invention relates to the latter activity. The features of interest in routinely monitored flight data include specified exceedences (excessive speed, g-forces, and other characteristics that differ from standard operating procedures), unusual events, and statistical patterns and/or trends.
Digital flight data are passed through a series of processing steps to convert the massive quantities of raw data, collected during routine flight operations, into useful information such as that described above. The raw data are progressively reduced using both deterministic and statistical methods. In the final stages of processing, statistical methods are used to identify flights to be reviewed by aviation experts, who infer key safety and operational information about the flights described in the data. These flight data processing methods are imbedded in software.
Conventional methods of finding anomalous flights in bodies of digital flight data require users to pre-define the operational patterns that constitute unwanted performances. This can be a hit-or-miss process, requiring the experience and knowledge of experts in aviation operations, and it only identifies occurrences that specifically match the pre-defined condition. A conventional flight data analysis tool will find the patterns it is told to look for in flight data, but the tool is blind to newly emergent patterns for which the tool has not been programmed to look. The invention overcomes this deficiency because it does not require any pre-specification of what to look for in bodies of flight data.
Most flights are typical and exhibit no safety issues. A very few flights stand out as atypical based values displayed by the data. These flights may be atypical due to one flight parameter being very unusual or multiple parameters being moderately unusual. It turns out that these unusual flights often exhibit safety issues and thus are of interest to identify and refer to aviation safety experts for review. Additionally, these atypical flights might display safety issues in a manner never envisioned by safety experts; hence impossible to find using pre-defined exceedences as done by the current state of the practice.
What is needed is a system for identifying and displaying results for atypical phases of aircraft flights that provides individual and collective information on the flight phases that are determined to be atypical according to one or more criteria. Preferably, the display system should allow graphic and tabular display and comparison of relevant details that contribute to a specified phase atypicality and collective phase information for which atypical behavior occurs.
These needs are met by the invention, which displays quantitative collective information and information on individual aircraft flights that have been determined to be “atypical,” according to one or more specified criteria disclosed in a co-pending patent application, “Identification of Atypical Flight Patterns,” (U.S. Ser. No. 10/857,376, sometimes referred to as “IATP” herein) which is incorporated by reference herein. Conditions that contributed to one or more atypical phases for each specified flight are displayed in graphical and tabular format, and additional information is optionally displayed on relevant details that may have contributed to atypicality.
The IATP analysis allows identification of the most important flight parameters, capture and characterization of the dynamic values of these important parameters, and application of a consistent analysis to identify aircraft flights that exhibit atypical characteristics. This could mean that one or more of these parameters exhibits atypical values with respect to a collection of a set of flights that collectively define “typical”. This could also mean that individual parameters were marginally atypical, but collectively atypical. The analysis must extend to a larger or smaller number of “important” parameters and should not depend upon choice of a fixed number of such parameters. The analysis allows identification of the most important flight parameters, capture and characterization of the dynamic values of these important parameters, and application of a consistent analysis to identify aircraft flights in which one or more of these parameters exhibits atypical values, without limiting the nature of the atypicalities to envisionable or pre-defined conditions. The analysis is extendable to a larger or smaller number of “important” parameters and should not depend upon choice of a fixed number of such parameters. This analysis, in order to be useful, should provide the resulting information in textual and graphical formats for review by a user.
The IATP analysis provides an approach: (1) to provide a set of time varying flight parameters that are “relevant;” (2) to transform this set of flight parameters into a minimal orthogonal set of transformed flight parameters; (3) to analyze values of each of these transformed flight parameters within a time interval associated with the flight phase; (4) to apply these analyses to the data for each aircraft flight; and (5) to identify flights in which the multivariate nature of these transformed flight parameters is atypical, according to a consistently applied procedure.
The IATP always begins with a selected subset of relevant flight parameters, each of which is believed to potentially characterize the nature of a selected aircraft's flight (q), for a selected phase (ph) of the flight (e.g., pre-takeoff taxi, pre-takeoff position, takeoff, low altitude ascent, high altitude ascent, cruise, high altitude descent, low altitude descent, runway approach, touchdown and post-touchdown taxi). Application of this criterion often reduces the number of flight parameters from a few thousand to a number as low as about 100, or lower if desired, referred to herein as underlying flight parameters (“FPs”). The data value for each record and for each FP is inspected to determine if the data are reasonable and should be used to characterize the nature of the aircraft's flight or if it is “bad” data that has been corrupted. If the data value is deemed “bad” that value is removed from the analysis process for those records where it is deemed “bad”.
The (remaining) sequence of received FP values is analyzed separately for parameters that are interval ratio continuous numbers and for parameters that are ordinal or categorical parameters, sometimes referred to as discrete value parameters. A continuous value parameter value is approximated in each of a sequence of overlapping time intervals as a polynomial (e.g., quadratic or cubic), plus an error term. Each of the sequence of approximation coefficients for the sequence of time intervals is characterized by a first order statistic, a second order statistic, a minimum value and a maximum value, and, optionally, by at least one of a beginning value and an ending value for the sequence. The discrete value parameters are analyzed and characterized in terms of proportion of time at each discrete value and number of transitions between discrete values. The continuous value and discrete value characterization parameters are combined as an M×1 vector E for each flight. The set of flights is combined to form a matrix for which a covariance matrix F is computed.
An eigenvalue equation, FV(λ)=λV(λ), is solved. The data matrix formed by combining the M×1 vectors E for the set of flights is transformed by a data matrix to form a new matrix G. The set of all eigenvalues can be, and preferably will be, replaced by a reduced set of eigenvalues having the largest values.
A cluster analysis is performed on the new matrix G, with each flight being assigned to one of the clusters. The Mahalanobis distance for the flight with respect to the mean of all the flights (based on the G matrix) forms an estimate of the atypicality score for each flight, (q), in each phase, (ph). This atypicality score for flight (q) and phase (ph) is combined with the proportion of flights in the cluster flight q/phase ph was associated to calculate a new atypicality value, referred to as a Global Atypicality Score (GAS).
The Global Atypicality Scores for all the flights are ranked in decreasing order. The flights in the top portion (typically 5%) are labeled “atypical” (“Level 2” and “Level 3”) and the most atypical of these flights are identified as “Level 3”. These flights are brought to the user's attention in a list. The user can select any of these flights and drill down to get additional information about the flight, including comparison of its parameter values to the values of other flights. These procedures are part of the IATP analysis.
The display system receives the results of intermediate and completed calculations and displays, in alphanumeric format and/or graphically, several quantities, such as: number of level 1, level 2 and level 3 atypical phase flights; specific flight attributes that contributed to the phase atypicality, including (optionally) identification of the flight and aircraft; comparison of a time varying trace of an atypical-phase flight with traces for a collection of similar but non-atypical-phase flights; and aircraft corrective actions, if any, taken in response to the observed phase atypicality.
In the IATP analysis, a sequence of values for each of a selected set of P relevant flight parameters FP is received, and unacceptable values are removed according to one or more of the following: (1) each value un of a sequence is compared with a range of acceptable values, U1≦u≦U2, and if the parameter value un lies outside this range, this value is removed from the received sequence; and (2) a first difference of two consecutive values, un−1 and un, is compared with a range of acceptable first differences, Δ1U1≦un−un−1≦Δ1U2, and if the computed first difference lies outside this range, at least one of the values, un−1 and un, is removed from the received sequence.
For continuous value parameters, each such parameter is analyzed by applying a time-based function over each of a sequence of partly overlapping time intervals (tn0, tn0+N−1) of substantially constant temporal length (N values) to develop, for each such time interval and for each FP, a polynomial approximation in a time variable t (e.g., quadratic or cubic), plus an error coefficient. For example, the polynomial may be a quadratic sum, such as
p(n0\\t;app)≈p0(n0)+p1(n0)·(t−tn0)+p2(n0)·(t−tn0)2+e(n0) (1A)
For the sequence of time intervals in the selected phase for the selected FP, each of the sequence of coefficients {p0(n0)}n0, {p1(n0)}n0, {p2(n0)}n0 and {d(n0)}n0, considered as a vector v of entries, is represented by characterization parameters, which include a first order statistic m1(v) (e.g., weighted mean, weighted median, mode), by a second order statistic m2(v) (e.g., standard deviation), by a minimum value min(v), by a maximum value max(v), and optionally by a beginning value begin(v) and/or by an ending value end(v) for that coefficient sequence. The collection of these characterization parameters is formatted and stored as an M×1 vector E1, representing the collection of time intervals for that phase (ph) for that flight parameter for that flight (q).
Each ordinal or categorical parameter (sometimes referred to as a discrete-valued parameter), numbered k2=1, . . . , K2 and having L(k2) discrete states, is analyzed by forming a square transition matrix, with each row and each column representing each of the possible states or values of the parameter(s). Each data point from the full flight phase is processed by counting the number of transitions Ni,i+1 from a state Si on record i to an immediately subsequent state Si+1 on record i+1, including the number of transitions of a state to itself. Each diagonal entry in this transition matrix is divided by the sum of the original diagonal values, to convert the matrix to an L(k2)2×1 vector E2k2, where L(k2) is the number of distinct values for this parameter, k2. The set of vectors E2k2 for all the discrete parameters of the phase for this flight are concatenated into a vector E2, that is L×1, where L is the sum of L(k2)2 over all k2=1, . . . , K2.
The discrete parameter vector(s) for each phase and for the phase ph is/are combined with the M1×1 vector E1 for continuous value parameters to form an M×1 row vector E (M=M1+L) that includes the contributions of continuous and discrete value parameters. The E vectors from each of the Q flights in the set selected to be studied are combined to form a matrix, denoted as DM. Optionally, vectors E for adjacent phases can be combined to perform a multiple phase analysis, if desired.
An M×M covariance matrix
F=cov(E) (2)
is formed, which is symmetric and non-negative definite, and an eigenvalue equation
F·V(λ)=λV(λ) (3)
is solved to determine a sequence of M=M1+L eigenvalues λi with λ1≧λ2≧ . . . , λM≧0. The eigenvalue equation (3) can be solved in a straightforward manner, or a singular value decomposition (SVD) approach can be used, as described by Kennedy and Gentle in Statistical Computing, Marcel Dekker, Inc., 1980 pp 278–286, or in any other suitable numerical analysis treatment. (The method used is equivalent to what is known as principle component analysis.) One works with a selected subset {λ′i} of these eigenvalues, which may be a proper subset of M′ eigenvalues (M′<M), where
and f is a selected fraction satisfying 0<f≦1 for example, f=0.8 or 0.9.
A transformed matrix
G=DM·F (5)
is then computed. Preferably, the matrix G is normalized by subtraction of a first order statistic of each column and by division of the difference by a second order statistic associated with that column.
An atypicality score, also referred to as a Mahalanobis distance,
is computed for each flight (q) and each phase (ph).
The atypicality scores for the selected set of flights can be compared using a histogram of reference atypicality scores for a collection of reference flights. An atypical flight will often appear as a statistical outlier, as illustrated in
A p-value, corresponding to an atypicality score Aq, the selected flight q and the selected phase ph, is defined using the Wishart probability density distribution as defined in Anderson, An Introduction to Multivariate Statistical Analysis, 2nd Edition, John Wiley & Sons, 1984, pg 244–255.
p(q;ph)=(F1·F2)/(F3·F4·F5) (7A)
where
F1=|Aq|(R−M−1), (7B)
F2=exp(−(½)trace(Σ−1Aq)), (7C)
F3=2−MR*πM(M−1)/4, (7D)
F4=|Σ|1/2R, (7E)
F5=ΠMi=1Γ((½)(R+1−i)), (7F)
Γ(x) is an incomplete gamma function.
A cluster analysis is applied to a collection of observed values G (from Eq. (5)) for the same phase and for the full set of selected flight(s). A preferred cluster analysis is K-means analysis, as set forth in any of a number of statistics and data mining books, including Kennedy, Lee, Roy, Reed and Lippman, Solving Data Mining Problems Through Pattern Recognition, Prentice Hall PTR, 1995–1997, page 10–50 through 10–53. The clustering is performed for each phase (or aggregated group of phases) separately.
The initialization step requires selection of the number K of clusters, and the setting of the initial seed values. There are a number of ways to set these seeds; including using (i) a random selection of K flight vectors U from the full set of flight vectors, (ii) a random selection of dimension values for each of the K flight vectors, (iii) setting the seeds to be all zeros in all dimension but one and that value is a maximum or minimum of that value among all flight vectors. There are many other ways as well. The first method is a preferred method. These seeds take the role as the initial values of the cluster centers or centroids.
The next step requires that the distance from each cluster centroid to each flight vector is calculated. A flight vector is associated with the cluster that has the minimum flight vector-to-center distance. There are numerous methods to calculate distance, including Euclidian distance, Manhattan distance and cosine methods. A preferred distance is the Euclidean distance.
After associating every flight vector U with a cluster, the centroid for each cluster k is calculated as the mean or first order statistic in each dimension of the flight vectors that are associated with cluster k.
These last two steps are repeated until the number of flight vectors changing cluster membership is below some threshold, or until an upper limit of number of iterations is reached.
A second preferred cluster analysis method is hierarchical clustering, which works with partitions of the collection of observations that are built up (agglomerations) or that are divided more finely (divisions) at each stage. Hierarchical methods are discussed by B. S. Everitt, Cluster Analysis, Halsted Press, New York, Third Ed., 1993, pp. 55–89. Other cluster analysis can also be performed using any of the approaches set forth in B. S. Everitt, ibid, pp 37–140.
Hierarchical clustering initially assigns each flight, q=1, . . . , Q, to its own cluster, c=1, . . . C. Then the “distance” between all possible flight vectors pairs is calculated using the G matrix and identify the two flight vectors with the minimum distance. There are numerous methods to calculate distance, including Euclidian distance, Manhattan distance and cosine methods. A preferred method is the Euclidean distance. These flight vectors are associated with a cluster. The cluster's centroid is calculated based on all its members, denoted by cc, 1, . . . , CC.
After the first cluster is formed, calculate the distance between all possible pairs from Q-1 objects (Q-2 flight vectors and 1 cluster), find the pair with the minimum distance and assign them to a cluster. This may be a pair of flight vectors or a flight vector with a cluster (and if there are multiple clusters, as there inevitably will be, this could be two clusters jointed to form one larger cluster). Continue this process of calculating distances, finding the minimum distance and assigning flights or clusters to form bigger clusters until all have been aggregated to one global cluster.
A cluster membership score CMS(q;ph), equal to a monotonic function of a ratio, which is the number of observations in that cluster, divided by the total number of observations (0<CMS<1), is then computed for the selected flight (q) and the selected phase (ph). A larger value of CMS corresponds to a less atypical set of observed values for the selected flight (q) and the selected phase (ph), and inversely.
A Global Atypicality Score GAS for a selected flight (q) and selected phase (ph) is then defined as
GAS(q;ph)=−logz{p(q;ph)}−logz{CMS(q;ph)}, (8)
where z is a selected real number greater than 1. According to the definition in Eq. (8), a Global Atypicality Score GAS increases with decreasing p-values and with decreasing CMS values. A probability value Pr can be assigned to each GAS value that decreases with an increase in the GAS value. The logarithm functions in Eq. (8) can be replaced by another function Fn that is monotonic in the argument, such as
GAS(q;ph)=w1·Fn{p(q;ph)}+(1−w)·Fn{CMS(q;ph)}, (9)
where w is a number lying in the range 0≦w≦1.
In step 2, applicable to a parameter with continuous values, polynomial coefficients p0(n0), p1(n0) and p2(n0) and an error coefficient e(n0) are determined for a polynomial approximation p(t;app)≈p0(n0)+p1(n0)(t−tn)+p2(n0)(t−tn)2+e(n0), where the coefficients p0, p1 and p2 are chosen to minimize the magnitude of e. The collections of coefficients {p0(n0)}n0, {p1(n0)}n0, {P2(n0)}n0, and (d(n0)=(N−3)−1Σe(n0)2}n are treated as entries for the respective vectors v=A, B, C and D, for the selected flight (q) and the selected phase (ph). A first order statistic m1(v), a second order statistic M2(v), a minimum value min(v) and a maximum value max(v), and optionally at least one of a beginning value begin(v) and an ending value end(v), are computed for each of the vectors v=A, B, C and D. An M1×1 vector E1 is formed, including the entries of the vectors A, B, C and D.
In step 3, for each of the overlapping time intervals, an L(k2)×L(k2) matrix is formed whose entries are the number of transitions from one of L(k2) discrete values to another of these discrete values of an FP; each of the original diagonal values of the L(k2)×L(k2) matrix is divided by the sum of the original diagonal values so that the sum of the diagonal entries of this modified L(k2)×L(k2) matrix has the value 1. An L×1 vector E2 is formed from the entries of the modified L(k2)×L(k2) matrices, where L is the sum of the squares L(k2)2.
In step 4, an M×1 vector E, including the entries of the vectors E1 and E2, is formed, where M=M1+L. In step 5, an M×M covariance matrix F=cov(E) is computed.
In step 6, eigenvalues k for an eigenvalue equation, FV(λ)=λV(λ), are obtained, where λ1≧λ2≧ . . . ≧λM≧0, and a selected subset of these eigenvalues, λ′1≧λ′2≧ . . . ≧λ′M′≧0, is provided, where M′≦M.
In step 7, a transformed matrix G=DM·F is provided, where DM is a selected data matrix.
In step 8, an atypicality score, Aq is calculated based on the M′ variables for the selected set of flights and the selected phase (ph), as set forth in Eq. (6).
In step 9 (optional), the computed atypicality score, Aq, for the selected flight is compared with a reference histogram of corresponding atypicality scores for a reference collection of similar flights with the same phase (ph), and an estimate is provided of a probability associated with the computed atypicality score relative to the reference collection. Step 9 is a simplified alternative to cluster analysis, which is covered in steps 10–15.
In step 10, a p-value corresponding to the computed atypicality score is provided for the selected flight and/or for one or more similar flights with the same phase (ph), as determined by Aq.
In step 11, an initial collection of M′-dimensional clusters is provided for the atypicality scores, Aq.
In step 12, a selected cluster analysis, such as K-means analysis or hierarchical analysis, is performed for the cluster collection provided. Each atypicality score is assigned to one of the clusters, and a selected cluster metric value or index is computed.
In step 13, membership in the clusters is iterated upon to determine a substantially optimum cluster collection that provides an extremum value (minimum or maximum) for the selected cluster metric value or index.
In step 14, a cluster membership score (CMS) is computed for each cluster, equal to a monotonic function of a ratio, the number of observations (atypicality scores) associated with each cluster, divided by the total number of observations in all the clusters.
In step 15, a global atypicality score GAS is computed as a—a linear combination of a selected monotonic function Fn applied to the p-value and the selected function Fn applied to the CMS, for the selected flight(s) and the selected phase (ph).
A collection of one or more atypicality scores is received by a p-value module 38, which calculates a p-value for the collection, as in step 10 (
A GAS value for a selected flight (q) and selected phase(s) (ph) may be compared with a spectrum of GAS values for a collection of reference flights for the same phase(s) to estimate a probability associated with the GAS for the selected flight. A GAS value for a selected flight may, for example, be placed in the most atypical 1 percent of all flights, in the next 4 percent of all flights, in the next 16 percent of all flights, or in the more typical remaining 80 percent of all flights.
Assume that the selected flight atypicality score is assigned to a given cluster, SFC. The GAS value for that selected flight will decrease as the CMS for the cluster SFC increases, and inversely. An increased CMS value for the SFC corresponds to enlargement of the SFC. The logarithm function −logz(x) manifests increased sensitivity to change of the argument x as x approaches 0.
One embodiment of the display system begins with relevant data for a large collection of flights (preferably at least 100) that, optionally, use a particular model of aircraft, where the flights were made in a specified time interval (e.g., a particular N-day interval) and identifies flights that fall into one of two or more levels of atypicality; for example, three levels, including the most atypical 1 percent, the next most atypical 4 percent and the next most atypical 15 percent of the original collection. Optionally, each atypical flight is identified by the atypicality attribute(s) and flight phase where the atypicality occurred and by one or more of (i) the tail number of the aircraft, (ii) the aircraft departure time, (iii) the departure airport, and (iv) the (original) aircraft destination airport. These data are illustrated for a group of 30 flights in a table in
The level of flight atypicality may be determined, for example, by procedures disclosed in the IATP application, where a system (1) provides a set of time varying flight parameters that are “relevant;” (2) transforms this set of flight parameters into a minimal orthogonal set of transformed flight parameters; (3) analyzes values of each of these transformed flight parameters within a time interval associated with the flight phase; (4) applies these analyses to the data for each aircraft flight; and (5) identifies flights in which the multivariate nature of these transformed flight parameters is atypical, according to a consistently applied procedure.
For the identified atypical phases of flights, a display shown in
For example, in the table shown in
Some operationally interesting attributes, or groups of attributes that contribute to atypicality include, but are not limited to:
takeoff anomalies,
non-normal aircraft ascent patterns,
TCAS RA with escape maneuver(s),
turbulence and aircraft accommodation,
high energy arrivals,
non-normal descent patterns, and
landing rollout anomalies,
among others. The attribute groups that contribute most often to atypicality for a given group of flights are optionally identified and displayed in text format by the system, and the percentage of flights for which this attribute group causes or contributes to an atypical flight phase is optionally displayed.
Additional information on one or more of the atypicality attributes set forth above is available and is optionally displayed in one or more additional “screens.” For example, a high energy arrival occurs when: (1) the arriving aircraft has an unusually high speed (above 200 knots) as the aircraft approaches 2500 feet altitude from above and/or (2) the aircraft has an above-standard glide path angle during low speed descent and final approach to landing. Any of at least three outcomes can result from a high energy arrival: (1) the aircraft is subsequently controlled and stabilized so that a normal approach and landing is subsequently executed (e.g., all parameters are within the desired envelope at and below 1000 feet altitude above touchdown altitude); (2) the aircraft pulls up and executes a go-around to approach the landing in a more stabilized configuration; and (3) the aircraft continues its landing approach in an unstable configuration. A high energy arrival has been identified through atypicality analysis in at most 1–2 percent of aircraft arrivals.
However, a graph of a parameter value for each of these five rationales can be quickly displayed and viewed to determine which, if any, of the corresponding parameter values are likely contributors. Data recorded by a flight recorder during the flight, plus accumulated data for the “normal” band, are used to construct each of the graphs for the rationales.
The approach of the designated flight would need to be studied in more detail to determine which, if any, of these rationales were operationally significant, contributing, causative, correlated or consequential.
However, a graph of a parameter value for each of five rationales can be quickly displayed and viewed to determine which, if any, of the corresponding parameter values are likely contributors. Data recorded by a flight recorder during the flight, plus collective data for the “normal” band, are used to construct each of the graphs for the rationales.
This application is a Continuation in Part of prior application Ser. No. 10/857,376. U.S. Pat. No. 6,937,924, filed May 21, 2004, issued Aug. 30, 2005.
The invention described herein was made by employees of the United States Government and its contractors under Contract No. NAS2-99091 and may be manufactured and used by or for the Government for governmental purposes without the payment of any royalties thereon or therefor.
Number | Name | Date | Kind |
---|---|---|---|
4235104 | Hoadley et al. | Nov 1980 | A |
4729102 | Miller et al. | Mar 1988 | A |
5796612 | Palmer | Aug 1998 | A |
5991691 | Johnson | Nov 1999 | A |
6389333 | Hansman et al. | May 2002 | B1 |
6449573 | Amos | Sep 2002 | B1 |
6480770 | Wischmeyer | Nov 2002 | B1 |
Number | Date | Country | |
---|---|---|---|
Parent | 10857376 | May 2004 | US |
Child | 10923156 | US |