This invention relates generally to the field of information theory. More particularly, it relates to an information theory method that uses the Fano equality with the Data Processing Inequality in a Markovian channel construct for the study of component-level uncertainty and information loss within a system.
Recent innovations in radio frequency (RF) sensing component technology, particularly in the area of remote target signature measurement and exploitation, include multi-channel spatially diverse antennas, sensitive receivers, fast analog-to-digital converters, adaptive transmit waveforms, and sparse sampling approaches. These innovations support new signature information sensing functions such as calibrated target measurements, feature processing, and inference-based decision algorithms. The ability to characterize target information extraction while under the effects of system uncertainties is critical to the full application of the scientific method in the expanding trade space of the new functional capabilities, particularly regarding waveform design and the analysis of radar signatures and radar systems. Regardless of the application, the success of any information systems theory model will largely depend on its ability to address several challenges: the ability to (1) characterize the performance of modular systems within critical regions in the space of inputs while under the effects of various sources of uncertainty; (2) propagate the effects of these uncertainly sources acting on individual components within the system to the predicted system performance measures; (3) effectively minimize the overall loss in the information flow while trading costs associated with component design, and (4) operate effectively within the nonlinear high dimensional spaces inherent in many systems such as signature sensor systems.
A variety of information theoretic approaches have been formulated and applied to the area of RF sensing, particularly to the analysis and design of waveforms and radar systems such as new radar architecture referred to as the MIMO (Multiple Input Multiple Output) radar. For example, information theory-based frameworks employing a variety of techniques have been presented in the field of radar analysis, including application of the Fano bound to train and develop target classifiers in automatic target recognition (ATR) systems and use of mutual information (MI) as a similarity measure for the evaluation of suitability of radar signature training surrogates. Other approaches, such as the information bottleneck approach, have presented the radar system in terms of a Markov Chain within a channel configuration and characterized the information flow from source to sink in order to, for example, study information loss. However, existing systems theory prototypes frequently fall short in their ability to fully characterize the flow of information through the components of a sensing system while that system is subjected to the effects of system uncertainty. The ability to isolate the effects of uncertainty within the components of the system allows for component design trade methods that lead to optimal information flow.
In engineering scenarios, the error associated with system parameters is of interest. For example, the tolerance of machined components in a mechanical system are a key consideration in the manufacturing process, impacting the amount of testing and measurement needed to ensure compliance, as well as a contributing factor to overall system assembly expense (generally, the more stringent the fabrication requirements, the more expensive the end product.) Similarly, the confidence a user has in a meter reading value output by a system is also of critical importance. For example, a pilot needs to know whether the fuel gauge in an airplane cockpit indicates that the aircraft can reach its destination with 50%, 90% or 100% confidence.
The uncertainty associated with a system parameter is typically due to many sources. In traditional linear signal processing models with additive Gaussian noise, sources of uncertainty (noise) are assumed to be statistically independent. Because the sum of Gaussians is a Gaussian, the final overall uncertainty for a system output value is easily tabulated from the individual component uncertainties. Real life systems however often have nonlinear behavior. In addition, the noise may not be Gaussian, additive, or statistically independent. These deviations from the linear, additive independent Gaussian noise model quickly make uncertainty and error estimation analytically intractable. As a recourse, engineers frequently resort to numerical simulation methods such as Monte Carlo-based techniques. However, real life systems have a large number of degrees of freedom and numerical simulation in such situations must be carefully addressed. Hence, the need arises for accurate, analytically-based, methods for uncertainty estimation and propagation analysis.
The description of the illustrative embodiments can be read in conjunction with the accompanying figures. It will be appreciated that for simplicity and clarity of illustration, elements illustrated in the figures have not necessarily been drawn to scale. For example, the dimensions of some of the elements are exaggerated relative to other elements. Embodiments incorporating teachings of the present disclosure are shown and described with respect to the figures presented herein, in which:
The present disclosure includes theoretical models and methods for identifying and quantifying information loss in a system due to uncertainty and analyzing the impact on the reliability of system performance. These models and methods join Fano's equality with the Data Processing Inequality in a Markovian channel construct in order to characterize information flow within a multi-component nonlinear system and allow the determination of risk and characterization of system performance upper bounds based on the information loss attributed to each component. The present disclosure additionally includes methods for estimating the sampling requirements and for relating sampling uncertainty to sensing uncertainty. The present disclosure further includes methods for determining the optimal design of components of a nonlinear system in order to minimize information loss, while maximizing information flow and mutual information.
The present disclosure includes a method for identifying and characterizing component-level information loss in a nonlinear system comprising a plurality of components, one or more of which are subject to at least one source of uncertainty that each comprises a plurality of system uncertainty parameters. The method comprises the steps of: a) determining discrete decision states for the nonlinear system that comprise a true object state H and a decision state Q, with the discrete decision states being characterized in a Markovian channel model comprising a plurality of links that each correspond to one component of the nonlinear system; b) modeling the system uncertainty parameters to create a plurality of distributions that each comprise a plurality of values ranging from a theoretical maximum entropy to a theoretical minimum entropy for one system uncertainty parameter, in which at least one of the system uncertainty parameters is unknown; c) calculating an entropy at each component, H(H), H(X), H(Y), . . . H(Q), that is directly related to an amount of uncertainty at each component; d) computing an amount of mutual information between H and Q, I(H;Q), in which I(H;Q) is used to characterize a total system performance and the one or more sources of uncertainty increases a total amount of entropy in the nonlinear system, thereby decreasing I(H;Q) and degrading the total system performance; e) calculating an amount of cumulative component information loss from H to Q, ILX, ILY, . . . ILQ, in which ILQ is equal to a sum of the component-level information loss that occurs at each component, ILXΔ, ILYΔ, . . . ILQΔ, and component-level information loss occurs only within the Markovian channel model; f) correlating, using Fano's equality, at least one of I(H;Q) and ILQ to the total amount of entropy to generate at least one overall probability of error, Pe, for the nonlinear system; g) estimating, using the Data Processing Inequality together with Fano's equality, a component-level probability of error, PeX, PeY, . . . PeQ; and h) correlating the component-level probability of error to the component-level information loss.
The present disclosure further includes a method for computing a component-level performance reliability and attributing a contribution of each system uncertainty parameter to the component-level performance reliability by: a) determining a real world statistical variation of the system uncertainty parameters; b) performing a Monte-Carlo simulation of a plurality of the statistical uncertainty parameters for a plurality of settings by iteratively performing, according to the present disclosure, the step of modeling the system uncertainty parameters through the step of correlating the component-level probability of error to the component-level information loss; c) calculating a component-level probability of error statistical distribution at each component; d) determining the component-level performance reliability based on a standard deviation of each component-level probability of error statistical distribution; and e) correlating the contribution of each system uncertainty parameter to the component-level performance reliability. In some embodiments, the step of performing the Monte-Carlo simulation further comprises determining a proper ensemble sample size.
The present disclosure further includes a method for determining at least one component-level ensemble sampling requirement comprising the steps of: a) determining a set of test criteria for a maximum allowable sampling uncertainty of the component-level information loss relative to the component-level probability of error statistical distributions; b) determining a sample ensemble size NM for the component-level information loss using a phase transition method; and c) computing the component-level performance reliability using a numerical simulation method on the sample ensemble size NM. In some embodiments, the numerical simulation method comprises Monte Carlo modeling.
The present disclosure further includes a method for determining an optimal component design for a nonlinear system comprising a plurality of components, one or more of which are subject to at least one source of uncertainty that each comprises a plurality of system uncertainty parameters. The method comprises the steps of: a) establishing an information loss budget comprising a desired PeQ; b) calculating, according to the present disclosure, the component-level information loss, ILXΔ, ILYΔ, . . . ILQΔ; c) calculating, according to the present disclosure, component probability of error, PeX, PeY, . . . PeQ, to generate a calculated PeQ; d) comparing the calculated PeQ with the desired PeQ; e) identifying at least one source of information reduction that comprises component-level information loss and/or information flow reduction; f) determining the optimal component design to minimize the calculated PeQ that includes at least one tradeoff between information flow and component design, in which the tradeoff decreases the at least one source of information reduction; and g) repeating the step of calculating component-level information loss through the step of determining the optimal component design until the calculated PeQ is equal to or less than the desired PeQ. In some embodiments, the method further comprises identifying at least two sources of information reduction that comprise component-level information loss and/or information flow reduction, ranking the two or more sources of information reduction according to impact on the calculated PeQ to identify at least one dominant source of information reduction, and determining the optimal component design to minimize the calculated PeQ, in which the optimal component design includes at least one tradeoff between information flow and component design that reduces the at least one dominant information loss source.
The present disclosure includes theoretical models and methods for identifying and quantifying information loss in a system due to uncertainty and analyzing the impact on the reliability of (or confidence in) system performance. These models and methods join Fano's equality, which is derived from Fano's inequality, with the Data Processing Inequality in a Markovian channel construct. In particular, the presently disclosed invention allows for the study of information flow and the effects of uncertainty on the information flow within the various components of a system. The present disclosure allows the determination of risk and characterization of system performance upper bounds based on the information loss attributed to each component. Taking an information theoretic view, degrading effects are considered as sources of entropy, which may be used to represent propagating uncertainty within an information channel. Treating the system as an information flow pipeline from input to output, the propagating effects of various sources of uncertainty (i.e. entropy) degrade the mutual information (MI) between the input and output. Development and application of a systems theory model allows for performing component-level design trades within the information sensing application based on a component-level information loss budget (Bits). Demonstration of the max flow in conjunction with the Data Processing Inequality further identifies information flow bottlenecks and provides analysis of these bottlenecks in the information flow pipeline.
The presently disclosed models and methods may be particularly useful within radar signature exploitation systems, and as such, key attributes of the presently disclosed theoretical models and methods are demonstrated under the constraints of a radar high range resolution (HRR) sensor system example. Simplified target scattering models are used to illustrate the value of component-level analysis under the effects of various sources of uncertainty in sensing systems. While the present disclosure is often referenced throughout with relation to radar and radar systems, one of ordinary skill in the art will appreciate that these models and methods may be employed in the design and analysis of a wide variety of systems and structures, including production/assembly lines, communications systems, and virtually any other multi-component system containing sources of uncertainty.
The use of information theoretic principles in the presently disclosed models and methods affords several advantages in dealing with the challenges associated with a variety of systems, particularly those in the areas of information sensing and exploitation. First, information theory prototypes enable the study of the propagating effects of various sources of uncertainty on system performance at the point of noise infiltration. For example, using Fano's inequality, the max flow criteria bounds the optimal Bayes error. Entropy and MI are analytically connected to the probability of error (Pe), and more generally the Neyman Pearson criteria, allowing for the rate of noise infiltration to be related to the rate of entropy growth and ultimately to the rate of degradation of system performance. The information loss associated with uncertainty sources can then be characterized in terms of a confidence interval about the predicted system performance at each component of the system. The Data Processing Inequality affords a method to determine information loss points and maximize information flow via component trades within a system information loss budget.
Second, the convexity of MI yields a unique solution and enables rapid numerical convergence (low computational complexity) to maximum MI configurations. MI affords the optimization of a scalar quantity, while classical Bayes likelihood ratio techniques involve optimizing on non-convex surfaces over high dimensional signature spaces. On a convex surface, the use of highly efficient search algorithms such as the Conjugate Gradient method will converge on the order of operations ( dimensional problem). While entropy-based methods operate non-parametrically such that the probability does not have to be estimated, complicating factors can include numerical computation issues that occur within high dimensional processes (Bellman's Curse of Dimensionality). It can be shown, however, that computing the entropy of the multivariate sensor signature processes is also O(). As a consequence of the law of large numbers, the asymptotic equipartition property asserts that there are large regions within the entropic signature subspace that will never occur under the decision hypotheses. Thus, the information theoretic approach holds the potential to exploit entropy-based methods operating within this “typical” signature subspace.
Third, classical statistical pattern recognition approaches use the maximum likelihood (ML) decision criteria, which include only the second order statistics present in the training process. The use of MI in nonlinear processing affords advantages over linear processing in that it accounts for higher-order statistics within the design of nonlinear optimal decision rules and in the optimization of features. In the context of radar systems, nonlinear scattering phenomenon resulting from the interaction of individual target mechanisms can also reduce the effectiveness of second order techniques in the optimization of diverse transmit waveforms. The use of MI as a nonlinear signal processing method for optimizing waveform design will address this phenomenon. It is these inherent benefits that distinguish the presently disclosed information theoretic models and methods over traditional statistical pattern recognition methods.
The present disclosure additionally includes methods for estimating the sampling requirements for entropic quantities based, for example, on a characterization of the typical set underlying the sufficient statistics of a random signature process. Interdependencies among multivariate target signatures can significantly impede information extraction, and the expansion of the signature statistical support is related to incremental increases in uncertainty. Baseline statistical support (in the native coordinate system) associated with the resolved radio frequency target scattering is characterized for specified states of certainty. The performance estimate variance associated with lower sample counts within a Monte Carlo experiment may be scaled (via central limit theorem) to the estimate variance associated with higher sample counts.
The present disclosure further includes methods for relating sampling uncertainty to sensing uncertainty to better understand the entropic effects within the information sensing system and to ensure confidence estimates are of sufficiently low variance. Referring to radar signature analysis, both sensor uncertainty and model training uncertainty are propagated into a classifier algorithm where uncertain decisions are inferred from uncertain observations. The uncertainty (i.e. the increase in entropy) is ultimately realized in the form of confidence or reliability intervals about the estimated system performance. A sensitivity analysis is performed to study the relative significance of various “unknown” operating conditions to the reliability of the performance estimate at each component of the system. The effects of sampling uncertainty are contrasted to reliability of performance estimates in order to study the variance effects in performance estimation within high dimensional signature processes subject to unknown operating conditions.
Uncertainty Analysis: In the sensor measurement community, “accuracy” generally refers to the agreement between a measured value and the true or correct value, while “precision” generally refers to the repeatability of a measurement. “Error” refers to the disagreement between the measured value and the true or accepted value. The “uncertainty” in a stated measurement is the interval of confidence around the measured value such that the measured value is expected not to lie outside this stated interval. This use of the term “uncertainty” implies that the true or correct value may not be known and can be stated along with a probability, which recognizes the deterministic nature of error and the stochastic nature of uncertainty. However, this definition is often insufficient to address the full range of issues within an information sensing system containing multiple sources of uncertainty.
For example, radar systems produce signature measurements that when combined with the effects of various system uncertainties, are realized as a random signature process. Conclusions are inferred by applying instances taken from this random measured signature process to a decision rule. The “unknowable” nature of parameters affecting the measured signature process leads to challenges in developing a signature process model that will generate the optimal decision rule for inferring information. The combined effects of these uncertainties limit the exploitation of physics-based features and result in a loss in information that can be extracted from target signature measurements. The resulting decision uncertainty is driven by both the distorted measurements and the degree of agreement between the signature process under measurement and the process used to train the optimal decision rule.
As a specific example, measurement of airborne moving objects using high range resolution (HRR) waveforms is complicated by several sources of uncertainty. As shown in Table 1, two classes of system uncertainty are introduced into the system: sensing uncertainty and uncertainty resulting from decision rule training limitations. Sensing uncertainty is further divided into three subcategories: (a) signature measurement uncertainty due to sensor design/limitations; (b) object tracking position and motion uncertainty; and (c) uncertainty due to interference.
The object under measurement by the sensing radar system can be viewed as a collection of scattered field sources filling an electrically large volume in space. The system measurement of this object is subject to uncertainty identified in source 1(a) generating the statistical support underlying a random signature process at a fixed position in time. Target fixed body motion within the measurement interval induces scintillation within the scattering sources, resulting in an additional increase in entropy. Imperfect knowledge of target position, velocity, and aspect also alters the statistical characterization of the random signature process (source 1(c)), and the random signature process interacts with an external environment (source 1(b)) to further impact the statistical nature of the measured signature process.
These sources of uncertainty, along with limitations within the training process, result in a decision rule design that is less than optimal with respect to system performance. The exploitation of this signature process using a decision algorithm requires the training (generally via supervised learning) of an optimal decision rule that operates within the entropy produced by sources 1(a)-(c), but only a subset of the phenomenon (parameters) underlying source 1 can be modeled and/or characterized within the statistical decision rule training process. While uncertainty source 1(a) is generally epistemic and may be modeled and characterized, uncertainty sources 1(b), 1(c), and 2 are aleatoric in nature and are generally considered “unknowable.” As such, uncertainty sources 1(b), 1(c), and 2 may generally only be characterized statistically and may result in a reduction in certainty from the highest certainty state.
The sources of uncertainty associated with source 2 in Table 1 are traceable to the corresponding effect within the decision rule subspace in the classical statistical pattern recognition approach to the binary hypothesis testing. The decision rule design (threshold d) is based on statistical training support resulting from the uncertainties in Table 1. If the sensing uncertainties within source 1 are adequately represented in the statistics of the training process, the decision rule design should provide optimal performance; however, the effects due to many of the uncertainties in Table 1 are unavoidable. For example, realizations are often formed through the integration of many sequential measurements. Intra-measurement object motion can cause distortion and induce uncertainty in the decision rule subspace that is not accounted for in the decision rule training process. In another example, the object under measurement may be configured differently than that represented in the training data (extra fuel tanks, wing flaps up, or damaged surface for example).
Information Theoretic Decision Rule Subspace: Referring now to the drawings, like reference numerals may designate like or corresponding parts throughout the several views. One approach to viewing the decision rule subspace is shown in
Information Theoretic Radar Channel Model: The concept of uncertainty introduced in
Referring to
One or more of the components of the system comprising links or stages within the Markovian channel model, referred to here as X, Y, . . . , is subject to at least one source of uncertainty. In Step 310, these sources of uncertainty, each of which may comprise a variety of governing system uncertainty parameters or variables, are modeled. Modeling the parameters creates a series of distributions, each of which represents a set of values ranging from the theoretical maximum value of entropy to the theoretical minimum value for each parameter. For example, in a radar system, these variables may include values that are constantly changing and/or that are unknowable or aleatoric such as the target aspect angle, the leading edge location, thermal noise, the presence of jamming frequencies, etc. These aleatoric variables must generally be characterized statistically, and these characterizations may be in the form of statistical distributions or in the case of radar systems, range bins.
Continuing with the radar system example, X in
The multivariate sample feature {right arrow over (Y)}i is extracted from the ith instance test sample {right arrow over (X)}i to support the desired function of the exploitation system. Given the random nature of {right arrow over (X)}, the extracted signature feature {right arrow over (Y)} is also random. The training feature process {right arrow over (Y)}′ is developed from the set of typical signatures within a decision rule training process {right arrow over (X)}′ (not separately shown). {right arrow over (X)}′ (and thus {right arrow over (Y)}′) is developed offline using a surrogate process and is used to determine the ‘optimal’ decision rule d. The decision algorithm applies {right arrow over (Y)}i to the decision rule d, yielding the decision Q (instance of Q) declaring which of the hypotheses has occurred.
Referring still to
Following determination of the sample ensemble size NM (Step 315), the next step is calculating an amount of entropy at each component of the system (Step 320). An instance of each parameter is drawn from the distributions created in Step 310, and based on the NM ensemble of samples calculated in Step 315, the entropy is determined for each component of the system H(H), H(X), H(Y), . . . H(Q). The amount of entropy at each component of the system is equal to H(H), H(X), H(Y), . . . H(Q), respectively, and a total amount of entropy for the system is equal to the sum of the entropies at each component. The next step is computing the amount of MI between H and all other components, including the output of the system at Q, I(H;X), I(H;Y), . . . I(H;Q)(Step 325). The one or more sources of uncertainty may cause a degradation of performance by increasing the amount of entropy in the nonlinear system and thus decreasing I(H;Q). I(H;Q) may be mathematically related to the total system performance as described herein, thereby allowing a correlation between increases in entropy, decreases in MI, and changes in system performance.
Following calculation of MI in Step 325, the next step is determining an amount of cumulative component information loss from H to each component, including the output component at Q, ILX, ILY, . . . ILQ, as well as component-level information loss ILXΔ, ILYΔ, . . . ILQΔ (Step 330). ILQ is equal to an end-to-end sum of the component-level information loss that is occurring at each component. Information cannot be gained and can only be lost within the Markovian channel. In the context of the radar system example, these sources of component information loss may include, for example, loss due to uncertainty in the sensing and/or feature extraction processes, as well as loss occurring from the decision process due to imperfect training. ILXΔ, ILYΔ, . . . ILQΔ may be determined by apportioning the ILQ determined in Step 330 among each component. Because ILQ is equal to a sum of the component-level information loss i.e. the information loss that occurs at each link or component within the system, ILX, ILY, . . . ILQ may be used to determine ILXΔ, ILYΔ, . . . ILQΔ.
The next step in the method 300 is calculating a predicted overall system performance Pe and the predicted link performance i.e. the component probability of error PeX, PeY, . . . PeQ (Step 335). Using Fano's equality, cumulative information loss, which may be, for example 1−I(H;Q) and/or ILQ, is correlated with the total amount of entropy associated with system uncertainties as determined in Step 320, with this correlation being characterized as at least one overall probability of error Pe for the nonlinear system i.e. the overall system performance. Using Fano's equality and the Data Processing Inequality together, Pe may be used to estimate a component-level probability of error PeX, PeY, . . . PeQ for each component in the system. In Step 337, PeX, PeY, . . . PeQ may then be correlated to ILX, ILYΔ, . . . ILQΔ.
Steps 310 to Step 335 may be repeated until the number of instances of the statistically distributed of uncertainty parameters calculated in Step 310 are realized. In one embodiment, this iterative sampling process may comprise a Monte Carlo method in which L draws are taken from the distribution of modeled uncertainty parameters (Step 310).
Once a sufficient number of samples (L) have been obtained, the method 300 continues with determining the statistical distribution of each component probability of error due to the various sources of parameter uncertainty (Step 340). The variance and mean of the cumulative component information loss ILX, ILY, . . . ILQ calculated in Step 330 are used to determine the variance on the predicted performance at each link or component. A distribution is created for each source of uncertainty, providing the random mapping to the performance estimate Pe at each component PeX, PeY, . . . PeQ.
The statistical distribution of the component probability of error is then used to compute the reliability of the component probability of error (Step 345), followed by termination of the method. The standard deviation σPe
In addition, the presently disclosed method allows the determination of the relative contribution of each system uncertainty parameter to the component-level performance reliability, as well as a comparison of the performance reliability estimates determined in Step 345 to real world uncertainty sources. This determination can be very helpful in the traceability of the effects of uncertainty on the reliability of performance. The disclosed method of decomposition will allow for designers to identify where the uncertainty is having the most detrimental effect on performance reliability. The ability to perform this traceability at the component level will further allow component designers to design for the minimum effects of uncertainty.
In one embodiment, a method for computing component-level performance reliability and attributing the contribution of each system uncertainty parameter to the component-level performance reliability may begin with determining a real world statistical variation of the system uncertainty parameters. In this context, “real world” refers to information obtained using actual events and/or experiments. For example, continuing with the radar system example, a variety of uncertainties exist, many of which occur due to chance and are hence unknowable. Samples obtained under real world conditions are subject to a variety of these system uncertainty parameters (known and unknown), and the statistical variation of the system uncertainty parameters may be calculated as described above. Following determination of real world statistical variation of the system uncertainty parameters, the method continues with performing Monte-Carlo modeling of a plurality of the statistical uncertainty parameters for a plurality of settings to determine component-level information loss. This step may occur, for example, through many iterations of Steps 310 to 335 in
The contribution of each system uncertainty parameter is then correlated to the component-level performance reliability. This calculation may be performed, for example using Eq. (34) described herein. The independent nature of the individual system uncertainty parameters allows the effects of each parameter to be seen. Using the Data Processing Inequality, a decomposition of reliability may be obtained so that the decomposition of reliability effects may be seen at each component. These calculations may be used to determine how well the information sensing system performs when real world data is used and to determine the acceptability of performance reliability with respect to real world uncertainty sources.
The presently disclosed invention further includes methods for determining the optimal design of components of a nonlinear system in order to minimize information loss, while maximizing information flow and MI. Referring to
However, if the calculated PeQ exceeds the information loss budget (“No”), the method may continue with identifying one or more sources of information loss and information flow reduction i.e. bottlenecks (Step 370). In some embodiments, there are two or more of information loss and information flow reduction, and the method further includes the step of identifying the dominant source(s) of information loss and information flow reduction i.e. bottlenecks (Step 370). These dominant sources may be identified by ranking the various sources of uncertainty at each link/component (for example; ILXΔ, ILYΔ) based on their individual impact on cumulative component information loss and performance of the system.
The next step is determining the optimal component design to minimize PeQ and ILQ, while maximizing I(H;Q) (Step 375) within the information budget via one or more tradeoffs between information flow and component design (described in more detail herein). Following determination of the optimal component design in Step 375, the method returns to Step 355 to continue component design iterations guided by relative levels of component information loss within a system component/link loss budget until a component design is determined that keeps the calculated PeQ within the desired information loss budget established in Step 350.
Fano-Based Information Theoretic Method (FBIT) and Data Processing Inequality: Fano's Inequality and the Data Processing Inequality, both of which are theorems from information theory, may be used in Step 335 of the method in
Fano's Inequality provides a mathematical means to relate the MI between H and Q, I(H;Q), to a lower bound on Pe. Fano's Inequality may be written as an equality as in Equation (Eq.) (1):
H(Pe)=δ−Pe·log(Nc−1)+H(H/Q) (1)
In Eq. (1), Pe is a real random variable between 0 and 0.5 representing the probability of error of the decision algorithm. Nc is the discrete size of the alphabet of H and Q. H(H) is the Shannon entropy of the discrete random variable H. δ is a bias offset derived from asymmetries in the data and decision algorithm. Typically, δ is small and to a first approximation, may be neglected.
Theorem I: For Nc=2, Fano's equality can be written as H(Pe)=1−I(H;Q)+I(Q;V), where V is the binary discrete random variable representing the probability that the decision rule makes a correct decision. Using I(H;Q)=H(H)−H(H/Q) and Eq. (1), Eq. (2) may be obtained:
H(Pe)=δ−Pe·log(Nc−1)+H(H)−I(H;Q) (2)
The asymmetry factor in Eq. (2) may be computed directly from the output of the decision algorithm. Let δ=I(Q;V) for Nc=2; where V is the binary discrete random variable representing the probability that the decision rule makes a correct decision. V=1 when H=Q; otherwise V=0. Eq. (2) can then be written more completely for Nc=2 as in Eq. (3):
H(Pe)=1−I(H;Q)+I(Q;V) (3)
Eq. (3) may be written in terms of the inverse entropy function, F, as shown in Eq. (4):
P
e
=F(H(H))−I(H;Q)+I(Q;V) (4)
In Eq. (4), F is a deterministic, strictly monotonically increasing function that maps information theoretic quantities into the Pe at the corresponding operating point. The relationship of Pe to F(x) where x∈[0, 0.5] is shown in
IL
Q
=H(H)−I(H;Q)+I(Q;V) (5)
In general, minimizing the information loss minimizes the system Pe. The entropic quantity H(H) is determined by the a priori probabilities of the outcomes of the random variable H corresponding to the different target classes. δ is fixed by architectural considerations. Since F is a known function, the deterministic relation Pe=F(H(H)−I(H;Q)+I(Q;V)), for fixed H(H) and δ, determines the MI, I(H;Q), needed to achieve a specified Pe. For example, for an equiprobable binary hypothesis scenario, H(H)=1 Bit and I(Q;V)≈0, an approximation for Pe can be written as Eq. (6):
P
e
≈F(1−I(H;Q)) (6)
Specifying a desired Pe determines the amount of allowed ILQ. How the ILQ budget is “spent” as information cascades from the input space at H to the classifier output space at Q can be traded off via component (link) design.
The Data Processing Inequality states that information can only be lost in the channel as shown in Eq. (7):
I(H;{right arrow over (X)})≥I(H;{right arrow over (Y)})≥I(H;Q) (7)
Using the relationship in Eqs. (4) and (5), the loss associated with each link within the channel can be characterized as in Eq. (8):
H(H)−I(H;{right arrow over (X)})≤H(H)−I(H;{right arrow over (Y)})≤H(H)−I(H;Q) (8)
The approximation to the cumulative information loss at each link in the channel can then be written as below applying Eq. (5):
IL
{right arrow over (X)}
≈H(H)−I(H;{right arrow over (X)}); {right arrow over (X)}∈|χ| (9.a)
IL
{right arrow over (Y)}
≈H(H)−I(H;{right arrow over (Y)}); {right arrow over (Y)}∈|Y| (9.b)
IL
Q
≈H(H)−I(H;Q); Q∈|Q| (9.c)
Theorem II: The respective information loss due to each link within a Markov chain H→X→Y→Q can then be approximated using Eqs. (10a-10c):
Loss due to Sensing≡ILSΔ≈H(H)−I(H;{right arrow over (X)}) (10.a)
Loss due to Feature Extraction≡ILFΔ≈I(H;{right arrow over (X)})−I(H;{right arrow over (Y)}) (10.b)
Loss due to Decision Rule≡ILDΔ≈I(H;{right arrow over (Y)})−I(H;Q) (10.c)
Thus, the probability of error can be estimated at various points in the channel using the approximation in Eq. (6):
P
e
X
≈F(H(H)−I(H;{right arrow over (X)}) (11.a)
P
e
Y
≈F(H(H)−I(H;{right arrow over (Y)}) (11.b)
P
e
Q
≈F(H(H)−I(H;Q) (11.c)
Uncertainty In the Information Channel: The feature extraction ƒ and decision rule d in
Referring to the radar system example, the loss at {right arrow over (X)}, ILSΔ, is due solely to the sensing process (source 1 in Table 1). The sensing uncertainty inherently alters the statistical support associated with {right arrow over (X)}n, generating statistical independence between {right arrow over (X)}n and {right arrow over (X)}, thus degrading the performance of the signature sensing process as quantified by PeX in Eq. (11a). The loss in information due to sensing uncertainty is then realized at {right arrow over (X)} as ILSΔ in Eq. (10a) and is quantified by the entropy H(PeX):
H(PeX)≈H(H)−I(H;{right arrow over (X)}) (12)
The level of statistical agreement between {right arrow over (X)} and {right arrow over (X)}′ will directly affect the loss in the channel due solely to the decision process (source 2 in Table 2), which is closely tied to the surrogate training process {right arrow over (X)}′. The sensing uncertainty sources in Table 1 are to some degree reproducible in the decision rule training process {right arrow over (X)}′. However, sources 1(b) and 1(c) in Table 1 are not fully reproducible in {right arrow over (X)}′. The dissimilarity between {right arrow over (X)} and {right arrow over (X)}′ results in a decision rule d that is less than optimal. The application of d to the feature process {right arrow over (Y)} induces a loss in the channel due to imperfect training. The effects of decision uncertainty within the decision rule subspace are realized at Q as ILDΔ as illustrated in
H(PeQ)≈H(H;Q)−H(Q) (13)
The resulting H(PeQ) provides the best possible performance for a given component design (radar sensor design, feature selection, algorithm design, and decision rule design). As stated above, {right arrow over (X)} is often not completely observable and a training surrogate {right arrow over (X)}′ is used to develop ƒ and d. Under conditions such as those listed in uncertainty source 2 in Table 1, the surrogate representation {right arrow over (X)}′ used in the training of the decision rule results in a non-optimal d. This is represented by the altered entropic quantity H(Q′) and more importantly I(H;Q′). The alternate Markov chain H→{right arrow over (X)}→{right arrow over (Y)}′→Q′ is shown as the dotted subspace H(Q′) in
H(Pe′)=1−I(H;Q′)+I(Q′;V) (14)
Therefore since H(Pe′)≥H(Pe), I(H;Q′)−I(Q′;V)≤I(H;Q)−I(Q;V).
Corollary I: Information loss due to imperfect training, ILTΔ, is then mathematically quantified in terms of the increase in entropy ΔH(Pe) resulting from a non-optimal design of ƒ and d:
If it can be shown that I(Q;V)≅I(Q′;V) and that I(Q;V)<<H(H)−I(H;Q) and I(Q′;V)<<H(H)−I(H;Q′), then:
Imperfect Training Loss≡ILTΔ≅I(H;Q)−I(H;Q′) (16)
The decrease in information flow due to imperfect training is illustrated in
Theorem III: The total loss in the channel is equal to the sum of link information loss:
IL
Total
=IL
SΔ
+IL
FΔ
+IL
DΔ
+IL
TΔ (17)
Definition 1: Any phenomenon producing an increase in I(H;Q) and a subsequent reduction in H(Pe) can be defined as a “system information gain” within the information channel. Any phenomenon producing a decrease in I(H;Q) resulting in an increase in H(Pe) is defined as a “system information loss.”
Propagating Effects of Uncertainty: Uncertainty propagation is the study of how uncertainty in the output of a model (numerical or otherwise) can be allocated to different sources of uncertainty in the model inputs, which are used in Step 310 of
The distributions associated with the input parameters in {right arrow over (V)}E and {right arrow over (V)}t are estimated from experimental data. The estimated parameters become factors within a Monte Carlo simulation. The cumulative link information loss as quantified within Eq. (5) and approximated in Eqs. (9.a), (9.b), (9.c) then become random variables as shown:
IL
{right arrow over (X)}
≈H(H)−I(H;{right arrow over (X)}(,{right arrow over (V)}E,{right arrow over (V)}t)); (18.a)
IL
{right arrow over (Y)}
≈H(H)−I(H;{right arrow over (Y)}(,{right arrow over (V)}E,{right arrow over (V)}t)); (18.b)
IL
Q
≈H(H)−I(H;Q(,{right arrow over (V)}E,{right arrow over (V)}t)); (18.c)
Similarly, the link information loss ILSΔ, ILFΔ, and ILDΔ in Eqs. (11.a), (11.b), and (11.c) also become random variables.
The unknowable characteristics of the observed signature process {right arrow over (X)} are realized within the input variables to the modeled training process {right arrow over (X)}′(′, {right arrow over (V)}E′, {right arrow over (V)}t′). If it is assumed that ′≠, {right arrow over (V)}E′≠{right arrow over (V)}E, {right arrow over (V)}t′≠{right arrow over (V)}t, then the mapping to the non-optimal decision rule will be d(, {right arrow over (V)}E′, {right arrow over (V)}t′), which will be written as d for brevity. The decision rule d is applied to {right arrow over (Y)}(, {right arrow over (V)}E, {right arrow over (V)}t) generating Q′(′, {right arrow over (V)}E′, {right arrow over (V)}t′), written as Q′, while the optimal decision rule dopt generates Q(, {right arrow over (V)}E, {right arrow over (V)}t). Each realization of d and dopt resulting from each ensemble {right arrow over (X)}′(, {right arrow over (V)}E′, {right arrow over (V)}t′) and {right arrow over (X)}(, {right arrow over (V)}E, {right arrow over (V)}t), respectively, in the Monte Carlo simulation will result in the randomization of the imperfect training loss function in Eq. (19) and the randomization of the cumulative loss function in Eq. (20):
IL
TΔ
≡I(H;Q)−I(H;Q′) (19)
IL
Q′
≈H(H)−I(H;Q′) (20)
In Eq. (19), the special case of {right arrow over (V)}E′={right arrow over (V)}E and {right arrow over (V)}t′={right arrow over (V)}t, the loss due to the optimal training of d=dopt yields ILTΔ=0 and ILQ′=ILQ. To narrow the focus of analysis, the training space (′, {right arrow over (V)}E′, {right arrow over (V)}t′) will be considered fixed and thus will become a component of the system control parameter . Therefore, d becomes fixed by design as d.
Independent Sources of Uncertainty Loss: Loss due to isolated sources of uncertainty within the channel can be computed to provide a means to characterize the relative impacts to information flow at various points in the channel. The various sources of sensing uncertainty induce information loss in the channel as characterized by the random link loss functions ILSΔ, ILFΔ, ILDΔ, and ILTΔ. The prior distributions on the random parameters within {right arrow over (V)}E and {right arrow over (V)}t are propagated to the respective loss functions using Monte Carlo simulation.
Definition 2: The expected value of the link information loss can be written as the expected values of the individual random loss components as in Eqs. (21.a)-(21.d):
The sensing uncertainty factors within {right arrow over (V)}E and {right arrow over (V)}t are assumed to be independent. Given that the total loss function ILTotal can account for multiple independent sources of uncertainty within the parameter space of (, {right arrow over (V)}E, {right arrow over (V)}t), the variance on ILTotal is the sum of the individual variances within the components of ILTotal.
Corollary II: Assuming ne factors within {right arrow over (V)}E and nt factors within {right arrow over (V)}t, the link loss variance can be decomposed as given in Eqs. (22.a)-(22.d):
Definition 3: The expected value of the cumulative link information loss can then be written as the expected values of the individual random cumulative loss components as in Eqs. (23.a)-(23.d), which may be used to obtain the cumulative component information loss as in Step 330 in
μIL
μIL
μIL
μIL
Corollary III: Assuming ne factors within {right arrow over (V)}E and nt factors within {right arrow over (V)}t, the cumulative link loss variance can be decomposed as given in Eqs. (24.a)-(24.d):
Propagating Link Loss to Link Performance: The variance and mean of the random cumulative loss components IL{right arrow over (X)}, IL{right arrow over (Y)}, ILQ and ILQ′ are used directly to determine the variance on the performance at the random link performance components Pe{right arrow over (X)}, Pe{right arrow over (Y)}, PeQ, and PeQ′. The Maximum Likelihood Estimate (MLE) of Pe is inferred at each realization of the sufficient statistical support about (, {right arrow over (V)}E, {right arrow over (V)}t), providing the random mapping to performance Pe at each link as in Step 335 in
Corollary IV: Given sufficient sampling of the space of {right arrow over (V)}E and {right arrow over (V)}t within the finite alphabet |χ| and |Y|, the environmental and position estimate uncertainty factors result in the respective random performance at {right arrow over (X)} and {right arrow over (Y)} given by functions PeX(, {right arrow over (V)}E, {right arrow over (V)}t) and PeY(, {right arrow over (V)}E, {right arrow over (V)}t) as in Eqs. (25) and (26):
P
e
{right arrow over (X)}
≡P
e
X(,{right arrow over (V)}E,{right arrow over (V)}t)≈F(IL{right arrow over (X)}) (25)
P
e
{right arrow over (Y)}
≡P
e
Y(,{right arrow over (V)}E,{right arrow over (V)}t)≈F(IL{right arrow over (Y)}) (26)
If the conditions of Corollary IV hold and perfect training conditions are assumed where ′=, {right arrow over (V)}E′={right arrow over (V)}E, {right arrow over (V)}t′={right arrow over (V)}t, then the mapping to the decision rule dopt will be optimal.
Corollary V: The output of the discrete random variable Q (from the finite alphabet |Q|) is driven by the inferred decision out of the application of each realization of {right arrow over (Y)} to dopt. The random performance function PeQ(, {right arrow over (V)}E, {right arrow over (V)}t) can be expressed as random realization of the information loss in the channel, ILQ in Eq. (18.c). Using the approximation form of Eq. (13) (assume I(Q;V)≈0), the random performance function PeQ is given by Eq. (27).
P
e
Q
≡P
e
Q(,{right arrow over (V)}E,{right arrow over (V)}t)≈F{ILQ} (27)
The approximation in Eq. (27) can be replaced by an equality using the full representation in Eq. (4):
P
e
Q
=F{IL
Q
+I(Q;V)} (28)
In Eqs. (27) and (28), the relaxation of the constraint {right arrow over (V)}E′={right arrow over (V)}E and {right arrow over (V)}t′={right arrow over (V)}t expands the study of the effects of uncertainty to the loss due to the non-optimal training of d.
Corollary VI: The output of the discrete random variable Q′ (from the finite alphabet) is driven by the inferred decision out of the application of each realization of {right arrow over (Y)} to d. The random performance function PeQ′(, {right arrow over (V)}E, {right arrow over (V)}t) can be expressed as random realization of the information loss in the channel, H(H)−I(H;Q′). Fixing the suboptimal decision rule d(=βc, {right arrow over (V)}E′=βE, {right arrow over (V)}t′=βt) and using the approximation form of Eq. (4) (assume I(Q;V′)≈0), the random performance function PeQ′ is given by Eq. (29):
P
e
Q′
≡P
e
Q′(,{right arrow over (V)}E,{right arrow over (V)}t)≈F{ILQ′}=F{H(H)−I(H;Q′)} (29)
The approximation in Eq. (29) is replaced by an equality using the full representation in Eq. (4):
P
e
Q′
≡P
e
Q′(,{right arrow over (V)}E,{right arrow over (V)}t)=F{H(H)−I(H;Q′)+I(Q′;V′)} (30)
Definition 4: The expected link performance under control parameters and in the presence of sensing uncertainty ({right arrow over (V)}E, {right arrow over (V)}t)) is defined as the expectation of the random link performance components Pe{right arrow over (X)}, Pe{right arrow over (Y)}, PeQ, and PeQ′.
Given a sufficient number of Monte Carlo samples over the random parameters in {right arrow over (V)}E and {right arrow over (V)}t, the standard deviation of the random link component performance function is used as a measure of reliability. Reliability is interpreted as 95% confidence that any estimate would fall within the bounds of one standard deviation.
Definition 5: Reliability in predicted link performance is defined as the standard deviation (σP
Uncertainty in Performance: The independent sources of uncertainty contributing to σIL
to the corresponding set of variances that combine to equal the variance on Pe{right arrow over (X)}.
It is possible to approximate the inverse entropy function (F) by a linear relationship about the mean of IL{right arrow over (X)}: F(IL{right arrow over (X)})=a+b·(IL{right arrow over (X)}). The mean and variance of the approximation are then
Using established approximation techniques, the first order Taylor expansion of F around the mean μIL
Using the Taylor Series expansion in Eq. (32), the approximation for E[F(IL{right arrow over (X)})] and Var[F(IL{right arrow over (X)})] are:
and F′(μIL
Assuming ne factors within {right arrow over (V)}E and nt factors within {right arrow over (V)}t, the cumulative link loss variance components given in Eq. (24.a) are applied to Eq. (34):
The variance on the performance estimate Pe{right arrow over (X)} is then decomposed into the individual sources of sensing uncertainty being propagated through the decision space at {right arrow over (X)}.
Similar methods are applied to the independent contributions to the sensing uncertainty of {right arrow over (V)}E and {right arrow over (V)}t comprising the variances
σIL
Stability of the Linear Approximation: The validity of the linear approximation in Eq. (34) requires σIL
is plotted in
Dimensionality and Computing: The computation of the entropy of {right arrow over (X)} involves the joint probability mass function (PMF) of the random multivariate {right arrow over (X)} and is complicated by the large dimensional nature of the observation mapping H→{right arrow over (X)}. It is desired to compute the discrete entropy for {right arrow over (X)} absent any assumption regarding dependence between the respective dimensions of {right arrow over (X)}. If the {right arrow over (X)} space consists of K random variables (dependent or independent) and the random variable Xk; k∈{1,K} has nb distinct bins (statistical divisions), then the size of the alphabet of {right arrow over (X)}, |{right arrow over (χ)}|, is given in Eq. (37):
For example, if K=3 and nk=2=nb for all k, |{right arrow over (χ)}|=2·2·2=8.
The joint PMF of {right arrow over (X)}, p(xk
A high dimensional problem is one where the alphabet of {right arrow over (X)}, |{right arrow over (χ)}|, underlying the random process far exceeds the number of samples observed (N), i.e.; |{right arrow over (χ)}|>>N. Sensing systems typically operate within this high dimensional signature data space of |{right arrow over (χ)}|. The high dimension arises due to factors within the space {right arrow over (X)}(, {right arrow over (V)}E, {right arrow over (V)}t). Hypothesis testing and inference within the high dimensional space of {right arrow over (X)} in turn leads to large sampling requirements to adequately determine the underlying statistical nature of the phenomenon under study. Without accurate determination of the underlying system statistics, poorly performing hypothesis tests and/or parameter estimation occur (Bias/Variance tradeoff).
The number of statistical bins, nb, within the discrete sampling of the K element joint PMF of {right arrow over (X)} also has a significant effect on |{right arrow over (χ)}| as well as the entropy computation of {right arrow over (X)}. An increase in size of nb in {right arrow over (X)} will result in an increase in the entropy of {right arrow over (X)}. However, in the limit, the value for I(H; {right arrow over (X)}) as a function of nb asymptotes to a constant value after one reaches the full intrinsic dimensionality of the subspace of I(H; {right arrow over (X)}). This will be true for I(H; {right arrow over (Y)}), I(H;Q), and I(H;Q′) as well. A method for determining the intrinsic dimensionality of {right arrow over (X)} is then needed to guide the selection of N.
Sample Size and Minimum Sampling Requirements: The methods used to determine the minimum sampling requirements for entropy estimation and the variance parameters of these entropy estimations (Step 315 in
The link performance variability estimate at each of the respective links,
σP
is written more precisely as in Eq. (38):
is defined as the N sample estimation variance or “sampling uncertainty” associated with the true variability
Eq. (38) can be written as
For the high dimensional problem, N must be large enough for:
The objective then is to produce link reliability estimates that are within this regime. The choice of N must be selected to ensure the uncertainty of the entropic estimate is much less than the reliability limits realized due to various factors within (, {right arrow over (V)}E, {right arrow over (V)}t) under study. That is, the ensemble size N of {right arrow over (X)}, {right arrow over (Y)}, Q, and Q′ should be sufficiently large to ensure that the variance of the estimate falls within three significant digits of the variability levels
σIL
As stated above, |{right arrow over (χ)}| in particular, can grow to large levels and as such the number of samples required will grow as well. Given that the sampling ensemble size N of {right arrow over (X)} is likewise imposed on {right arrow over (Y)} and Q, the following analysis is focused on the process at {right arrow over (X)}.
From
Phase Transitions and the Typical Set: The entropy computation requires the development of the joint mass function associated with the multi-variate {right arrow over (X)}, p(xkj); j∈{1:nb}, k∈{1:K}. The development of this mass function assumes no independence between the K indices of {right arrow over (X)} and is performed using a “linked list” approach to limit the memory requirements during computation. A doubly linked list implementation with a hash table search approach yields a computational complexity of O(). The Miller-Madow estimate provides a faster convergence over the MLE method for finite sample estimates.
Maximum Likelihood Estimate of H({right arrow over (X)}k):
Miller-Madow Estimate of H({right arrow over (X)}k) (note: M+=number of statistical bins for which p(xk
Ĥ
MM
(Xk)=ĤMLE
The N sample estimates for ĤMLE
Phase transitions within the growth trajectory of the estimated entropy with increasing N are useful in defining the alphabet size |{right arrow over (χ)}|. The following illustration demonstrates the usefulness of this approach. The signature process under evaluation will be constructed by design such that the actual entropy value is known. The multivariate random signature vector {right arrow over (X)} is modeled to be uniformly distributed (standard uniform {0,1}) with nb=6 (all indices of {right arrow over (X)}) and K=3. The theoretical maximum value of the entropy of {right arrow over (X)} is then log2(nbK) or log2(63)=7.7549 Bits. In
Initially, the samples are filling the open high dimensional space of {right arrow over (X)} in a uniform fashion. The linear dashed line represents the log2(N) growth of the entropy associated with this uniform distribution. Note that the actual achieved entropy computation begins to diverge from a uniform distribution. Only after the samples of {right arrow over (X)} begin to accumulate in the bin space of the joint mass function of {right arrow over (X)} does this transition occur. This phase transition point represents the point at which the fundamental statistics of {right arrow over (X)} change.
The phase transition point is determined from the intersection of the line tangent to the linear portion of the typical set profile and the line tangent to the asymptotic portion of the profile. The number of samples coinciding with this phase transition point is NT. For the example here, NT is found to be approximately 250 as illustrated in
Sampling Uncertainty for Probability of Error Estimate: Since the random estimation error variable is essentially the sum of many independently distributed random variables, the estimation error is Gaussian. The standard deviation of the Gaussian distribution of Î(H; {right arrow over (X)}), will then scale as a function of 1/N. Thus the variance on the estimate
can be scaled to a large sample size (σÎ(H;x)
{circumflex over (P)}
e
X
≈H
−1(H(H)−Î(H;{right arrow over (X)})) (42)
For the equal probable binary hypothesis case, H(H) is equal to 1 Bit. Therefore the sampling uncertainty
is a function only of σÎ(H;x)
As previously noted, the inverse entropy function in Eq. (42) is a transcendental function and as such the variance on the estimate {circumflex over (P)}eX,
can be very difficult to determine analytically. Following a similar line of analysis found in Eqs. (33) and (34), the mean and variance of {circumflex over (P)}eX can be calculated as:
The use of Eq. (44) requires an estimate of the mean of Î(H; {right arrow over (X)}), which is taken to be the sample mean μ{right arrow over (I)}(H;x)
from a low sample estimate of the mean of Î(H; {right arrow over (X)}),
Manipulating Eq. (44) above, σÎ(H;x)
To ensure
the relationship in Eq. (45) is essential.
The regime of interest is where Î(H; {right arrow over (X)}) is close to 1 and
and thus {circumflex over (P)}eX is small. The derivative of the estimate in this regime is on the order of 0.25 as illustrated in
Therefore, errors in the estimate of μÎ(H;x) can have a significant impact on the estimate of the number of samples required to reach a target sampling uncertainty of
This means that a conservative approach is needed to estimate E[Î(H; {right arrow over (X)})] based on a small number of samples. Instead of using the sample mean
as an estimate of the expectation E{Î(H; {right arrow over (X)})}, a value somewhat less than the sample mean should be chosen. Depending on the level of confidence required in the estimate of the number of samples N, a higher confidence estimate can be achieved by replacing
with
As discussed above, the variance on the estimate Î(H; {right arrow over (X)}),
can be scaled to large sample size (σÎ(H;x)
and the standard deviation,
can be estimated using the low number of samples (N=2NT).
Sampling Uncertainty versus Variability in Performance: The expression in Eq. (45) provides guidance on the level of sampling uncertainty associated with Î(H; {right arrow over (X)}) that is required to achieve the corresponding sampling uncertainty in {circumflex over (P)}eX. A more important question relevant to the study of uncertainty and performance estimation is the relationship introduced in Eq. (44) and written in general form:
The variable α may be set to limit the degree of sampling uncertainty to be realized in the performance confidence analysis. Using Eq. (44), Eq. 34 and the fact that σIL
The factor β(N,NT) in Eq. (47) is given as:
Thus, the expression in Eq. (48) may be used to test for conditions specified in Eq. (46):
The FBIT model provides a platform for the study and analysis of the relationship of the level of sampling uncertainty to the level of performance uncertainty. Incremental values for the ratio on the left side of Eq. (48) can be computed for increasing N. The point at which the inequality is obeyed is related to the phase transition minimum sample methods previously generated.
The following examples and methods are presented as illustrative of the present disclosure or methods of carrying out the invention, and are not restrictive or limiting of the scope of the invention in any manner.
An Information Flow Numerical Example: The application of the FBIT method to the study of uncertainty propagation is now illustrated within a simple radar sensor example. An information loss budget is constructed for a baseline design. Selected forms of uncertainty in Table 1 are introduced into the system to demonstrate the analysis of the effects of propagating uncertainty through the information sensing channel.
Observed Target Scattering Model: In the high frequency regime used to obtain HRR signatures, the target may be approximated as a collection of scattering centers valid over a limited aspect window and frequency band. These scattering centers may be considered to be localized to a point and may represent a variety of scattering phenomena ranging from specular reflection to diffraction phenomena such as edge and tip diffraction. The fields radiated by these point scatterers depend upon both temporal and spatial frequencies (angular dependence). Because the radar illuminating the target has finite bandwidth and is a one dimensional imaging system, the target is seen as a collection of contiguous swaths of range, with each range swath corresponding to a particular range. The extent of each range swath, range resolution, depends upon the signal bandwidth. For a typical extended target of interest, each range swath contains a number of scattering centers which can be widely spaced in cross-range.
The electromagnetic field obtained as a result of the interference of the scattered fields from the scattering centers appears as the signal corresponding to a particular range bin of the target signature. The target signature may be considered to be a one dimensional image of the reflectivity (or scattering) profile of the target for a given azimuth/elevation aspect angle (θ, ϕ) and bandwidth. The mathematical definition of the radar signature is developed from the normalized scattered field in Eq. (49), where {right arrow over (E)}s and {right arrow over (E)}i are the scattered field and the incident field, respectively:
Using scattering center modeling and the far field approximation, Eq. (49) can be written in terms of the target aspect angle and the transmitted wavelength as shown in Eq. (50):
In Eq. (50), SE is the band-limited frequency response of the target comprised of M scattering centers at the respective range Rm. Conditioned on the target hypothesis H at a fixed aspect angle (θi, ϕi), {right arrow over (S)}E(θi, ϕi)=SE(θi, ϕi, λ), λ∈{λl, λl+1, . . . λƒ} defines the band-limited frequency response of the normalized scattered field measurements given in Eq. (50). Clusters of simple scattering centers are chosen for targets of interest at X-band frequencies (8-12 GHz) in the following development. The targets are electrically large with dimensions in range and cross-range of many wavelengths. The target cluster of M isotropic scatters occupies the target volume within the radar sensor coordinate system illustrated in
The three-dimensional target scattering center configuration for the two targets examined in the following example occupy an approximate cubic volume of {x=2, y=3, z=2.5} meters and are positioned at a line-of-site, {right arrow over (l)}OS, of for (θt, ϕt)=10°, 7.5°. Both targets are comprised of 100 scattering centers of unity amplitude and three strong localized scattering clusters of amplitude 5. Target 1 differs from target 2 in that the length of the target 1 is shorter than target 2 in the Y dimension by 0.5 meters. One of the localized scattering clusters is also displaced by (0.2, 0.2, 0) meters.
Radar Sensor Model: Applying matched filter processing and the discrete Fourier transform to the observed signature {right arrow over (S)}E(θi, ϕi), in additive noise, the measured HRR signature can be modeled for a range of frequencies present in the transmitted waveform. The multidimensional encoded source {right arrow over (X)}Ei is defined here as the vector form of the time delay transformation of the band-limited frequency response {right arrow over (S)}E (θi, ϕi). The measured random signature process {right arrow over (X)}ni is then defined as in equation Eq. (51) were {right arrow over (n)} is additive white noise:
{right arrow over (X)}
n
i
={right arrow over (X)}
E
i
+{right arrow over (n)} (51)
The process {right arrow over (X)}ni is modeled at the output of a radar step frequency measurement sensor system for the specified target aspect angle (θi, ϕi). The additive noise process {right arrow over (n)} is modeled as the sum of thermal white noise and quantization noise components. The quantization error component is thought of as a random process uncorrelated with both the signal and the thermal noise. The complete radar step frequency measurement model system parameters are summarized in Table 3.
The sensing of {right arrow over (X)}ni in a dynamic, real world environment is subject to the uncertainties listed in area 1 of Table 1 leading to the random signature process {right arrow over (X)} as previously outlined and as summarized in Table 2. Given the dynamic nature of the phenomenon underlying these uncertainties, the statistics associated with the dimensions of {right arrow over (X)} are often time varying. The target statistics are assumed to be stationary (constant with time), thus, the sample signatures associated with this random vector correspond to a stationary random process. Given the short measurement times associated with radar measurements of the nature under study, this assumption is appropriate.
Modeling Pose Angle Estimation Uncertainty: The observed object aspect angle estimate can be viewed as lying within a solid cone angle centered on the observed object aspect angle (θt, ϕt). The parameter σt is defined as the uncertainty associated with the sensor estimate of (θt, ϕt). The parameter σt and μt are elements of {right arrow over (V)}t and are the standard deviation and bias of the object aspect angle estimate, respectively.
The variation in measured signature phenomenology due to the uncertainties in target aspect angle are generated in the signal model in Eq. (50) through the introduction of distributions on θ and ϕ. The parameters θ and ϕ are both modeled as Gaussian random variables each with variance σt2 and mean μt+θt, μt+ϕt. The bias parameter μt is assumed to be unknown and is modeled uniformly distributed between the interval [−1, 1] degrees.
Modeling Leading Edge Position Estimation Uncertainty: The leading edge location estimation will vary under real world sensing conditions. Thus, the range alignment (along {right arrow over (l)}OS) of the measured signature process {right arrow over (X)} to the decision rule training process {right arrow over (X)}′ is imperfect and can be modeled as an uncertainty source. The process {right arrow over (X)} alignment to {right arrow over (X)}′ is modeled through a positive bias applied to the phase center of the scattering cluster underlying {right arrow over (X)}. The bias parameter μr is assumed to be unknown and is modeled uniformly distributed between [0, 0.2] meters. Note that μr is another element of {right arrow over (V)}t.
Modeling Imperfect Training: The training process component {right arrow over (X)}′ in
Feature Discriminate and Decision Rule Design: The function ƒ used to compute the feature discriminate {right arrow over (Y)} in
The Maximum Likelihood estimator is used to determine the optimal decision rule d:
Assuming equally likely priors on each of the binary hypotheses H1 and H2 in {right arrow over (X)} and {right arrow over (Y)}, the samples ({right arrow over (Y)}) from {right arrow over (Y)} are applied to the decision rule d. {right arrow over (Y)}<d are declared from H1 (denoted Q1) and {right arrow over (Y)}>d are declared from H2 (denoted Q2). The in-class and out-of-class scoring system is given by the conditional probabilities within α, β, γ, and κ as provided below:
The output of the decision algorithm Q as formed from the scoring system above can be summarized by the confusion matrix for the binary classifier given in Table 4:
Certainty States: The most certain state achievable for the example HRR radar example presented here is the case of the observed deterministic multivariate signal in noise ({right arrow over (X)}ni) when accompanied by perfect training ({right arrow over (X)}′={right arrow over (X)}ni). Table 5 relates selected combinations of measurement and training uncertainty sources from Table 1. The cases 1-6 identified in Table 5 represent the certainty states of interest within the system. Unknown parameters are shown in bold.
Assuming sufficient sampling to completely determine the probability density function (pdf) associated with the additive noise, the resulting statistical characteristics of the random performance functions will resemble the delta function and thus the reliability in predicted link performance (such as σP
Case 1 of Table 5 represents an observed process {right arrow over (X)}n of a stationary object of known aspect angle with perfect training. Case 1 conditions correspond to the highest certainty state possible. Case 2 corresponds to the observed process {right arrow over (X)} of an object that is moving slow enough as to appear stationary during the measurement interval. The aspect estimation is σt=0.75 degrees with an unknown bias (μt), and again the training is perfect. Case 3 conditions are similar with an unknown leading edge position bias μr.
The signal-to-noise ratio (SNR) parameter is treated as an unknown parameter in Case 4. Case 5 is a combined condition of the unknown parameters in Cases 2, 3, and 4. In case 6, a form of imperfect training is presented where the measurement parameter uncertainty provided in Case 5 is combined with training level B (μr=0 and μt=0).
Sampling and FBIT Analysis: The amplitude response for the N sample ensemble of HRR signatures for a “baseline” set of conditions defined as Case 2 (μr=0 and μt=0) are provided in
Sampling Uncertainty Example: The sampling uncertainty previously defined is illustrated using the baseline uncertainty conditions and multiple target ensembles similar to those previously discussed. Using the Monte Carlo simulation, the typical set for {right arrow over (X)}1, {right arrow over (X)}2, and {right arrow over (X)}, is computed for an increasing value for N. Multiple ensembles of each are simulated at each value of N to generate both the mean and variance of the entropy estimate within the typical set.
The sampling uncertainty associated with entropic estimation at {right arrow over (X)} is realized within the estimate Î(H; {right arrow over (X)}).
In Eq. (47), Corollary IV and V are used to compute the sampling uncertainty associated with the estimate of the probability of error. The following figures demonstrate the accuracy of Corollary IV and V using Eq. (44), which is applied at each link in the radar channel. Note that each application of Eq. (44) is conducted with the 2×NT=6×103 as the basis for the scaling. The approximation for the standard deviation of the probability of error is computed for the complete range of ensemble size out to N=3×104.
The application of Eq. (44) at each draw of the Monte Carlo simulation will generate an approximation of the sampling uncertainty within the probability of error estimate.
Eq. (48) provides the test for minimum sampling based on low sample ensemble sizes. In
The Fano Equality: It is important to demonstrate the validity of Theorem I as written in Eq. (3). Using the radar example,
Experiments: The experiments conducted are given in Table 6:
Information Flow and Design Trades within the Radar Channel: The value of the Data Processing Inequality is readily seen from
The signal-to-noise ratio of the signatures resulting from sensor measurements depends in part on the noise figure of the system. In
It is also of interest how the dynamic range of the sensor affects the information flow through the channel. Specifically, the sensitivity of I(H;Q) and ultimately Pe to the dynamic range in the sensor is of interest. The A/D conversion of the radar intermediate frequency (IF) signal to a digital representation must preserve the amplitude and phase information contained in the radar return with minimum error. The effects of quantization at each measurement point (quantization event) due to the twos-complement rounding error are assumed to be zero mean white noise processes. The A/D conversion and associated quantization noise are modeled as an additive noise component {right arrow over (e)} and added to the measured signature process.
{right arrow over (X)}
n
i
={right arrow over (X)}
E
i
+{right arrow over (n)}+{right arrow over (e)} (52)
The maximum dynamic range supportable by a “B-bit” quantizer is the ratio of the largest representable magnitude to the smallest nonzero representable magnitude. The dynamic range for twos compliment and magnitude encoding for a “B-bit” quantizer is
The dynamic range trade in
The analysis of the bandwidth trade in
One would then expect that there should be a ‘bump’ in information flow when the bandwidth reaches levels that support the resolution necessary to resolve the peaks associated with these two scatterers. The theoretical resolution to achieve this feature separation would be approximately 800 MHz using the fundamental bandwidth relationship;
In
In each figure, it can be seen that the MI decreases as links move further down the channel. With one Bit going into the channel (binary classification problem), Table 7 tabulates the information loss budget for each trade study at the selected baseline operating point.
The study of Table 7 reveals several key points. First, in this particular example problem, the targets appear to be separating very well at {right arrow over (X)}, and much of the loss occurs within the feature extraction and at the application of the decision rule. The loss at link {right arrow over (Y)} appears to be the dominant information limiting component in the system. There is a loss of 0.3-0.4 Bits at the feature extraction function at {right arrow over (Y)}. The information loss associated with signature measurement and signature processing results in only 0.1 Bits of loss. This is very important information in the effective optimization of system design for information sensing. Little gain can be expected through the expansion of sensing degrees of freedom (DOF) in improving the overall performance of the system.
Also, the loss due to the decision component of the system is in the range of 0.1-0.2 Bits. Depending on the performance requirements of the system, improvements to the decision stage of the system may or may not be warranted. At the decision stage of the system, 0.4-0.5 bits of loss have been sustained resulting in an “upper bound” in performance of something in the area of Pe=0.1. No improvements to the classifier design within the decision component of the system can improve upon this performance level. Improvements appear to be best directed toward the feature extraction stage.
An optimal design operating point may for example include the following component selections: (i) A/D converter with B=4 Bits; (ii) receiver design that achieves 20 dB SNR under tactically significant conditions; and (iii) transmit waveform with BW>800 MHz.
Information Flow and System Uncertainty: The study of the effects of sources of uncertainty on system performance confidence while under control parameters and in the presence of sensing uncertainty ({right arrow over (V)}E, {right arrow over (V)}t) is of particular interest. For a fully sampled signature process with negligible sampling uncertainty per Eq. (46), the FBIT method can be applied to study the independent sources of uncertainty. The effects of each independent source of uncertainty can be studied at each link in the channel. Eq. (36) is demonstrated for links {right arrow over (X)}, {right arrow over (Y)}, and Q under case 5 conditions defined in Table 5. Under these conditions, three independent sources of uncertainty are introduced in the system under perfect training conditions. An unknown bias in target aspect estimation and an unknown bias in leading edge range bias estimation are assumed. The target range is also unknown and as such a third uncertainty in introduced in the SNR of the measured signature. All assumed statistics associated with the uncertainties are as defined under case 5 of Table 5 and as described previously.
Using Monte Carlo simulation, L independent draws of an NM sample ensemble from {right arrow over (X)} are generated. The FBIT method is applied at each draw to generate the decomposition of the performance estimate reliability in Eq. (36) at {right arrow over (X)}, {right arrow over (Y)}, and Q. In
The corresponding impacts to the reliability in link performance can be generated through the application of Corollary IV and V. In
In
The implications of imperfect training are realized in the final stage of the channel at Q′ as shown in
A summary of the expected link loss, expected link performance, reliability in link performance, and results of respective sampling uncertainty tests in
From Table 8 it can be seen that gains in performance due to component design trades must also take into account the reliability level associated with predicted performance. In this example problem, changes within two significant digits of the expected performance should be studied in the context of the reliability of the performance estimates based on uncertainty factors introduced in the system.
By virtue of the foregoing, a method is provided for identifying and characterizing component-level information loss in a nonlinear system comprising a plurality of components, wherein at least one of the components of the nonlinear system is subject to at least one source of uncertainty, each source of uncertainty comprising a plurality of system uncertainty parameters, the method comprising the steps of: (a) determining discrete decision states for the nonlinear system, wherein the discrete decisions states comprise a true object state H and a decision state Q, the discrete decision states being characterized in a Markovian channel model comprising a plurality of links, wherein each link corresponds to one component of the nonlinear system; (b) modeling the system uncertainty parameters to create a plurality of distributions, wherein each distribution comprises a plurality of values ranging from a theoretical maximum entropy to a theoretical minimum entropy for one system uncertainty parameter, wherein at least one of the system uncertainty parameters is unknown; (c) calculating an entropy at each component, H(H), H(X), H(Y), . . . H(Q), wherein the entropy is directly related to an amount of uncertainty at each component; (d) computing an amount of mutual information between H and Q, I(H;Q), wherein I(H;Q) is used to characterize a total system performance and wherein the at least one source of uncertainty increases a total amount of entropy in the nonlinear system, thereby decreasing I(H;Q) and degrading the total system performance; (e) calculating an amount of cumulative component information loss from H to Q, ILX, ILY, . . . ILQ, wherein ILQ is equal to a sum of the component-level information loss that occurs at each component, ILXΔ, ILYΔ, . . . ILQΔ, and wherein component-level information loss occurs only within the Markovian channel model; (f) correlating, using Fano's equality, at least one of I(H;Q) and ILQ to the total amount of entropy to generate at least one overall probability of error Pe for the nonlinear system; (g) estimating, using the Data Processing Inequality together with Fano's equality, a component-level probability of error, PeX, PeY, . . . PeQ; and (h) correlating the component-level probability of error to the component-level information loss.
In one or more embodiments, the method further comprises computing a component-level performance reliability and attributing a contribution of each system uncertainty parameter to the component-level performance reliability, the method comprising the steps of: (a) determining a real world statistical variation of the system uncertainty parameters; (b) performing a Monte-Carlo simulation of a plurality of the statistical uncertainty parameters for a plurality of settings through iteration of steps 1b) to 1h); (c) calculating a component-level probability of error statistical distribution at each component; (d) determining the component-level performance reliability based on a standard deviation of each component-level probability of error statistical distribution; and (e) correlating the contribution of each system uncertainty parameter to the component-level performance reliability.
In an exemplary embodiment, the step of performing the Monte-Carlo simulation further comprises determining a proper ensemble sample size. In an exemplary embodiment, the method further comprising determining at least one component-level ensemble sampling requirement for the method of claim 1, the method comprising the steps of: (a) determining a set of test criteria for a maximum allowable sampling uncertainty of the component-level information loss relative to the component-level probability of error statistical distributions; (b) determining a sample ensemble size NM for the component-level information loss using a phase transition method; and (c) computing the component-level performance reliability using a numerical simulation method on the sample ensemble size NM. In a particular embodiment, the numerical simulation method comprises Monte Carlo modeling.
In another aspect of the present disclosure, a method is provided for determining an optimal component design for a nonlinear system comprising a plurality of components, wherein at least one of the components of the nonlinear system is subject to at least one source of uncertainty, each source of uncertainty comprising a plurality of system uncertainty parameters, the method comprising the steps of: (a) establishing an information loss budget comprising a desired PeQ; (b) calculating component-level information loss, ILXΔ, ILYΔ, . . . ILQΔ, according to claim 1; (c) calculating component probability of error, PeX, PeY, . . . PeQ, according to claim 1 to generate a calculated PeQ; (d) comparing the calculated PeQ with the desired PeQ; (e) identifying at least one source of information reduction, wherein the at least one source of information reduction comprises at least one of component-level information loss and information flow reduction; (f) determining the optimal component design to minimize the calculated PeQ, wherein the optimal component design includes at least one tradeoff between information flow and component design, wherein the at least one tradeoff decreases the at least one source of information reduction; and (g) repeating steps 6b) to 6g) until the calculated PeQ is equal to or less than the desired PeQ.
In one or more embodiments, the method further comprising identifying at least two sources of information reduction, wherein the at least two sources of information reduction comprise at least one of component-level information loss and information flow reduction; ranking the at least two sources of information reduction according to impact on the calculated PeQ, wherein at least one dominant source of information reduction is identified; and determining the optimal component design to minimize the calculated PeQ, wherein the optimal component design includes at least one tradeoff between information flow and component design, wherein the at least one tradeoff decreases the at least one dominant source of information reduction.
Information Quantification and Isolation Device.
The following portion of the present disclosure makes reference to the following references:
1. Background. Information Theory has long been intertwined with statistical signal processing. Claude Shannon devised Information Theory to provide detailed design rules for transmitter/receiver protocols in communication systems, yielding long sought provably optimal communication links [1, 2]. The application of information theory to computation and hypothesis testing has been more elusive. In 1963 Winograd and Cowan examined applying information theoretic concepts to computation [3]. In a similar vein, Woodward in 1964 explored applying information theory to radar signal processing and target identification (hypothesis testing) [4]. Bejan in 1996 explored applying information theoretic ideas, specifically entropy minimization, to the optimization of hardware design [5]. More recently Tishby, in a series of papers which started in 1999, proposed using information theory to advocate an information bottleneck approach to computation and hypothesis testing, whereby information theory is used in a hypothesis testing architecture to identify components which limit overall system performance [6]. The Fano Based Information Theoretic (FBIT) method proposed by Malas et al is the first generic design methodology which utilizes information theory to provide detailed guidance for hypothesis testing systems such as radar signal processing. In many ways the FBIT design process may be considered to be a method which encompasses all of the aforementioned approaches. The FBIT method enables detailed tradeoff analyses among system criteria such as Size, Weight and Power (SWaP), as well as all other metrics of interest, while maintaining an overall system performance level such as probability of error. A detailed understanding of subsystem/component tradeoffs is critical to the efficient and cost-effective construction of hypothesis testing hardware and software for example as used in radar equipment.
Design tradeoffs enabled by the FBIT approach are detailed below for a modern radar system architecture. The FBIT method described above outlines the mathematical concepts behind the generic design and optimization procedure. A numerical example is provided using high fidelity modeling and simulation of a radar system. In what follows below, the results of this numerical example are related to the implementation of the hardware design of the FBIT method: the Information Quantification and Isolation Device (IQID). The IQID is applied to a design, development, and test application that is widely used by radar developers in the industry. The HardWare-In-The-Loop (HWITL) application is the primary platform for hardware component design, test, and fault isolation. The radar system hardware and software components are operated on the ground in a laboratory bench arrangement. Surface-to-air radar measurements are conducted on aircraft targets of opportunity as well as on instrumented target aircraft. The results of the numerical radar system example are shown to be relevant to hardware and functional component design trades to lead to optimally performing hardware.
2. Introduction. The hardware implementation described above is referred to in this section as the Information Quantification Isolation Device (IQID). This Hardware Implementation Prototype section provides a specific application of the IQID to providing a radar hardware-in-the-loop testing system. 3. Summary: A high-fidelity numerical example is described above that demonstrated the application of the IQID (via the FBIT approach) to the design and optimization of the components of a radar system. The numerical radar example is given in the context of a radar designed to perform a target identification function on commercial aircraft. The radar must determine if the aircraft is a large commercial jet airliner or a small propeller plane. In this addition to the original material provided above, this section of the application is further explained in the context of a radar system design, development, and test facility which is known in the art as a hardware-in-the-loop (HWITL) facility. The radar system hardware and software components are operated on the ground in a laboratory bench arrangement. Surface-to-air measurements are conducted on aircraft targets of opportunity as well as on instruments target aircraft. Aircraft target measurements are made using instrumentation over an extended period of time and radar measurements at various stages in the system are stored for later analysis. The results of the numerical radar system example are shown to be relevant to hardware and functional component design trades performed on the HWITL facility using the IQID that lead to optimally performing hardware.
The original numerical example trades given above were designed to demonstrate a significant increase in capability over the current art in applications such as that within the HWITL or any other system component design/development/test operation. In
Design and Development Trade Capability: Component design trade capability are described above for optimal design of radar components in system. System performance bottlenecks can be identified and component hardware modified.
The significant increase in capability over the current art at each probe point is explained below. In each example below, current art does not enable quantification and isolation of information and performance loss at each probe point. Current art allows for quantifying “signal power loss” relevant to target detection functionality but does not address “target signal information loss” relevant to target identification and statistical inference functionality.
With continued reference to
In one or more embodiments, the workstation 2610 includes an HWITL application 2668 that enables the controller 2612 to interface with and supervise the IQID 2604. The controller 2612 manages a Fano-based information theoretic method (FBIT) design and optimization of nonlinear systems component 2670. In one or more embodiments, IQID 2604 is a system of bench instrumentation that is communicatively coupled to probe points 2603a-2603f to obtain radar data 2618. In one or more embodiments, IQID 2604 is a handheld device that is selectively communicatively coupled to the probe points 2603a-2603f. Description of probe points 2603a-2603f:
a. Probe point H 2603a represents the various target aircraft labels that have been measured in the open-air tests. These labels can be entered into the IQID 2604.
b. Probe point S 2603b represents the output of the Analog-to-Digital (A/D) converter. Measurement of component information loss and system performance due to noise insertion and signal clipping are critical here. Significant information loss is possible depending on the A/D converter design and limitations imposed here can impose severe limitations in system performance. An example of the A/D converter trades designed to maximize information flow and system performance at the probe point S are provided in
c. Probe point 2603c represents the data coming out of the digital in phase and quadrature (I/Q) processor component. Data conversion function is performed here and frequency sampled signatures are generated. Information loss due to signal conversion techniques are important and can be quantified.
d. Probe point 2603d represents the data coming out of the Fast Fourier Transform (FFT) converter component.
e. Probe point 2603e represents the data out of the feature extraction component of the system. A variety of feature design approaches can be applied here and the performance afforded by the associated information loss can be quantified. A great deal of expensive trial and error at the system level can be avoided by probing the system at the feature component .
f. Probe point Q 2603f represents the output of the aircraft target identification classifier component. The classifier component accounts for a large portion of the system design complexity and cost.
Radar System Design and Development in Uncertainty Capability: Absent of various sources of system uncertainty, the performance we would expect to see at each component probed in the radar system is given by the expected link performance. In real world operation, the radar system is exposed to various sources of uncertainty that will impact the expected link performance. The variation in the expected performance is quantified by the reliability at each component. Details regarding the performance reliability are given above. The present disclosure provides relating system performance reliability to respective hardware component reliability contributions and isolating major contributors to system reliability. The present disclosure further uses this insight to select specific hardware components for improvement in order to reach the system reliability performance specifications.
The numerical radar example provides a quantified degree of reliability at various components in the radar system using the process outlined in
4.4 System Performance Gains using the IQID: The advantages of using the IQID in the HWITL radar context can best be summarized through the study of several designs where a significant increase in system performance and reduced complexity/cost in hardware are realized. The following four design cases are taken from the numerical examples provided above and are directly applicable to the HWITL radar bench setting which is expanded upon herein. A radar system is designed to perform target identification on airborne aircraft and determine if the aircraft is a commercial passenger jet aircraft or a private small single engine propeller plane. The specifications for the performance of the radar include the important requirement that the probability of error in identification (Pe) be less than 0.05.
In Design Trade 1 the following radar hardware component design parameters are set in an ad hoc manor; system receiver bandwidth=400 MHZ, system signal-to-noise Ratio=20 dB, system dynamic range=20 dB. This results in marginal performance with target identification error=Pe=0.31, and a corresponding total system information loss=0.9 Bits. In this design, information flow is choked off at feature extraction component in the system limiting Pe to greater than 0.35. This ad hoc design provides early choke points in the system flow and as a result marginal system performance is shown in the design tradeoff graphic in
In Design Trade 2, the engineering designer increases the system receiver bandwidth to 1 GHZ, leaves the system signal-to-noise ratio at 20 dB and increases the complexity of the A/D converter component to achieve a system dynamic range of 40 dB. The target ID Error is now reduced to Pe=0.146=15% and system information loss is also reduced to 0.6 Bits. Acceptable performance is achieved by increasing component capability as shown in
In Design Trade 3 the system designer uses the IQID. Key components are probed and the radar information loss budget is generated by the IQID. Using this information the designer sets the system receiver bandwidth equal to 800 MHZ, leaves the system signal-to-noise ratio at 20 dB, and leaves the system dynamic range at 20 dB. The Target ID Error remains at Pe=0.146=15% and the corresponding system information loss also remains at 0.6 Bits. This design results in acceptable performance as in design 2, as shown in
In Design Trade 4 the engineering designer again uses the IQID. Increased performance over the acceptable level achieved in design 3 is desired. Key components are again probed and the radar information loss budget is generated by the IQID. Using this information, the designer leaves all three key design parameters the same as in Design Trade 3 (system receiver bandwidth=800 MHZ, system signal-to-noise ratio=20 dB, system dynamic Range=20 dB). The IQID information loss quantification at each component in the system hardware identifies the feature extraction software in the signal processor as the bottleneck point. Desired Performance is achieved building upon design 3 as shown in
5. Hardware Implementation Prototype IQID Device for Application to Radar Hardware in the Loop: The IQID hardware design including a component schematic and hardware chip specification is provided in the following sections.
5.1 Introduction to the Hardware: The mathematical theory of the Fano Based Information Theoretic FBIT method has been described in previously submitted documentation. The material below serves to describe the hardware and associated software that is needed to implement the FBIT method in order to permit optimization of a system such as a radar. The hardware is to be envisioned as having a form factor of a digital multimeter (DMM).
5.2 Functionality of FBIT Hardware: As described in the previously submitted patent documentation, the key to the FBIT optimization process is to determine the mutual information between two nodes of the system under study/optimization. For the sake of illustrating the principles involved, the pertinent electrical parameters may be voltage, current, impedance or other quantity. The type of electrical variable at each node need not be the same. For instance one node parameter may be voltage while the second node parameter may be current. For the sake of illustration the hardware description will use real-time voltages at two nodes. Later in this document variations on the type of nodal inputs (e.g. voltage versus current, impedance, . . . , etc) will be discussed. The multidimensional input case will also be addressed later.
Consider two continuous voltages X(t) and Y(t). If X(t) and Y (t) are Nyquist sampled at rates νX and νY respectively, one obtains a series of pairs {xk,yl} for k, l positive integers. One then fills the joint probability for P (x, y) shown in
A total of NSamples pairs labeled {X(t=tk)≡xk, Y (t=tk)≡yk} for k=1, . . . , NSamples are binned and populate the two dimensional grid representing the joint probability P (x, y). There are a total of M(X,Y) bins for P (X, Y), MX bins for P (X) and MY bins for P (Y), where M(X,Y)=MY*MY. Referring to
The mutual information between X and Y, denoted I(X; Y), is to be computed. The mathematical definition of mutual information that will be used is
(X;Y)=(X)+(Y)−(X,Y).
The mutual information I(X; Y) is constructed from the entropic quantities H(X, Y), H(X) and H(Y). The definition of H(X, Y) is shown below.
The definitions of H(X) and H(Y) are similar.
For example, let NSamples=10, where 3 sample pairs fall in one bin, 2 in another, and the remaining 5 in a third bin. The entropies of interest are
H(X,Y)=H(X)=1.485 Bits
and
H(Y)=1.00 Bits.
Combining H(X, Y), H(X) and H(Y) yields the mutual information I(X; Y).
I(X;Y)=1.485+1.000−1.485=1 Bit.
5.2.1 Computing Information Theoretic Quantities for Inphase and Quadrature Signals: Radar systems typically have complex signals containing magnitude and phase which are mapped to Inphase and Quadrature signals I and Q respectively. (Note that the symbol for the Inphase signal, I, should not be confused with the symbol I for mutual information.) As a result, radar systems have a two dimensional input signal for X, namely {IX, QX}. and a two dimensional signal for Y, namely {IY, QY}. The tableau representing P(X,Y)(X, Y) shown in
5.3 Hardware Description: The sampling portion of the computation is implemented using an analog low pass filter with the upper frequency cutoff determined specified by the user or via an adaptive algorithm. This stage determines the Nyquist sampling frequency ƒNyquist. The total number of samples are determined by specifying the time frame over which the samples should be accumulated. This is specified by the user. The binning operation for the X(t) and Y (t) sampled signals may be specified by the user or implemented by an autoranging A to D chip. (The binning scheme is implemented as autoranging in the software pseudocode shown in
The clock oscillator chip is the Vectron Corporation PS-702. The Analog to Digital Converter (ADC) chips are the Analog Devices 9208. The Digital Signal Processing (DSP) chip is the Texas Instruments TMS320C6678. The clock oscillator outputs a One Gigahertz clock signal of a format which can be used as an input by the Analog Devices ADC's for sampling. Multiple Analog Device ADC's can use the same clock to ensure synchrony of sampling. The TI DSP contains an internal Phased Lock Loop (PLL) as a clock for timing. (See the functional block diagram for the TMS320C6678 shown in FIG. 1-1 on page 4 of the TMS320C6678 data sheet.) An estimated power budget is given in TABLE A3.
5.4 DSP Software: The Texas Instruments DSP is fully programmable. For the sake of simplicity, the pseudocode where X is one dimensional and Y is one dimensional is given in
Furthermore, Without Loss of Generality, referring to
The DSP pseudo code uses the following approximation for computing the entropy (X)≡HX of the discrete random variable X. Recall that MX is the number of discrete bins that the dynamic range of X is partitioned into. In the equations below NX(k) is the number of samples of X(t) that fall into X bin number k, where k=0, . . . , MX. Note that Σk=0k=M
6. Summary: The computational components implementing the FBIT optimization procedure can be implemented in a hand held device with moderate power requirements. The FBIT hardware construction uses off-the-shelf chips. The resulting FBIT hardware device resembles a digital multimeter (DMM) device, although the FBIT hardware device is implementing different internal functionality.
Workstation system 3100 includes processors 3104 and 3106, chipset 3108, memory 3110, graphics interface 3112, a basic input and output system/extensible firmware interface (BIOS/EFI) module 3114, disk controller 3116, hard disk drive (HDD) 3118, optical disk drive (ODD) 3120, disk emulator 3122 connected to an external solid state drive (SSD) 3124, input/output (I/O) interface (I/F) 3126, one or more add-on resources 3128, a trusted platform module (TPM) 3130, network interface 3132, and power supply 3136. Processors 3104 and 3106, chipset 3108, memory 3110, graphics interface 3112, BIOS/EFI module 3114, disk controller 3116, HDD 3118, ODD 3120, disk emulator 3122, SSD 3124, I/O interface 3126, add-on resources 3128, TPM 3130, and network interface 3132 operate together to provide a host environment of workstation system 3100 that operates to provide the data processing functionality of an information handling system. The host environment operates to execute machine-executable code, including platform BIOS/EFI code, device firmware, operating system code, applications, programs, and the like, to perform the data processing tasks associated with workstation system 3100.
In a host environment, processor 3104 is connected to chipset 3108 via processor interface 3138, and processor 3106 is connected to the chipset 3108 via processor interface 3140. Memory 3110 is connected to chipset 3108 via a memory bus 3142. Graphics interface 3112 is connected to chipset 3108 via a graphics bus 3144, and provides a video display output 3146 to graphical display(s) 3148 that presents UI 3149. In a particular embodiment, workstation system 3100 includes separate memories that are dedicated to each of processors 3104 and 3106 via separate memory interfaces. An example of memory 3110 includes random access memory (RAM) such as static RAM (SRAM), dynamic RAM (DRAM), non-volatile RAM (NV-RAM), or the like, read only memory (ROM), another type of memory, or a combination thereof.
BIOS/EFI module 3114, disk controller 3116, and I/O interface 3126 are connected to chipset 3108 via an I/O channel 3150. An example of I/O channel 3150 includes a Peripheral Component Interconnect (PCI) interface, a PCI-Extended (PCI-X) interface, a high speed PCI-Express (PCIe) interface, another industry standard or proprietary communication interface, or a combination thereof. Chipset 3108 can also include one or more other I/O interfaces, including an Industry Standard Architecture (ISA) interface, a Small Computer Serial Interface (SCSI) interface, an Inter-Integrated Circuit (I2C) interface, a System Packet Interface (SPI), a Universal Serial Bus (USB), another interface, or a combination thereof. BIOS/EFI module 3114 includes BIOS/EFI code operable to detect resources within workstation system 3100, to provide drivers for the resources, initialize the resources, and access the resources. BIOS/EFI module 3114 includes code that operates to detect resources within workstation system 3100, to provide drivers for the resources, to initialize the resources, and to access the resources.
Disk controller 3116 includes a disk interface 3152 that connects the disk controller to HDD 3118, to ODD 3120, and to disk emulator 3122. An example of disk interface 3152 includes an Integrated Drive Electronics (IDE) interface, an Advanced Technology Attachment (ATA) such as a parallel ATA (PATA) interface or a serial ATA (SATA) interface, a SCSI interface, a USB interface, a proprietary interface, or a combination thereof. Disk emulator 3122 permits SSD 3124 to be connected to workstation system 3100 via an external interface 3154. An example of external interface 3154 includes a USB interface, an IEEE 1394 (Firewire) interface, a proprietary interface, or a combination thereof. Alternatively, solid-state drive 3124 can be disposed within workstation system 3100.
I/O interface 3126 includes a peripheral interface 3156 that connects the I/O interface to add-on resource 3128, to TPM 3130, and to network interface 3132. Peripheral interface 3156 can be the same type of interface as I/O channel 3142, or can be a different type of interface. As such, I/O interface 3126 extends the capacity of I/O channel 3142 when peripheral interface 3156 and the I/O channel are of the same type, and the I/O interface translates information from a format suitable to the I/O channel to a format suitable to the peripheral channel 3156 when they are of a different type. Add-on resource 3128 can include a data storage system, an additional graphics interface, a network interface card (NIC), a sound/video processing card, another add-on resource, or a combination thereof. Add-on resource 3128 can be on a main circuit board, on separate circuit board or add-in card disposed within workstation system 3100, a device that is external to the information handling system, or a combination thereof.
Network interface 3132 represents a network interface controller (NIC) disposed within workstation system 3100, on a main circuit board of the information handling system, integrated onto another component such as chipset 3108, in another suitable location, or a combination thereof. Network interface 3132 includes network channel(s) 3158 that provide interfaces to devices that are external to workstation system 3100. In a particular embodiment, network channel(s) 3158 are of a different type than peripheral channel 3156 and network interface 3132 translates information from a format suitable to the peripheral channel to a format suitable to external devices. An example of network channel(s) 3158 includes InfiniBand channels, Fibre Channel channels, Gigabit Ethernet channels, proprietary channel architectures, or a combination thereof. Network channel(s) 3158 can be connected to external network resources such as IQID 3159.
Within memory 3110, HDD 3118, ODD 3120, or SSD 3124, one or more software and/or firmware modules and one or more sets of data can be stored that can be utilized during operations of workstation system 3100. These one or more software and/or firmware modules can be loaded into memory 3110 during operation of the workstation system 3100. Specifically, in one embodiment, memory 3110 can include therein a plurality of such modules, including FBIT design and optimization of nonlinear systems component application 3168, one or more other applications 3170, operating system (OS) 3172, radar data 3174, and optimized radar design data 3176 These software and/or firmware modules have varying functionality as disclosed herein when their corresponding program code is executed by processors 3104, 3106.
Although specific embodiments have been described in detail in the foregoing description and illustrated in the drawings, various other embodiments, changes, and modifications to the disclosed embodiment(s) will become apparent to those skilled in the art. All such other embodiments, changes, and modifications are intended to come within the spirit and scope of the appended claims.
While the disclosure has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the disclosure. In addition, many modifications may be made to adapt a particular system, device or component thereof to the teachings of the disclosure without departing from the essential scope thereof. Therefore, it is intended that the disclosure not be limited to the particular embodiments disclosed for carrying out this disclosure, but that the disclosure will include all embodiments falling within the scope of the appended claims. Moreover, the use of the terms first, second, etc. do not denote any order or importance, but rather the terms first, second, etc. are used to distinguish one element from another.
In the preceding detailed description of exemplary embodiments of the disclosure, specific exemplary embodiments in which the disclosure may be practiced are described in sufficient detail to enable those skilled in the art to practice the disclosed embodiments. For example, specific details such as specific method orders, structures, elements, and connections have been presented herein. However, it is to be understood that the specific details presented need not be utilized to practice embodiments of the present disclosure. It is also to be understood that other embodiments may be utilized and that logical, architectural, programmatic, mechanical, electrical and other changes may be made without departing from general scope of the disclosure. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims and equivalents thereof.
References within the specification to “one embodiment,” “an embodiment,” “embodiments”, or “one or more embodiments” are intended to indicate that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of such phrases in various places within the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not other embodiments.
It is understood that the use of specific component, device and/or parameter names and/or corresponding acronyms thereof, such as those of the executing utility, logic, and/or firmware described herein, are for example only and not meant to imply any limitations on the described embodiments. The embodiments may thus be described with different nomenclature and/or terminology utilized to describe the components, devices, parameters, methods and/or functions herein, without limitation. References to any specific protocol or proprietary name in describing one or more elements, features or concepts of the embodiments are provided solely as examples of one implementation, and such references do not limit the extension of the claimed embodiments to embodiments in which different element, feature, protocol, or concept names are utilized. Thus, each term utilized herein is to be given its broadest interpretation given the context in which that terms is utilized.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The description of the present disclosure has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the disclosure. The described embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.
The present application is a continuation-in-part of U.S. patent application Ser. No. 16/666,516, entitled “Fano-Based Information Theoretic Method (FBIT) for Design and Optimization of Nonlinear Systems”, filed 29 Oct. 2019, which in turn is a continuation-in-part of U.S. patent application Ser. No. 14/315,365, entitled “Fano-Based Information Theoretic Method (FBIT) for Design and Optimization of Nonlinear Systems”, filed 26 Jun. 2014, which in turn claims the benefit of priority under 35 U.S.C. § 119(e) to U.S. Provisional Application Ser. No. 61/914,429 entitled “Fano-Based Information Theoretic Method (FBIT) for Design and Optimization of Nonlinear Systems,” filed 11 Dec. 2013, the contents of all of which are incorporated herein by reference in their entirety.
The invention described herein was made by employees of the United States Government and may be manufactured and used by or for the Government of the United States of America for governmental purposes without the payment of any royalties thereon or therefore.
Number | Date | Country | |
---|---|---|---|
61914429 | Dec 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16666516 | Oct 2019 | US |
Child | 17019531 | US | |
Parent | 14315365 | Jun 2014 | US |
Child | 16666516 | US |