Priority is claimed to European Provisional Patent Application No. 22173344.7, filed on May 13, 2022, the entire disclosure of which is hereby incorporated by reference herein.
The present disclosure relates to a method, system, and computer-readable medium for a hyper network machine learning model for simulating physical systems.
Numerical simulations are used to various industries and technical specialties, and can be used, for example, to design new cars, airplanes, molecules and drugs, and even to predict weather. While these numerical simulations can be extremely important, they also often require large amounts of computational power and require fast adaptation to new conditions and hypothesis.
Physic-informed machine learning aims to build surrogate models for real-world physical systems governed by partial differentiable equations. One of the more popular recently proposed approaches is the Fourier Neural Operator (FNO), which learns the Green's function operator for partial differential equations (PDEs) based only on observational data. These operators are able to model PDEs for a variety of initial conditions and show the ability of multi-scale prediction. However, this model class is not able to model a high variation of the parameters of some PDEs. For example, PDEs may be used to describe various physical systems, from large-scale dynamic systems such as weather systems, galactic dynamics, airplanes, or cars, to small-scale systems such as genes, proteins, or drugs. In traditional approaches, such as dynamic numerical simulations, the use of domain expertise is the basis for designing numerical solvers. However, such traditional approaches suffer from a host of disadvantages. For example, traditional approaches may suffer from numerical instabilities, long simulation times, and a reduced adaptability for use with hybrid hardware applications involving use of Graphics Processing Units (GPUs) and vector computing. Traditional approaches may also have difficult or unclear ways to include direct numerical observation from instrumental measurements, making it particularly difficult to model noisy data or utilize sparse observational data into a numerical simulation. Large computational resource requirements, including large memory requirements, may be required. Dedicated software is often also required for data and computational parallelization. Traditional approaches also struggle with generalization, making it difficult to apply a trained State of the Art (SOTA) machine learning model, like an FNO model, to unseen data.
A method for operating a hyper network machine learning system, the method comprising training a hyper network configured to generate main network parameters for a main network and generating, using the trained hyper network, the main network with the main network parameters, the main network having a machine learning architecture that models a spatial domain and a frequency domain to simulate a physical system.
Subject matter of the present disclosure will be described in even greater detail below based on the exemplary figures. All features described and/or illustrated herein can be used alone or combined in different combinations. The features and advantages of various embodiments will become apparent by reading the following detailed description with reference to the attached drawings, which illustrate the following:
The present disclosure provides an improved machine learning architecture for simulating and making predictions about physical systems. According to an aspect of the present disclosure, a hyper network machine leaning architecture is provided, which includes a hyper network and a main network. The hyper network is configured to learn the behavior of the main network and train and/or configure the main network. The main network, once trained, is configured to accurately model (simulate) a target physical system. The main network and/or the hyper network are configured with spatial components and frequency components—for example, the main network and/or the hyper network may use a Fourier Neural Operator (FNO) machine learning architecture.
Advantageously, machine learning systems configured according to aspects of the present disclosure accelerate the computation of numerical solutions of partial differential equations (PDEs) using data driven machine learning as compared to the state of the art. Aspects of the present disclosure also provide for a variety of advantages over traditional models performing numerical simulation methods, such as an increase in model accuracy for new parameter configurations, increased simulation speed for new configurations, and integrating models with observational data. Other advantages include enabling efficient initial parameter estimates for new system configurations, compatibility with hybrid hardware such as GPUs, and easy adaptation due to inference times that are proportional to the number of parameters in a model (e.g., an FNO model). The disclosed machine learning architecture provides these substantial improvements over the state of the art, while only adding a small additional memory requirement.
A first aspect of the present disclosure provides a method for operating a hyper network machine learning system, the method comprising training a hyper network configured to generate main network parameters for a main network, and generating, using the trained hyper network, the main network with the main network parameters, the main network having a machine learning architecture that models a spatial domain and a frequency domain to simulate a physical system.
According to a second aspect of the present disclosure, the main network of a method according to the first aspect may have a Fourier neural operator architecture comprising a plurality of Fourier layers each having a frequency and spatial component, and wherein the hyper network generating the main network parameters comprises generating parameters for the Fourier layers.
According to a third aspect of the present disclosure, during training of the hyper network in a method according to at least one of the preceding aspects, the hyper network modifies the Fourier layers based on a Taylor expansion around a learned configuration to determine updated parameters for the Fourier layers.
According to a fourth aspect of the present disclosure, the updated parameters are changed in both the frequency and spatial component in a method according to at least one of the preceding aspects.
According to a fifth aspect of the present disclosure, a method according to at least one of the preceding aspects may further comprise obtaining a dataset based on experimental or simulation data generated with different parameter configurations, the dataset comprising a plurality of inputs and a plurality of outputs corresponding to the inputs, wherein the hyper network is trained using the dataset.
According to a sixth aspect of the present disclosure, the training in a method according to at least one of the preceding aspects may comprise simulating, via the main network generated with the main network parameters, the physical system to determine a simulation result based on the at least one input of the dataset comparing the simulation result against at least one output corresponding to the at least one input from the dataset, and updating the main network parameters based on the comparison result.
According to a seventh aspect of the present disclosure, the training of the hyper network in a method according to at least one of the preceding aspects is iteratively conducted until the simulation result is within a predetermined tolerance threshold when compared to the at least one output.
According to an eighth aspect of the present disclosure, a method according to at least one of the preceding aspects may further comprise receiving system parameters by the hyper network, the system parameters corresponding to the physical system targeted for simulation, wherein generating the main network with the main network parameters comprises the hyper network generating the main network parameters based on the hyper network parameters and the system parameters.
According to a ninth aspect of the present disclosure, the hyper network in a method according to at least one of the preceding aspects may comprise Fourier layers each having a frequency and spatial component with corresponding hyper network parameters, and wherein the method further comprises receiving system parameters by the hyper network, the system parameters being configured to adapt the Fourier layers to the physical system targeted for simulation.
According to a tenth aspect of the present disclosure, the hyper network in a method according to at least one of the preceding aspects may comprise Fourier layers each having a frequency and spatial component with corresponding hyper network parameters, wherein the method further comprises adapt the Fourier layers to the physical system targeted for simulation based on system parameters, and wherein the system parameters are determined by learning a representation of the system parameters according to a bilevel problem.
According to an eleventh aspect of the present disclosure, the hyper network in a method according to at least one of the preceding aspects may comprise hyper network parameters corresponding to the spatial domain and the frequency domain, wherein training the hyper network comprises updating the hyper network parameters using stochastic gradient descent based on a training database comprises input and output pairs until a target loss threshold is reached, and wherein the generating of the main network is performed after completing the training of the hyper network and comprises receiving system parameters associated with the target physical system and generating the main network parameters based on the hyper network parameters and the system parameters.
According to a twelfth aspect of the present disclosure, a method according to at least one of the preceding aspects may comprise instantiating the main network on a computer system and operating the man network to simulate the target physical system.
According to a thirteenth aspect of the present disclosure, a method according to at least one of the preceding aspects may comprise receiving input data, simulating the physical system based on the input data to provide a simulation result and determining whether to activate an alarm or hardware control sequence based on the simulation result.
According to a fourteenth aspect of the present disclosure, a method according to at least one of the preceding aspects may comprise parameterizing a meta-learning network by modifying only system parameters.
According to a fifteenth aspect of the present disclosure, in a method according to at least one of the preceding aspects, the main network based on the main network parameters generated by the hyper network includes fewer parameters than the hyper network.
According to a sixteenth aspect of the present disclosure, a tangible, non-transitory computer-readable medium is provided having instructions thereon which, upon being executed by one or more hardware processors, alone or in combination, provide for execution of a method according to at least one of the first through fifteenth aspects.
According to a seventeenth aspect of the present disclosure, a system is provided comprising one or more hardware processors which, alone or in combination, are configured to provide for execution of the steps of training a hyper network configured to generate main network parameters for a main network and generating, using the trained hyper network, the main network with the main network parameters, the main network having a machine learning architecture that models a spatial domain and a frequency domain to simulate a physical system.
According to aspects of the present disclosure, a class of alternative operations for the generation of FNO parameters is disclosed, and the affine transformation in the hyper network is shown to be sufficient, thus reducing the number of additional network parameters.
According to an aspect of the present disclosure, a method is provided for use of a hyper network that generates a smaller network that is used to simulate a physical system after being trained on a large dataset corresponding to a configuration. The hyper network may have a limited number of parameters, with frequency and spatial layers being modified based on a Taylor expansion around a learned configuration, where a change is also learned. A machine learning architecture may be used that models the space and frequency domain, and the learned change in the parameters is on both of the two domains driven by the parameters of the system. The external parameters may adapt the smaller network (which may be a FNO model) to the specific (i.e., target) configuration/environment/use cases. If the external parameters are not known, a training procedure may be run that includes learning a representation of the parameters described as a bi-level problem. The smaller network may be instantiated and used to make predictions based on inputs. When few samples are given, the generated smaller network may be individually trained.
According to an aspect of the present disclosure, a method is provided that includes: collecting experimental data and/or simulation data over different parameter configurations; training of a hyper network over the experimental and/or simulation dataset; querying the hyper network with specific parameters to obtain main network parameters; and using the main network parameters for a target configuration.
According to an aspect of the present disclosure, a hyper network system architecture may include two networks that work together: a hyper network and a main network. The hyper network generates and/or reconfigures the main network. The main network—after being trained on a training dataset—is used to simulate a target physical system. As used in the present disclosure a “hyper network” and a “main network” are machine learning models, in particular neural networks using FNOs.
The hyper network may be configured to receive, as inputs, parameters (or representations of parameters) of the system and provide the parameters of the main network. The hyper network may then learn the behavior of the main network (e.g., during the training phase) and use that information to reconfigure the main network (e.g., by sending updated parameters to the main network to improve the performance (e.g., accuracy) of the main network). Additionally or alternatively, the hyper network may interpolate the configuration of the main network and assist in predicting the output of the main network in new configurations, whose parameters were not seen before (or during) training, after being trained in calibrated simulations.
According to an aspect of the present disclosure, the hyper network is trained by minimizing a loss function that includes parameters for the main network. The hyper network uses each layer of an FNO, each layer including spatial and frequency components. Additionally or alternatively, the hyper network parameters may be updated using a stochastic gradient descent, as exemplified by the following formula:
θ′=θ−∇θ(y−ŷ(ψ(θ(λ)),x) [Formula I]
A derivative of a parameter θ of the main network is thus determined based on a gradient comparing dataset output y and a predicted output ŷ that is based on a function P of main parameter θ and system parameters λ, as well as a dataset input x. By repeated use of the foregoing formula and iterative updating of the hyper network parameters, ideal parameters of the hyper network can be determined, thereby “training” the hyper network and enabling the hyper network to provide optimized parameters θ to a main network. Additionally or alternatively, the hyper network may be trained together with the main network based on datasets used to compare predicted values with known calculated values.
The main network is configured as the network that receives input data (e.g., physical simulation input data) and outputs one or more predictive results (e.g., the predictive result of the simulation). In the training phase, the main network may receiving input training data, which is from a training data set, which includes the training input data and the complementary known training output data. The output predicted by the main network in the training phase may then be compared against the known training output data, and the parameters of the main network may be adjusted (e.g., by or with the assistance of the hyper network) based on that comparison. In an online phase, the main network may receive input data on which a prediction is to be made—i.e., no corresponding output data yet exists and the output is yet-unknown to the system; run a simulation of the physical system based on the input data to predict an outcome; and generate output data corresponding to the predicted outcome.
The main network is “smaller” than the hyper network—e.g., the main network may have smaller architecture with fewer parameters or layers than the hyper network—making the main network more computationally efficient to run test simulations upon. On the other hand, the hyper network utilizes a large architecture (at least in part) to generate the smaller main network (or at least its parameters). The larger architecture of the hyper network may include a higher number of parameters to enable it to be trained efficiently. The smaller main network (once trained/updated) can be deployed for simulating the target physical system, and is generally more efficient (e.g., as far as the utilization of computational resources) at simulating the physical system as compared to not only the larger hyper network, but also to other machine learning models that were not generated and/or configured using a hyper network. The larger hyper network may not be used in this deployed simulation.
According to an aspect of the present disclosure, the main network may have parameters that are not generated by the hyper network, but are nevertheless trained with parameters generated by the hyper network. For example, one layer of an FNO may be generated by the hyper network, while other layers are not. In another example, while both frequency and spatial parameters are implemented in the main network, only the frequency parameters (or, conversely, only the spatial parameters) are generated by the hyper network.
The training dataset may be obtained by collecting experimental data and simulation data, e.g., over different parameter configurations, for the target physical system.
It will be readily appreciated that the system 100 of
According to an aspect of the present disclosure, the hyper network and/or the main network may be configured with a Fourier Neural Operator (FNO) architecture. For example, the main network may include multiple layers of elements of the form:
x=Wx+
−1(R(x)) [Formula II]
In the foregoing Formula II, is the Fourier transform, x are the features of the network and W and R are matrices representing the parameters of the layer. The hyper network generates the parameters for a Fourier layer of the main network according to the formula:
x
l+1
=W
l(λ,θl)xl+−1(Rl(λ,θl)
(xl)) [Formula III]
An input 202 is received by a first parameter layer 206. The parameter layer 206 is then used in first Fourier layer 208. A second Fourier layer 210 receives an output from the first Fourier layer 208. A second parameter layer 212 receives the output of the second Fourier layer 210 and outputs output 214. The first parameter layer 206 and second parameter layer 212 include projection operators P and Q for reducing dimensions and expanding and contracting the input 202 in the hyper-FNO 204. Projection operators P and Q can be generated by the hyper network. 218.
Hyper network machine learning architectures implemented according to the present disclosure can be further understood to comprise addition features or modification to the foregoing aspects, thereby realizing additional advantages over traditional machine learning models executing numerical simulations.
According to one aspect, estimation of the parameters can be accomplished in a model agnostic manner. For example, a bi-level formulation and update rule can be implemented to jointly learn the representation of unknown parameters of a system. The bi-level formulation may include solving an optimization problem composed of two sub-problems that depend on each other. One problem is referred to as the outer problem and the other is referred to as the inner problem. The general form of the bi-level formulation is:
λ is the parameter of the PDE describing a particular model, which may not be known in advance, so jointly solving for the parameters may be required during training. In the bi-level formulation, ƒ and g are loss functions and x is the solution of the PDE. ƒ and g may be the same loss function, but computed on different datasets.
According to one aspect, estimation of the parameters with new environments is accomplished. When a new environment is observed, the parameters of the system may not be known, and thus a few samples may be used to first detect the parameters of the system and then possibly use the same or additional samples to update the predictive model that is later used at test time. For example, the first ten samples of a solution may be observed and used to derive the parameter λ. The parameter λ may then be used to predict the rest of the solution.
According to one aspect, main network calibration can be carried out by using the hyper-FNO to calibrate the main network. For example, the hyper-FNO can be trained based on multiple configurations of the main network, and then the hyper-FNO can be used as a surrogate model. The optimal parameters for a desired condition or specific output can then be found. The main model with the new discovered parameters can be run to determine a more accurate prediction, if necessary.
According to an aspect of the present disclosure, a conditional network can be established to use a conditional neural network where the network receives as inputs the parameters of the system (via PDEs) and inputs of the main network (e.g., the initial condition, the forcing term, or other physics-related functions). During training, all the parameters of the system are learned. At test time, if data of the new environment is available, only the last layers are trained. In this manner, training efforts and resources are concentrated or limited to test time, thereby increasing simulation efficiency, but, as a trade-off, an advantage in reduced memory size may be lost.
According to an aspect of the present disclosure, a meta-learning network is provided, wherein the parameters of the main network are selected to work in all configurations or a few samples are used to specialized to a specific scenario. In an embodiment, a reptile approach is used, wherein the parameters of the meta-learning network is updated only after a few iterations of updating the main network on a new task or a new configuration. In an embodiment, a Model Agnostic Meta Learning (MAML) approach is used, wherein the meta-learning model is the same as the main network. In this embodiment, a few gradient descents are used based on a sample for the specific new task or new configuration.
In addition, the structure of the meta-learning network is parametrized by λ and in the adaptation phase only λ is modified, according to the formulas:
R
V(λ)=R0+(V0λ,V1λ)⊙row,colR1 [Formula IV]
W
U(λ)=W0+(U0λ,U1λ)⊙row,colW1 [Formula V]
or
R
V
,V
(λ)=rijlFT(λ)=rijl0(1+v0ikλkqv1jkv2lk) [Formula VI]
W
U
,U
(λ)=wijlXT(λ)=wijl0(1+u0ikλku1jku2lk) [Formula VII]
According to an aspect of the present disclosure, the system parameters λ are modelled as a distribution and in phase of inference drawn by a distribution λ˜N(μ, Σ), where μ, Σ parameters are learned in a variational approach using a variational trick. In this way a statistic of the results with error intervals can be built. Specifically, a variational trick may include using a fixed distribution without parameters as a sample and subsequently transforming the sample with parameters that are trainable. For example, a variable e may be modeled as a normal distribution with a mean of zero and a variance of one. A new variable x can then be built such that:
x=αe+β,e˜N(0,1),
where α and β are trainable parameters. A model is defined by
W(λ)=W1(λ)e+W0(λ),e˜N(0d,1d),
where W1(λ), W0(λ) are modelled as in Formula VII and R1(λ), R0(λ) as Formula VI.
In an exemplary implementation of an aspect of the present disclosure, numerical simulations may be used to provide large weather forecasts and simulation acceleration for high-performance computing (HPC). In such an implementation, the use of hyper-FNO is advantageous for accelerating the study of weather forecast and support to the government and research community to perform simulations in various scenarios. Furthermore, the use of the hyper-FNO facilitates parameter estimation and inverse problem solving. Parameter estimation is particularly benefitted in that an infinite parameter space is drastically reduced with the help of hyper-FNO's efficient parameter estimation.
In traditional approaches, several numerical simulations are performed based on observational data, and statistical data is used to produce a forecast. However, the accuracy of a prediction degrades as prediction time increases because of the chaotic nature of PDEs. Some approaches combat this by increasing the sample of statistical data used to produce the forecast, which can be difficult because of significantly increased computational costs.
In an aspect of the present disclosure, traditional simulation results are combined with hyper-FNO's predictions. The hyper-FNO's predictions can be more quickly obtained (and thus a higher quantity of predictions obtained in a given time) in comparison to predictions by a traditional simulation thanks to efficient model calculation and the fact that the various parameters could be easily taken into account.
In an exemplary implementation, numerical simulations may be used to provide molecular simulation for new materials and new protein discovery. In traditional approaches, a numerical simulator uses a model for molecular and atomic interactions at a small scale and produces a prediction based on these smaller-scale models. Small errors and/or un-modelled dynamics can lead to a prediction that is not in line with real world observations. However, in an aspect of the present disclosure, a hyper-FNO can be used within a machine learning model to model hyper-parameters of the numerical simulation and find the most appropriate configuration for the main network. In an embodiment, the hyper-FNO can also be trained on specific calibrated configuration and observational data, thereby predicting new outputs based on one or more new unseen configurations.
In an exemplary implementation, numerical simulations may be used for identification of blood flow in arteries and vessels and/or identification of blood coagulation. In traditional approaches, blood flow can be modelled using a complex system of PDs, such as Navier-Stokes equations, representing flow over a network of arteries and vessels of the human body. In an embodiment of the present disclosure, a hyper-FNO is used to model the flow in each arterial section and to adapt the model to observational data. For example, the hyper-FNO may be used to adapt the model based on changes in the form of blood vessels, and to detect problems of artificial blood vessels before they are implanted or otherwise utilized in surgery.
For example,
In an exemplary implementation, numerical simulations may be used for identification of gene regulatory networks from observational data. Gene regulatory networks describe the interaction, be it by promotion or inhibition, of gene activity, including the interactions between a gene and other genes, proteins, or other cell elements. Gene regulatory networks are used to model causal relationships among these elements. In traditional approaches, ordinary or partial differential equations can be used to describe such interactions. The final expression level of these interactions can be partially observed using different measurement techniques, such as gene sequencing.
In an embodiment of the present disclosure, observational data can be used to for model training and to derive the structure and parameters of the ordinary or partial differential equations used to describe gene regulatory networks. Derived models are used to detect changes in the gene regulatory network and to measure the consistency of a gene expression with a specific gene regulatory network, thereby aiding detection of results that are outside of a modeled statistical distribution.
In an exemplary implementation, numerical simulations may be used to solve inverse problems for water contamination and/or oil exploration. Traditional approaches describe propagation of pollution or of an acoustic wave with a PDE. In an aspect, a hyper-FNO and is used in conjunction with numerical simulation to estimate a propagation profile of pollutant or a wave.
Likewise, a hyper-FNO can be used in conjunction with numerical simulation to estimate porosity and topology of a domain based on acoustic wave propagation.
In an aspect of the present disclosure, the foregoing machine learning models are used in diagnostic applications such as, for example, pathology, to model progranulin (GRN) and/or neoantigen simulations. In some embodiments, digital twin simulation, whereby a virtual representation of an object or system that spans the object's lifecycle is created and updated using real-time data, is used and incorporates the foregoing numerical simulations and model creation methods. Such embodiments have significant advantages over traditional simulations and simulation methods, as a numerical simulation can be applied to a more specific population of people by adapting parameters for personalized treatment, which would otherwise be too time and/or resource intensive.
It will be readily appreciated that the foregoing simulation methods and machine learning models may also provide advantageous benefits when used and/or applied in a variety of fields or industries when combined with IPC solutions.
In some embodiments, it will be readily appreciated that the size of the main network (in terms of quantity of data, computational power required for execution, and/or memory usage) is smaller than that of a hyper network. In some embodiments, the presence of a hyper network may be determined based on a comparison of the size of the main network with the hyper network, thereby allowing a system to determine an association of a main network with a hyper network. It will be readily appreciated that a hyper network according to the above-described embodiments typically are larger than main networks due to their configuration to process and output parameters to the main network, which is generated based on parameter configurations set forth by the hyper network.
In some embodiments, a hyper network may be detected by checking if additional information as external parameter are used in a predictive model.
In some embodiments, a user interface (UI) is included in a simulation system or is displayed via instructions stored in a computer-readable medium. The user interface may display and/or allow for user input of parameters used as inputs by the hyper network. In some embodiments, user input is accomplished by manual entry and/or selection of parameters in the UI.
In connection with the foregoing aspects, further detail will be provided below regarding previously disclosed, additional, and/or related aspects of the present disclosure. Minor variations in wording and tone are not to be understood as delimiting aspects exclusively of one another. It will be readily understood that the presentation of the following disclosure, which includes formulas, data, and descriptions, elucidate aspects of the present disclosure. The following disclosure includes short form citations to references, a full list of corresponding long form citations of which are included in the List of References at the end of the disclosure herein.
As described previously, traditional FNO approaches modeling PDEs are not able to model a high variation of the parameters of some PDEs. To this end, hyper-FNO is an approach to extend FNOs using hyper networks so as to increase the models' extrapolation behavior to a wider range of PDE parameters using a single model. Hyper-FNO learns to generate the parameters of functions operating in both the original and the frequency domain. This architecture is evaluated using various simulation problems. The success of deep learning methods in various domains has recently been carried over to simulations of physical systems. For instance, neural networks are now commonly used to approximate the solution of a PDE or for approximating its Green's function (Thuerey, et al., 2021; Avrutskiy, 2020; Karniadakis, et al., 2021; Li, et al., 2021; Raissi, et al., 2019; Chen, et al., 2018; Raissi, 2018; Raissi, et al., 2018b). In applications such as vehicle aerodynamic design and prototyping, access to approximate solutions at a lower computation cost is often preferable over solutions with a known approximation error but prohibitive computational costs. In these contexts, machine learning models provide an approach to solving PDEs which complements traditional numerical solvers. Furthermore, data-driven methods are useful when observations are noisy or the underlying or the underlying physical model is not fully known or defined (Eivazi, et al., 2021; Tipireddy, et al., 2019).
Neural Operators (NOs) (Li, et al., 2020) and in particular Fourier Neural Operators (FNOs) (Guibas et al., 2021; Li, et al., 2021) have impressive performance and can be applied in challenging scenarios such as weather forecasting (Pathak et al., 2022). In contrast to physics informed neural networks (PINNs) (Raissi, et al., 2019), Neural Operators do not require the knowledge of the physical model and can be applied whenever observations are available. As such, Neural Operators are fully data-driven methods. Neural Operators, however, work under the assumption that the governing PDE is fixed, that is, its parameters are static while the initial condition is what changes. If this assumption is not met, the performances of these approaches deteriorate (Mischaikow and Mrozek, 1995). Thus, when the interest is in a situation that requires the evaluation over multiple physical model parametrizations, then (1) the Neural Operators for each of the parameter configurations should be re-trained, or (2) the parameter values should be included as input to the neural operator (Arthurs and King, 2021). Training over a large number of possible parametrizations is computationally demanding. On the other hand, increasing the number of parameters of the network increases the computational complexity of the model and would increase inference time, which takes away from the advantage surrogate models have over numerical solvers.
In the present disclosure, a meta-learning problem is formulated in which each possible set of parameter values of the PDE induces a separate task. At inference time, the learned meta-model is used to adapt to the current task, that is, the given inference time parameters of the PDE. A hyper-FNO is thus disclosed, as well as a method to adapt the Neural Operator over a wide range of parameter configurations, which uses hyper networks (Ha, et al., 2016a). Hyper-FNO learns to model the parameter space of a Green function operator that takes as input the parameters and produces as output the neural network that approximates the Green function operator associated with that parametrization. By separating the training and testing in two networks (the hyper network and the main network), complexity at inference time is reduced while maintaining the prediction power of the original model and without the need of a fine-tuning period.
A solution to a PDE is a vector valued function u: T×X×Λ on some spatial domain X and temporal index T, parameterized over Λ. For example, in the heat diffusion equation, u could represent the temperature in the room at a location x∈X at a time t∈T, where the conductivity field is defined by λ: X→. A forward operator maps the solution at one instant of time to a future time step F: v(t, x, λ)→v(t+1, x, λ). The forward operator is known, and the solution of the PDE for any time can be computed, given the initial conditions.
Thus, a general problem of learning a class of an operator, which includes the forward operator Gλ: A×Λ→U between two infinite dimensional spaces of functions A: d→
p and U:
Rd→
q, on the space of parameters Λ, from a finite collection of observed data {λj, aj, uj}i=1N, λj∈Λ, aj∈A, uj∈U, composed of parameter-input-output triplet. For the forward operator, aj is the solution of a given PDE conditioned to the PDE parameter λj at time t, while uj is the solution at time t+1. The input aj˜μ and the parameter λj˜ρ are drown from two known probability distribution μ over A and ρ over Λ. To solve this problem, a family of operator Gθλ: A×Λ×Θ is considered, which minimizes
with (u′, u) being a cost function measuring the difference between the true and predicted output.
A diffusion equation with no boundary conditions and diffusion coefficient D is defined by:
u
t(t,x)=Duxy(y,x),t∈(0,1],x∈[−∞,∞]
u(t=0,x)=u0(x),x∈[−∞,∞]
Where ut=∂u/∂t and uxx=∂2/∂x2, while u0(x) is the initial condition. The general solution of this equation can be written using Green's function as:
The convolution can now be written in the Fourier space as:
where U(t, ω) and G(T, ω) are the solution and the Green operator in the Fourier space. The relation
is used when performing the Fourier transformation. For a small change of Dt→Dt+ΔDt, the change in Green's function is given by:
∂DtG(t,ω)=−4ω2G(t,ω).
Thus, Green's function can be written as a function of the change in the parameters ΔDt as:
G(T,ω)+∂DtG(t,ω)ΔDt=H(ω,z,Δz)G(t,ω), (3)
H(ω,z,Δz)=1−4ω2Δz, (4)
where z≡Dt. This means that when the parameters of the diffusion equation are updated, the Green's function operator is multiplied by a function H(ω, z, Δz) in Fourier space, where H(ω, z, Δz) is linear in the change of parameters Δz. The advantage of doing this in the frequency domain is that the function could be more compactly written Indeed, few frequencies are typically necessary to describe the behavior of Green's function.
Furthermore, the rate of change of the solution can be found as a function of the change in the PDE parameter. First, the difference between the original solution and the solution after the infinitesimal change Δλ, which is
∫T,Ω∥U′(t,ω)−U(t,ω)∥dtdω=|Δλ|∫T,Ω∥4ω2tG(t,ω)U(0,ω)∥dtdω (5)
with U′(t, ω)=(G(t, ω)+∂λG(t, ω)Δλ)U(0, ω). For Δλ≠0, the difference increases with the square of the frequency. The implication is that if the parameter of the equation is changed, a change in frequency is induced that is proportional to |Δλ|∫T,Ω∥4ω2tG(t, ω)U(0, ω)∥dtdω. The original operator thus is not more able to accurately predict the function at a later time, accumulating an error in time or frequency.
Interestingly, Green's function can also be implemented in the spatial domain, that is, the original, non-Fourier space, directly using Equation (1) and a convolutional neural network. Similarly to Fourier space, the variation of the Green function around the current parameters can be derived by considering that from Equation (1),
Where * is the convolution operator and then using the Taylor expansion
In the spatial domain, the change of Green's function with respect to the change in parameters can be described as the multiplication of the base function by a term that corresponds to the variation of the parameters. While the two approaches are mathematically equivalent, one might provide a more suitable inductive bias in the context of learning surrogate models. Moreover, the specific implementation, for example, the discretization of the domain, might also affect the final performance. This motivates a goal to generate the parameters of linear transformations either in the frequency or spatial domain, or both.
A hyper-FNO formula can be derived with the help of the finite volume method. First, a general form of the field equation may be considered with parameters:
∂tU(x,t)+∂x[F(x,t)+αG(x,t)]=βS(x,t), (9)
where the equation depends linearly on the parameters: α and β. Assuming the finite volume method is used, Equation (9) reduces to:
where subscripts n, j are time-step and cell number, respectively, and j±½ means the cell boundaries. Δt, Δx is the time-step and cell size, respectively. The above equation shows that the effect of parameter value change always linearly depends on the parameter in the case of the finite volume method. This is true when Δt, Δx<1.
On the other hand, in the case of a machine learning model, the above equation becomes:
U
j
n+1=(Un;α,β). (11)
Because of the flexibility of a deep neural network (DNN), there is a vast amount of the degree of freedom to take into account the parameter information into the DNN. Here, it is natural for machine learning models to take into account parameter dependence as in Equation (10):
U
j
n+1=(Un)+α
(Un)+β
)(Un). (12)
What follows is a 1-layer model that can be rewritten as:
U
j
n+1=σ[(WF+αWG+βWS)Un]. (13)
This is equivalent to the hyper-FNO formula.
Equation (10) is valid independent of the absolute value of parameters α, β but depends on Δx, Δt. Hence, Equation (13) is also valid when Δx, Δt<1.
In Equation (6), the convolution function of the spatial representation of the Green's function has infinite domain and its effective width is proportional to λ. When implemented using a finite convolution kernel, as in the disclosed machine learning frameworks, the convolution function is truncated and the distortion of the operation increases with the increase of λ. On the other side, in Equation (2), the Green's function in the frequency domain, while still affected by the parameter λ, is multiplied in frequency by the initial condition function. When the initial condition is limited in frequency, the distortion introduced by the frequency discretization and limit, as introduced in the FNO model, is less severe. Thus, even if change in the parameter in both spatial and frequency domain can be modeled, the latter could be more powerful and easier to model.
FNOs (Guibas, et al., 2021; Li, et al., 2021) are composed of initial and final projection networks parameterized by P and Q, Q′. These two networks transform the input signal into a latent space, adding and reducing features at each spatial location. After the initial feature expansion through a projection, the FNOs consists of blocks of Fourier layers which consist of two parallel spatial and frequency layers. The spatial layer, parameterized by a tensor W, is implemented using a 1-d convolutional network. The frequency layer is parameterized by a tensor R and operates in Fourier space. The prior transformation to Fourier space is implemented using the Fast Fourier Transform (FFT, F)
z
l+1=σ(Wlzl+F−1(RlF(zl))), (14)
z
0
=Px,u=Q′σ(QzL−1) (15)
Where the projection is implemented using two consecutive fully connected layers. Since the FNO is operating in both frequency and spatial domains, for the purpose of this disclosure, the former is called the Fourier domain, and the latter the spatial domain (or original domain). In Equation (14) and Equation (15), the variables z, x, u are in the spatial domain.
Hyper networks (Ha, et al., 2016) are a meta-learning method comprised of two networks: the main and the hyper network. The main network, with parameters Ø, is used during inference and the training is performed on θ, the parameters of the hyper network. The hyper network is trained to generate the parameters Ø of the main network. Hence, the parameters θ are generated through the hyper network as Ø=h(θ, λ), where λ are the hyper-parameters. Typically, the hyper network generates all parameters of the main network. In this work, a hyper network is used to generate the weights of particular subnetworks of the main network.
The hyper-FNO network is built by a hyper network that produces the parameters for the main network, where the main network is an instance of the FNO architecture. If FNO is written as the function ƒ(φ, x), then the hyper-FNO can be written as:
Ø=h(θ,λ),û=ƒ(φ,x)
where û is the predicted solution given the PDE of parameters λ and initial condition x, while φ are the parameters of main network parameters, which are generated by the hyper network. The hyper network has parameters θ and are learned end-to-end. The hyper network is trained by minimizing the loss function
L(θ)=λ˜p(λ)Lλtr(θ,λ),
where Lλ(θ,λ)=(x,u)˜D
and φλ=h(θ,λ).
Hyper Networks are used to generate the parameters of the main network, where the parameters are specific for the current task. In the typical scenario, the hyper network is a large network that produces a smaller network. In this way the complexity of adaptation is off-loaded to the hyper network, while the prediction is performed by the smaller main network. This approach is particular convenient in order to reduce the computation complexity of the prediction, for example in case of limited resources at inference time. An alternative approach aims at using a hyper network that only marginally increases the size of the main network, but still allow to easily adapt to new tasks. This second scenario can have a special class of hyper layer, which then can modularly build the main network.
In hyper-FNO, each layer of the FNO is generated by an Hyper Fourier Layer (HyperFL), and used in the main Fourier Layer as
z
l+1=σ(zl+WU
z
0
=P(λ)x,u=Q′(λ)σ(Q(λ)zL−1)
where the hyper network generates (1) reference to the annex on the example of diffusion; (2) in a more simple case, only a scaling quantity; and (3) in a case where change with different strength and frequency or convolution is desired, then a change in equation with the parameters as
R
V
(λ)=R0l+(V0lλ,V1lλ,V2lλ)⊙row,col,depthR1l, (16)
W
U
(λ)=W0l+(U0lλ,U1lλ)⊙row,colW1l (17)
where ⊙row,col, ⊙row,col,depth represents the Hadamard product applied to the rows, columns, and depths of tensor, using vectors whose size is equal to the number of rows, columns, and depths, respectively.
This version is called the Addition version. In addition, Ul=(U0l, U1l) and Vl=(V0l, V1l, V2l) are the parameters of the spatial and frequency tensors. The number of parameters of Equation (16) are about twice the number of parameters of the main network. In order to reduce the number of parameters, another formulation (multiplicative) that significantly reduces the number of parameters may be used. This choice is justified by the shape of the Taylor expansion. The parameters of the main network are generated by
R
v
(λ)=rijmlFT=rijm0l(1+λkv0ikl,v1jklv2mkl) (18)
W
U
(λ)=wijlXT=wij0l(1+λku0iklu1jkl), (19)
where rijmFT and wijlXT are the frequency and spatial tensors used in the main network, written using the Einstein notation. This is called the Taylor version. Also the initial expansion and final projection are generated by the hyper network using
P
V(λ)=P0+(V0λ,V1λ)⊙row,colP1, (20)
Q
U(λ)=Q0+(U0λ,U1λ)⊙row,colQ1, (21)
The parameters λ can be encoded using an additional neural networks of minimal size, λ′=g(T, λ), with T additional hyper-FNO parameters. The parameters of the hyper-FNO are θ={Vl, Ul, Rl, Wl, T}l=0,L−1, where Rl, Wl, depending by the architecture choice, may contain one or two tensors.
Equation (14) can be differentiated with respect to the parameter λ, leading to an identity
∇λzl+1∇λWzl+W∇λzl+−1(∇λR
(zl))+
−1(R
(∇λzl)),
where the two terms ∇λW and VλR are the variation of the FNO parameters. In one approach,
∇λRV
∇λWU
where the change is a linear transformation in the parameter λ.
The extension of new operators in the Fourier and spatial domain may also be considered. Specifically, the various families of operations, in particular affine, rotation, polynomial, multilayer perceptron (MLP), and rank-1 operations may be considered. The generic operator is described as
Y=T′(λ′,X),λ′=ƒθ(λ) (24)
where Y is any of the FNO parameters R, W, P, Q, and X is the hyper-parameter. ƒθ is a generic transformation used to increase or reduce the number of parameters or to include non-linear transformations.
The first class can be written in the following ways using Einstein notation:
T(λ,X)=yijml=xijm0l+xijm1lλkx0iklx1jklx2mkl) (25)
T(λ,X)=yijml=xijm0l(1+λkx0iklx1jklx2mkl) (26)
For the rotation, the exponential operator may be used. Since a tensor is included, the exponential map of a tensor can be defined as
A rotation can then be written as exp{λX}. In order to restrict the number of parameters and the complexity, the Rodrigues's formula
with X being an anti-symmetric tensor (for a matrix, it is X=AB−BA, for tensor X=½(X. . . ij . . . −X. . . ji . . . ), thus leading to the rotation (exponentiation) transformation as:
with X0,k, X1 being learnable parameters and the product with λ being implementable in a similar manner as in Equation (25), while αk=∥X0,k∥.
An alternative is to use a polynomial over the tensor X:
T(λ,X)=polyλ(X)=Σn=0NλnXn, (28)
where Xn is the n times application of X.
The most generic transformation is implemented using a standard MLP, in which
Y=g
x(λ), (29)
where gx is an MLP with parameters X.
The rotation and polynomial operators are expensive in terms of number of parameters, since they require full rank operators. For example, rotations are inevitable matrices, while the power operator will produce equal but scalar scaled matrices, i.e. (vvT)n=(vTv)nvvT, when applied to rank-1 matrices. Thus, the use of rank-1 updates is considered wherein for each parameter λk, a rank-1 vector transformation can be written in simplified form as:
Y=Π
k(I+λkx0kx0kT)X1, (30)
where x0k, X1 are trainable parameters.
In the effort of identifying nonlinear dynamical systems from data, the Multi Step neural networks (Raissi, et al., 2018) uses the multi-step time-stepping schemes to learn the system dynamics. The PDE is expanded in the time dimension and expressed as a M step equation, where the step hyper-parameters α, β define the scheme, while the system dynamic is capture by the neural network ƒ, whose parameters are learnt by minimizing the mean square error with the observed data. This approach is thus limited to time-series data.
HyperPINN (Belbute-Peres, et al., 2021), a closely related work, introduce the use of hyper network for Physics Informed Neural Networks (PINNs). An hyper network generates the main network that is then used to solve the specific PDE. This approach inherit the same limitations of the PINNS, and thus requires to run multiple iteration for each new initial conditions, thus requiring relatively long inference time.
Meta-learning (Chen, et al., 2019) has been used to help solving advection-diffusion-reaction (ADR) equations to optimize for the hyper-parameters of sPINN (O'Leary, et al., 2021), the stochastic version of PINN, using Bayesian optimization that uses the composite multi-fidelity neural network proposed in (Meng and Karniadakis, 2020). This approach allows to estimate the PDE parameters and reduce the computation time, but it still requires multiple evaluation for every new initial condition, thus sharing similar limitations of PINNs, where the closed form equation of the problem is known in advance.
In order to evaluate the performance of hyper-FNO, the following problems are considered: 1) one-dimensional Burgers' equation, 2) one-dimensional reaction-diffusion equation, 3) two-dimensional Decaying Flow problem. Contrary to (Li, et al., 2021), datasets allowing various parameter values are prepared, for instance for the diffusion coefficient.
The resource costs of hyper-FNO is evaluated in terms of additional parameters needed by the respective architecture since each choice has a varying impact on the number of parameters. Indeed, the number of parameters defines the memory and computational complexity of the resulting neural network. The Taylor version only adds a negligible number of parameters and thus its complexity is similar to the original network. If an Addition version is used, the number of parameters doubles, while the full connected version does not have any upper bound. In experiments, a full connected network is used that leads to an increase up to 9 and 10 times the original parameters numbers. The computational complexity of the Addition and Taylor versions is thus equal to the original network.
Further reduction of complexity could be achieved when a reduced rank representation of the model tensors is used, for example one could model R0l=M0lN0l, with ρ(M0l)=ρ(N0l)<<ρ(R0l).
To illustrate the computational complexity of numerical simulators, the necessary computational cost of the traditional numerical solver for the field equations, such as hydrodynamic equations, may be considered. For simplicity, only the case of the explicit method is considered. First, the memory cost is approximately proportional to O(ncNd) where ne is the number of variables, Nis the resolution in a direction, and d is the number of dimensions along each axis. If using a method with n-th order temporal accuracy, the cost increases as: O(n ncNd) because n increments need to be performed. Such is the case, for example, using an n-th order Runge-Kutta method. Next, the necessary number of calculations is considered. Approximately speaking, the number of calculations is proportional to the mesh size, i.e. O(Nd). Assuming the advection equation, the stability condition, known as Courant-Friedrich-Lewy (CFL) condition, demands the upper limit of the time-step size should be: Δt∂Δx where Δt, Δx are the time-step size and mesh size, respectively. Hence, the necessary temporal step number is Tfin/Δt∂N where Tfin is the final time so that the total number of calculations is proportional to O(Nd+1). If the diffusion process is included, the CFL condition becomes Δt∂Δx2, and the total number of calculations is proportional to O(Nd+2) when λtdiff/λtadv=vcλx/η<1 where vc is the characteristic velocity, and l is the diffusion coefficient. This analysis shows that hyper-FNO becomes especially more effective than the direct numerical simulation when considering large diffusion coefficient and high-resolution cases, because the numerical complexity of hyper-FNO is independent of diffusion coefficient, and accuracy depends on the resolution very weakly, as shown in (Li, et al., 2021).
In Zero-Shot learning, at training time, access to solutions of a PDE over different initial conditions and for a set of PDE parameters is provided. At inference time, the PDE parameters of the new environment are used as inputs to the hyper-FNO to generate the parameters of the main FNO network. This network is then used to predict the solutions for new initial conditions. To evaluate the performance of hyper-FNO, it can be compared in various numerical computational problems against the original FNO (Li, et al., 2020) and the U-Net (Ronneberger, et al., 2015).
In addition to few-shot learning, a case may be considered wherein a set of training samples for a new environment are given which correspond to a new parameter configuration of the PDE. In this case, the parameters of the new environment are used to generate the FNO main network and the network is further trained with additional samples. Finally, the fine-tuned network is tested with test samples. An additional case may be considered wherein the parameters of each environment are assumed and not known, but the method will estimate the parameters based on held out validation samples. The problem to be solved can be written as a bi-level problem:
At test time, some samples are used to predict the parameter of the dataset
then a query of the hyper-FNO is used to obtain the main network parameters Øe=h(θ, λe) and used to predict the solution to a PDE û=ƒ(Øe, x). The loss functions are defined for each environment
L
e
tr(θ,λ)=(x,u)˜D
L
e
tr(θ,λ)=(x,u)˜D
respectively.
Meta-learning is the problem of learning meta-learning parameters from the source tasks in a way that helps learning a model for a target task. Each task is defined by two sets of samples: training and test samples. During training, the training sample from the source tasks can be used to learn the meta-model and use the test samples (or validation samples) to train the model.
The vector λ=[λτ]τ=1T, is defined, then the equations L(λ, θ)=τ˜p
τ˜p
d
θ
L(λ,θ)|λ=λ*(θ)=∇θL(λ,θ)|λ=λ*(θ) (33)
−∇0,λ
The gradient can either be implemented directly or using an iterative loop, where the external loop looks for the parameter λτ associated with the environment, while the inner loop is solved for the hyper-FNO parameters. It is observed that the size of ∇λL(λ, θ) is proportional to the number of tasks and the dimensions for the PDE parameter representation. This dimension is typically low and during training is limited to the batch size, where a limited number of tasks are sampled. The inverse ∇λ,λ
The previous results follow based on the publications by Domke, which provides that the gradient of the loss function
which is given by
d
ω
L(ω)=dωl−∂ω∂y
where the first term is present, and wherein 1 depends explicitly from ω, ie.e l(y, ω). (Domke, 2012).
At test time, new target tasks Dτ are used. For each task, a set Dτtr can be used to train the meta-model and adapt to a specific task. The performance on the Dτte of the target tasks can then be measured.
The Burgers' equation is a PDE modeling the non-linear behavior and diffusion process of fluid dynamics as:
In an exemplary dataset, the dataset consists of 10,000 initial conditions of various distributions. The dataset is tested over two time horizons (t is also used to indicate the time step of the simulation) (t=[5, 10]). Table 1 and Table 2 show performance on the Burgers datasets. As observed in the results, the largest gain is obtained with the longest horizon. This is due to the effect of the parameter change. Close to the initial condition, the change in the solution as a function of the PDE parameters is relatively small. Similarly, for a very large horizon, the solution difference is also small, because of the source term that forces the system to be a steady-state independent of the initial condition, so the effect of the parameter change is negligible; while for an intermediate time horizon, the change is more evident and Hyper-FNO has the largest advantage. In
Next consider a one-dimensional reaction-diffusion type PDE is considered that combines a diffusion process and a rapid evolution from a source term (Krishnapriyan, et al., 2021). The equation is expressed as:
∂tu(t,x)−v∂xxu(t,x)−ρu(1−u)=0,x∈(0,1),t∈(0,1], (39)
u(0,x)=u0(x),x∈(0,1). (40)
Tables 3 and 4 show the results of the Hyper-FNO on the reaction-diffusion dataset for time horizon t=[5, 10]. Similar to the Burgers equation, also with the reaction-diffusion equation, Hyper-FNO shows improved performance and can adapt to the change in the parameters.
In some experiments, the steady-state solution of 2-d Darcy Flow over the unit square, whose viscosity term a(x) is an input of the system, is considered. The solution of the steady-state is defined by the following equation
−∇(a(x−λ)∇u(x))=ƒ(x),x∈(0,1)2 (41)
u(x)=0,x∈∂(0,1)2 (42)
where the viscosity term is shifted by the parameters λ=[λx, λy]T.
Table 5 shows the performance of Hyper-FNO to model the change in the parameters for the steady-state solution. The performance gain is somehow limited in this case. The effect of the change in the parameter of the Darcy Flow is to shift the viscosity term in the 2d coordinates. The limited improvement is related to the limited capacity of FNO to capture this type of parameter change, which is indicated by the smaller difference in the test error between U-Net2d and FNO than in the other PDE cases.
In addition to the foregoing Tables,
Hyper-FNO is a method that improves the adaptability of an FNO to various parameters of a physical system that is being modeled. Furthermore, the disclosed hyper-FNO is agnostic of the actual system and can be adapted in a variety of fields and uses for positive societal impact.
Through a hyper-FNO, a method is provided to adapt the FNO architecture over a wide range of parameters of the PDE. Significant improvement is gained over different physics systems, such as the Burgers equation, the reaction-diffusion, and the Darcy flow. Meta-learning for Physics Informed Machine Learning is an important direction of research and a method in this direction that allows us to model to adapt to new environments is disclosed. In some embodiments, In the future, the parameters of the PDE may be automatically learned using Bayesian Optimization.
A Navier-Stokes equation is considered, the equation being defined by
where cs is the sound velocity, and η and ζ are shear and bulk viscosity, respectively. The above equations have more parameters than the incompressible Navier-Stokes equations, that is, the bulk viscosity ζ and mach number vc/cs where vc is the characteristic velocity in the system. In this case, the next step value can be recursively predicted after observing the first t0=10 samples, allowing predictions for t0<t≤T.
In
In an exemplary implementation, a rate of change of a solution as a function of change in a PDE parameter can be determined. Specifically, a difference between the original solution and the solution after an infinitesimal change Δλ is computed. The computed difference is
∫T,Ω∥U′(t,ω)−U(t,ω)∥dtdω=∫T,Ω∥−4ω2tG(t,ω)ΔλU(0,ω)∥dtdω=|Δλ|∫T,Ω∥4ω2tG(t,ω)U(0,ω)∥dtdω (46)
with U′(t, ω)=(∂λG(t, ω)Δλ)U(0, ω). For Δλ≠0, the difference increases with the square of the frequency. This implies that if the parameter of the equation is changed, a change in frequency is induced that is proportional to |Δλ|∫T,Ω∥4ω2tG(t, ω)U(0, ω)∥dtdω. The original operator thus is not more able to accurately predict the function at a later time, accumulating the error in time or frequency.
Processors 1302 can include one or more distinct processors, each having one or more cores. Each of the distinct processors can have the same or different structure. Processors 1302 can include one or more central processing units (CPUs), one or more graphics processing units (GPUs), circuitry (e.g., application specific integrated circuits (ASICs)), digital signal processors (DSPs), and the like. Processors 402 can be mounted to a common substrate or to multiple different substrates.
Processors 1302 are configured to perform a certain function, method, or operation (e.g., are configured to provide for performance of a function, method, or operation) at least when one of the one or more of the distinct processors is capable of performing operations embodying the function, method, or operation. Processors 1302 can perform operations embodying the function, method, or operation by, for example, executing code (e.g., interpreting scripts) stored on memory 1304 and/or trafficking data through one or more ASICs. Processors 1302, and thus processing system 1300, can be configured to perform, automatically, any and all functions, methods, and operations disclosed herein. Therefore, processing system 1300 can be configured to implement any of (e.g., all of) the protocols, devices, mechanisms, systems, and methods described herein.
For example, when the present disclosure states that a method or device performs task “X” (or that task “X” is performed), such a statement should be understood to disclose that processing system 1300 can be configured to perform task “X”. Processing system 1300 is configured to perform a function, method, or operation at least when processors 1302 are configured to do the same.
Memory 1304 can include volatile memory, non-volatile memory, and any other medium capable of storing data. Each of the volatile memory, non-volatile memory, and any other type of memory can include multiple different memory devices, located at multiple distinct locations and each having a different structure. Memory 1304 can include remotely hosted (e.g., cloud) storage.
Examples of memory 1304 include a non-transitory computer-readable media such as RAM, ROM, flash memory, EEPROM, any kind of optical storage disk such as a DVD, a Blu-Ray® disc, magnetic storage, holographic storage, a HDD, a SSD, any medium that can be used to store program code in the form of instructions or data structures, and the like. Any and all of the methods, functions, and operations described herein can be fully embodied in the form of tangible and/or non-transitory machine-readable code (e.g., interpretable scripts) saved in memory 1304.
Input-output devices 406 can include any component for trafficking data such as ports, antennas (i.e., transceivers), printed conductive paths, and the like. Input-output devices 1306 can enable wired communication via USB®, DisplayPort®, HDMI®, Ethernet, and the like. Input-output devices 1306 can enable electronic, optical, magnetic, and holographic, communication with suitable memory 1304. Input-output devices 1306 can enable wireless communication via WiFi®, Bluetooth®, cellular (e.g., LTE®, CDMA®, GSM®, WiMax®, NFC®), GPS, and the like. Input-output devices 1306 can include wired and/or wireless communication pathways.
Sensors 1308 can capture physical measurements of environment and report the same to processors 1302. User interface 1310 can include displays, physical buttons, speakers, microphones, keyboards, and the like. Actuators 1312 can enable processors 1302 to control mechanical forces.
Processing system 1300 can be distributed. For example, some components of processing system 1300 can reside in a remote hosted network service (e.g., a cloud computing environment) while other components of processing system 1300 can reside in a local computing system. Processing system 1300 can have a modular design where certain modules include a plurality of the features/functions shown in
The attached paper “Appendix” forms a part of this disclosure and is hereby incorporated by reference herein in its entirety, including each of the references cited therein.
While subject matter of the present disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. Any statement made herein characterizing the invention is also to be considered illustrative or exemplary and not restrictive as the invention is defined by the claims. It will be understood that changes and modifications may be made, by those of ordinary skill in the art, within the scope of the following claims, which may include any combination of features from different embodiments described above.
The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.
The following references provide additional background information which may be helpful in understanding aspects of the present disclosure. The entire contents of each of the following references are incorporated by reference herein.
| Number | Date | Country | Kind |
|---|---|---|---|
| 22173344.7 | May 2022 | EP | regional |