The present application claims priority to Provisional Patent Application Ser. No. 63/009,282, entitled “CFDNET: A Deep Learning-Based Accelerator for Fluid Simulations” and filed on Apr. 13, 2020, the entirety of which is incorporated by reference herein.
Physics-based modeling represents real world processes or physical systems using numerical solutions of physics equations that describe physical processes in the real world. Examples of the physics equations that are used in physics-based models include Newton's equations of motion, Maxwell's equations of electrodynamics, Einstein's relativistic equations of motion, Schrödinger's equations for quantum mechanics, Navier-Stokes equations of fluid dynamics, and the like. These equations, or combinations thereof, are typically represented as partial differential equations that are solved iteratively to determine values of variables that represent the physical state of cells in a discretized geometry such as a grid or a mesh. Solutions to the physics equations used in physics-based modeling are typically constrained to satisfy physical conservation laws such as conservation of mass, conservation of energy, conservation of momentum, conservation of charge, and the like, e.g., by including a corresponding continuity equation in the physics-based model. The solutions include static solutions that converge to time-independent values of the variables in the cells or dynamic solutions that produce time-dependent values of the variables in the cells. For example, computational fluid dynamics (CFD) is used to solve the Navier-Stokes equations in static geometries such as fluid flow past a fixed object and dynamic geometries such as weather systems. However, physics-based modeling of equations such as the Navier-Stokes equations is computationally expensive and typically subject to a trade-off between accuracy and computational costs.
The present disclosure may be better understood, and its numerous features and advantages made apparent to those skilled in the art by referencing the accompanying drawings. The use of the same reference symbols in different drawings indicates similar or identical items.
Machine learning methods, including deep learning (DL), are widely used to perform modeling/classification tasks such as computer vision, natural language processing, and high-performance computing. Conventional DL algorithms are implemented using neural networks such as deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNN), and the like. A CNN architecture includes a stack of layers that implement functions to transform an input volume (such as a digital image) into an output volume (such as labeled features detected in the digital image). A DNN performs deep learning on tasks that contain multiple hidden layers. For example, a DNN that is used to implement computer vision includes explicit functions (such as orientation maps) and multiple hidden functions in the hierarchy of vision flow. The layers in a CNN are separated as an example into convolutional layers, pooling layers, and fully connected layers. In some embodiments, multiple sets of convolutional, pooling, and fully connected layers are interleaved to form a complete CNN. The functions implemented by the layers in a CNN are explicit (i.e., known or predetermined) or hidden (i.e., unknown). An RNN is a type of artificial neural network that forms a directed graph of connections between nodes along a temporal sequence and exhibits temporal dynamic behavior.
Attempts have been made to accelerate physics-based models by incorporating DL algorithms. For example, DL algorithms have been applied to accelerate computational fluid dynamics (CFD) simulations. However, DL algorithms are not constrained by the physical requirements of the relevant equations or conservation laws, such as the Navier-Stokes equations that govern fluid flows. Instead, DL algorithms are trained to recognize patterns of physical variables using data from previous physics-based models in related contexts. Neglecting the physical requirements of the situation leads to several drawbacks. For example, techniques based on DL algorithms typically do not satisfy the relevant conservation laws. Second, a conventional DL algorithm typically predicts a partial flow field that includes a subset of the flow variables that provide incomplete information about the physical context. For example, DL algorithms are not applied to turbulent flows that are common in most industrial applications. Furthermore, the DL algorithms are trained based on training input such as a training geometry that is the same as (or similar to a test geometry that the DL algorithm is attempting to model. Thus, the DL algorithms are not easily generalized to other geometries.
The processing system 100 includes at least one graphics processing unit (GPU) 115 that renders images for presentation on a display 120. For example, the GPU 115 renders objects to produce values of pixels that are provided to the display 120, which uses the pixel values to display an image that represents the rendered objects. Some embodiments of the GPU 115 are used to implement DL operations including CNNs, DNNs, and RNNs, as well as performing other general-purpose computing tasks. In the illustrated embodiment, the GPU 115 implements multiple processing elements 116, 117, 118 (collectively referred to herein as “the processing elements 116-118”) that execute instructions concurrently or in parallel. In the illustrated embodiment, the GPU 115 communicates with the memory 105 over the bus 110. However, some embodiments of the GPU 115 communicate with the memory 105 over a direct connection or via other buses, bridges, switches, routers, and the like. The GPU 115 executes instructions stored in the memory 105 and the GPU 115 stores information in the memory 105 such as the results of the executed instructions. In the illustrated embodiment, the memory 105 stores a copy of instructions from program code that represents a physics-based solver 125 and a copy of instructions from program code that represents a DL algorithm 128.
The processing system 100 also includes at least one central processing unit (CPU) 130 that implements multiple processing elements 131, 132, 133, which are collectively referred to herein as “the processing elements 131-133.” The processing elements 131-133 execute instructions concurrently or in parallel. The CPU 130 is connected to the bus 110 and therefore communicates with the GPU 115 and the memory 105 via the bus 110. The CPU 130 executes instructions such as program code 135 stored in the memory 105 and the CPU 130 stores information in the memory 105 such as the results of the executed instructions. The CPU 130 is also able to initiate graphics processing by issuing draw calls to the GPU 115. Some embodiments of the CPU 130 execute portions of the copy of the program code for a physics-based solver 125, portions of the copy of the program code for the DL algorithm 128, or a combination thereof.
An input/output (I/O) engine 140 handles input or output operations associated with the display 120, as well as other elements of the processing system 100 such as keyboards, mice, printers, external disks, and the like. The I/O engine 140 is coupled to the bus 110 so that the I/O engine 140 communicates with the memory 105, the GPU 115, or the CPU 130. In the illustrated embodiment, the I/O engine 140 reads information stored on an external storage component 145, which is implemented using a non-transitory computer readable medium such as a compact disk (CD), a digital video disc (DVD), and the like. The I/O engine 140 also writes information to the external storage component 145, such as the results of processing by the GPU 115 or the CPU 130.
The physics-based solver 125 is used to solve a set of one or more physical equations and, in some embodiments, corresponding conservation laws (e.g., conservation laws that are represented by one or more continuity equations) that determine the values of physical variables that represent a state of a physical system. To illustrate, some embodiments of the physics-based solver 125 are configured to solve the Navier-Stokes equations:
The variables in the Navier-Stokes equations are the mean velocity (Ū), the kinematic mean pressure (
These equations form a system of four partial differential equations in two-dimensions (2D) and five partial differential equations in three dimensions (3D). The physics-based solver 125 solves discretized forms of these equations on a structured grid with corresponding boundary conditions using numerical finite difference techniques.
The DL algorithm 128 is implemented using one or more artificial neural networks, such as a CNN, DNN, or RNN, which are represented as program code that is configured using a corresponding set of parameters. The artificial neural network is therefore executed on one or more GPUs 115, one or more CPUs 130, or other processing units including field programmable gate arrays (FPGA), application-specific integrated circuits (ASIC), and the like, or combinations of such processing units. If the artificial neural network implements a known function that is trained using a corresponding known dataset, the artificial neural network is trained (i.e., the values of the parameters that define the artificial neural network are established) by providing input values of the known training data set (e.g., the training input) to the artificial neural network executing on the GPU 115 or the CPU 130 and then comparing the output values of the artificial neural network to labeled output values in the known training data set. Error values are determined based on the comparison and back propagated to modify the values of the parameters that define the artificial neural network. This process is iterated until the values of the parameters satisfy a convergence criterion, e.g., as represented by one or more thresholds that are compared to values of the parameters. Some embodiments of the DL algorithm 128 are implemented using other input representations such as a 1-D representation.
Some embodiments of the DL algorithm 128 are trained using results of previous simulations performed by the physics-based solver 125. For example, CFD simulations can be performed for a set of training flow configurations. Images of the physical variables that represent the flow field at intermediate iterations (e.g., prior to convergence of the physics-based solver 125) are stored and used as inputs to the DL algorithm 128 during training. An image of the physical variables that represent the flow field after convergence of the physics-based solver 125 are stored and labeled as the target output for the DL algorithm 128 during training. However, in some embodiments, an intermediate output is stored prior to convergence of the physics-based solver 125 and labeled as the target output for the DL algorithm 128 during training. The DL algorithm 128 is then trained to produce the image of the converged values of the physical variables (or the intermediate output) in response to input representing the flow field at any of the intermediate iterations.
In operation, the GPU 115, the CPU 130, or a combination thereof execute the DL algorithm 128 on input values of the physical variables that represent a state of a physical system that is governed by one or more physical equations and one or more corresponding conservation laws. Some embodiments of the input values are determined by executing the physics-based solver 125 for one or more initial (or warm-up) iterations although the physics-base solver 125 does not perform warm-up interations in other embodiments. The values of the physical variables determined by the physics-based solver 125 during the warm-up iterations are provided as input values to the DL algorithm 128. The DL algorithm 128 infers estimated values of the physical variables that represent a state of the physical system. The estimated values of the physical variables are then provided as input to the physics-based solver 125, which is executed to modify the estimated values based on the one or more physical equations and conservation laws. In some embodiments, the physics-based solver 125 performs iterations until one or more convergence criteria and the corresponding conservation laws are satisfied within a tolerance. For example, the convergence criterion can be determined in terms of a rate of change of the physical variables between iterations and the conservation law is considered satisfied if the relevant quantity (e.g., mass, energy, momentum) is conserved within a predetermined tolerance.
In some embodiments, the values of the variables in the images 301-304 are non-dimensionalized. For example, the values of the variables in the pixels 305 are divided by a flow configuration-specific reference value corresponding to the variable. Non-dimensionalizing the variables in the images 301-304 addresses the (potentially large) differences in the scales or ranges of the values for the different variables. Non-dimensionalizing the variables in the images 301-304 also reduces the number of three parameters. If certain non-dimensionless parameters are significantly smaller than others, they are negligible in certain areas of the flow.
The convolution layer 401 receives one or more input images 415 such as the input 300 shown in
An output image 435 of the same size as the input image 415 is reconstructed using the subsequent deconvolution layers 411-413. In the illustrated embodiment, the output 430 is provided to the deconvolution layer 411 that generates a corresponding output 440, which is provided to the deconvolution layer 412. Output 445 is generated by the deconvolution layer 412 and provided to the deconvolution layer 413, which uses the output 445 as an input to produce the output image 435. The PReLU activation functions implemented in the convolution layer 401 and the deconvolution layer 413 capture negative values present in the intermediate field represented by the input image 415 and predict final, real valued variables for the output image 435.
In some embodiments, the input 505 is provided to an instance of a physics-based solver 515 that performs one or more iterations of a numerical solution of the set of equations that determines values of the physical variables in the physical system. For example, the physics-based solver 515 can use the input 505 as initial values of the variables and then perform one or more iterations to modify the initial values based on the set of equations and corresponding conservation laws. The number of iterations is determined adaptively based on a residual drop of the values of the physical variables from the initial values. For example, a residual drop of one order of magnitude is sufficient for the physical variables near the boundaries of the physical system and the object 510 to capture the geometry and flow conditions. The physics-based solver 515 produces an intermediate image 520 of the values of the physical variables.
The intermediate image 520 is provided to a trained DL algorithm 525 that performs inference on the values of the physical variables in the intermediate image 520 to determine an estimated image 530 that indicates estimated values of the physical variables. As discussed herein, the inference process implemented by the DL algorithm 525 does not explicitly account for the constraints imposed by the set of physical equations or the corresponding conservation laws. Thus, the estimated values of the physical variables in the estimated image 530 do not necessarily satisfy either the physical equations or the conservation laws for the physical system. In some embodiments, the inference loss of the trained DL algorithm 525 is less than an error tolerance (for the physical equations or the conservation laws) compared to ground truth data. In that case, the estimated image 530 can be returned as an output tensor that represents the final values of the physical variables. The trained DL algorithm 525 would therefore act as a surrogate of the set of equations and conservation laws that govern the physical system. However, there are several drawbacks to relying exclusively on the results of the trained DL algorithm 525. First, the convergence criteria for the DL algorithm 525 is based on error metrics that lack physical meaning and can be ill-defined. Second, satisfying the conservation laws can be imperative in some situations. Third, ground truth data is typically not available and it may be difficult or impossible to evaluate the accuracy of the results produced by the DL algorithm 525 without ground truth data for comparison.
In the illustrated embodiment, the estimated image 530 is provided to an instance of the physics-based solver 515, which performs one or more additional iterations of the numerical solution to the set of equations to refine the values of the physical variables in the estimated image 530. The physics-based solver 515 applies convergence criteria determined based on changes in the values of the physical variables between iterations and the constraints imposed by the conservation laws. For example, the physics-based solver 515 determines that the numerical solution has converged in response to a residual of the physical variables dropping by 4-5 orders of magnitude. In addition, the physics-based solver 515 requires that the relevant conservation laws be satisfied to within a predetermined tolerance. In response to convergence of the solution, the physics-based solver 515 generates a final image 535 of the final values of the physical variables.
At block 605, initial values of the physical variables that represent a state of a physical system are provided to a physics-based solver that implements a numerical technique for solving the equations subject to one or more conservation laws. For example, the physics-based solver can generate a numerical solution of discretized versions of the equations and conservation laws for values of the physical variables on a grid or mesh of cells.
At block 610, the physics-based solver performs one or more iterations of the numerical solution based on the physical equations and the conservation laws. The values of the physical variables are updated after each iteration. In some embodiments, the number of iterations of the numerical solution performed by the physics-based solver is determined dynamically based on an amplitude or rate of change of the physical variables, the conservation laws, or a combination or subset thereof.
At block 615, an input image of the physical variables is generated based on the values of the physical variables determined by the physics-based solver. In some embodiments, the input image is used to provide values of the physical variables via different channels, as shown in
At block 620, the input image generated by the physics-based solver is provided to a DL algorithm that has been trained on data from other simulations. The DL algorithm performs inference on the input image to generate an output image of the physical variables. The output image includes estimated values of the physical variables that do not necessarily satisfy the requirements of the equations that govern the physical system or the relevant conservation laws. The output image of the estimated values is therefore provided to the physics-based solver for a refinement stage.
At block 625, the physics-based solver performs one or more iterations of the physics model beginning with the estimated values of the physical variables inferred by the trained DL algorithm. The physics-based solver continues to perform iterations of the numerical solution of the set of equations until the relevant convergence criteria and conservation laws are satisfied. In response to determining that the convergence criteria and conservation laws are satisfied, the physics-based solver returns a final image of the values of the physical variables present the final state of the physical system.
At block 705, the physics-based solver accesses values of physical variables that represent a physical system. In some embodiments, the values of the physical variables are initial values or values that are generated by a trained DL algorithm, as discussed herein.
At block 710, the physics-based solver modifies values of the physical variables based on the physical equations that represent the state of the physical system. In some embodiments, the physics-based solver modifies the values by solving discretized forms of the equations on a structured grid with corresponding boundary conditions using numerical finite difference techniques.
At decision block 715, the physics-based solver determines whether the numerical solution has converged. Some embodiments of the physics-based solver determine whether the numerical solution has converged based on changes in the values of the physical variables between iterations and the constraints imposed by the conservation laws. For example, the physics-based solver determines that the numerical solution has converged in response to a residual of the physical variables dropping by 4-5 orders of magnitude. In addition, the physics-based solver requires that the relevant conservation laws be satisfied to within a predetermined tolerance. If the numerical solution has not converged, the method 700 flows back to the block 705. If the numerical solution has converged, the method 700 flows to block 720.
At block 720, the physics-based solver stores the final values of the physical variables. For example, the final values of the physical variables can be stored in a memory such as the memory 105 shown in
A computer-readable storage medium includes any non-transitory storage medium, or combination of non-transitory storage media, accessible by a computer system during use to provide instructions and/or data to the computer system. Such storage media can include, but is not limited to, optical media (e.g., compact disc (CD), digital versatile disc (DVD), Blu-Ray disc), magnetic media (e.g., floppy disc, magnetic tape, or magnetic hard drive), volatile memory (e.g., random access memory (RAM) or cache), non-volatile memory (e.g., read-only memory (ROM) or Flash memory), or microelectromechanical systems (MEMS)-based storage media. The computer readable storage medium can be embedded in the computing system (e.g., system RAM or ROM), fixedly attached to the computing system (e.g., a magnetic hard drive), removably attached to the computing system (e.g., an optical disc or Universal Serial Bus (USB)-based Flash memory), or coupled to the computer system via a wired or wireless network (e.g., network accessible storage (NAS)).
In some embodiments, certain aspects of the techniques described above are implemented by one or more processors of a processing system executing software. The software includes one or more sets of executable instructions stored or otherwise tangibly embodied on a non-transitory computer readable storage medium. The software can include the instructions and certain data that, when executed by the one or more processors, manipulate the one or more processors to perform one or more aspects of the techniques described above. The non-transitory computer readable storage medium can include, for example, a magnetic or optical disk storage device, solid state storage devices such as Flash memory, a cache, random access memory (RAM) or other non-volatile memory device or devices, and the like. The executable instructions stored on the non-transitory computer readable storage medium can be in source code, assembly language code, object code, or other instruction format that is interpreted or otherwise executable by one or more processors.
Note that not all of the activities or elements described above in the general description are required, that a portion of a specific activity or device may not be required, and that one or more further activities may be performed, or elements included, in addition to those described. Still further, the order in which activities are listed are not necessarily the order in which they are performed. Also, the concepts have been described with reference to specific embodiments. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present disclosure as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure.
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any feature(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature of any or all the claims. Moreover, the particular embodiments disclosed above are illustrative only, as the disclosed subject matter may be modified and practiced in different but equivalent manners apparent to those skilled in the art having the benefit of the teachings herein. No limitations are intended to the details of construction or design herein shown, other than as described in the claims below. It is therefore evident that the particular embodiments disclosed above may be altered or modified and all such variations are considered within the scope of the disclosed subject matter. Accordingly, the protection sought herein is as set forth in the claims below.
Number | Date | Country | |
---|---|---|---|
63009282 | Apr 2020 | US |