The present disclosure relates to machine learning techniques, more specifically, using one simulation model using one machine learning technique to train another model using a different machine learning technique.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description section. This summary does not identify required or essential features of the claimed subject matter. The innovation is defined with claims, and to the extent this Summary conflicts with the claims, the claims should prevail.
In general, some technologies described herein describe training a learning model using an optimizer.
In embodiments, a computer-enabled learning model training system is disclosed. The system comprises: a processor; a memory in operable communication with the processor, computing code associated with the processor configured to create a simulator trainer; an optimizer that determines initial node values for a simulator, the simulator comprising nodes with values; the simulator that uses an input time series from time t=(−n) to time t=(0) as input, and outputs for the nodes an output time series from time t=(−n) to time t=(0); a reverser that reverses the input time series to time t=(0) to t=(−n), to produce a reversed input time series and reverses the output time series to time t=(0) to t=(−n); and a learning model that uses the reversed input time series as training input and uses selected values of the output time series at t=(−n) as a ground truth for a cost function associated with the learning model.
In embodiments, the simulator is a heterogenous neural network.
In embodiments, the system further comprises a cost function determiner that uses selected node values from the output time series as an input into a cost function, and wherein the cost function determiner further comprises the cost function using the ground truth as input into the cost function.
In embodiments, a cost derived from the cost function is used by the optimizer to determine subsequent initial node values.
In embodiments, further comprising an iterator which iteratively runs the optimizer, the simulator, and the learning model until a stop state is reached.
In embodiments, when the stop state is reached, the initial node values are used as input into a starting state estimation simulation.
In embodiments, the starting state estimation simulation is run from time t=(−n) to time t=(0); wherein a state simulation is then run from time t(0) to t(m); the state simulation produces an output that can be used to produce a control sequence, and wherein the control sequence is used to run a device modeled by the state simulation.
In embodiments, the Learning Model is a neural network.
In embodiments, the neural network is a Recurrent Neural Network.
In embodiments, a computer-enabled method to train a learning model using an optimizer model implemented in a computing system is disclosed, the method comprising one or more processors and one or more memories coupled to the one or more processors, the one or more memories comprising computer-executable instructions for causing the computing system to perform operations comprising: running an optimizer to determine initial simulator node values; running a simulator using inputs and the initial simulator node values producing simulator outputs; comparing selected node values from the simulator outputs to an desired node values producing a cost; reversing the selected node values, producing a reversed selected node values; reversing the inputs of the simulator producing a reversed simulator input; using the reversed selected node values and the reversed simulator input as training input into a learning model; and running the learning model.
In embodiments, running the learning model produces a reversed time series as learning model output.
In embodiments, the learning model output at time t(−n) is compared with the initial simulator node values in a cost function.
In embodiments, the cost is derived from the cost function, and wherein the cost is used for backpropagation within the learning model.
In embodiments, the simulator is a heterogenous neural network.
In embodiments, the inputs comprises weather data over time.
In embodiments, the selected node values are temperature of areas inside a space that the simulator is modeling.
In embodiments, reversing the inputs of the simulator comprise reversing time series originally from t=(−n) to time=(0) to time t=(0) to t=(−n), to produce a reversed time series.
In embodiments, a computer-readable storage medium configured with instructions is disclosed, which upon execution by one or more processors to perform a method for training a simulator, the method comprising: running an optimizer to determine initial simulator node values; running a simulator using inputs and the initial simulator node values producing simulator output; comparing selected node values from the simulator outputs to an desired node values producing a cost; reversing the selected node values, producing a reversed selected node values; reversing the inputs of the simulator producing a reversed simulator input; using the reversed selected node values and the reversed simulator input as training input into a learning model; and running the learning model.
In embodiments, the learning model output at time t=(−n) is compared with the simulator output at time t=(0) for a learning model cost function, and wherein a cost derived from the learning model cost function is used for backpropagation within the learning model.
In embodiments, the learning model is a Recurrent Neural Network.
These, and other, aspects of the invention will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. The following description, while indicating various embodiments of the embodiments and numerous specific details thereof, is given by way of illustration and not of limitation. Many substitutions, modifications, additions or rearrangements may be made within the scope of the embodiments, and the embodiments includes all such substitutions, modifications, additions or rearrangements.
Non-limiting and non-exhaustive embodiments of the present embodiments are described with reference to the following FIGURES, wherein like reference numerals refer to like parts throughout the various views unless otherwise specified.
Corresponding reference characters indicate corresponding components throughout the several views of the drawings. Skilled artisans will appreciate that elements in the FIGURES are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of various embodiments. Also, common but well-understood elements that are useful or necessary in a commercially feasible embodiment are often not depicted in order to facilitate a less obstructed view of these various embodiments.
Disclosed below are representative embodiments of methods, computer-readable media, and systems having particular applicability to systems and methods for warming up a simulation. Described embodiments implement one or more of the described technologies.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present embodiments. It will be apparent, however, to one having ordinary skill in the art that the specific detail need not be employed to practice the present embodiments. In other instances, well-known materials or methods have not been described in detail in order to avoid obscuring the present embodiments. “one embodiment”, “an embodiment”, “one example” or “an example” means that a particular feature, structure or characteristic described in connection with the embodiment or example is included in at least one embodiment of the present embodiments. Thus, appearances of the phrases “in one embodiment”, “in an embodiment”, “one example” or “an example” in various places throughout this specification are not necessarily all referring to the same embodiment or example. Modifications, additions, or omissions may be made to the systems, apparatuses, and methods described herein without departing from the scope of the disclosure. For example, the components of the systems and apparatuses may be integrated or separated. Moreover, the operations of the systems and apparatuses disclosed herein may be performed by more, fewer, or other components and the methods described may include more, fewer, or other steps. Additionally, steps may be performed in any suitable order.
For convenience, the present disclosure may be described using relative terms including, for example, left, right, top, bottom, front, back, upper, lower, up, and down, as well as others. It is to be understood that these terms are merely used for illustrative purposes and are not meant to be limiting in any manner.
In addition, it is appreciated that the figures provided herewith are for explanation purposes to persons ordinarily skilled in the art and that the drawings are not necessarily drawn to scale. To aid the Patent Office and any readers of any patent issued on this application in interpreting the claims appended hereto, applicants may wish to note that they do not intend any of the appended claims or claim elements to invoke 35 U.S.C. 112(f) unless the words “means for” or “step for” are explicitly used in the particular claim.
Embodiments in accordance with the present embodiments may be implemented as an apparatus, method, or computer program product. Accordingly, the present embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may be referred to as a “system.” Furthermore, the present embodiments may take the form of a computer program product embodied in any tangible medium of expression having computer-usable program code embodied in the medium.
Any combination of one or more computer-usable or computer-readable media may be utilized. For example, a computer-readable medium may include one or more of a portable computer diskette, a hard disk, a random access memory (RAM) device, a read-only memory (ROM) device, an erasable programmable read-only memory (EPROM or Flash memory) device, a portable compact disc read-only memory (CDROM), an optical storage device, and a magnetic storage device. Computer program code for carrying out operations of the present embodiments may be written in any combination of one or more programming languages.
The flowchart and block diagrams in the flow diagrams illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present embodiments. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It will also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
“Optimize” means to improve, not necessarily to perfect. For example, it may be possible to make further improvements in a value or an algorithm which has been optimized.
As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having,” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, article, or apparatus.
Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present). “Program” is used broadly herein, to include applications, kernels, drivers, interrupt handlers, firmware, state machines, libraries, and other code written by programmers (who are also referred to as developers) and/or automatically generated. “Optimize” means to improve, not necessarily to perfect. For example, it may be possible to make further improvements in a program or an algorithm which has been optimized. “Determine” means to get a good idea of, not necessarily to achieve the exact value. For example. It may be possible to make further improvements in a value or algorithm which has already been determined.
Additionally, any examples or illustrations given herein are not to be regarded in any way as restrictions on, limits to, or express definitions of any term or terms with which they are utilized. Instead, these examples or illustrations are to be regarded as being described with respect to one particular embodiment and as being illustrative only. Those of ordinary skill in the art will appreciate that any term or terms with which these examples or illustrations are utilized will encompass other embodiments which may or may not be given therewith or elsewhere in the specification and all such embodiments are intended to be included within the scope of that term or terms. Language designating such nonlimiting examples and illustrations includes, but is not limited to: “for example,” “for instance,” “e.g.,” and “in one embodiment.”
A “cost function,” generally, is a function determines how close a simulation model answer is to the desired answer—the ground truth. That is, tt quantifies the error between the predicted value and the desired value. This cost function returns a cost. The cost function may use a least squares function, a Mean Error (ME), Mean Squared Error (MSE), Mean Absolute Error (MAE), a Categorical Cross Entropy Cost Function, a Binary Cross Entropy Cost Function, and so on, to arrive at the answer. In some implementations, the cost function is a loss function. In some implementations, the cost function is a threshold, which may be a single number that indicates the simulated truth curve is close enough to the ground truth. In other implementations, the cost function may be a slope. The slope may also indicate that the simulated truth curve and the ground truth are of sufficient closeness. When a cost function is used, it may be time variant. It also may be linked to factors such as user preference, or changes in the physical model. The cost function applied to the simulation engine may comprise models of any one or more of the following: energy use, primary energy use, energy monetary cost, human comfort, the safety of building or building contents, the durability of building or building contents, microorganism growth potential, system equipment durability, system equipment longevity, environmental impact, and/or energy use CO2 potential. The cost function may utilize a discount function based on discounted future value of a cost. In some embodiments, the discount function may devalue future energy as compared to current energy such that future uncertainty is accounted for, to ensure optimized operation over time. The discount function may devalue the future cost function of the control regimes, based on the accuracy or probability of the predicted weather data and/or on the value of the energy source on a utility pricing schedule, or the like. A cost may be derived from a cost function This cost may be a single number. A “goal function” may read in a cost (a value from a cost function) and determine if that cost meets criteria such that a goal has been reached, such that the simulation iterations stop. Such criteria may be the cost reaching a certain value, being higher or lower than a certain value, being between two values, etc. A goal function may also look at the time spent running the simulation model overall and/or how may iterations have been made to determine if the goal function has been met.
A “machine learning algorithm” or “optimization method” is used to determine the next set of inputs after running a simulation model. These machine learning algorithms or optimization methods may include Gradient Descent, methods based on Newton's method, and inversions of the Hessian using conjugate gradient techniques, Evolutionary computation such as Swarm Intelligence, Bee Colony optimization; self-organizing migrating algorithm (SOMA), Particle Swarm, Non-linear optimization techniques, and other methods known by those of skill in the art. A “state” as used herein may be Air Temperature, Radiant Temperature, Atmospheric Pressure, Sound Pressure, Occupancy Amount, Indoor Air Quality, CO2 concentration, Light Intensity, or another state that can be measured and controlled.
The deep physics networks that are used herein are a type of structured similar to neural networks. But unlike the homogeneous activation functions of neural nets, each neuron comprises unique physical equations (for the equipment model) or resistance/capacitance values (for the building model). Once configured, known sensors are fed into their corresponding nodes in the network. Once the network is trained, any location in the thermodynamic system can be queried to extract data about the model at that point. The figure “Possible Equipment Model Implementation” shows one portion of a database structure that might hold queryable data for an equipment model. Querying a model can also be called introspecting. Similar data structure exist for building models. This process provides powerful generalized data fusion, data synthesis, and quality assessment through inference even where no sensors exist—for any thermodynamic system. The same mechanism enables model optimization, and time series generated from the models can then be used for real-time sequence generation and fault detection. To automate a structure, a digital twin version (the structure simulation) of the structure is created. A matching digital twin version of the equipment in the building (the equipment simulation) is created as well (the equipment simulation). The equipment model comprises nodes that represent the individual material layers of the building and their resistance and capacitance. These are formed into parallel and branchless neural network strings that propagate heat (or other state values) through them. The equipment model comprises nodes that represent equipment, their connections, and outside influences on the equipment, such as weather. Nodes have physics equations that describe equipment state change. Equipment nodes may also have state input(s) and state output(s), state parameters with values, allowable state parameter values, state input location data, and state output location data. The location data can be cross-referenced to the thermodynamic building model locations. In embodiments, the equipment nodes may form control loops. These nodes inputs and outputs along along with the connections between the equipment form a heterogenous neural network. State information flows through the model following physical rules.
Conceptually, running a structure simulation model comprises inserting some amount of some measurable state into the building. This can be temperature, humidity, acoustic, vibration, magnetic, light, pressure, moisture, etc. This state then propagates through the different layers, affecting the structure. This illustrative example uses temperature to describe aspects of the system and methods. (brick, insulation, drywall, etc.) heating up rooms, as represented by inside nodes. An outside node is associated with a time heat curve T. The curve value at time T1 is injected into one or more outside nodes. This temperature is propagated through the building (by the values of the nodes of the simulation neural nets). The outside of the building (say, brick), has its temperature modified by the outside node and known aspects of how heat transfers through brick as found in the brick node. The brick then heats up the next layer, perhaps insulation. This continues throughout the building until an inside node is reached. At inside nodes, other functions can be applied, such as those that represent lighting and people (warmth) within the zone. State information continues to propagate until another outside layer is reached, with individual node parameter values representing the heating present in the building at time T1. In some embodiments, each outer surface has its own time temperature curve. In some embodiments a building is deconstructed into smaller subsystems, (or zones) so rather than propagating temperature through the entire structure, only a portion of the structure is affected by a given input. In some implementations, the digital twin models are built on a controller that is associated with the building being controlled. In some instances, the controller is embedded in the controlled building and is used to automate the building. A controller may comprise a simulation engine that itself comprises a model of the controlled system the controller is in, or a model of the equipment in the controlled system the controller is in. This model may be called the “physical model.” This physical model may itself comprise past regressions and a cost function. The past regressions are instances of these model being run in the past and the results. The controlled system has at least one sensor whose value can be used to calibrate the physical model(s) by checking how close the model value at the sensor location is to the simulated sensor value equivalent in the physical models. A cost function may be used to determine the distance between the sensor value and the simulated sensor value equivalent. This information can then be used to refine the physical models. This Controller-Controlled system loop may be implemented without use of the internet. The Controller may control and/or run a Local Area Network (LAN) with which it talks to sensors and other resources. The Controller may be hardwired into the sensors and other resources, or there may be a combined system, with some resources hardwired, and other resources which connect to the LAN. A “simulation model” may be a resource model or a building model.
A “machine learning algorithm” or “optimization method” is used to determine the next set of inputs after running a simulation model. These machine learning algorithms or optimization methods may include Gradient Descent, methods based on Newton's method, and inversions of the Hessian using conjugate gradient techniques, Evolutionary computation such as Swarm Intelligence, Bee Colony optimization; self-organizing migrating algorithm (SOMA), Particle Swarm, Non-linear optimization techniques, and other methods known by those of skill in the art. A “state” as used herein may be Air Temperature, Radiant Temperature, Atmospheric Pressure, Sound Pressure, Occupancy Amount, Indoor Air Quality, CO2 concentration, Light Intensity, or another state that can be measured and controlled.
The technical character of embodiments described herein will be apparent to one of ordinary skill in the art, and will also be apparent in several ways to a wide range of attentive readers. Some embodiments address technical activities that are rooted in computing technology, such as determining more efficient ways to perform simulations. These simulations may simulate energy flow in a building or other structure. The energy flow simulation may produce control sequences that allow a building to run with much more energy efficiency. The simulations themselves may be able to run much quicker by being able to be warmed up to a reasonable value, and by performing that warmup very quickly, more efficiently using the computer processor and memory. Other advantages based on the technical characteristics of the teachings will also be apparent to one of skill from the description provided.
To perform an informative building simulation starting at t=0 and going into the future, a simulation of the building itself cannot be in a random state. To see why, consider the following example. Imagine starting a simulation with every node in the building at −100 degrees C. Rather than a simulation of such a building being predictive of future reality, any reasonably long simulation would spend the entire simulation time heating up to ambient temperature. Furthermore, the output of the nodes representing zone temperatures would be useless lines starting at −100 degrees C., and then steadily heating up. The actual building, from time t=0 on, will consist of precise oscillations of temperature based on weather inputs, building inter-dynamics and human interaction.
Setting all the nodes to ambient temperature at t=0 also fails as the dynamics of a building consist of complex relationships, which include resistance of materials, capacitance of masses, solar absorption, etc. The various zones in a building are never all the same, and their relationships are complex. Starting all nodes in a model at the same temperature is too simplified to get meaningful simulator output from t=0 on. Again, it might take the entire simulation before the dynamics of the building have had time to stabilize and reflect reality.
Even when historical temperature readings exist, there are only a small subset of nodes that represent an area in the building with a temperature sensor. Empirical historical data may only exist for two or three percent of the locations in a building represented by nodes. Therefore, an estimate from such sparse data may lead to most nodes having values very far off from what their corresponding actual values may be. Again, it might take the entire simulation before the dynamics of the building have had time to overwrite the estimation errors and reflect reality. Running the simulation for a longer time may be very time- and resource consuming. Such models are by their very nature users of massive amounts of computer resources and time. Using techniques disclosed herein will use the little empirical data we have to infer correct temperatures at t=0 for the unknown nodes in a building (or similar structure) while reducing use of valuable computer resources. The state estimation simulation methods and systems taught here run much quicker to achieve a reasonable starting point than running a simulation for an extended period of time.
Learning models are very powerful at solving difficult problems. However, such models must be trained with many sets of training data before they are able to offer a reasonable solution. Acquiring sufficient training examples is often difficult, and sometimes insurmoutable. Developing a training set involves acquiring and/or generating many examples of input data that can then generate meaningful output data. One way to do this is use as training data synthetic or actual examples of data from problems that the learning model is designed to solve. The learning model should preferentially be trained on real-world data, if possible, thus receiving a representative training set. When modeling buildings, however, data can only be gathered in real time; a single year-long data set takes a year to generate. Providing synthetic data produces its own sort of problems. When synthetic data is used, the learning model has a difficult time giving accurate answers to real-world data, as the real world includes examples that were not generated by the synthetic data generation procedure—the data is either overfitted (noise is misinterpreted as parameters) or underfitted (parameters are missed). This problem is so severe that “[i]t is quite common to invest days to months of time on hundreds of machines in order to solve even a single instance of the neural network training problem.” Goodfellow et al., “Deep Learning (Adaptive Computation and Machine Learning series)”, MIT Press, Nov. 18, 2016, p. 274. Another difficult to overcome problem is the large number of training sets required to generate usable solutions. As the number of parameters in a model rises, the amount of training data needed increases exponentially to achieve a state solution. The training data must also be representative of the data the model will actually encounter, or the results will be biased in a way that is difficult to ascertain. These models are also computationally intensive, such that models can require very expensive equipment to run, require very large amounts of time, and so on. This is made more difficult by the nature of real-world data-gathering. For example, as mentioned earlier, a single year-long data set for a building takes a year to accumulate.
To train a learning model using an optimizer, we start by using an optimizer to iteratively determine reasonable node starting values for a building simulation. Thousands of simulations may be run during the process of optimization, each with an input and output set. Each of the optimizer input and output sets is then used for training a learning model. As the simulation being discussed uses physics and a thorough understanding of the modeled structure to produce its results, the input and output from the optimizer simulations can be used as real-world data sets. After training, ideally, the learning model would now be able to produce outputs similar to that achieved by the optimizer-simulator optimization even when given unique problems. Even though the learning model can be considered trained, the optimizer-simulator, at a minimum, continues to be used to provide sanity checks on the results of the learning model, ensuring that the learning model continues to provide accurate answers. This prevents, among other things, over- and under-fitting engendered by, e.g., atypical scenarios. This also greatly reduces the computational power required to run any given model, as training sets are generated automatically, rather than using extra computing power; the trained learning model requires much less computational power and time to run than the optimizer, and so on.
More specifically, in an embodiment, a state that can be measured (or at least determined) within a structure being modeled is chosen. In a building being modeled for HVAC control, this may be room temperature, which will be used in this embodiment. Rooms often have thermometers installed within them, and even when they do not, it is easy enough to measure temperature within a room over time. This chosen measurement (e.g., temperature) is then measured for some period of time in the rooms, giving a state-time curve. These temperature state time curves are then used as the “ground truth” for an initial value simulation. That is, the optimizer modifies beginning values within the simulator in an attempt to match the “ground truth” (the temperature time curves within rooms in the building) throughout the simulation. Before going further with our example, the makeup of the simulation model will be addressed. A digital twin simulation may break down the structure being modeled into nodes (described with more specificity with reference to
To determine how a building naturally warms up, ideally, the temperatures of rooms should be taken at the same time that the outside weather is being measured. This measured weather may then be used as input into the optimizer simulation models. The optimizer chooses temperature values (in this embodiment) for each of the nodes in the simulator. The simulation is then run. Running the simulation consists of the simulation applying the weather state inputs to the outside, and then letting the weather state (e.g., temperature) percolate through the structure for the simulation time. For the first time through, the optimizer may choose its node values at random, may use an earlier model node values, etc. At the end of the simulation, the nodes representing room temperature are checked against the desired room temperature, and the optimizer then chooses new values for the nodes in the simulation. The simulation is run multiple times, with the optimizer choosing new node values each time, until an optimal output is reached. This optimal output gives initial node values for a state estimation simulation to run, using the same simulator. At the end of the state estimation simulation, the simulation is considered to have reasonable temperatures within it, and so is ready to run a simulation.
At the same time this is happening, the learning model is being trained. The learning model may be thought of as running a backward version of the optimizer simulation: given the output desired (the desired temperature time curves for a structure) and outside state (weather) information it should return initial value digital twin simulation node values. That is, when the simulation is run with the learning model chosen node values (for the desired time), the simulation model will be considered to have a reasonable state—it will be warmed up. How does it do that? The learning model uses the reverse output of the optimizer simulation as input, and then produces reversed node state values as output. Specifically, the learning model uses as input the ground truth (in our current example, room temperature time value curves) and the weather time value curves that the optimizer uses for its simulation runs. Only, the learning model flips these time curves around backward. If the curves initially ran from time −300 to 0, then they will run from 0 to −300 producing output from 0 to time −300.
More specifically, the weather state time series (reversed) and the desired node value subset time series (reversed) are fed into the learning model as input. The learning model is set up so that it has the same number of output nodes that the simulator has nodes. These node values at the end of the learning model run are then checked against the starting values of the optimizer simulation run, e.g., the desired values. The cost (the difference between the two values) is then used to update the learning model using backpropagation or a similar learning method. This creates, e.g., a reverse building simulator, in that it is given the end values desired and produces the starting values that will arrive at those end values. With each training set, then the learning model gets slightly more accurate. As a single optimization cycle can produce thousands of training sets, the learning model may be able to be trained relatively quickly. Furthermore, as the data sets use not only actual data to be solved, but also actual solutions, the under- or over-fitting of training data is ameliorated.
This method is highly counterintuitive as to current uses of learning models. At the intuitive level cause and effect are reversed here. Simulators attempt to represent an evolution of cause-and-effect dynamics going forward. Going backward is impossible (in some situations) as one cannot rely on causality. As an example, if pool balls are in the starting triangle, with the white ball moving towards them, a simulator can be written to tell how the collisions and movement will unfold, with a reasonable amount of accuracy. But, if the balls are started in random locations, a simulator cannot be written that will tell the positions of the balls before the last shot. This is because they could have been in any number of positions, including the starting triangle formation, but there is no way to know which. It is easy to simulate forward, but cannot be simulated backwards within our lifetimes. One of the reasons for this is because of the Second Law of Thermodynamics. Entropy in a system always increases, until it reaches equilibrium. In some situations, once it has reached equilibrium, all information content about earlier system states has been erased. If that is true, you can't know anything about what the system looked like in the past with even the most clever simulator. For example, imagine two rods dipped into a single tank of room-temperature water. The rod on the left is hot, and the rod on the right is room-temperature. The rod on the left will bleed heat into the water (and the other rod). After a long enough wait, the two rods will be the same temperature. If someone walked into the room at that point, and were asked which rod was initially hot, they would have no way to know. All information about earlier states has been erased. A reverse simulator that would tell you which rod was hot could not be written. As another example, imagine a room-temperature pot of water on a stove. Over a ten minute period, the burner is turned on, for random spurts of time, to random heat settings, while a log of the burner actions is kept. At the end of the ten minutes, temperature of the water is at a certain value. At this point, the entries in the log are unrecoverable. Only the total amount of energy injected into the water can be determined. The exact sequence and strength of ‘signals’ in the log cannot be reconstructed. That information has been lost, as though the evolution of the water's thermodynamics acted as a lossy low-pass filter.
Here, in spite of all odds, running backward works anyway. The learning model sees the simulated structure's behavior over and over. So it can learn to leverage what information it has much more effectively than a pure backwards-physics simulator, which would be very hindered by these information-theory issues we've spelled out. In the absence of the ability to deterministically simulate, it can learn from past experience using the output from the optimizing simulator as training models.
To use a simulator 110 to develop training models for a Learning Model 125, the simulator 110 first runs the digital twin for a period of time—e.g., from a time prior to the simulation, such as t(−300) (representing 300 time units from where the actual simulation will take place), to t(0) (the time that the simulation starts). The digital twin may be a heterogenous node system. This heterogenous node system may be a heterogenous neural network. The inputs 105 may represent outside state that will be presented to the digital twin and used by the optimizer which affects a building over time, such as weather, humidity, vibration, magnetic, light, pressure, moisture, etc. This state then may propagate through the different layers, affecting the structure over the time of the simulation. The inputs may be state curves that run for the same amount of time as the simulation itself, so for our current example, it would be from t(−300) to t(0). The simulator 110 itself may model a digital twin of the building descried as nodes. These nodes may represent various building chunks. The constitution of the chunks depend on the specific implementation. In some cases, the chunks represent large scale structures in a building such as rooms; in some implementations, the chunks represent building portions such as walls, windows, floors, ceilings; in some representations, the chunks represent individual components of a building such as layers in a wall; i.e., a wall may be modeled as a specific type of drywall, followed by a specific type of insulation, followed by a specific layer of drywall, and so on. Some implementations may use both chunks representing large scale structures and individual components, etc. The nodes values may then be used to determine building thermodynamic behavior. The nodes are heterogenous as individual nodes may represent different building chunks, and so may have different properties. The properties may be state values. A “state” as used herein may be Air Temperature, Radiant Temperature, Atmospheric Pressure, Sound Pressure, Occupancy Amount, Indoor Air Quality, CO2 concentration, Light Intensity, or another state that can be measured and controlled. The properties may also be values associated with, e.g., the chunk of the building that the node represents, in some embodiments. Properties associated with equipment may also be used in the nodes, when appropriate, etc. For example, nodes may have properties such as heat capacity rates, efficiency, specific enthalpy, resistance, and so on. These properties may be used in equations to determine how state changes over time. An example of a node is shown with reference to
The inputs 105 may be presented to the location within the heterogenous node model that correspond to the areas that they will affect in the building. So, e.g., weather inputs may be presented to nodes representing outside walls, sun inputs may be presented to nodes representing windows facing toward the sun, and so on. The optimizer produces starting value node states as inputs 105 for the initial simulation starting time. After the simulation has been run with the optimizer starting values, output 115 is produced. The output is the state of the nodes in the simulation for the simulation time—state time curves. Selected node values, such as nodes that represent the the temperature inside rooms, may be measured against a set of ground truth state curves. The ground truth state curves may be actual temperatures measured over time in spaces that are represented by the simulator. For example, the temperature of rooms in a building may be measured over time, at the same time that the temperature and other state values of the location may also be measured. At the end of the simulation, the selected node values may be measured against a ground truth vector. These selected node values may be used in a cost function along with ground truth values. The cost function may measure an output (e.g., historical values of the room temperature) against the ground truth to produce a cost. The optimizer may then use the cost to improve the initial values, at which point the optimizer is rerun with new initial values for the nodes in the simulator. Selected node values from the outputs, along with state date is then run through the learning model as a new training example. This improvement may continue until a stopping state is reached. Once a stopping state is reached, that is, the simulator 110 (e.g., a heterogenous node system) has an optimized solution, a certain number of cycles have run; the model has run for a maximum time, etc., that solution may be used as starting values for a starting state estimation—a warmup run for the simulation. This is described in greater detail with reference to
When the simulator output 115 is used for learning model input, selected node values from the simulator—which may be for for the simulation time (e.g., time t(−300) to t(0))—are then reversed, such that they run from time t(0) to t(−300)— backwards. This reversed output 130 is then used as input 135 for the learning model 125. Similarly, a portion of the simulation inputs become the ground truth that the learning model outputs 140 are compared to. the output of the learning model and the inputs of the simulator the learning model may have a node that parallels each node (or most of the nodes) in the simulation 110. The simulation may be a heterogenous node simulation. The inputs used by the simulator model that have been similarly reversed 120, may also be used as input into the learning model 125. The learning model may be set up such that it outputs 140 state over time, such as value that represent inside temperature in rooms in a building. These state time curves are in reverse time order. The original output at the beginning of the model run (e.g., in our example, t(−300)), from the simulation model run is then used as ground truth. The ground truth is compared with the learning model output in a cost function to produce a cost. That cost may then be used for backpropagation within the learning model to improve the output of the model. According to the Wikipedia entry “Backpropagation”: “The back propagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule.” Backpropagation techniques are known by those of skill in the art. When the learning model is run with input and output from the optimizer simulation, the learning model simulation gets incrementally better at producing an outcome closer to that generated by the optimizer—it is trained.
)
With reference to
A computing environment may have additional features. For example, the computing environment 300 includes storage 340, one or more input devices 350, one or more output devices 355, one or more network connections (e.g., wired, wireless, etc.) 360 as well as other communication connections 370. An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing environment 300. Typically, operating system software (not shown) provides an operating environment for other software executing in the computing environment 300, and coordinates activities of the components of the computing environment 300. The computing system may also be distributed; running portions of the software on different CPUs.
The storage 340 may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, CD-RWs, DVDs, flash drives, or any other medium which can be used to store information and which can be accessed within the computing environment 300. The storage 340 stores instructions for the software, such as software 385 to implement systems and methods of warming up simulation models that rely on state being at a reasonable value.
The input device(s) 350 may be a device that allows a user or another device to communicate with the computing environment 300, such as a touch input device such as a keyboard, video camera, a microphone, mouse, pen, or trackball, a digital camera, a scanning device such as a digital camera with a scanner, touchscreen, joystick controller, a wii remote, or another device that provides input to the computing environment 300. For audio, the input device(s) 350 may be a sound card or similar device that accepts audio input in analog or digital form, or a CD-ROM reader that provides audio samples to the computing environment. The output device(s) 355 may be a display, a hardcopy producing output device such as a printer or plotter, a text-to speech voice-reader, speaker, CD-writer, or another device that provides output from the computing environment 300.
The communication connection(s) 370 enable communication over a communication medium to another computing entity. The communication medium conveys information such as computer-executable instructions, compressed graphics information, or other data in a modulated data signal. Communication connections 370 may comprise input devices 350, output devices 355, and input/output devices that allows a client device to communicate with another device over network 360. A communication device may include one or more wireless transceivers for performing wireless communication and/or one or more communication ports for performing wired communication. These connections may include network connections, which may be a wired or wireless network such as the Internet, an intranet, a LAN, a WAN, a cellular network or another type of network. It will be understood that network 360 may be a combination of multiple different kinds of wired or wireless networks. The network 360 may be a distributed network, with multiple computers, which might be building controllers, acting in tandem. A communication connection 370 may be a portable communications device such as a wireless handheld device, a personal electronic device, etc.
Computer-readable media are any available non-transient tangible media that can be accessed within a computing environment. By way of example, and not limitation, with the computing environment 300, computer-readable media include memory 320, storage 340, communication media, and combinations of any of the above. Computer readable storage medium 365 which may be used to store computer readable media comprises instructions 375 and data 380. Data Sources may be computing devices, such as general hardware platform servers configured to receive and transmit information over the communications connections 370. The computing environment 300 may be an electrical controller that is directly connected to various resources, such as HVAC resources, and which has CPU 310, a GPU 315, Memory 320, input devices 350, communication connections 370, and/or other features shown in the computing environment 300. The computing environment 300 may be a series of distributed computers. These distributed computers may comprise a series of connected electrical controllers.
Although the operations of some of the disclosed methods are described in a particular, sequential order for convenient presentation, it should be understood that this manner of description encompasses rearrangement, unless a particular ordering is required by specific language set forth below. For example, operations described sequentially can be rearranged or performed concurrently. Moreover, for the sake of simplicity, the attached figures may not show the various ways in which the disclosed methods, apparatus, and systems can be used in conjunction with other methods, apparatus, and systems. Additionally, the description sometimes uses terms like “determine,” “build,” and “identify” to describe the disclosed technology. These terms are high-level abstractions of the actual operations that are performed. The actual operations that correspond to these terms will vary depending on the particular implementation and are readily discernible by one of ordinary skill in the art.
Further, data produced from any of the disclosed methods can be created, updated, or stored on tangible computer-readable media (e.g., tangible computer-readable media, such as one or more CDs, volatile memory components (such as DRAM or SRAM), or nonvolatile memory components (such as hard drives) using a variety of different data structures or formats. Such data can be created or updated at a local computer or over a network (e.g., by a server computer), or stored and accessed in a cloud computing environment.
The simulator 110 may be a heterogenous neural network. This neural network may have activation functions in the nodes that perform separate calculations to determine weight values that are exiting the node. These separate calculations may be thermodynamic calculations that determines how state flows through the node. States may be characterized as weights entering and exiting nodes. An example of such a heterogenous neural network is described in patent application Ser. No. 17/009,713, “Neural Network Methods For Describing System Topologies”, filed Sep. 1, 2020 and incorporated herein by reference in its entirety. A typical neural network comprises inputs, outputs, and hidden layers connected by edges which have weights associated with them. The neural net sums the weights of all the incoming edges, applies a bias, and then uses an activation function to introduce non-linear effects, which basically squashes or expands the weigh/bias value into a useful range; often deciding whether the node will, in essence, fire, or not. This new value then becomes a weight used for connections to the next hidden layer of the network. The activation function does not do separate calculations, depending on the node, do not have physics equations associated with them, etc.
In some heterogenous neural networks, which might be used with embodiments described herein, the fundamentals of physics are utilized to model single components or pieces of equipment on a one-to-one basis with neural net nodes. When multiple components are linked to each other in a schematic diagram, a neural net is created that models the components as nodes. The values between the objects flow between the nodes as weights of connected edges. These neural nets may model not only the real complexities of systems but also their emergent behavior and the system semantics. Therefore, they bypass two major steps of the conventional AI modeling approaches: determining the shape of the neural net, and training the neural net from scratch. The nodes are arranged in order of an actual system (or set of equations) and because the nodes themselves comprise an equation or a series of equations that describe the function of their associated object, and certain relationships between them are determined by their location in the neural net. Therefore, a huge portion of training is no longer necessary, as the neural net itself comprises location information, behavior information, and interaction information between the different objects represented by the nodes. Further, the values held by nodes in the neural net at given times represent real-world behavior of the objects so represented. The neural net is no longer a black box but itself contains important information. This neural net structure also provides much deeper information about the systems and objects being described. Since the neural network is physics- and location-based, unlike the conventional AI structures, it is not limited to a specific model, but can run multiple models for the system that the neural network represents without requiring separate creation or training.
In some embodiments, the heterogenous neural network shapes the location of the nodes to tell something about the physical nature of the system. It may also place actual equations into the activation function. The weights that move between nodes may be equation variables. Different nodes may have unrelated activation functions, depending on the nature of the model being represented. In an exemplary embodiment, each activation function in a neural network may be different. For example, a pump could be represented in a neural network as a series of network nodes, some that represent efficiency, energy consumption, pressure, etc. The nodes will be placed such that one set of weights (variables) feeds into the next node (e.g., with an equation as its activation function) that uses those weights (variables). Now, two previous required steps, shaping the neural net and training the model may already be performed, at least to a certain portion. Using embodiments discussed here the neural net model need not be trained on information that is already known. It still needs to be trained on other information, such as is detailed here.
In some embodiments, the individual nodes represent physical representations of chunks of building material within a structure, equipment, etc. These individual nodes may hold parameter values that help define the physical representation. As such, when the neural net is run, the parameters helping define the physical representation can be tweaked to more accurately represent the given physical representation. This has the effect of pre-training the model with a qualitative set of guarantees, as the physics equations that describe objects being modeled are true, which saves having to find training sets and using huge amounts of computational time to run the training sets through the models to train them. A model does not need to be trained with information about the world that is already known. With objects connected in the neural net like they are connected in the real world, emergent behavior arises in the model that maps to the real world. This model behavior that is uncovered is otherwise too computationally complex to determine. Further, the nodes represent actual objects, not just black boxes. The behavior of the nodes themselves can be examined to determine behavior of the object, and can also be used to refine the understanding of the object behavior.
Conceptually, optimizing a structure simulation model comprises inserting some amount of some measurable state into a structure that may be modeled by the heterogenous network. This can be temperature, humidity, acoustic, vibration, magnetic, light, pressure, moisture, etc. This state then propagates through the different layers, affecting the structure. The input data 505 may be such state represented by a time curve. As an example, an outside node in the heterogenous network may be associated with a time heat curve T. The curve value at time t(−n) is injected into one or more outside nodes. This temperature is propagated through the building, e.g., through the represented layers (brick, insulation, drywall, etc.) by modifying values of the nodes of the simulation neural nets using physics representation. The outside of the building (say, brick), has its temperature modified by the outside node and known aspects of how heat transfers through brick as found in the brick node. The brick then heats up the next layer, perhaps insulation. This continues throughout the building until an inside node is reached. This may represent the space inside a room. At inside nodes, other functions can be applied, such as those that represent lighting and people (warmth) within the zone of the node. State information continues to propagate until another outside layer is reached, with individual node parameter values representing the heating present in the building at time T0. In some embodiments, each outer surface has its own time temperature curve. In some embodiments a structure is deconstructed into smaller subsystems, (or zones) so rather than propagating temperature through the entire structure, only a portion of the structure is affected by a given input.
The optimizer 510 chooses the initial values of state and then passes them as optimizer output 511 to the simulation. To choose the initial values, the optimizer uses machine learning algorithms that do not require training, or that are already trained. The optimizer runs iteratively, ideally choosing better initial node values with the iterations. The simulator runs for each (or some) iterations with the starting values chosen by the optimizer. As output 513, nodes in the simulator produce node values for the length of the simulation, as described in
A “cost function,” generally, compares the output of a simulation model with the ground truth—a time curve that represents the answer the model is attempting to match. This gives us the cost—the difference between the simulated truth curve values and the expected values (the ground truth). The cost function may use a least squares function, a Mean Error (ME), Mean Squared Error (MSE), Mean Absolute Error (MAE), a Categorical Cross Entropy Cost Function, a Binary Cross Entropy Cost Function, and so on, to arrive at the answer. In some implementations, the cost function is a loss function. In some implementations, the cost function is a threshold, which may be a single number that indicates the simulated truth curve is close enough to the ground truth. In other implementations, the cost function may be a slope. The slope may also indicate that the simulated truth curve and the ground truth are of sufficient closeness. When a cost function is used, it may be time variant. It also may be linked to factors such as user preference, or changes in the physical model. The cost function applied to the optimizer may comprise models of any one or more of the following: energy use, primary energy use, energy monetary cost, human comfort, the safety of building or building contents, the durability of building or building contents, microorganism growth potential, system equipment durability, system equipment longevity, environmental impact, and/or energy use CO2 potential, or something else. The cost function may utilize a discount function based on discounted future value of a cost.
In some implementations, a goal function is used to determine if the cost is such that the operation can stop. Some implementations also include a stop state; a secondary state, such as another quit option, such as quit if the optimizer 510 or the simulator 512 has run for a certain amount of time, or has run for a number of cycles. If indicated (by the stop state, the goal function, or the cost function), the machine learning algorithm continues. In some implementations, information from output of the simulation, such as the simulation prediction selected node values and the cost function are used to update initial state values by the optimizer 510. This value updating may be implemented at least partially by back propagation 530. The optimizer 510, after updating, then chooses new state node values for the next round of simulation. The optimizer 510-simulation 512 cycle may run until a stop state (defined by some combination of one or more of the cost function, time the optimizer has run, time the optimizer and simulator has run, time the entire program has run, number of cycles the program has run, number of training runs taken by the learning model, etc.) is reached.
Turning now to
Initially, the temperature inside rooms in a building that is to be modeled for a period of time, and the weather data (weather for the same period of time associated with the building location) is determined. An optimizer 810 determines starting temperature values as optimizer output 811 for the nodes in the heterogenous node model 812. The heterogenous node model 812 is then run, with weather data 805 (which may be temperature data) used as input in places equivalent to places in the heterogenous node model that weather values would enter an actual building, such as nodes representing outside walls. The heat values from the weather data 805 are propagated throughout the heterogenous model of the building 812 for the simulation run time—in this case, from t(−300) to t(0). The heterogenous node model 812 returns the inside node value temperature 815 (for the simulation time) of rooms as represented in the simulation. At the end of the simulation 835 the inside node room temperature values 815 are then compared to the historical inside temperature state/time curves 820 for the length of the simulation using a cost function 827 to determine a cost 825. The entire time curve (or a large section of it) is generally used, because if only the temperature for the last time value is used, for example, the optimizer solution may not represent a rational temperature warmup. For example, the model could represent temperature shooting up at the last time value, which would not give an accurate first time temperature, as is desired. In some embodiments, a portion of the time curve is used for the inside node values 815. The cost function uses the chosen inside node values 815 to measure against the ground truth historical node temperatures 820 using a cost function 827 to produce a cost 825. The cost 825 is then used by the optimizer 810 to modify the starting node temperature values that will then be used as starting node values within the heterogenous node system. This set of actions is repeated 830 until a stopping state is reached, to produce an optimized set of inside temperature node values for all (or most) of the nodes within the heterogenous node model 812. When the heterogenous node model 812 is run, the temperature value of its nodes 845 (which may have many state values such as temperature, humidity, etc.), are reported for the simulation time, e.g., from time t(0) to time t(−300). In some embodiments, the heterogeneous node system may report node values through a model run, rather than at the end. This is described in more detail with reference to
These output values 930 are then used in a cost function 945 with the heterogenous node model node temperature values 840, 920 that are sampled at the beginning of the warmup period (e.g., t(−300)) to determine a cost 940. This cost 940 is then used for backpropagation 935 to train the network. This process, where the RNN is trained using the input and output from an optimizer, is continued until the RNN is considered trained. Backpropagation, short for “backward propagation of errors” is widely used to incrementally train neural networks. According to the Wikipedia entry “Backpropagation”: “The back propagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule.”
In some embodiments, the computer-enabled method 1100A, 1100B may be implemented in one or more processing devices (e.g., a digital processor, an analog processor, a digital circuit designed to process information, an analog circuit designed to process information, a state machine, and/or other mechanisms for electronically processing information, a multiprocess system, etc.). The one or more processing devices may include one or more devices executing some or all of the operations of method 1100A, 1100B in response to instructions stored electronically on an electronic storage medium. The one or more processing devices may include one or more devices configured through hardware, firmware, and/or software to be specifically designed for execution of one or more of the operations of method 1100A, 1100B. 10.
At operation 1105, an optimizer is run to determine (some) initial node values for a simulator that simulates a structure, a process, etc. These node values may be, for multiple nodes, an initial state value, such as temperature, etc. These nodes in the simulator may have physics equations associated with them, as described, e.g., with reference to
At operation 1130, the learning model is run, producing a reversed time series—from t(0) to t(−n)—as learning model output which map to a portion (or all) nodes used in the simulation. At operation 1135, the learning model output values at a time t(−n) is used as input into a cost function, e.g., 737, 945, producing a cost 735, 940. The optimizer simulation output beginning node values (620, 720) is used as ground truth for the cost function 737. This is explained in greater detail with reference to
At operation 1149, the optimizer output is saved. At operation 1150, a state simulation is run using the warmed-up simulation. Specifically, the state simulation is run from time t=0 forward for an appropriate time, such as t(m), where m equals some positive number. The state simulation may be an extension of the starting state simulation estimation simulation run. At operation 1155, after the state simulation is run, results of the state simulation may be used to modify behavior of the structure that the simulation is modeling. One way that the structure behavior may be modified is by using simulation output to eventually direct device behavior. The structure may contain various devices that can be controlled. These devices may be scientific instruments, HVAC equipment, and so on. These devices may be modeled in a separate device simulation 1160. The results of this device simulation may be control sequences for the devices that were modeled. The simulation may optimize the device behavior to achieve certain results. For example, a building simulation may seek to optimize energy usage. As such, devices such as air conditioners, moveable vents, heaters, boilers, etc., may be modeled within a building structure. At operation 1165, the state simulation may produce outputs that may modify the behavior of the structure that is being modeled by the simulator, such as control sequences. At operation 1170, these control sequences may then be used to run at least one device. For example, an air conditioner may be run using the control sequence generated by the state simulation.
The model outputs, e.g. selected node values 515, may not be a specific layer, but rather may be a collection of nodes 1240 deep with the model. In the example shown, N3 (the third node from the left on row N), P10, P21, Q7, and Q15 may all be considered selected node values. Other output nodes may not be shown. There may be a reason for this specific set of nodes being considered output nodes. For example, they may represent a state that has an actual measurement in a corresponding real world structure. For example, selected nodes may represent the inside of a room, and there may be a node value that holds a temperature that represents the temperature of the room represented by the node. The selected nodes may then be the nodes representing the inside of rooms in a building, with the selected node values being the temperature of the selected nodes.
With reference to
With reference to
Block 1805 disclose an optimizer that determines initial node values for a simulator, the simulator comprising nodes with values. The simulator is shown with reference to
Block 1810 discloses the simulator. The simulator uses an input time series from time t(−n) to time t(0) as input. Output for the simulator and outputs for the nodes may be an output time series from time t(−n) to time t(0). Inputs with the series t(−n) to t(0) is described, e.g., with reference to
Block 1815 discloses a reverser that reverses the input time series to time t(0) to t(−n), to produce a reversed input time series. An example of this can be seen with reference to
Block 1820 discloses a learning model that uses the reversed input time series as training input and uses selected values of the output time series at t(−n) as a ground truth for a cost function associated with the learning model. The learning model shown with reference to
Block 1825 discloses a cost function determiner. With reference to
Block 1850 discloses an iterator. Optimizers, such as optimizers disclosed herein, learn through an iterative process. A series of possible values is chosen as input, the chosen inputs are used by the simulation, in a simulation run, then the output is checked, using a cost function, to see how close it is to a desired output, as explained elsewhere. If not at a stop state, the optimizer then chooses a new set of values for a simulation and the process continues until a stop state is reached. In some embodiments, the iterator controls these iterations. For example, the iterator may run the optimizer, pass the optimized node values to a simulator, run the cost function at the end of the simulation, and determine if a stop state has been reached. If not, then the iterator may have the optimizer determine a subsequent set of optimized node values using the cost. This cycle continues until a stop state is reached. When the stop state is reached, the initial values of the last iteration, e.g., 840 at
With reference to
Some embodiments provide or utilize a computer-readable storage medium 365 configured with software 385 which upon execution by at least a central processing unit 310 performs methods and systems described herein.
In view of the many possible embodiments to which the principles of the disclosed invention may be applied, it should be recognized that the illustrated embodiments are only examples of the invention and should not be taken as limiting the scope of the invention. Rather, the scope of the invention is defined by the following claims. We therefore claim as our invention all that comes within the scope and spirit of these claims.