Control systems used for controlling heating, cooling, or other types of variables include complicated lower level control mechanisms. Often these complicated lower level control mechanisms are calibrated and set by technicians responsible for the maintenance of the control systems. The settings for the low level controls based on such calibrations may not result in an efficient operation because of the variable demands on the systems being controlled or the unexpected changes in the operating environment of the systems being controlled.
Moreover, not only do the control systems have to comply with safety and security measures, but they must also comply with regulatory frameworks, including environmental regulations. Effective control of the systems in an uncertain operating environment and a rapidly changing regulatory framework requires continued improvements to the systems and methods used for controlling such systems.
In one example, the present disclosure relates to autonomous control of supervisory setpoints using artificial intelligence. An example method includes collecting historical and state data associated with a system (e.g., a heating, ventilation, and cooling (HVAC) system. The collected data may be filtered and rearranged, as needed, to create operational data. The method may further include using a measurable attribute associated with the system, segmenting the operational data into a first bin, a second bin, a third bin, and a fourth bin. The method may further include preparing a first data model associated with the first bin, a second data model associated with the second bin, a third data model associated with the third bin, and a fourth data model associated with the fourth bin. The method may further include using deep reinforcement learning, training a first brain based on the first data model, a second brain based on the second data model, a third brain based on the third data model, and a fourth brain based on the fourth data model. The method may further include using the first brain, the second brain, the third brain, and the fourth brain generating predicted supervisory control suggestions and collating the predicted supervisory control suggestions into a single data structure.
In another example, the present disclosure relates to systems for implementing various autonomous control methods, including the above method.
In yet another example, the present disclosure relates to a system, including at least one processor, where the system is configured to using a measurable attribute associated with a system, segment operational data associated with the system into at least a first bin and a second bin. The system may further be configured to train a first brain based on a first data model associated with the first bin and train a second brain based on a second data model associated with the second bin. The system may further be configured to using the first brain and the second brain, implemented by at least one processor, automatically generate predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system.
In another example, the present disclosure relates to a method including using a measurable attribute associated with a system, segmenting operational data associated with the system into at least a first bin and a second bin. The method may further include training a first brain based on a first data model associated with the first bin and training a second brain based on a second data model associated with the second bin. The method may further include using the first brain and the second brain, implemented by at least one processor, automatically generating predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system.
In yet another example, the present disclosure relates to a method including using a measurable attribute associated with a system, segmenting operational data associated with the system into at least a first bin and a second bin, where the segmenting the operational data associated with the system into the first bin and the second bin further comprises determining a transition boundary between the first bin and the second bin. The method may further include using deep reinforcement learning, training a first brain based on a first data model associated with the first bin and training a second brain based on a second data model associated with the second bin. The method may further include using the first brain and the second brain, implemented by at least one processor, automatically generating predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The present disclosure is illustrated byway of example and is not limited by the accompanying figures, in which like references indicate similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.
Examples described in this disclosure relate to autonomous control of supervisory setpoints using artificial intelligence. Certain examples relate to autonomously controlling supervisory setpoints using deep reinforcement learning and machine teaching as applied to, but not limited to, “HVAC-like” systems for smart building operations. The described examples are directed to supervisory control and thus are compute light. In addition, unlike some traditional artificial intelligence (AI) systems that take direct and intrusive control, the examples described herein are not disruptive to the existing operations and product lines of entities that may deploy the supervisory control systems and methods described herein. Finally, the present disclosure leverages a binning strategy, which allows the supervisory control to be effective even in a sparse and an uncertain data environment.
In one example, a trained “brain” using simulations and deep reinforcement learning (DRL) may be used to provide the control at the supervisory level.
With continued reference to
Stage 320 may relate to filtering and splitting datasets into bins. The filtering step may include ignoring the temporal dependency in the historical state and action data related to the HVAC system. After the filtering step, in this example, the dataset may be segmented into separate bins 330 by the outside air temperature (OAT). Thus, in this example, the dataset may be segmented into four bins: bin 332 (for the data related to OAT<40 degrees Fahrenheit), bin 334 (for the data related to 40 degrees Fahrenheit<OAT<50 degrees Fahrenheit), bin 336 (for the data related to 50 degrees Fahrenheit<OAT<60 degrees Fahrenheit, and bin 338 (for the data related to 60 degrees Fahrenheit<OAT). Binning, however, is not limited to based on outside air temperature alone. Other attributes, including for example, the wet bulb temperature, which is a measure of the relative humidity, may also be used. Indeed, for other autonomous supervisory control situations other attributes, such as pressure, power, or other measurable attributes, may be used. In addition, the bin size need not be equal, as they can be unequal. Moreover, there is no restriction on the number of bins.
This type of structured binning may be accomplished by selecting the boundaries for the bins based on transition functions. Assuming there are no other constraints in terms of business requirements, operational realities, or legacy workflow, the bins may simply be based on transition boundaries. Thus, historical data may be evaluated to understand the evolution of the various states as a function of time; and the bin boundaries may be placed at, or close to, the transition boundaries. In one example, the transition boundaries may be determined qualitatively. As an example, if the evolution of the states makes a transition by a factor of 2 from one instance of time to another, then that may be viewed as an abrupt transition. The bin boundary may be selected based on this observed abrupt transition in one of the states. The abruptness of the transition may be a factor of 2 or even 20 and may depend on the control system and its context.
Stage 340 may relate to building data models for each bin. As an example, build data model 342 may relate to building a segmented data model corresponding to bin 332. Build data model 344 may relate to building a segmented data model corresponding to bin 334. Build data model 346 may relate to building a segmented data model corresponding to bin 336. Build data model 348 may relate to building a segmented data model corresponding to bin 338. As an example, as part of stage 340, the segmented dataset from each bin may be rearranged to prepare the dataset for a Markov decision process type of model. Any modeling approach may be used to model the Markov decision process, which then acts as a simulator for building and training the brain for each bin. As an example, the Markov decision process is used to rearrange the segmented dataset from each bin such that the future is only dependent on the present and is independent of the past. In these examples, the modeling of the historical data to come up with the model allows the autonomous control system to be trained without requiring large amounts of training data required by traditional machine learning algorithms. In one example, the Markov decision process can be characterized by a tuple (S, A, T, R), where: (1) S is a finite set of states; (2) A is a finite set of actions; (3) T is a state-transition function such that T(s′, a, s)=p(s′|, a); and (4) R is a local reward function. Thus, in the context of HVAC system environment 100 of
Stage 360 may relate to building and training the brain for each bin. As an example, build and train the brain 362 may correspond to the data model derived from build data model 342. Build and train the brain 364 may correspond to the data model derived from build data model 344. Build and train the brain 366 may correspond to the data model derived from build data model 346. Build and train the brain 368 may correspond to the data model derived from build data model 348. Each of these brains may be built and trained using machine teaching. As used herein the term “brain” includes, but is not limited, to one or more neural networks, one or more neural network layers, or any other trainable artificial intelligence. The trained brains may be deployed using a client/server architecture or any other architecture that allows the trained “brain” to respond to control and/or data in relation to a control system (e.g., the HVAC system).
Any autonomous control system similar to the HVAC system that has a defined start state, iterates over time, and responds to external inputs may be implemented using the workflow 300. A specific start state, such as a certain temperature setpoint, may be required to allow the brain to learn from a wide array of conditions. The machine teaching using deep reinforcement learning may result in the brain being able to take a set of discrete actions to affect the state even in an uncharted territory. In sum, each of the brains for a respective bin should be able to predict in contexts and scenarios that it has not explicitly encountered in the dataset corresponding to the respective bin. The brain may learn using any of the learning algorithms, including Distributed Deep Q Network, Proximal Policy Optimization, or Soft Actor Critic. Once the brains for each bin have been trained, the next stage is prediction. In case of the HVAC system, the states are the power usage of the HVAC system and the cooling load and thus the goal of learning by the brain is to maximize energy efficiency while meeting the building's cooling demand.
Thus, in stage 370, the suggested actions for each bin may be combined in one lookup table or a similar data structure. The suggested actions may be based on predictions for each brain associated with a respective bin. Table 1 below shows an example of the lookup table.
In one example, the lookup table may be integrated with a dashboard associated with the HVAC system. In this manner, the lookup table may allow for the autonomous supervisory control of the HVAC system. The integration with using software may be such that an operator of the HVAC system may override some of the recommendations of the brains associated with the respective bins. The lookup table may be integrated via other means, including by using a hardware controller (e.g., a field programmable gate array (FPGA) or a programmable logic controller (PLC)).
With continued reference to
An example LSTM network may comprise a sequence of repeating RNN layers or other types of layers. Each layer of the LSTM network may consume an input at a given time step, e.g., a layer's state from a previous time step, and may produce a new set of outputs or states. In the case of using the LSTM, a single chunk of content may be encoded into a single vector or multiple vectors. As an example, a word or a combination of words (e.g., a phrase, a sentence, or a paragraph) may be encoded as a single vector. Each chunk may be encoded into an individual layer (e.g., a particular time step) of an LSTM network. An example LSTM layer may be described using a set of equations, such as the ones below:
f
t=σ(Wf·[ht-1xt]+bc)
i
t=σ(Wf·[ht-1xt]+bi)
{tilde over (c)}
t=tanh(Wc·[ht-1xt]+bc)
c
t
=f
t
∘c
t-1
+i
t
∘{tilde over (c)}
t
o
t=σ(Wo·[ht-1xt]+bo)
h
t
=o
t∘tanh(ct)
In this example, in the above equations σ is the element wise sigmoid function and ∘ represents Hadamard product (element-wise). In this example, ft, it, and ot are forget, input, and output gate vectors respectively, and ct is the cell state vector. In this example, inside each LSTM layer, the inputs and hidden states may be processed using a combination of vector operations (e.g., dot-product, inner product, or vector addition) or non-linear operations, if needed.
Still referring to
With continued reference to
Deployment/monitoring 870 may interface with a sensor API that may allow sensors to receive and provide information via the sensor API. Software configured to detect or listen to certain conditions or events may communicate via the sensor API any conditions associated with devices that are being monitored by deployment/monitoring 870. Remote sensors or other telemetry devices may be incorporated within the data centers to sense conditions associated with the components installed therein. Remote sensors or other telemetry may also be used to monitor other adverse signals in the data center and feed the information to deployment/monitoring 870. As an example, if fans that are cooling a rack stop working then that may be sensed by the sensors and reported to the deployment/monitoring 870. Although
Still referring to
Instructions corresponding to various parts of platform 700 may be stored in memory 906 or another memory. These instructions when executed by processor(s) 902, or other processors, may provide the functionality associated with platform 700. The instructions corresponding to platform 700, and related components, could be encoded as hardware corresponding to an A/I processor. In this case, some or all of the functionality associated with the learning-based analyzer may be hard-coded or otherwise provided as part of an A/I processor. As an example, A/I processor may be implemented using a field programmable gate array (FPGA) with the requisite functionality. Other types of hardware such as ASICs and GPUs may also be used. The functionality associated with platform 600 may be implemented using any appropriate combination of hardware, software, or firmware. Although
Step 1020 may include using a measurable attribute associated with the system, segmenting the operational data into a first bin, a second bin, a third bin, and a fourth bin. As explained earlier, with respect to
Step 1030 may include preparing a first data model associated with the first bin, a second data model associated with the second bin, a third data model associated with the third bin, and a fourth data model associated with the fourth bin. As explained earlier, with respect to
Step 1040 may include using deep reinforcement learning, training a first brain based on the first data model, a second brain based on the second data model, a third brain based on the third data model, and a fourth brain based on the fourth data model. As explained earlier, with respect to
Step 1050 may include using at least the first brain, the second brain, the third brain, and the fourth brain, generating predicted supervisory control suggestions and then collating the predicted supervisory control suggestions into a single data structure. As explained earlier, with respect to
In another example, using the HVAC system as an example, the bin boundaries based on the outside air temperature (OAT) may be identified by detecting changes in the dynamics of the HVAC system in relation to predicted values of a state associated with the HVAC system. In this example, the tonnage of the chilled water, which is viewed as a demand state used by the HVAC system, is the state that is predicted by the underlying deep neural network (DNN) model. In one example, the changes in the dynamics of the HVAC system may be identified by determining a difference in the predicted values of the demand state in the forward direction (D+) and in the backward direction (D−). To accomplish this, a forward data model may be created that captures the dynamic behavior of the HVAC system forward in time (e.g., time=0 to time=t+1). In one example, the forward data model may relate to the organization of the neural network training data such that the neural network processing includes receiving: (1) values for one or more current states of the system, and (2) the current inputs, and providing values for one or more of the next states. In terms of an equation, the forward data model may be expressed as: S(t+1)=NNModel(S(t), a(t)), where S(t+1) corresponds to the next state(s), S(t) corresponds to the current state(s), and a(t) corresponds to the action(s).
Table 3 above shows an example of a forward data model. In this example, the states include chilled water tonnage and other state(s). The action(s) may correspond to any one or more of: (1) chilled water temperature setpoint (CHW SWS) relative to the outside air temperature, (2) condenser water temperature setpoint (CDW SWS) relative to the wet bulb temperature (a measure of the ambient relative humidity), (3) chilled water flow GPM STPT, (4) differential pressure (DPSP), and (5) chilled water setpoint relative to the return water temperature from the building. With continued reference to Table 3 above, in one example, at time t, values of TON(t), OS(t) and a(t) are processed as inputs and the predicted values for TON(t+1) and OS(t+1) are generated.
A backward data model may be created that captures the inverse dynamic behavior of the HVAC system backward in time (e.g., time=t+1 to time=0). In one example, the backward data model may relate to the organization of the neural network training data such that the neural network processing includes receiving: (1) values for one or more current states of the system, and (2) the current inputs, and providing values for one or more of the previous states. In terms of an equation, the backward data model may be expressed as: S(t−1)=NNModel(S(t), a(t−1)), where S(t−1) corresponds to the previous state(s), S(t) corresponds to the current state(s), and a(t−1) corresponds to the action(s).
Table 4 above shows an example of a backward data model. In this example, the states include chilled water tonnage and other state(s). The action(s) may correspond to any one or more of (1) chilled water temperature setpoint (CHW SWS) relative to the outside air temperature, (2) condenser water temperature setpoint (CDW SWS) relative to the wet bulb temperature (a measure of the ambient relative humidity), (3) chilled water flow GPM STPT, (4) differential pressure (DPSP), and (5) chilled water setpoint relative to the return water temperature from the building. With continued reference to Table 4 above, in one example, at time t+1, values of TON(t+1), OS(t+1) and a(t) are processed as inputs and the predicted values for TON(t) and OS(t) are generated.
Step 1420 may include training a first brain based on a first data model associated with the first bin and training a second brain based on a second data model associated with the second bin. As explained earlier, each of the first brain and the second brain is trained using a Markov decision process model characterized by a tuple comprising: (1) a finite set of states associated with the system, (2) a finite set of actions associated with the system, (3) a state transition function associated with the system, and (4) a reward function associated with the system. In one example, neither the finite set of states associated with the system nor the finite set of actions associated with the system may include the measurable attribute associated with the system. As explained earlier, with respect to
Step 1430 may include using the first brain and the second brain, implemented by at least one processor, automatically generating predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system. As explained earlier, with respect to
In conclusion the present disclosure relates to a system, including at least one processor, where the system is configured to using a measurable attribute associated with a system, segment operational data associated with the system into at least a first bin and a second bin. The system may further be configured to train a first brain based on a first data model associated with the first bin and train a second brain based on a second data model associated with the second bin. The system may further be configured to using the first brain and the second brain, implemented by at least one processor, automatically generate predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system.
The system may further be configured to determine a transition boundary between the first bin and the second bin as part of segmenting the operational data associated with the system into the first bin and the second bin. Each of the first brain and the second brain may be trained using a Markov decision process model characterized by a tuple comprising: (1) a finite set of states associated with the system, (2) a finite set of actions associated with the system, (3) a state transition function associated with the system, and (4) a reward function associated with the system. Neither the finite set of states associated with the system nor the finite set of actions associated with the system may include the measurable attribute associated with the system.
The transition boundary may relate to a transition in predicted values of at least one state associated with the system. The transition in the predicted values of the at least one state may be determined by a first set of training data corresponding to a forward data model and a second set of training data corresponding to a backward data model, where the forward data model relates to a dynamic behavior of the system forward in time and the backward data model relates to a dynamic behavior of the system backward in time.
In another example, the present disclosure relates to a method including using a measurable attribute associated with a system, segmenting operational data associated with the system into at least a first bin and a second bin. The method may further include training a first brain based on a first data model associated with the first bin and training a second brain based on a second data model associated with the second bin. The method may further include using the first brain and the second brain, implemented by at least one processor, automatically generating predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system.
The segmenting the operational data associated with the system into the first bin and the second bin further comprises determining a transition boundary between the first bin and the second bin. Each of the first brain and the second brain may be trained using a Markov decision process model characterized by a tuple comprising: (1) a finite set of states associated with the system, (2) a finite set of actions associated with the system, (3) a state transition function associated with the system, and (4) a reward function associated with the system. Neither the finite set of states associated with the system nor the finite set of actions associated with the system may include the measurable attribute associated with the system.
The transition boundary may relate to a transition in predicted values of at least one state associated with the system. The transition in the predicted values of the at least one state may be determined by using a first set of training data corresponding to a forward data model and a second set of training data corresponding to a backward data model, where the forward data model relates to a dynamic behavior of the system forward in time and the backward data model relates to a dynamic behavior of the system backward in time. The transition in the predicted values of the at least one state may be determined by determining differences between a first set of predicted values of the at least one state based on the forward data model and a second set of predicted values of the at least one state based on the backward data model.
The system may comprise a heating, ventilation, and cooling (HVAC) system and where the measurable attribute comprises an air temperature outside a structure being heated or cooled by the HVAC system.
In yet another example, the present disclosure relates to a method including using a measurable attribute associated with a system, segmenting operational data associated with the system into at least a first bin and a second bin, where the segmenting the operational data associated with the system into the first bin and the second bin further comprises determining a transition boundary between the first bin and the second bin. The method may further include using deep reinforcement learning, training a first brain based on a first data model associated with the first bin and training a second brain based on a second data model associated with the second bin. The method may further include using the first brain and the second brain, implemented by at least one processor, automatically generating predicted supervisory control suggestions for a plurality of supervisory setpoints associated with the system.
Each of the first brain and the second brain may be trained using a Markov decision process model characterized by a tuple comprising: (1) a finite set of states associated with the system, (2) a finite set of actions associated with the system, (3) a state transition function associated with the system, and (4) a reward function associated with the system. Neither the finite set of states associated with the system nor the finite set of actions associated with the system may include the measurable attribute associated with the system.
The transition in the predicted values of the at least one state may be determined by using a first set of training data corresponding to a forward data model and a second set of training data corresponding to a backward data model, where the forward data model relates to a dynamic behavior of the system forward in time and the backward data model relates to a dynamic behavior of the system backward in time. The transition in the predicted values of the at least one state may be determined by determining differences between a first set of predicted values of the at least one state based on the forward data model and a second set of predicted values of the at least one state based on the backward data model.
It is to be understood that the methods, modules, and components depicted herein are merely exemplary. Alternatively, or in addition, the functionality described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application-Specific Standard Products (ASSPs), System-on-a-Chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc. In an abstract, but still definite sense, any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or inter-medial components. Likewise, any two components so associated can also be viewed as being “operably connected,” or “coupled,” to each other to achieve the desired functionality.
The functionality associated with some examples described in this disclosure can also include instructions stored in a non-transitory media. The term “non-transitory media” as used herein refers to any media storing data and/or instructions that cause a machine to operate in a specific manner. Exemplary non-transitory media include non-volatile media and/or volatile media. Non-volatile media include, for example, a hard disk, a solid-state drive, a magnetic disk or tape, an optical disk or tape, a flash memory, an EPROM, NVRAM, PRAM, or other such media, or networked versions of such media. Volatile media include, for example, dynamic memory such as DRAM, SRAM, a cache, or other such media. Non-transitory media is distinct from, but can be used in conjunction with transmission media. Transmission media is used for transferring data and/or instruction to or from a machine. Exemplary transmission media include coaxial cables, fiber-optic cables, copper wires, and wireless media, such as radio waves.
Furthermore, those skilled in the art will recognize that boundaries between the functionality of the above described operations are merely illustrative. The functionality of multiple operations may be combined into a single operation, and/or the functionality of a single operation may be distributed in additional operations. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.
Although the disclosure provides specific examples, various modifications and changes can be made without departing from the scope of the disclosure as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure. Any benefits, advantages, or solutions to problems that are described herein with regard to a specific example are not intended to be construed as a critical, required, or essential feature or element of any or all the claims.
Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles.
Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements.
This application claims the benefit of U.S. Provisional Application No. 63/091,384, filed Oct. 14, 2020, entitled “AUTONOMOUS CONTROL OF SUPERVISORY SETPOINTS USING ARTIFICIAL INTELLIGENCE,” the entire contents of which is hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63091384 | Oct 2020 | US |