This invention relates generally to heating, ventilation, and air conditioning (HVAC) systems, and more particularly to controlling HVAC systems to reduce energy consumption.
It is important to a control heating, ventilation, and air conditioning (HVAC) system so that energy consumption can be reduced. To control the HVAC system, outside and inside conditions are considered. The outside conditions can be due to the time of day, the seasons, and weather, and the inside condition can be due to the time of day, the day of the week, machinery, office equipment, lighting, occupants, and building thermal mass. All these conditions vary dynamically, and often in an unpredictable manner.
Therefore, HVAC system typically use input signals from timers, and sensors inside and outside of the building to determine heating, ventilation, and cooling demands relative to temperature set-points. Over time, the set-points form a trajectory. Generally, the object is to determine on optimal trajectory of set-points, which maintains a comfortable temperature, while reducing energy consumption.
One control strategy is Night Set-up Strategy (NSS). With this strategy, the HVAC system is used only when needed. The system is turned off at night as much as possible, using set-points for the heating systems, which are reduced at night in the winter. The set-points for the cooling systems are increased at night in the summer. The set-points are selected such that the system can essentially be turned off except when set-points are exceeded.
A number of methods for solving this problem are known, such as, dynamic optimization, genetic algorithms, and nonlinear optimization. However, those methods simulate using a generalized building thermal model. Some methods rely on an approximated model that does not have any guarantee on the performance of the system.
The embodiments of the invention provide a method for controlling a heating, ventilation, and air conditioning (HVAC) system to reduce energy consumption. The method uses a Markov decision problem (MDP), and associated solving techniques.
A building thermal model is converted to an MDP model, after using Delaunay triangulation, and action discretization.
Specifically, a method controls a heating, ventilation, and air conditioning (HVAC) system for a building. The system is modeled with a state space model, wherein the state space includes a set of states. A set of suitable actions is defined for each state, wherein the system changes from a current state to a next state based on the current state, and a selected action.
A set of samples is selected in the state space, and triangulated to descritize the state space into simplices, wherein each simplex has a set of nodes. For each state and a corresponding simplex, a cost-to-go for each node is obtained, and then a trajectory of set-points of temperatures for the system is generated based on the computed costs-to-go.
The embodiments of our invention provide a method for controlling a heating, ventilation, and air conditioning (HVAC) system in a building to reduce energy consumption. More specifically, we use a Markov decision process (MDP) to solve this problem.
Markov Decision Problem Model for Optimizing Set-Point Trajectories
Introduction to MDP
MDP provides a framework for solving sequential decision problems. A typical MDP for a system has a set of states and corresponding sets of actions for each state. The system changes from a current state to a next state based on the current state, and a selected action. In another word, the transition process of MDP is memoryless. For example, the current state of a component the system is OFF, the action is TURN ON, and the next state is ON, or a component has a current state of 21°, and the action is INCREASE 5°, and in the next state the component operates at 26°. It is noted, that buildings are often partitioned into zones, and the heating, ventilation air conditioning in the zones are controlled independently.
For a pair of state and action, the next state is not deterministic, usually with probabilities to a number of states. These properties of MDP make a useful framework for modeling dynamic systems and decision processes.
A description for a common finite MDP is a four-tuple of (I; X, U, P), where:
P={p
ij(u)|∀xi,xjεX,uεU}.
The MDP is solved using backward dynamic programming when the time interval T is finite, and by value iteration or policy iteration when the time interval T is infinite.
Building Thermal Model
The MDP based trajectory is generated and simulated via an example thermal circuit as shown in
where ROz is the thermal resistance between an office zone and other zones, RWin is the resistance between thee office zone and an outside environment through windows, REo is the thermal resistance of the outside wall surface, CEo is the thermal capacitance of outside wall surface, REm is the thermal resistance between the outside wall surface and an inner wall surface, CEi is the thermal resistance of the inner wall surface, REi is the thermal resistance between the inner wall surface and zone capacitance, CZ is the thermal capacitance of zone, and TZ is the zone temperature.
Continuous State Continuous Action MDP
The MDP problem could be solved with equations (1) to (4) using backward dynamic programming. However, in the HVAC control problem, the temperature values at every capacitance in the thermal circuit are in a continuous interval instead of a discrete set. The situation is the same for actions, as the actions determine the temperatures, which are also continuous.
Thus, to make the discrete dynamic programming framework applicable for solving this problem, discretization is needed for both temperatures and actions. Terminologies and notations used are listed as follows:
We apply Delaunay triangulation to the set of samples of the state space to descritize the state space into simplices. Each simplex has a set of nodes in the state space, where the number of nodes is 1+N. Thus, every state within the continuous state space belongs to one and only one simplex.
For a state x and the corresponding simplex s including the nodes, equation (5) is applied for obtaining V(x) for values of the nodes in the simplex, where
The action is discretized into different levels. For example, if a comfort temperature range is [21° C.-26° C.], then actions for the set-points can be 21°, 22°, . . . , 26°, depending on the required accuracy.
Another special situation for the problem is that the outside temperature is changing, which leads to changing AC coefficient of performance (COP) values, and building thermal behavior. Thus, the time interval also needs to be discretized.
The same set of state spaces exists at every time step and the system state changes from the current state to the next state in the next time step.
The Bellman equation, also known as a dynamic programming equation, is a necessary condition for optimality in dynamic programming. The equation expresses the value of the decision problem at a certain instance in time in terms of the payoff from some initial choices, and the value of the remaining decision problem that results from those initial choices. This reduces a dynamic optimization problem to simpler subproblems.
Trajectory Generation Procedure
Thus, as shown in
Sampling.
A set of samples 311 in the state space 301 is selected 310. There can be different ways of sampling. In one embodiment, we apply uniform sampling along each dimension, including boundary nodes make sure all states are covered by the simplices
State Space Triangulation.
Denaulay triangulation is applied 320 to the state space samples to descritize the state space into simplices, wherein each simplex has a set of nodes.
Simplex Node Optimal Value Evaluation.
A Bellman equation is applied to obtain 330 the optimal value of each node of every simplex.
The potential savings by applying MDP based trajectory can be greater than 50% when compared with conventional methods, such as NSS, which needs to be optimized every time when it is applied in a different environment.
In contrast, our MDP based approach can generate set-point trajectory adaptively to different outside weather and inside building thermal properties.
The process on state space triangulation and set-point trajectory generation can be parallelized.
Our MDP based approach yields a greatly changing trajectory, which is actually equivalent to trajectories that are smoother. This can be achieved by changing the order for evaluating different actions during the trajectory generating process.
To speed up the evaluation process for potential actions, a number of actions can be aggregated because the aggregated actions lead to same next state with same cost.
Although the invention has been described by way of examples of preferred embodiments, it is to be understood that various other adaptations and modifications may be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.