Deep Reinforcement Learning on HVAC Control

Ivars Namatēvs


Due to increase of computing power and innovative approaches of an end-to-end reinforcement learning (RL) that feed data from high-dimensional sensory inputs, it is now plausible to combine RL and Deep learning to perform Smart Building Energy Control (SBEC) systems. Deep reinforcement learning (DRL) revolutionizes existing Q-learning algorithm to Deep Q-learning (DQL) profited by artificial neural networks. Deep Neural Network (DNN) is well trained to calculate the Q-function. To create comprehensive SBEC system it is crucial to choose appropriate mathematical background and benchmark the best framework of a model based predictive control to manage the building heating, ventilation, and air condition (HVAC) system. The main contribution of this paper is to explore a state-of-the-art DRL methodology to smart building control.


Deep reinforcement learning; deep Q-learning; deep neural network; energy management system

Full Text:



R. S. Smith, “Model Predictive Control of Energy Flow and Thermal Comfort in Buildings”, MPC Seminar, EPFL, Lausanne, Switzerland, May 23, 2013.

H. W. Lin, and T. Hong, “On Variations of Space-Heating Energy Use in Office Buildings”, Applied Energy, vol. 111, pp. 515–528, 2013.

Coherent Market Insights. [Online]. Available: [Accessed: Sept.2, 2018].

N. Parks, “Energy efficiency and the smart grid,” Environmental Science & Technology, vol. 43, no. 9, pp. 2999–3000, May 2009.

Efficiency and the Smart Grid, Environmental Science & Technology, pp. 2999–3000, May 1, 2009. [Online]. Available: [Accessed: Sept.2, 2018].

Z. Afroz, G M. Shafiullah, T. Urmee, and G. Higgins, “Modelling Techniques used in HVAC Control Systems: A Review”, Renewable and Sustainable Energy Reviews, vol. 83, pp. 64–84, 2018.

P. M. Ferreira, A. E. Ruano, S. Silva, and E.Z.E. Conceico, “Neural Networks Based Predictive Control for Thermal Comfort and Energy Savings in Public Buildings”, Energy and Buildings, pp. 238–251, 2012.

X. Li, and J. Wen, “Review of Building Energy Modelling for Control and Operation”, Renewable and Sustainable Energy Reviews, vol. 37, pp. 517–537, 2014.

K. Dalmagkidis, D. Kolokotse, K. Kalaitzakis, and G. S. Stavrakakis, “Reinforcement Learning for Energy Conservation and Comfort in Buildings”, Building and Environment vol. 42, no. 7, pp. 2686–2698, 2007.

B. W. Olsen, and K. C. Parson, “Thermal Comfort Standards and to the Proposal New Version of EN ISO 7730”, Energy and Buildings, vol. 34, no. 6, pp. 537–548, 2002.

F. Oldewurtel, A. Parisio, C. N. Jones, M. Morari, D. Gyalistras, M. Gwerder, V. Stauch, B. Lehmann, and K. Wirth,” Energy Efficient Building Climate Control using Stochastic Model Predictive Control and Weather Predictions”, Paper presented at the 2010 American Control Conference (ACC2010), 30 June – 2 July 2010, Baltimore, Maryland, USA.

S. Whiteson, “Adaptive Representation for Reinforcement Learning”, Springer, p. 133, 2010.

M. Han, X. Zhang, L. Xu, R. May, S. Pan, and J. Wu, “A Review of Reinforcement Learning Methodologies on Control Systems for Building Energy”, Working papers in transport, tourism, information technology and microdata analysis, Dalarna University, Nr. 2018:02, 2018. ISSN 1650-5581

H. Belink, and A. H. Costa, “Batch Reinforcement Learning for Smart Home Energy Management”, Proceedings of the 24th International Joint Conference on Artificial Intelligence, pp. 2561–2567, 2015.

V. Mansur, P. Carreira, and A. Arsenio, “A Learning Approaches for Energy Efficiency Optimization by Occupancy Detection”, Internet of Things. User-Centric IoT, pp. 9–15, 2015.

L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement Learning: A Survey”, Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996.

J. Ma, J. Qin, T. Salsbery, and P. Xu, “Demand Reduction in Building Energy Systems Based on Economical Model Predictive Control”, Chemical Engineering Science, vol. 67, no. 1, pp. 92–100, 2012.

R. S. Sutton, and A. G. Barto, “Reinforcement Learning: An Introduction”, 2nd ed., Cambridge, Massachusetts, London: A Bradford Book MIT Press, 2018.

C. Szepesvari, “Algorithms for Reinforcement Learning”, Synthesis Lectures on Artificial Intelligence and Machine Learning series, Morgan & Claypool Publishers, 2009.

A. Juliani, V.-P. Berges, E. Vckay, Y. Gao, H. Henry, M. Motta, and D. Lange, “Unity: A General Platform for Intelligence Agents”, 2018. arXiv:1809.02627v1.7Sep2018.

A. Cahill, “Catastrophic Forgetting in Reinforcement-Learning Environments” M.S. thesis, University of Otago, New Zealand, 2010.

C. J. C. H. Watkins, “Learning from Delayed Rewards”, Ph.D. thesis, King`s Colege, London, 1989.

T. Wei, Y. Wang, and Q. Znu, “Deep Reinforcement Learning for Building HVAC Control”, DAC`17, June 18–22, 2017, Austin, TX, USA.

V. Heirdich-Meisner, C. Igel, “Neuroevolution Strategies for Episodic Reinforcement Learning”, Journal of Algorithms vol. 64, no. 4, pp. 152– 168, 2009.

E. Mocanu, D. C. Mocanu, P. H. Nguyen, A. Liotta, M. E. Webber, M. Gibescu, and J. G. Slootweg, “Energy Optimization using Deep Reinforcement Learning”, 2017. arXiv:1707.05878v1.

J. Ma, J. Qin, T. Salsbury, and P. Xu, “Demand Reduction in Building Energy Systems Based on economic model predictive control”, Chemical Engineering Science, vol. 67, no. 1, pp. 92–100, 2012.

A. Afram, and F. Janabi-Sharif, “Theory and Applications of HVAC Control System – A Review of Model Predictive Control (MPC)”, Building and Environment” vol. 72, pp. 343–355, 2014.

A. Krizhevsky, I. Sutskever, and G. E. Hinton. “ImageNet Classification with Deep Convolutional Neural Networks”, in NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1, 2012, pp. 1097–1105.

DOI: 10.7250/itms-2018-0004


  • There are currently no refbacks.

Copyright (c) 2018 Ivars Namatēvs

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.