Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning

Hossain, Ramij Raja; Yin, Tianzhixi; Du, Yan; Huang, Renke; Tan, Jie; Yu, Wenhao; Liu, Yuan; Huang, Qiuhua

doi:10.1007/s10994-023-06422-w

Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning

Published: 06 November 2023

Volume 113, pages 2675–2700, (2024)
Cite this article

Machine Learning Aims and scope Submit manuscript

Ramij Raja Hossain ORCID: orcid.org/0000-0003-0224-7245¹,
Tianzhixi Yin ORCID: orcid.org/0000-0003-0730-4309¹,
Yan Du¹,
Renke Huang¹,
Jie Tan²,
Wenhao Yu²,
Yuan Liu¹ &
…
Qiuhua Huang¹

487 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems. Recent advances show promising results for model-free DRL-based methods in power systems control problems. But in power systems applications, these model-free methods have certain issues related to training time (clock time) and sample efficiency; both are critical for making state-of-the-art DRL algorithms practically applicable. DRL-agent learns an optimal policy via a trial-and-error method while interacting with the real-world environment. It is also desirable to minimize the direct interaction of the DRL agent with the real-world power grid due to its safety-critical nature. Additionally, the state-of-the-art DRL-based policies are mostly trained using a physics-based grid simulator where dynamic simulation is computationally intensive, lowering the training efficiency. We propose a novel model-based DRL framework where a deep neural network (DNN)-based dynamic surrogate model (SM), instead of a real-world power grid or physics-based simulation, is utilized within the policy learning framework, making the process faster and more sample efficient. However, having stable training in model-based DRL is challenging because of the complex system dynamics of large-scale power systems. We addressed these issues by incorporating imitation learning to have a warm start in policy learning, reward-shaping, and multi-step loss in surrogate model training. Finally, we achieved 97.5% reduction in samples and 87.7% reduction in training time for an application to the IEEE 300-bus test system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DAE-PINN: a physics-informed neural network model for simulating differential algebraic equations with application to power networks

Article 15 October 2022

Deep reinforcement learning-based network for optimized power flow in islanded DC microgrid

Article 15 May 2023

Short Term Voltage Stability Assessment with Incomplete Data Based on Deep Reinforcement Learning in the Internet of Energy

Data availability

Data are available upon request.

Code availability

Code is available upon request.

References

International Energy Agency (2021). An energy sector roadmap to carbon neutrality in China. OECD Publishing.
Book Google Scholar
Atkeson, C. G., & Santamaria, J. C. (1997). A comparison of direct and model-based reinforcement learning. In Proceedings of international conference on robotics and automation (Vol. 4, pp. 3557–3564).
Australian Energy Market Operator (2017). Black system South Australia 28 September 2016: Final report. https://aemo.com.au/
Balduin, S., Tröschel, M., & Lehnhoff, S. (2019). Towards domain-specific surrogate models for smart grid co-simulation. Energy Informatics, 2(1), 1–19.
Google Scholar
Birol, F., & Kant, A. (2022). India’s clean energy transition is rapidly underway, benefiting the entire world.
Brosinsky, C., Westermann, D., & Krebs, R. (2018). Recent and prospective developments in power system control centers: Adapting the digital twin technology for application in power system control centers. In 2018 IEEE international energy conference (ENERGYCON) (pp. 1–6).
California ISO (2013). California ISO-fast facts. https://www.caiso.com/documents/flexibleresourceshelprenewables_fastfacts.pdf
Cao, D., Hu, W., Zhao, J., Zhang, G., Zhang, B., Liu, Z., Chen, Z., & Blaabjerg, F. (2020). Reinforcement learning and its applications in modern power and energy systems: A review. Journal of Modern Power Systems and Clean Energy, 8(6), 1029–1042.
Article Google Scholar
Cao, J., Zhang, W., Xiao, Z., & Hua, H. (2019). Reactive power optimization for transient voltage stability in energy internet via deep reinforcement learning approach. Energies, 12(8), 1556.
Article Google Scholar
Cao, D., Zhao, J., Hu, W., Ding, F., Yu, N., Huang, Q., & Chen, Z. (2022). Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning. Applied Energy, 306, 117982.
Article Google Scholar
Chen, C., Cui, M., Li, F., Yin, S., & Wang, X. (2020). Model-free emergency frequency control based on reinforcement learning. IEEE Transactions on Industrial Informatics, 17(4), 2336–2346.
Article Google Scholar
Chen, X., Qu, G., Tang, Y., Low, S., & Li, N. (2022). Reinforcement learning for selective key applications in power systems: Recent advances and future challenges. IEEE Transactions on Smart Grid, 13(4), 2935–2958.
Article Google Scholar
Deisenroth, M., & Rasmussen, C. E. (2011). Pilco: A model-based and data-efficient approach to policy search. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 465–472).
Duan, J., Shi, D., Diao, R., Li, H., Wang, Z., Zhang, B., Bian, D., & Yi, Z. (2019). Deep-reinforcement-learning-based autonomous voltage control for power grid operations. IEEE Transactions on Power Systems, 35(1), 814–817.
Article Google Scholar
Fetting, C. (2020), The european green deal. ESDN Report, December (2020)
Gao, Y., & Yu, N. (2022). Model-augmented safe reinforcement learning for Volt–VAR control in power distribution networks. Applied Energy, 313, 118762.
Article Google Scholar
Glavic, M. (2019). (Deep) reinforcement learning for electric power system control and related problems: A short review and perspectives. Annual Reviews in Control, 48, 22–35.
Article MathSciNet Google Scholar
Hatziargyriou, N., Milanovic, J., Rahmann, C., Ajjarapu, V., Canizares, C., Erlich, I., Hill, D., Hiskens, I., Kamwa, I., Pal, B., Pourbeik, P., Sanchez-Gasca, J., Stankovic, A., Van Cutsem, T., Vittal, V., & Vournas, C. (2021). Definition and classification of power system stability-revisited and extended. IEEE Transactions on Power Systems, 36(4), 3271–3281.
Article Google Scholar
Hossain, R. R., Huang, Q., & Huang, R. (2021). Graph convolutional network-based topology embedded deep reinforcement learning for voltage stability control. IEEE Transactions on Power Systems, 36, 4848–4851.
Article Google Scholar
Huang, R., Jin, S., Chen, Y., Diao, R., Palmer, B., Huang, Q., & Huang, Z. (2017). Faster than real-time dynamic simulation for large-size power system with detailed dynamic models using high-performance computing platform. In 2017 IEEE power and energy society general meeting (pp. 1–5).
Huang, R., Chen, Y., Yin, T., Huang, Q., Tan, J., Yu, W., Li, X., Li, A., & Du, Y. (2022). Learning and fast adaptation for grid emergency control via deep meta reinforcement learning. IEEE Transactions on Power Systems, 37, 4168–4178.
Article Google Scholar
Huang, R., Chen, Y., Yin, T., Li, X., Li, A., Tan, J., Yu, W., Liu, Y., & Huang, Q. (2021). Accelerated derivative-free deep reinforcement learning for large-scale grid emergency voltage control. IEEE Transactions on Power Systems, 37(1), 14–25.
Article Google Scholar
Huang, Q., Huang, R., Hao, W., Tan, J., Fan, R., & Huang, Z. (2019). Adaptive power system emergency control using deep reinforcement learning. IEEE Transactions on Smart Grid, 11(2), 1171–1182.
Article Google Scholar
Huang, Q., Huang, R., Palmer, B. J., Liu, Y., Jin, S., Diao, R., Chen, Y., & Zhang, Y. (2019). A generic modeling and development approach for WECC composite load model. Electric Power Systems Research, 172, 1–10.
Article Google Scholar
Hussein, A., Gaber, M. M., Elyan, E., & Jayne, C. (2017). Imitation learning: A survey of learning methods. ACM Computing Surveys, 50(2), 1–25.
Article Google Scholar
Jiang, C., Li, Z., Zheng, J., & Wu, Q. (2019). Power system emergency control to improve short-term voltage stability using deep reinforcement learning algorithm. In 2019 IEEE 3rd international electrical and energy conference (CIEEC) (pp. 1872–1877).
Kamel, M., Dai, R., Wang, Y., Li, F., & Liu, G. (2021). Data-driven and model-based hybrid reinforcement learning to reduce stress on power systems branches. CSEE Journal of Power and Energy Systems, 7(3), 433–442.
Google Scholar
Kamruzzaman, M., Duan, J., Shi, D., & Benidris, M. (2021). A deep reinforcement learning-based multi-agent framework to enhance power system resilience using shunt resources. IEEE Transactions on Power Systems, 36(6), 5525–5536.
Article Google Scholar
Li, J., Chen, S., Wang, X., & Pu, T. (2021). Research on load shedding control strategy in power grid emergency state based on deep reinforcement learning. CSEE Journal of Power and Energy Systems, 8, 1175–1182.
Google Scholar
Lin, B., Wang, H., Zhang, Y., & Wen, B. (2022). Real-time power system generator tripping control based on deep reinforcement learning. International Journal of Electrical Power and Energy Systems, 141, 108127.
Article Google Scholar
Li, X., Wang, X., Zheng, X., Dai, Y., Yu, Z., Zhang, J. J., Bu, G., & Wang, F.-Y. (2022). Supervised assisted deep reinforcement learning for emergency voltage control of power systems. Neurocomputing, 475, 69–79.
Article Google Scholar
Luo, F. -M., Xu, T., Lai, H., Chen, X. -H., Zhang, W., & Yu, Y. (2022). A survey on model-based reinforcement learning. arXiv:2206.09328
Mahmoud, M., Abouheaf, M., & Sharaf, A. (2021). Reinforcement learning control approach for autonomous microgrids. International Journal of Modelling and Simulation, 41(1), 1–10.
Article Google Scholar
Mania, H., Guy, A., & Recht, B. (2018). Simple random search of static linear policies is competitive for reinforcement learning. In Advances in neural information processing systems (Vol. 31).
Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M. I., & Stoica, I., (2018). Ray: A distributed framework for emerging AI applications. In 13th USENIX symposium on operating systems design and implementation) (pp. 561–577).
Moya, C., Lin, G., Zhao, T., & Yue, M. (2023). On approximating the dynamic response of synchronous generators via operator learning: A step towards building deep operator-based power grid simulators. arXiv preprint arXiv:2301.12538
Nagabandi, A., Kahn, G., Fearing, R. S., & Levine, S. (2018). Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 7559–7566).
Nair, A., McGrew, B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 6292–6299).
Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., & Kawato, M. (2004). Learning from demonstration and adaptation of biped locomotion. Robotics and Autonomous Systems, 47(2–3), 79–91.
Article Google Scholar
Perera, A., & Kamalaruban, P. (2021). Applications of reinforcement learning in energy systems. Renewable and Sustainable Energy Reviews, 137, 110618.
Article Google Scholar
PJM (2021). Exelon transmission planning criteria. https://www.pjm.com/-/media/planning/planning-criteria/exelon-planning-criteria.ashx?la=en
Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R. Y., Chen, X., Asfour, T., Abbeel, P., & Andrychowicz, M. (2017). Parameter space noise for exploration. arXiv preprint arXiv:1706.01905
Pomerleau, D. A. (1988). Alvinn: An autonomous land vehicle in a neural network. In Advances in neural information processing systems (Vol. 1, pp. 305–313).
Potamianakis, E. G., & Vournas, C. D. (2006). Short-term voltage instability: Effects on synchronous and induction machines. IEEE Transactions on Power Systems, 21(2), 791–798.
Article Google Scholar
Qiu, G., Liu, Y., Zhao, J., Liu, J., Wang, L., Liu, T., & Gao, H. (2020). Analytic deep learning-based surrogate model for operational planning with dynamic TTC constraints. IEEE Transactions on Power Systems, 36, 3507–3519.
Article Google Scholar
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., & Dormann, N. (2021). Stable-baselines3: Reliable reinforcement learning implementations. The Journal of Machine Learning Research, 22(1), 12348–12355.
Google Scholar
Rocchetta, R., & Patelli, E. (2020). A post-contingency power flow emulator for generalized probabilistic risks assessment of power grids. Reliability Engineering and System Safety, 197, 106817.
Article Google Scholar
Rocchetta, R., Zio, E., & Patelli, E. (2018). A power-flow emulator approach for resilience assessment of repairable power grids subject to weather-induced failures and data deficiency. Applied energy, 210, 339–350.
Article Google Scholar
Ross, S., Gordon, G., & Bagnell, D. (2011). A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 627–635).
Schaal, S., et al. (1997). Learning from demonstration. Advances in Neural Information Processing Systems, 9, 1040–1046.
Google Scholar
Schneider, J. G. (1997). Exploiting model uncertainty estimates for safe dynamic control learning. In Advances in neural information processing systems (pp. 1047–1053).
Shuai, H., & He, H. (2020). Online scheduling of a residential microgrid via Monte-Carlo tree search and a learned model. IEEE Transactions on Smart Grid, 12(2), 1073–1087.
Article Google Scholar
Su, T., Liu, Y., Zhao, J., & Liu, J. (2021). Deep belief network enabled surrogate modeling for fast preventive control of power system transient stability. IEEE Transactions on Industrial Informatics, 18(1), 315–326.
Article Google Scholar
Sun, J., Zhu, Z., Li, H., Chai, Y., Qi, G., Wang, H., & Hu, Y. H. (2019). An integrated critic-actor neural network for reinforcement learning with application of DERs control in grid frequency regulation. International Journal of Electrical Power and Energy Systems, 111, 286–299.
Article Google Scholar
Sutton, R., & Barto, A. (2018). Reinforcement learning: An introduction. MIT Press.
Google Scholar
Taylor, C. W. (1992). Concepts of undervoltage load shedding for voltage stability. IEEE Transactions on Power Delivery, 7(2), 480–488.
Article Google Scholar
United Nations (2023). Intergovernmental Panel on Climate Change longer report. https://www.ipcc.ch/report/ar6/syr/
US Department of Energy (2021). How we’re moving to net-zero by 2050. https://www.energy.gov/articles/how-were-moving-net-zero-2050
Vu, T. L., Mukherjee, S., Huang, R., & Huang, Q. (2021). Safe reinforcement learning for grid voltage control. arXiv preprint arXiv:2112.01484
Wang, T., Bao, X., Clavera, I., Hoang, J., Wen, Y., Langlois, E., Zhang, S., Zhang, G., Abbeel, P., & Ba, J. (2019). Benchmarking model-based reinforcement learning. arXiv preprint arXiv:1907.02057
Wang, X., Liu, Y., Zhao, J., Liu, C., Liu, J., & Yan, J. (2021). Surrogate model enabled deep reinforcement learning for hybrid energy community operation. Applied Energy, 289, 116722.
Article Google Scholar
Xie, J., & Sun, W. (2021). Distributional deep reinforcement learning-based emergency frequency control. IEEE Transactions on Power Systems, 37, 2720–2730.
Article Google Scholar
Yang, Y., Caluwaerts, K., Iscen, A., Zhang, T., Tan, J., & Sindhwani, V. (2020). Data efficient reinforcement learning for legged robots. In Proceedings of the conference on robot learning. Proceedings of machine learning research (Vol. 100, pp. 1–10).
Yan, Z., & Xu, Y. (2018). Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with continuous action search. IEEE Transactions on Power Systems, 34(2), 1653–1656.
Article MathSciNet Google Scholar
Yan, Z., & Xu, Y. (2020). A multi-agent deep reinforcement learning method for cooperative load frequency control of a multi-area power system. IEEE Transactions on Power Systems, 35(6), 4599–4608.
Article Google Scholar
Zhang, J., Lu, C., Fang, C., Ling, X., & Zhang, Y. (2018). Load shedding scheme with deep reinforcement learning to improve short-term voltage stability. In 2018 IEEE innovative smart grid technologies-Asia (ISGT Asia) (pp. 13–18).

Download references

Funding

This work is supported by funding from the U.S. Department of Energy (DOE) Advanced Research Projects Agency-Energy (ARPA-E) OPEN 2018 program. The Pacific Northwest National Laboratory (PNNL) is operated by Battelle for the U.S. Department of Energy (DOE) under Contract DE-AC05-76RL01830.

Author information

Authors and Affiliations

Pacific Northwest National Laboratory (PNNL), Richland, WA, 99354, USA
Ramij Raja Hossain, Tianzhixi Yin, Yan Du, Renke Huang, Yuan Liu & Qiuhua Huang
Google Brain, Google Inc., Mountain View, CA, 94043, USA
Jie Tan & Wenhao Yu

Authors

Ramij Raja Hossain
View author publications
You can also search for this author in PubMed Google Scholar
Tianzhixi Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yan Du
View author publications
You can also search for this author in PubMed Google Scholar
Renke Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Tan
View author publications
You can also search for this author in PubMed Google Scholar
Wenhao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qiuhua Huang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: QH, RH, JT; Methodology: QH, RH, JT, WY, TY, RRH; Formal analysis and investigation: RRH, TY, YD; Writing—original draft preparation: RRH, TY; Writing—review and editing: QH, JT, WY, RH, YD, YL; Funding acquisition: QH, RH; Resources: QH, YL, JT; Supervision: QH, RH.

Corresponding author

Correspondence to Tianzhixi Yin.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Editors: Yuxi Li, Emma Brunskill, Minmin Chen, Omer Gottesman, Lihong Li, Yao Liu, Zongqing Lu, Niranjani Prasad, Zhiwei (Tony) Qin, Csaba Szepesvari, Matthew E. Taylor

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yan Du, Renke Huang, and Qiuhua Huang were at PNNL when conducting this study.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hossain, R.R., Yin, T., Du, Y. et al. Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning. Mach Learn 113, 2675–2700 (2024). https://doi.org/10.1007/s10994-023-06422-w

Download citation

Received: 05 April 2023
Revised: 18 September 2023
Accepted: 03 October 2023
Published: 06 November 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s10994-023-06422-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

DAE-PINN: a physics-informed neural network model for simulating differential algebraic equations with application to power networks

Deep reinforcement learning-based network for optimized power flow in islanded DC microgrid

Short Term Voltage Stability Assessment with Incomplete Data Based on Deep Reinforcement Learning in the Internet of Energy

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

DAE-PINN: a physics-informed neural network model for simulating differential algebraic equations with application to power networks

Deep reinforcement learning-based network for optimized power flow in islanded DC microgrid

Short Term Voltage Stability Assessment with Incomplete Data Based on Deep Reinforcement Learning in the Internet of Energy

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation