Skip to main content
Log in

Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems. Recent advances show promising results for model-free DRL-based methods in power systems control problems. But in power systems applications, these model-free methods have certain issues related to training time (clock time) and sample efficiency; both are critical for making state-of-the-art DRL algorithms practically applicable. DRL-agent learns an optimal policy via a trial-and-error method while interacting with the real-world environment. It is also desirable to minimize the direct interaction of the DRL agent with the real-world power grid due to its safety-critical nature. Additionally, the state-of-the-art DRL-based policies are mostly trained using a physics-based grid simulator where dynamic simulation is computationally intensive, lowering the training efficiency. We propose a novel model-based DRL framework where a deep neural network (DNN)-based dynamic surrogate model (SM), instead of a real-world power grid or physics-based simulation, is utilized within the policy learning framework, making the process faster and more sample efficient. However, having stable training in model-based DRL is challenging because of the complex system dynamics of large-scale power systems. We addressed these issues by incorporating imitation learning to have a warm start in policy learning, reward-shaping, and multi-step loss in surrogate model training. Finally, we achieved 97.5% reduction in samples and 87.7% reduction in training time for an application to the IEEE 300-bus test system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

Data are available upon request.

Code availability

Code is available upon request.

References

  • International Energy Agency (2021). An energy sector roadmap to carbon neutrality in China. OECD Publishing.

    Book  Google Scholar 

  • Atkeson, C. G., & Santamaria, J. C. (1997). A comparison of direct and model-based reinforcement learning. In Proceedings of international conference on robotics and automation (Vol. 4, pp. 3557–3564).

  • Australian Energy Market Operator (2017). Black system South Australia 28 September 2016: Final report. https://aemo.com.au/

  • Balduin, S., Tröschel, M., & Lehnhoff, S. (2019). Towards domain-specific surrogate models for smart grid co-simulation. Energy Informatics, 2(1), 1–19.

    Google Scholar 

  • Birol, F., & Kant, A. (2022). India’s clean energy transition is rapidly underway, benefiting the entire world.

  • Brosinsky, C., Westermann, D., & Krebs, R. (2018). Recent and prospective developments in power system control centers: Adapting the digital twin technology for application in power system control centers. In 2018 IEEE international energy conference (ENERGYCON) (pp. 1–6).

  • California ISO (2013). California ISO-fast facts. https://www.caiso.com/documents/flexibleresourceshelprenewables_fastfacts.pdf

  • Cao, D., Hu, W., Zhao, J., Zhang, G., Zhang, B., Liu, Z., Chen, Z., & Blaabjerg, F. (2020). Reinforcement learning and its applications in modern power and energy systems: A review. Journal of Modern Power Systems and Clean Energy, 8(6), 1029–1042.

    Article  Google Scholar 

  • Cao, J., Zhang, W., Xiao, Z., & Hua, H. (2019). Reactive power optimization for transient voltage stability in energy internet via deep reinforcement learning approach. Energies, 12(8), 1556.

    Article  Google Scholar 

  • Cao, D., Zhao, J., Hu, W., Ding, F., Yu, N., Huang, Q., & Chen, Z. (2022). Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning. Applied Energy, 306, 117982.

    Article  Google Scholar 

  • Chen, C., Cui, M., Li, F., Yin, S., & Wang, X. (2020). Model-free emergency frequency control based on reinforcement learning. IEEE Transactions on Industrial Informatics, 17(4), 2336–2346.

    Article  Google Scholar 

  • Chen, X., Qu, G., Tang, Y., Low, S., & Li, N. (2022). Reinforcement learning for selective key applications in power systems: Recent advances and future challenges. IEEE Transactions on Smart Grid, 13(4), 2935–2958.

    Article  Google Scholar 

  • Deisenroth, M., & Rasmussen, C. E. (2011). Pilco: A model-based and data-efficient approach to policy search. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 465–472).

  • Duan, J., Shi, D., Diao, R., Li, H., Wang, Z., Zhang, B., Bian, D., & Yi, Z. (2019). Deep-reinforcement-learning-based autonomous voltage control for power grid operations. IEEE Transactions on Power Systems, 35(1), 814–817.

    Article  Google Scholar 

  • Fetting, C. (2020), The european green deal. ESDN Report, December (2020)

  • Gao, Y., & Yu, N. (2022). Model-augmented safe reinforcement learning for Volt–VAR control in power distribution networks. Applied Energy, 313, 118762.

    Article  Google Scholar 

  • Glavic, M. (2019). (Deep) reinforcement learning for electric power system control and related problems: A short review and perspectives. Annual Reviews in Control, 48, 22–35.

    Article  MathSciNet  Google Scholar 

  • Hatziargyriou, N., Milanovic, J., Rahmann, C., Ajjarapu, V., Canizares, C., Erlich, I., Hill, D., Hiskens, I., Kamwa, I., Pal, B., Pourbeik, P., Sanchez-Gasca, J., Stankovic, A., Van Cutsem, T., Vittal, V., & Vournas, C. (2021). Definition and classification of power system stability-revisited and extended. IEEE Transactions on Power Systems, 36(4), 3271–3281.

    Article  Google Scholar 

  • Hossain, R. R., Huang, Q., & Huang, R. (2021). Graph convolutional network-based topology embedded deep reinforcement learning for voltage stability control. IEEE Transactions on Power Systems, 36, 4848–4851.

    Article  Google Scholar 

  • Huang, R., Jin, S., Chen, Y., Diao, R., Palmer, B., Huang, Q., & Huang, Z. (2017). Faster than real-time dynamic simulation for large-size power system with detailed dynamic models using high-performance computing platform. In 2017 IEEE power and energy society general meeting (pp. 1–5).

  • Huang, R., Chen, Y., Yin, T., Huang, Q., Tan, J., Yu, W., Li, X., Li, A., & Du, Y. (2022). Learning and fast adaptation for grid emergency control via deep meta reinforcement learning. IEEE Transactions on Power Systems, 37, 4168–4178.

    Article  Google Scholar 

  • Huang, R., Chen, Y., Yin, T., Li, X., Li, A., Tan, J., Yu, W., Liu, Y., & Huang, Q. (2021). Accelerated derivative-free deep reinforcement learning for large-scale grid emergency voltage control. IEEE Transactions on Power Systems, 37(1), 14–25.

    Article  Google Scholar 

  • Huang, Q., Huang, R., Hao, W., Tan, J., Fan, R., & Huang, Z. (2019). Adaptive power system emergency control using deep reinforcement learning. IEEE Transactions on Smart Grid, 11(2), 1171–1182.

    Article  Google Scholar 

  • Huang, Q., Huang, R., Palmer, B. J., Liu, Y., Jin, S., Diao, R., Chen, Y., & Zhang, Y. (2019). A generic modeling and development approach for WECC composite load model. Electric Power Systems Research, 172, 1–10.

    Article  Google Scholar 

  • Hussein, A., Gaber, M. M., Elyan, E., & Jayne, C. (2017). Imitation learning: A survey of learning methods. ACM Computing Surveys, 50(2), 1–25.

    Article  Google Scholar 

  • Jiang, C., Li, Z., Zheng, J., & Wu, Q. (2019). Power system emergency control to improve short-term voltage stability using deep reinforcement learning algorithm. In 2019 IEEE 3rd international electrical and energy conference (CIEEC) (pp. 1872–1877).

  • Kamel, M., Dai, R., Wang, Y., Li, F., & Liu, G. (2021). Data-driven and model-based hybrid reinforcement learning to reduce stress on power systems branches. CSEE Journal of Power and Energy Systems, 7(3), 433–442.

    Google Scholar 

  • Kamruzzaman, M., Duan, J., Shi, D., & Benidris, M. (2021). A deep reinforcement learning-based multi-agent framework to enhance power system resilience using shunt resources. IEEE Transactions on Power Systems, 36(6), 5525–5536.

    Article  Google Scholar 

  • Li, J., Chen, S., Wang, X., & Pu, T. (2021). Research on load shedding control strategy in power grid emergency state based on deep reinforcement learning. CSEE Journal of Power and Energy Systems, 8, 1175–1182.

    Google Scholar 

  • Lin, B., Wang, H., Zhang, Y., & Wen, B. (2022). Real-time power system generator tripping control based on deep reinforcement learning. International Journal of Electrical Power and Energy Systems, 141, 108127.

    Article  Google Scholar 

  • Li, X., Wang, X., Zheng, X., Dai, Y., Yu, Z., Zhang, J. J., Bu, G., & Wang, F.-Y. (2022). Supervised assisted deep reinforcement learning for emergency voltage control of power systems. Neurocomputing, 475, 69–79.

    Article  Google Scholar 

  • Luo, F. -M., Xu, T., Lai, H., Chen, X. -H., Zhang, W., & Yu, Y. (2022). A survey on model-based reinforcement learning. arXiv:2206.09328

  • Mahmoud, M., Abouheaf, M., & Sharaf, A. (2021). Reinforcement learning control approach for autonomous microgrids. International Journal of Modelling and Simulation, 41(1), 1–10.

    Article  Google Scholar 

  • Mania, H., Guy, A., & Recht, B. (2018). Simple random search of static linear policies is competitive for reinforcement learning. In Advances in neural information processing systems (Vol. 31).

  • Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M. I., & Stoica, I., (2018). Ray: A distributed framework for emerging AI applications. In 13th USENIX symposium on operating systems design and implementation) (pp. 561–577).

  • Moya, C., Lin, G., Zhao, T., & Yue, M. (2023). On approximating the dynamic response of synchronous generators via operator learning: A step towards building deep operator-based power grid simulators. arXiv preprint arXiv:2301.12538

  • Nagabandi, A., Kahn, G., Fearing, R. S., & Levine, S. (2018). Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 7559–7566).

  • Nair, A., McGrew, B., Andrychowicz, M., Zaremba, W., & Abbeel, P. (2018). Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 6292–6299).

  • Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., & Kawato, M. (2004). Learning from demonstration and adaptation of biped locomotion. Robotics and Autonomous Systems, 47(2–3), 79–91.

    Article  Google Scholar 

  • Perera, A., & Kamalaruban, P. (2021). Applications of reinforcement learning in energy systems. Renewable and Sustainable Energy Reviews, 137, 110618.

    Article  Google Scholar 

  • PJM (2021). Exelon transmission planning criteria. https://www.pjm.com/-/media/planning/planning-criteria/exelon-planning-criteria.ashx?la=en

  • Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R. Y., Chen, X., Asfour, T., Abbeel, P., & Andrychowicz, M. (2017). Parameter space noise for exploration. arXiv preprint arXiv:1706.01905

  • Pomerleau, D. A. (1988). Alvinn: An autonomous land vehicle in a neural network. In Advances in neural information processing systems (Vol. 1, pp. 305–313).

  • Potamianakis, E. G., & Vournas, C. D. (2006). Short-term voltage instability: Effects on synchronous and induction machines. IEEE Transactions on Power Systems, 21(2), 791–798.

    Article  Google Scholar 

  • Qiu, G., Liu, Y., Zhao, J., Liu, J., Wang, L., Liu, T., & Gao, H. (2020). Analytic deep learning-based surrogate model for operational planning with dynamic TTC constraints. IEEE Transactions on Power Systems, 36, 3507–3519.

    Article  Google Scholar 

  • Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., & Dormann, N. (2021). Stable-baselines3: Reliable reinforcement learning implementations. The Journal of Machine Learning Research, 22(1), 12348–12355.

    Google Scholar 

  • Rocchetta, R., & Patelli, E. (2020). A post-contingency power flow emulator for generalized probabilistic risks assessment of power grids. Reliability Engineering and System Safety, 197, 106817.

    Article  Google Scholar 

  • Rocchetta, R., Zio, E., & Patelli, E. (2018). A power-flow emulator approach for resilience assessment of repairable power grids subject to weather-induced failures and data deficiency. Applied energy, 210, 339–350.

    Article  Google Scholar 

  • Ross, S., Gordon, G., & Bagnell, D. (2011). A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 627–635).

  • Schaal, S., et al. (1997). Learning from demonstration. Advances in Neural Information Processing Systems, 9, 1040–1046.

    Google Scholar 

  • Schneider, J. G. (1997). Exploiting model uncertainty estimates for safe dynamic control learning. In Advances in neural information processing systems (pp. 1047–1053).

  • Shuai, H., & He, H. (2020). Online scheduling of a residential microgrid via Monte-Carlo tree search and a learned model. IEEE Transactions on Smart Grid, 12(2), 1073–1087.

    Article  Google Scholar 

  • Su, T., Liu, Y., Zhao, J., & Liu, J. (2021). Deep belief network enabled surrogate modeling for fast preventive control of power system transient stability. IEEE Transactions on Industrial Informatics, 18(1), 315–326.

    Article  Google Scholar 

  • Sun, J., Zhu, Z., Li, H., Chai, Y., Qi, G., Wang, H., & Hu, Y. H. (2019). An integrated critic-actor neural network for reinforcement learning with application of DERs control in grid frequency regulation. International Journal of Electrical Power and Energy Systems, 111, 286–299.

    Article  Google Scholar 

  • Sutton, R., & Barto, A. (2018). Reinforcement learning: An introduction. MIT Press.

    Google Scholar 

  • Taylor, C. W. (1992). Concepts of undervoltage load shedding for voltage stability. IEEE Transactions on Power Delivery, 7(2), 480–488.

    Article  Google Scholar 

  • United Nations (2023). Intergovernmental Panel on Climate Change longer report. https://www.ipcc.ch/report/ar6/syr/

  • US Department of Energy (2021). How we’re moving to net-zero by 2050. https://www.energy.gov/articles/how-were-moving-net-zero-2050

  • Vu, T. L., Mukherjee, S., Huang, R., & Huang, Q. (2021). Safe reinforcement learning for grid voltage control. arXiv preprint arXiv:2112.01484

  • Wang, T., Bao, X., Clavera, I., Hoang, J., Wen, Y., Langlois, E., Zhang, S., Zhang, G., Abbeel, P., & Ba, J. (2019). Benchmarking model-based reinforcement learning. arXiv preprint arXiv:1907.02057

  • Wang, X., Liu, Y., Zhao, J., Liu, C., Liu, J., & Yan, J. (2021). Surrogate model enabled deep reinforcement learning for hybrid energy community operation. Applied Energy, 289, 116722.

    Article  Google Scholar 

  • Xie, J., & Sun, W. (2021). Distributional deep reinforcement learning-based emergency frequency control. IEEE Transactions on Power Systems, 37, 2720–2730.

    Article  Google Scholar 

  • Yang, Y., Caluwaerts, K., Iscen, A., Zhang, T., Tan, J., & Sindhwani, V. (2020). Data efficient reinforcement learning for legged robots. In Proceedings of the conference on robot learning. Proceedings of machine learning research (Vol. 100, pp. 1–10).

  • Yan, Z., & Xu, Y. (2018). Data-driven load frequency control for stochastic power systems: A deep reinforcement learning method with continuous action search. IEEE Transactions on Power Systems, 34(2), 1653–1656.

    Article  MathSciNet  Google Scholar 

  • Yan, Z., & Xu, Y. (2020). A multi-agent deep reinforcement learning method for cooperative load frequency control of a multi-area power system. IEEE Transactions on Power Systems, 35(6), 4599–4608.

    Article  Google Scholar 

  • Zhang, J., Lu, C., Fang, C., Ling, X., & Zhang, Y. (2018). Load shedding scheme with deep reinforcement learning to improve short-term voltage stability. In 2018 IEEE innovative smart grid technologies-Asia (ISGT Asia) (pp. 13–18).

Download references

Funding

This work is supported by funding from the U.S. Department of Energy (DOE) Advanced Research Projects Agency-Energy (ARPA-E) OPEN 2018 program. The Pacific Northwest National Laboratory (PNNL) is operated by Battelle for the U.S. Department of Energy (DOE) under Contract DE-AC05-76RL01830.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: QH, RH, JT; Methodology: QH, RH, JT, WY, TY, RRH; Formal analysis and investigation: RRH, TY, YD; Writing—original draft preparation: RRH, TY; Writing—review and editing: QH, JT, WY, RH, YD, YL; Funding acquisition: QH, RH; Resources: QH, YL, JT; Supervision: QH, RH.

Corresponding author

Correspondence to Tianzhixi Yin.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Editors:  Yuxi Li, Emma Brunskill, Minmin Chen, Omer Gottesman, Lihong Li, Yao Liu, Zongqing Lu, Niranjani Prasad, Zhiwei (Tony) Qin, Csaba Szepesvari, Matthew E. Taylor

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yan Du, Renke Huang, and Qiuhua Huang were at PNNL when conducting this study.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hossain, R.R., Yin, T., Du, Y. et al. Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning. Mach Learn 113, 2675–2700 (2024). https://doi.org/10.1007/s10994-023-06422-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06422-w

Keywords

Navigation