Learning Towards Failure Prediction of High Performance Computing Clusters by Employing LSTM
Kamaljit Kaur1, Kuljit Kaur2

1Kamaljit Kaur, Department of Computer Engineering & Technology, Guru Nanak Dev University, Amritsar, Punjab, India.
2Kuljit Kaur, Department of Computer Science, Guru Nanak Dev University, Amritsar, Punjab, India.
Manuscript received on July 20, 2019. | Revised Manuscript received on August 10, 2019. | Manuscript published on August 30, 2019. | PP: 1829-1838| Volume-8 Issue-6, August 2019. | Retrieval Number: F7885088619/2019©BEIESP| DOI: 10.35940/ijeat.F7885.088619
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: This Failure prediction of high-performance computing clusters (HPCC) is a crucial issue and a hot problem for many years. Previous works have failed to provide a robust method for real-time failure prediction of HPCC. The available techniques are old, unrealistic and provide low accuracy. This paper presents an efficient technique which provides robust failure prediction with good accuracy and state of the art models. We have employed the concept of long short-term memory (LSTM) with reinforcement learning to correct the prediction accuracy in real-time and provide a solution to the industry with reliable results.
Keywords: LSTM, Rainbow, HPCCs, failure prediction.