Optimal replacement and reactivation in warm standby systems performing random duration missions

https://doi.org/10.1016/j.cie.2020.106791Get rights and content

Highlights

  • Warm standby systems with preventive replacements are considered.

  • Elements may be activated multiple times during the mission.

  • Mission succeeds if no elements fail in operation or before activation.

  • The optimal activation scheduling problem is formulated and solved.

  • Schedules for homogeneous and a heterogeneous systems are obtained.

Abstract

Reliability modeling and optimization of warm standby systems have attracted considerable research attentions in the past decade due to their wide use in diverse critical applications. The existing works have mostly assumed that the activation of a standby element is triggered by the failure of the online operating element. This paper advances the state of the art by modeling a new type of warm standby systems subject to preventive replacements (PR) and element reactivation during its uncertain mission time. According to pre-determined PR schedule and element activation sequence, the online element is replaced with a standby element before its failure, and any element may be activated or reactivated multiple times during the mission. The mission succeeds if none of the activated elements fails before (i.e., while in the standby mode) or during its operation. A probabilistic method is proposed to evaluate the mission success probability for the considered warm standby system. The optimal PR scheduling problem is then formulated and solved, maximizing the mission success probability. Impacts of different system model parameters on the mission success probability and on the optimization solutions are examined through two examples, including a homogeneous and a heterogeneous warm standby system.

Introduction

In many industrial and technological systems, e.g., power grids (Zhong, Pantelous, Goh, & Zhou, 2019), aviation (Zhang, Li, & Liu, 2020), healthcare (Maktoubian & Ansari, 2019), railway (Lin et al., 2019), chemical plants (Jia & Cui, 2011), the failure of an operating element may cause the failure of the system’s mission and result in considerable damage to the system and financial loss. Therefore, it is a common practice to perform preventive replacement (PR) of the operating element with the aim to reduce or remove the element’s accumulated deterioration, thus improving the overall system reliability. The PR can be triggered based on a certain pre-determined schedule or by the presence of some system condition. In this work, we focus on the scheduled PR for systems with non-repairable standby elements.

When the mission time is fixed, the optimal PR schedule is achieved when the failure probability is the same for different elements, which avoids bottlenecks that increase the mission failure probability. When the mission time is uncertain, the schedule that is optimal for shorter time can become far from being optimal when the time increases. If all of the elements have already operated and the mission still continues, it may be beneficial to re-activate some elements, which have been already replaced and put into standby mode. In this work we develop a model for evaluating mission success probability (MSP) of warm standby systems that perform missions with uncertain time and possibility of re-activation of already operated elements, and further determine the optimal PR schedule maximizing MSP.

Missions with uncertain duration abound in practice. For example, auxiliary gas turbine power stations operating during increased demands periods with uncertain duration; cooling/heating system with mission durations depending on weather conditions; continuous production systems performing according to uncertain market demands.

Considerable research efforts have been dedicated to the optimal PR scheduling or general preventive maintenance (PM) planning problem. For example, in Zhong et al. (2019), a fuzzy multi-objective non-linear programming model was formulated and solved to determine the PM schedule for offshore wind farms that operate under uncertainty. In Lin et al. (2019), a 0–1 programming model was formulated and solved to plan PM for high-speed trains. In Wang, Li, and Xie (2020), an unpunctual PM policy was optimized for warranted items, which allows item owners to postpone or advance scheduled PM within an acceptable range. In Yang, Ye, Lee, Yang, and Peng (2019), a two-phase PM policy was examined for a single-component system, where inspections are conducted in the first phase to reveal the system defective state followed by imperfect maintenance and PRs are conducted during the scheduled window for the second phase. In Mizutani, Dong, Zhao, and Nakagawa (2020), the optimal PR policy was studied for systems with product update announcements. In Hu, Shen, and Shen (2020), based on a continuous-time Markov process, a periodic PM policy was suggested for a single-component system working under time-varying operating conditions. In Sheu, Liu, and Zhang (2019), the PR policy was optimized for systems with random working cycles and undergoing random shocks. In Eryilmaz (2017), a matrix-based model was suggested for determining the optimal PR time of systems under different shock models with phase-type shock inter-arrival time. In Eryilmaz and Kan (2019), the optimal PR policy was studied for systems undergoing extreme shocks and a sudden change point in the distribution of the magnitudes of the external shocks. In Zhao, Chen, and Nakagawa (2020), the PR policy was examined taking into consideration of excess cost (induced by too early replacement) and shortage cost (induced by too late replacement). In Panagiotidou (2019), the PR schedule was co-optimized with the spare parts ordering policy for parallel systems with multiple identical components. In Mirahmadi and Taghipour (2019), a stochastic model was formulated and solved to determine the joint optimal production planning and PM policy for manufacturing systems. This co-optimization of production planning and PM policies has also been addressed for multi-state systems (Fitouhi & Nourelfath, 2014) and multi-product production systems (Liu, Wang, & Peng, 2015).

There also exists a limited body of works on addressing the PM policy for standby systems. There are three types of standby systems: cold, hot, and warm (Xing, Levitin, & Wang, 2019). A cold standby element is initially unpowered and fully isolated from operating stresses; a hot standby element operates in parallel with the primary online element and thus is fully exposed to the operating stresses; a warm standby element is partially exposed to the operating stresses and thus has a reduced failure rate. Cold and hot standby systems are special cases of the general warm standby system model. A rich body of literature on reliability of standby systems is available. For example, in Kumar and Jain (2020), a Markov model-based method was applied to evaluate the reliability of a warm standby system considering service interruption, imperfect switching, and rebooting behaviors. In Levitin, Xing, and Xiang (2020), an event transition-based method was proposed to evaluate the reliability of a series phased-mission system with heterogeneous warm standby components. In Jia, Ding, Peng, Liu, and Song (2020), a multi-state decision diagram-based method was presented for the reliability analysis of a warm standby power generation system. In Peiravi, Ardakan, and Zio (2020), an exact Markov model-based method was investigated for evaluating the reliability of a standby system with mixed redundancy types. In Levitin, Xing, Ben Haim, and Huang (2019), a universal generating function-based method was proposed for reliability analysis of a linear consecutive multi-state sliding window system with warm standby components. Examples of recent works on the PM policy for standby systems include Ma, Liu, Yang, Peng, and Zhang (2020), where the condition-based PM policy was investigated for a warm standby two-unit cooling system. In Luo, Alkhaleel, Liao, and Pascual (2020), the preventive switching strategy was proposed for a one-component deteriorating system with one standby sparing part that can also deteriorate. In Barak, Neeraj, & Kumari (2018), the PM rate was investigated for a two-unit cold standby system performing under different weather conditions. In Ruiz-Castro and Dawabsha (2020), based on a Markovian arrival process model with marked arrivals, the condition-based PM was investigated for a multi-state warm standby system subject to both internal failures and external shocks. In Levitin, Finkelstein, and Dai (2018), a shock-based PR policy was optimized for a heterogeneous 1-out-of-N warm standby system subject to both internal failures and external shocks. In Levitin, Xing, and Dai (2017), the PR schedule was co-optimized with the periodic backup policy for a real-time warm standby system that must accomplish a certain amount of work by a hard deadline. In Levitin, Xing, and Dai (2018), the joint PR and backup policy was co-optimized with the standby element sequencing for warm standby systems performing missions without deadlines.

Despite the rich body of works on PM modeling and optimization, to the best of the authors’ knowledge, none of the existing works have modeled the possibility of reactivating already operated elements during the mission when planning the PR schedule for warm standby systems. We make new contributions by evaluating MSP of such warm standby systems subject to element reactivation and further optimizing the PR schedule to maximize the MSP.

The rest of the paper is arranged as follows: Section 2 presents the standby system model and the problem to be addressed. Section 3 presents the method of deriving the MSP of the considered warm standby system subject to PR and reactivation and the optimization method. Section 4 presents the analysis of an example power supply system of reserve gas turbine generators under two cases (homogeneous and heterogeneous) to illustrate the proposed method and examines the effects of different model parameters on the system MSP and optimization solutions. Section 5 concludes the paper and indicates a few directions of future research.

Section snippets

Standby system operation model

A system consisting of n non-repairable elements with different reliability characteristics has a mission task that requires continuous operation during random time T. The probability density function (pdf) of T is given as q(t). When an element executes the mission, the rest of elements wait in a warm standby mode. The standby elements do not undergo repair or rehabilitation actions during the mission. Based on a predetermined schedule, PRs are performed to replace the operating element

Element failure probability evaluation

An element in the considered system may undergo three different modes before its failure or mission completion: standby, activation process and operation modes. Influence of these different modes on the element’s failure behavior is addressed through the cumulative exposure model (CEM) (Amari, Misra, and Pham, 2008). Under the CEM, the failure probability or unreliability of an element is a function of a cumulative exposure time t*, which is summation of duration of each mode multiplied by a

Examples

Consider a system of reserve gas turbine generators that provide power supply to customers when the demand exceeds the cumulative capacity of coal power stations and renewable energy sources in a region (Billinton & Chowdhury, 1988). The duration of the demand peak depends on the weather conditions and is random. The reserve generation system mission time obeys a truncated normal distribution with tmin = 60 (h), tmax = 100, μ = 70 (mean) and σ = 10 (standard deviation). When the loaded turbine

Conclusions

It has been shown by many studies that the warm standby system design can effectively enhance the system reliability while conserving the limited system resources (Hadipour et al., 2019, Liu et al., 2020, Xing et al., 2019). Different from the existing works where a standby element’s activation is triggered by the failure of the online operating element, this paper models a new type of warm standby systems where the online element is preventively replaced by a standby element according to the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (37)

Cited by (25)

View all citing articles on Scopus
View full text