Maintenance reliability estimation for a cloud computing network with nodes failure
Highlights
► An estimated maintenance reliability is proposed to evaluate the performance of a cloud computing network. ► A maintenance model is proposed to guarantee the network keep a sufficient capacity. ► Node failure case is considered in this paper. ► Data is transmitted through multiple paths to shorten the transmission time.
Introduction
In decades, the development of information technology has grown rapidly and explosively. As the need for the large amount of resources (computing resource, storage capacity, or network bandwidth), cloud computing is therefore developed for the enormous requirements. In a cloud computing paradigm, information is processed or stored by servers on the internet and cached temporarily on clients (Hewitt, 2008). All the resources are provided by powerful servers which can be depicted as the “cloud” in a cloud computing network (CCN). For a stable network environment, it is necessary for internet service providers to guarantee the CCN keep a good quality of service (QoS) and satisfy their customers/clients all the times.
For a practical CCN, the capacity of each edge (physical lines, fiber optics, or coaxial cables) and node (servers or switches) should be stochastic due to failure, partial failure, or maintenance. Thus, each edge/node has various capacities or states (Chen and Lin, 2009, Lin, 2004, Lin, 2007, Lin, 2010, Jane et al., 1993, Xue, 1985). To guarantee the CCN keeps a stable QoS, it should be maintained when falling to a specific state such that the cloud cannot provide enough capacity to fulfill the client’s demand d. Thus, the maintenance budget should be considered. According to Yeh’s (2004) definition, the maintenance cost is the overall cost of restoring a network from its failed state back to its original state, where the failed state is that the network sends less than the given d units of data. That is, the edges/nodes in the CCN should be recovered to their highest capacities when only d units of data can be sent.
Furthermore, the transmission time to send data from the cloud to the client is another important issue to be concentrated. When data are transmitted through a CCN, it is necessary to select a shortest delayed path to minimize the transmission time (Bodin et al., 1982, Golden and Magnanti, 1977). However, the flow of data transmission is not considered in these works. In order to find a path which sends the given amount of data from the cloud to the client with minimum transmission time, Chen and Chin (1990) proposed a version of the shortest path problem called the quickest path problem. In such problem, both the capacity and the lead time are involved in each edge and are assumed to be deterministic (Chen and Chin, 1990, Hung and Chen, 1992, Martins and Santos, 1997). Variants of quickest path problems, such as constrained quickest path problem (Chen and Hung, 1994, Chen and Tang, 1998), the first k quickest path problem (Clímaco et al., 2007, Pascoal et al., 2005), and all-pairs quickest path problem (Chen and Hung, 1993, Lee, 1993), are subsequently proposed. To shorten the transmission time, the data can be transmitted through k (k ⩾ 2) disjoint minimal paths (MPs) simultaneously, in which a MP is a path whose proper subsets are no longer paths. However, these researches are proposed by assuming perfect nodes.
In the CCN, nodes play the role as servers/switches and they would be failure due to unexpectedly malfunction as well as edges. Therefore, all of the failure, maintenance action, and transmission time on nodes are needed to be considered as well. Aggarwal, Gupta, and Misra (1975) proposed the concept that the failure of a node implies the failure of edges incident from it. Based on this concept, further related works modified the original network with node failure to be a conventional network with perfect nodes (Lin, 2001, Lin, 2004, Lin, 2007). With node failure, we present an algorithm to estimate the probability that the CCN can send d units of data from the cloud to the client under both maintenance budget B and time constraint T. Such a probability is named the maintenance reliability herein. A bounding approach is first proposed to generate two sets of capacity vectors; {UB-MPs} and {LB-MPs}, where a UB-MP is the minimal capacity vector satisfying d and T while a LB-MP is the minimal capacity vector satisfying d, B, and T. The estimation of maintenance reliability is derived in terms of UB-MPs and LB-MPs by the Recursive Sum of Disjoint Products (RSDP) algorithm afterwards. The remainder of this paper is organized as follows. Notations and assumptions are described in Section 2. The CCN model and the maintenance reliability are described in Section 3. Algorithm to generate the UB-MPs and LB-MPs is proposed in Section 4. An example presented in Section 5 illustrates the algorithm and how the maintenance reliability may be calculated.
Section snippets
Notations and assumptions
Let G = (N, E, W, C, L) denote a CCN with a cloud Scloud and a client Sclient where E = {ei∣i = 1, 2, … , n} represents the set of edges, N = {ei∣i = n + 1, n + 2, … , n + r} represents the set of nodes, W = {Wi∣i = 1, 2, … , n + r} with the maximal capacity Wi of ei, C = {ci∣i = 1, 2, … , n + r} with unit maintenance ci cost of ei, and L = {li∣i = 1, 2, … , n+r} with the lead time li of ei. Suppose the data can be transmitted through P1, P2, …, Pk simultaneously, where Pm is the mth MP for m = 1, 2, …, k. The capacity vector X = (x1, x2, … , xn+r) is
The CCN model and maintenance reliability
The maintenance cost is calculated in terms of the amount of capacity that each edge/node needs to be restored. The total cost to recover the edges/nodes in a CCN from the state X iswhere ci(Wi − xi) is the maintenance cost for ei on any MP to recover from the current capacity xi to its highest capacity Wi. For instance, given the current capacity vector X = (1, 0, 1, 1, 0, 0, 1, 1), the maximal capacity vector W = (3, 3, 3, 1, 2, 4, 5, 4), and the unit maintenance cost C = (25, 15, 25, 40,
The algorithm to generate UB-MPs and LB-MPs
All UB-MPs and LB-MPs can be generated by the following steps.
- Step 0.
[Initialization] Set ΦUB = ∅, ΦLB = ∅, and j = 0.
- Step 1.
Find the largest assigned demand such that .
- Step 2.
[Generation of feasible demand vector d] Generate all non-negative integer solutions of where .
- Step 3.
[Generation of UB-MPs] For each demand vector d, do the following steps.
- 3.1
Find the minimal capacity vm of Pm such that dm units of data can be sent through Pm under T, m = 1, 2, …, k. That is, find the
- 3.1
An illustrative example
A random network with 12 edges and 6 failure nodes shown in Fig. 1 is utilized to illustrate the solution process. In this example, each edge is combined with several OC-18 (Optical Carrier 18) lines and each line provides two capacities, 1 Gbps (giga bits per second) and 0 bps. Since the lines are provided by different suppliers, the edge’s capacity has different probability distribution. The capacity, lead time, and per unit maintenance cost of each edge are shown in Table 1.
In this example,
Summary
For a practical CCN, the capacity of each edge and node should be stochastic due to failure, partial failure, or maintenance. That is, all of the failure, maintenance action, and transmission time on nodes are needed to be considered as well as edges. In this paper, when the CCN falls to the failed state where it cannot provide enough capacity to satisfy client’s requirements, the maintenance action should be taken on each edge/node for keeping a good QoS. Moreover, the transmission time that
References (27)
- et al.
The quickest path problem
Computers and Operations Research
(1990) - et al.
On the quickest path problem
Information Processing Letters
(1993) - et al.
On performance evaluation of ERP systems with fuzzy mathematics
Expert Systems with Applications
(2009) - et al.
Minimum time paths in a network with mixed time constraints
Computers and Operations Research
(1998) - et al.
Internet packet routing: Application of a K-quickest path algorithm
European Journal of Operational Research
(2007) - et al.
Distributed algorithms for the quickest path problem
Parallel Computing
(1992) - et al.
The all-pairs quickest path problem
Information Processing Letters
(1993) A simple algorithm for reliability evaluation of a stochastic-flow network with node failure
Computers and Operations Research
(2001)A novel algorithm to evaluate the performance of stochastic transportation systems
Expert Systems with Applications
(2010)- et al.
An algorithm for the quickest path problem
Operations Research Letters
(1997)
An algorithm for ranking quickest simple paths
Computers and Operations Research
Multistate network reliability evaluation under the maintenance cost constraint
International Journal of Production Economics
A simple method for reliability evaluation of a communication system
IEEE Transactions on Communications
Cited by (42)
Rail transport network reliability with train arrival delay: A reference indicator for a travel agency in tour planning
2022, Expert Systems with ApplicationsCitation Excerpt :Such a performance indicator can be employed as a reference indicator to assess the number of passengers in a tour group that a travel agency can serve. Stochastic-flow network reliability evaluation is popular in many fields, such as communication networks (Lin & Chang, 2011; Lin, 2012; Forghani-elahabad & Kagan, 2019; Huang et al., 2020), manufacturing networks (Lin & Chang, 2014; Chang et al., 2019; Lin & Huang, 2020), power transmission networks (Lin & Yeh, 2011), and logistic networks (Niu et al., 2014; Xu et al., 2018). In the transport field, Lin & Nguyen (2018) constructed a stochastic flight network model with the constraints of arrival time and the number of stopovers and investigated flight network reliability to support strategic decision-making for airline managers; Lin et al. (2019) measured air transport network reliability considering various source and sink stations.
Fuzzy reliability analysis using cellular automata for network systems
2016, Information SciencesImproved security of a dynamic remote data possession checking protocol for cloud storage
2014, Expert Systems with ApplicationsDecision making procedure of demand satisfaction and production policy for capacitated production systems
2014, Expert Systems with ApplicationsRDPC: Secure cloud storage with deduplication technique
2020, Proceedings of the 4th International Conference on IoT in Social, Mobile, Analytics and Cloud, ISMAC 2020