A self-validating control system based approach to plant fault detection and diagnosis

https://doi.org/10.1016/S0098-1354(00)00661-XGet rights and content

Abstract

An approach is proposed in which fault detection and diagnosis (FDD) tasks are distributed to separate FDD modules associated with each control system located throughout a plant. Intended specifically for those control systems that inherently eliminate steady state error, it is modular, steady state based, requires very little process specific information and therefore should be attractive to control systems implementers who seek economies of scale. The approach is applicable to virtually all types of process plant, whether they are open loop stable or not, have a type or class number of zero or not and so on. Based on qualitative reasoning, the approach is founded on the application of control systems theory to single and cascade control systems with integral action. This results in the derivation of cause–effect knowledge and fault isolation procedures that take into account factors like interactions between control systems, and the availability of non-control-loop-based sensors.

Introduction

For the sake of both economy and safety, online process monitoring, fault detection and fault diagnosis have received significant attention in recent years. Becraft, Guo, Lee and Newell (1991) have surveyed a number of methods pertaining to fault diagnosis and have pointed out that each method manages to capture or model some subset of the features of diagnostic reasoning about operating conditions, and thus may be more suitable than other techniques for a particular class of problem. Although a lot of approaches have since been developed, still none can be viewed as having general applicability. Heuristic methods are fast and do not require a plant model, but are comparatively brittle because they cannot handle situations that are not explicitly anticipated. Model-based techniques are less brittle, but pose other problems: most industrial chemical processes are unique, it is expensive to build high-fidelity first-principles models of these processes and very difficult to anticipate all the abnormal situations that might arise; neural network based methods require considerable data and long training times and might have difficulty in diagnosing novel faults.

The main motivation for the development of the method described here is to provide a distributed scheme, because virtually all the other methods take a centralised view in that design and analysis are normally carried out from the top, say from a plant schematic, and FDD tasks are implemented centrally on something like a DCS Supervisor. This appears to be contrary to current developments: to exploit economies of scale, “economic pressures are dispersing machine intelligence away from centralized computers toward distributed (Fieldbus) devices” (Clarke, 1995). If one looks at the measurements collected from a process plant, a large proportion relate to the control loops, the rest are largely collected to ensure that operation is within allowable constraints. Since the hardware modules associated with these control loops are distributed throughout the plant, it seems sensible to distribute associated detection and diagnosis tasks in a similar manner. The role of each of these detection and diagnosis modules would not be confined to the validation of the performance of the closed loops, each module would also monitor the performance of the process located in the proximity of these loops. The boundaries specified for individual module responsibilities are likely to overlap one another so their union should encompass the entire plant. Economies of scale would be achieved if minimal use could be made of mathematical models. This would facilitate common software that could be configured at the same time as individual loops were tuned. Clearly these economies of scale would be diminished if the algorithms were too plant specific and required knowledge not readily available from the plant.

Such a vision will not be realised easily. Considerable research and debate are required before practicable implementations evolve. The method described here is a contribution towards this goal. Its novelty lies in its focus on the control system and on how it responds, in the steady state, to faults and disturbances in both the control system and in the local plant. Thus the focus is on distribution. It has many limitations and issues of implementation have yet to be addressed. However it might be appropriate for many process plants particularly those under standard single or cascade PID control. Some of the limitations are as follows:

  • 1.

    the controllers themselves must perform to specification and all control loops must guarantee zero steady state error, e.g. because they have integral action;

  • 2.

    the fault must remain until a new steady state is reached and that this change in steady state must be detected;

  • 3.

    multiple faults can only be diagnosed if they are separated either spatially, i.e. in different parts of the plant, or temporarily, i.e. a new steady state is arrived at before the next fault occurs;

  • 4.

    reasoning is performed qualitatively and hence no quantitative results are obtained;

  • 5.

    controller outputs, together with measurements of the control variables must be available as observations.

A likely communications architecture is shown in Fig. 1. The detection and diagnosis modules are called SEVACS to highlight a possible relationship with SEVA components as described by Henry and Clarke (1993) and by Clarke (1995). Note that the approach has two components, a distributed component (SEVACS) and a central component (the FDD Supervisor). In the distributed component, candidate sets of faults and disturbances would be hypothesised by reasoning qualitatively about how steady state deviations, observed in the control system, might have been caused. This reasoning process is based on qualitative equations derived for that particular form of control system. In the central component, the candidate sets generated by the various SEVACS are then fused by applying various isolation procedures, all of which take into account known interactions between control systems and sign information output from the SEVACS.

It is envisioned that this approach would be implemented in two stages, as part of the (offline) design stage and then, online, during commissioning. At the offline stage the plant would be decomposed into manageable compartments and each control system would be considered in turn. The following characteristics would then be identified/hypothesised for each control system:its structure, i.e. whether it has a standard form like a single loop or double, cascade loops or whether a new special form has to be recognised;the process Type Number (Dorf & Bishop, 1995) (if known);open loop stability (if known);steady state gains between interacting loops (if known).

Based on these characteristics, an appropriate configuration would then be downloaded to each of the detection and diagnosis modules. During commissioning, online procedures would then be executed to obtain those items above that were still unknown. In addition, the modules would be configured to detect changes in steady state and the FDD supervisor would be specified.

In the approach proposed here fault isolation is achieved by reasoning about steady state deviations in measured variables by referring to cause–effect knowledge of individual control systems. Section 2 analyses how standard control systems respond to faults and disturbances, in general, by referring to linear control systems theory and to signed-directed-graph (SDG) representations. Both linear and weakly non-linear (i.e. linearisable) processes are considered. Section 3 describes the cause–effect knowledge that can be generated from this analysis, and which can be downloaded to the various modules. Issues of generality, in terms of the diversity of disturbances and non-linearities, are discussed in Section 7.

It is envisioned that the SEVACS modules would output analyses every time the process obtained a new steady state. By applying a choice of reasoning processes that are based on a search and test strategy described in 4 Fault isolation, 5 An alternative fault isolation method for interacting control systems, the Supervisor would then fuse these analyses to isolate the cause. The level of isolation depends on the knowledge available and SDGs can facilitate this. The procedure described in Section 4 is somewhat specific; a more general, but more complicated, procedure is offered in Section 5 as an alternative. The approach is demonstrated using a simple CSTR example in Section 6.

The approach depends on the availability of procedures to detect a change in steady state. Although this problem is not new (see for instance, Cao & Rhinehart, 1995, Cao & Rhinehart, 1997, Theilliol, Weber, Ghetie & Noura, 1995), it is not clear whether steady state identifiers have been implemented successfully on a large scale plant. Albers (1997) has discussed the application of steady state identifiers to data reconciliation and error detection, whilst workers on real-time optimisers (see for instance, Pierucci, Brandani, Ranzi & Sogaro 1996) and PCA (see for instance, Vedam & Venkatasubramanian, 1999) do not comment on the issue. Fig. 2b shows the kind of output that is sought: the temperature data of Fig. 2a has been converted into a time series of R-statistics (Cao & Rhinehart, 1995, Cao & Rhinehart, 1997), which in turn has been analysed by applying a hypothesis test to detect if a change has occurred; this has then been automatically interpreted into the more meaningful form shown. The key to Fig. 2b is as follows: level 3 occurs when it is not possible to make a decision because of insufficient data at that time, level 2 arises when no-decision is available because the plant is not in a steady state, levels ±1 arise when a new steady state is deemed to have been obtained (the sign denotes the direction), level 0 is obtained when this new steady state is ‘accepted’ and the test procedure is reset. Thus the change in steady state, which can be observed in Fig. 2a, is detected about 1 h later (Fig. 2b) and the deviation is negative in direction.

The steady state requirement is quite demanding. Fortunately this requirement can be loosened, at least in one aspect, that of long settling times, which have the potential to render the approach impracticable. An early decision on responses with long settling times can be made, provided asymptotic trajectories can be detected as such. Another issue is that of time delays. Being steady state based, the concept is independent of any time delays in the plant. However time delays are likely to make steady state change detection more difficult, increase the time before a new steady state can be identified and cause confusions, for instance if one part is deemed to be in steady state whilst another is not. Although important, these issues were outside the scope of the work described here.

A considerable amount of literature has been published about fault detection and diagnosis. The focus here is on the two most relevant aspects: on distributing diagnostic tasks to control systems and on those non-distributed methods that might be viewed as having some similarities with the approach described here.

Little has been written about distributing diagnostic tasks presumably because traditionally, the diagnostic engineer's view of feedback control is that it complicates, rather than aids, diagnostic reasoning. Feedback control adds to the complexity of fault detection in process plants by masking measurement deviations that might indicate a fault, and by making it difficult to distinguish between a sensor, actuator, or plant failure (Henry & Clarke, 1993). Control systems offer little decision-making assistance to an operator during the occurrence of process faults or abnormal disturbances, and in many cases, the actions of the control system can mask manifestations of the fault that would aid the operator in determining the cause of the process fault (Wilcox & Himmelblau, 1994a, Wilcox & Himmelblau, 1994b). A similar research activity is that of control loop performance monitoring (Harris, 1989, Desborough & Harris, 1992, Stanfelj, Marlin & MacGregor, 1993, Tyler & Morari, 1996, Kesavan & Lee, 1997, Thornhill, Oettinger & Fedenczuk, 1999). This differs from that proposed here because it focuses solely on the control loop. Other researchers seek to partition FDD tasks as opposed to distribute them. See for instance, Finch and Kramer (1988) and Prasad, Davis, Jirapinyo, Josephson and Bhalodia (1998) who examine ways in which diagnostic knowledge can be structured for large-scale process systems. It is difficult to see how ‘distributed’ versions of many techniques could be obtained: for instance multivariable statistical process control (MSPC), gross error detection and data reconciliation (Rollins & Davis, 1992, MacGregor & Kourti, 1995, Tong and Crowe, 1995, Albuquerque and Biegler, 1996, Crowe, 1996, Heyen, Marechal & Kalitventzeff, 1996, Schraa & Crowe, 1996, Bagajewicz and Jiang, 1997, Bakshi, 1998, Dunia & Qin, 1998, Luo, Misra, Qin, Barton & Himmelblau, 1998, Jiang, Sanchez & Bagajewicz, 1999, Martin, Morris & Kiparissides, 1999, Nounou and Bakshi, 1999, Sanchez, Romagnoli, Jiang & Bagajewicz, 1999, Shao, Jia, Martin and Morris, 1999). An alternative avenue of research that might be worth pursuing is based on neural networks and wavelets (see for instance non-distributed approaches by Rengaswamy and Venkatasubramanian, 1995, Vedam & Venkatasubramanian, 1997, Chen, Wang, Yang & McGreavy, 1999, Wang, Chen, Yang & McGreavy, 1999).

Based on qualitative reasoning, the proposed approach has been developed, in part, by referring to SDG representations of control systems. Although the application of SDG-based reasoning to fault diagnosis is not new, previous work has focused on the process plant, with its associated control and sensory systems, as a single entity (e.g. Shiozaki, Matsuyama, Tano & O'Shima, 1984, Tsuge et al., 1984a, Tsuge et al., 1984b, Kramer & Palowitch, 1987, Kutsuwa, Kojima & Matsuyama, 1988, Mo, Lee, Nam, Yoon & Yoon, 1997). Although not directly relevant because the approach cannot be distributed, it is interesting that Vedam and Venkatasubramanian (1999) have developed a hybrid approach based on PCA and SDG because PCA is a steady-state based approach. Wang, Yang, Veloso, Lu and McGreavy (1995) have applied fuzzy qualitative reasoning method to assess process plants whilst Lunze and Schiller (1999) have explored fault diagnosis based on qualitative and probabilistic logic models. Finally the reader is referred to Chantler, Coghill, Shen and Leitch (1998), which is considered noteworthy because it outlines various implementations that have been examined in realistic situations.

Looking into the future, Fig. 3 summarises the overall procedure that might be applied if the approach was to be implemented on a large-scale plant.

Section snippets

Representational issues for SEVACS knowledge generation

This section examines various ways that cause–effect knowledge can be represented to facilitate its generation. The first step is to introduce nomenclature relating to block diagram representations of two standard control systems (Section 2.1). These block diagrams are then analysed in Section 2.2 to produce equations that can generate cause–effect knowledge. An alternative representation (based on SDGs) is then considered in Section 2.3.

SEVACS cause–effect knowledge

Results from the previous section can now be applied to generate tables of cause–effect knowledge, which can be downloaded to the SEVACS. The contents of the tables differ depending on whether or not the process has a Type Number of zero. Here the focus is on controlled processes with a Type Number of zero, the other case is given in Appendix C. , , , , , , were referred to extensively when deriving this knowledge. Table 2, Table 3 describe the various effects that individual faults would have

Fault isolation

It is very easy to isolate faults like a dead sensor or a large exogenous fault/disturbance or a sticking valve, by applying appropriate simple heuristic rules. It is more difficult to isolate faults like a sensor bias, or a small exogenous fault/disturbance. In these circumstances, the controlled variable, and its effect on descendants, are arguably the key to fault isolation: with a sensor bias, the controlled variable will deviate from its nominal value and descendants of the controlled

An alternative fault isolation method for interacting control systems

The procedures described in the last section require different knowledge or rules for different processes. This section describes an alternative approach that is easier to realise as an auto-reasoning algorithm in a real-time fault diagnosis system. It does not rely on a large number of rules. The approach is to modify the SDG representation of interacting control systems by breaking the interaction between two control systems, and then to use the knowledge relating to the single or cascade

A CSTR example

The CSTR process is shown in Fig. 21: there are two outlets, the flow rate, F1, is manipulated to regulate level, L, whilst a nominally constant flow rate, F, is drawn for a separate purpose; the reactor temperature, T, is maintained by varying the flow rate, FJ, through a heat exchanger installed in the CSTR and, in addition, concentration CA is measured. Its design is somewhat contrived to enable bi-directional interaction between two of the control systems. A detailed description, including

Conclusions

A self-validating control system (SEVACS) based approach to plant fault detection and diagnosis has been proposed that enables the distribution of these tasks throughout a plant. The approach itself is targeted on control systems that inherently eliminate steady state error; it is modular, steady state based, requires very little process specific information and should therefore be attractive to control system's implementers who seek economies of scale. Considerable effort has been expended to

References (59)

  • J.F. MacGregor et al.

    Statistical process control of multivariate processes

    Control Engineering Practice

    (1995)
  • E.B. Martin et al.

    Manufacturing performance enhancement through multivariate statistical process control

    Annual Reviews in Control

    (1999)
  • S. Mohindra et al.

    A distributed fault diagnosis method based on digraph models: steady-state analysis

    Computers & Chemical Engineering

    (1993)
  • P.R. Prasad et al.

    Structuring diagnostic knowledge for large-scale process systems

    Computers & Chemical Engineering

    (1998)
  • R. Rengaswamy et al.

    Syntactic pattern-recognition approach for process monitoring and fault diagnosis

    Engineering Applications of Artificial Intelligence

    (1995)
  • M. Sanchez et al.

    Simultaneous estimation of biases and leaks in process plants

    Computers and Chemical Engineering

    (1999)
  • O.J. Schraa et al.

    Numerical solution of bilinear data reconciliation problems using unconstrained optimization methods

    Computers & Chemical Engineering

    (1996)
  • R. Shao et al.

    Wavelets and non-linear principal components analysis for process monitoring

    Control Engineering Practice

    (1999)
  • J. Shiozaki et al.

    An improved algorithm for diagnosis of system failures in the chemical process

    Computers & Chemical Engineering

    (1985)
  • N.F. Thornhill et al.

    Refinery-wide control loop performance assessment

    Journal of Process Control

    (1999)
  • M.L. Tyler et al.

    Performance monitoring of control systems using likelihood methods

    Automatica

    (1996)
  • T. Umeda et al.

    A graphical approach to cause and effect analysis of chemical processing systems

    Chemical Engineering Science

    (1980)
  • H. Vedam et al.

    PCA-SDG based process monitoring and fault diagnosis

    Control Engineering Practice

    (1999)
  • X.Z. Wang et al.

    Application of wavelets and neural networks to diagnostic system development, 2, an integrated framework and its application

    Computers and Chemical Engineering

    (1999)
  • N.A. Wilcox et al.

    The possible cause and effect graphs (PCEG) model for fault diagnosis — I. Methodology

    Computers & Chemical Engineering

    (1994)
  • N.A. Wilcox et al.

    The possible cause and effect graphs (PCEG) model for fault diagnosis — II. Applications

    Computers & Chemical Engineering

    (1994)
  • J.E. Albers

    Online data reconciliation and error detection

    Hydrocarbon Processing

    (1997)
  • J.S. Albuquerque et al.

    Data reconciliation and gross-error detection for dynamic systems

    AIChE Journal

    (1996)
  • M.J. Bagajewicz et al.

    Integral approach to plant linear dynamic reconciliation

    AIChE Journal

    (1997)
  • Cited by (25)

    • A survey of the state of condition-based maintenance (CBM) in the nuclear power industry

      2018, Annals of Nuclear Energy
      Citation Excerpt :

      Simani and Fantuzzi (2000) combined the neural networks and the model–based Kalman filter in developing an FDS. During the same time (Chen and Howell, 2001) proposed an FDD method based on control system theories in identifying steady-state errors in NPPs. The approach can be implemented on virtually all types of process plants, open loop stable or not.

    View all citing articles on Scopus
    View full text