Skip to main content
Log in

Automating the addition of fault tolerance with discrete controller synthesis

  • Published:
Formal Methods in System Design Aims and scope Submit manuscript

Abstract

Discrete controller synthesis (DCS) is a formal approach, based on the same state-space exploration algorithms as model-checking. Its interest lies in the ability to obtain automatically systems satisfying by construction formal properties specified a priori. In this paper, our aim is to demonstrate the feasibility of this approach for fault tolerance. We start with a fault intolerant program, modeled as the synchronous parallel composition of finite labeled transition systems; we specify formally a fault hypothesis; we state some fault tolerance requirements; and we use DCS to obtain automatically a program, having the same behavior as the initial fault intolerant one in the absence of faults, and satisfying the fault tolerance requirements under the fault hypothesis. Our original contribution resides in the demonstration that DCS can be elegantly used to design fault tolerant systems, with guarantees on key properties of the obtained system, such as the fault tolerance level, the satisfaction of quantitative constraints, and so on. We show with numerous examples taken from case studies that our method can address different kinds of failures (crash, value, or Byzantine) affecting different kinds of hardware components (processors, communication links, actuators, or sensors). Besides, we show that our method also offers an optimality criterion very useful to synthesize fault tolerant systems compliant to the constraints of embedded systems, like power consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altisen K, Clodic A, Maraninchi F, Rutten E (2003) Using controller-synthesis techniques to build property-enforcing layers. In: Proceedings of the European symposium on programming, ESOP’03, Warsaw, Poland, April 2003. LNCS, vol 2618

  2. Altisen K, Gössler G, Sifakis J (2002) Scheduler modeling based on the controller synthesis paradigm. J Real-Time Syst 23(1/2):55–84

    Article  MATH  Google Scholar 

  3. Attie PC, Arora A, Emerson EA (2004) Synthesis of fault-tolerant concurrent programs. ACM Trans Program Lang Syst 26(1):125–185

    Article  Google Scholar 

  4. Avizienis A, Laprie J-C, Randell B (2004) Dependability and its threats: a taxonomy. In: IFIP world computer congress, Toulouse, France, August 2004. Kluwer Academic, Norvell, pp 91–120

    Google Scholar 

  5. Bellman R (1957) Dynamic programming. Princeton University Press, Princeton

    Google Scholar 

  6. Benveniste A, Caspi P, Edwards SA, Halbwachs N, Le Guernic P, de Simone R (2003) The synchronous languages twelve years later. Proc IEEE 91(1):64–83. Special issue on embedded systems

    Article  Google Scholar 

  7. Bernardeschi C, Fantechi A, Simoncini L (2000) Formally verifying fault tolerant system designs. Comput J 43(3)

  8. Bonakdarpour B, Kulkarni SS (2007) Exploiting symbolic techniques in automated synthesis of distributed programs with large state space. In: International conference on distributed computing systems, ICDCS’07, Toronto, Canada, June 2007

  9. Bonakdarpour B, Kulkarni SS (2008) Revising distributed UNITY programs is NP-complete. In: International Conference on Principles of Distributed Systems, OPODIS’08, Luxor, Egypt, December 2008. LNCS, vol 5401. Springer, New York, pp 408–427

    Google Scholar 

  10. Bonakdarpour B, Kulkarni SS (2008) SYCRAFT: A tool for synthesizing distributed fault-tolerant programs. In: International conference on concurrency theory, CONCUR’08, Toronto, Canada, August 2008. LNCS, vol 5201. Springer, New York, pp 167–171. Tool paper

    Google Scholar 

  11. Brière D, Ribot D, Pilaud D, Camus J-L (1994) Methods and specifications tools for Airbus on-board systems. In: Avionics conference and exhibition, London, UK, December 1994. ERA Technology

  12. Brinis N (2005) Synthèse d’un contrôleur pour le problème des généraux byzantins. Master’s Report, École Nationale des Sciences de l’Informatique, La Manouba, Tunisie, July 2005

  13. Bruns G, Sutherland I (1997) Model checking and fault tolerance. In: International conference on algebraic methodology and software technology, AMAST’97, Sidney, Australia, 1997

  14. Bryant RE (1986) Graph-based algorithms for boolean function manipulation. IEEE Trans Comput C-35(8):677–691

    Article  Google Scholar 

  15. Caspi P, Girault A, Pilaud D (1999) Automatic distribution of reactive systems for asynchronous networks of processors. IEEE Trans Softw Eng 25(3):416–427

    Article  Google Scholar 

  16. Cassez F, David A, Fleury E, Larsen KG, Lime D (2005) Efficient on-the-fly algorithms for the analysis of timed games. In: International conference on concurrency theory, CONCUR’05, San Francisco (CA), USA, August, 2005. LNCS, vol 3653. Springer, Berlin, pp 66–80

    Google Scholar 

  17. Cho K-H, Lim J-T (1998) Synthesis of fault-tolerant supervisor for automated manufacturing systems: A case study on photolothographic process. IEEE Trans Robot Autom 14(2):348–351

    Article  Google Scholar 

  18. Cieslak R, Desclaux C, Fawaz A, Varaiya P (1988) Supervisory control of discrete-event processes with partial observations. IEEE Trans Autom Control 33(3):249–260

    Article  MATH  Google Scholar 

  19. Cortadella J, Kondratyev A, Lavagno L, Passerone C, Wanatabe Y (2005) Quasi-static scheduling of independant tasks for reactive systems. IEEE Trans Comput-Aided Des Integr Circuits Syst 24(10):1492–1514

    Article  Google Scholar 

  20. Cousot P, Cousot R (1977) Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: 4th symposium on principles of programming languages, Los Angeles (CA), USA, January 1977. ACM SIGPLAN

  21. Delaval G, Rutten E (2007) A domain-specific language for multi-task systems, applying discrete controller synthesis. EURASIP J Embed Syst. Article ID 84192

  22. Dumitrescu E, Girault A, Rutten E (2004) Validating fault-tolerant behaviors of synchronous system specifications by discrete controller synthesis. In: Workshop on discrete event systems, WODES’04, Reims, France, September 2004. IFAC, New York

    Google Scholar 

  23. Dumitrescu E, Girault A, Marchand H, Rutten E (2007) Optimal discrete controller synthesis for modeling fault-tolerant distributed systems. In: Workshop on dependable control of discrete systems, DCDS’07, Cachan, France, June 2007. IFAC, New York, pp 23–28

    Google Scholar 

  24. Emerson EA, Clarke EM (1982) Using branching time temporal logic to synthesize synchronization skeletons. Sci Comput Program 2:241–266

    Article  MATH  Google Scholar 

  25. Gärtner F (1999) Fundamentals of fault-tolerant distributed computing in asynchronous environments. ACM Comput Surv 31(1):1–26

    Article  Google Scholar 

  26. Gärtner F, Jhumka A (2004) Automating the addition of fail-safe fault-tolerance: Beyond fusion-closed specifications. In: Joint conference on formal modelling and analysis of timed systems and formal techniques in real-time and fault tolerant system, FORMATS-FTRTFT’04, Grenoble, France, September 2004. LNCS, vol 3253. Springer, Berlin

    Google Scholar 

  27. Girault A, Rutten E (2004) Discrete controller synthesis for fault-tolerant distributed systems. In: International workshop on formal methods for industrial critical systems, FMICS’04, Linz, Austria, September 2004. ENTCS, vol 133. Elsevier, New York, pp 81–100

    Google Scholar 

  28. Girault A, Yu H (2006) A flexible method to tolerate value sensor failures. In: International conference on emerging technologies and factory automation, ETFA’06, Prague, Czech Republic, September 2006. IEEE, Los Alamitos, pp 86–93

    Google Scholar 

  29. Halbwachs N, Lagnier F, Raymond P (1993) Synchronous observers and the verification of reactive systems. In: Nivat M, Rattray C, Rus T, Scollo G (eds) International conference on algebraic methodology and software technology, AMAST’93, Twente, NL, June 1993. Springer, Berlin

    Google Scholar 

  30. Jeannet B (2003) Dynamic partitioning in linear relation analysis. Application to the verification of reactive systems. Formal Methods Syst Des 23(1):5–37

    Article  MATH  Google Scholar 

  31. Jensen RM, Veloso M, Bryant R (2003) Synthesis of fault-tolerant plans for non-deterministic domains. In: Workshop on planning under uncertainty and incomplete information, Trento, Italy, June 2003

  32. Kamach O, Pietrac L, Niel E (2005) Approche multi-modèle pour les systèmes à événements discrets: application à un préhenseur pneumatique. In: Modélisation des systèmes réactifs, MSR’05, Autrans, France, September 2005. Hermes, Paris, pp 159–174

    Google Scholar 

  33. Kulkarni SS, Arora A (2000) Automating the addition of fault-tolerance. In: Joseph M (ed) International symposium on formal techniques in real-time and fault-tolerant systems, FTRTFT’00, Pune, India, September 2000. LNCS, vol 1926. Springer, Berlin, pp 82–93

    Chapter  Google Scholar 

  34. Kulkarni SS, Ebnenasir A (2004) Automated synthesis of multitolerance. In: International conference on dependable systems and networks, DSN’04, Firenze, Italy, June 2004. IEEE, Los Alamitos

    Google Scholar 

  35. Kulkarni SS, Ebnenasir A (2005) Complexity issues in automated synthesis of failsafe fault-tolerance. IEEE Trans Dependable Secure Comput 2(3):201–215

    Article  Google Scholar 

  36. Kumar R, Garg VK (1995) Optimal supervisory control of discrete event dynamic systems. SIAM J Control Optim 33(2):419–439

    Article  MATH  MathSciNet  Google Scholar 

  37. Lamport L, Shostak R, Pease M (1982) The Byzantine generals problem. ACM Trans Program Lang Syst 4(3):382–401

    Article  MATH  Google Scholar 

  38. Lin F, Wonham WM (1988) Decentralized supervisory control of discrete-event systems. Inf Sci 44(3):199–224

    Article  MATH  MathSciNet  Google Scholar 

  39. Lin F, Wonham WM (1988) On observability of discrete-event systems. Inf Sci 44(3):173–198

    Article  MATH  MathSciNet  Google Scholar 

  40. Maraninchi F, Rémond Y (2003) Mode-automata: a new domain-specific construct for the development of safe critical systems. Sci Comput Program 46(3):219–254

    Article  MATH  Google Scholar 

  41. Marchand H, Rutten E (2002) Managing multi-mode tasks with time cost and quality levels using optimal discrete controller synthesis. In: Euromicro conference on real-time systems, ECRTS’02, Vienna, Austria, June 2002

  42. Marchand H, Samaan M (2000) Incremental design of a power transformer station controller using a controller synthesis methodology. IEEE Trans Softw Eng 26(8):729–741

    Article  Google Scholar 

  43. Marchand H, Boivineau O, Lafortune S (2000) On the synthesis of optimal schedulers in discrete event control problems with multiple goals. SIAM J Control Optim 39(2):512–532

    Article  MATH  MathSciNet  Google Scholar 

  44. Marchand H, Bournai P, Le Borgne M, Le Guernic P (2000) Synthesis of discrete-event controllers based on the signal environment. Discrete Event Dyn Syst: Theory Appl 10(4):325–346

    Article  MATH  Google Scholar 

  45. Marchand H, Boivineau O, Lafortune S (2002) On optimal control of a class of partially observed discrete event systems. Automatica 38:1935–1943

    Article  MathSciNet  Google Scholar 

  46. Milner R (1989) Communication and concurrency. International series in computer science. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  47. Powell D (1992) Failure mode assumption and assumption coverage. In: International symposium on fault-tolerant computing, FTCS-22, Boston (MA), USA, July 1992. IEEE, Los Alamitos, pp 386–395. Research report LAAS 91462

    Google Scholar 

  48. Ramadge PJ, Wonham WM (1987) Supervisory control of a class of discrete event processes. SIAM J Control Optim 25(1):206–230

    Article  MATH  MathSciNet  Google Scholar 

  49. Schepers H, Hooman J (1994) Trace-based compositional proof theory for fault tolerant distributed systems. Theor Comput Sci, 128

  50. Sengupta R, Lafortune S (1998) An optimal control theory for discrete event systems. SIAM J Control Optim 36(2):488–541

    Article  MATH  MathSciNet  Google Scholar 

  51. Taha S (2004) Synthèse de contròleurs discrets pour systèmes embarqués tolérants aux pannes. Master’s Report, Institut National Polytechnique de Grenoble, Grenoble, France, June 2004

  52. Tripakis S (2004) Decentralized control of discrete event systems with bounded or unbounded delay communication. IEEE Trans Autom Control 49(9):1489–1501

    Article  MathSciNet  Google Scholar 

  53. Tronci E (1996) Optimal finite state supervisory control. In: IEEE conference on decision and control, CDC’96, Kobe, Japan, December 1996. IEEE, Los Alamitos

    Google Scholar 

  54. Tsitsiklis JN (1989) On the control of discrete event dynamical systems. Math Control Signals Syst 2(2):95–107

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alain Girault.

Additional information

Research of A. Girault was supported by a Marie Curie International Outgoing Fellowship within the 7th European Community Framework Programme.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Girault, A., Rutten, É. Automating the addition of fault tolerance with discrete controller synthesis. Form Methods Syst Des 35, 190–225 (2009). https://doi.org/10.1007/s10703-009-0084-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10703-009-0084-y

Keywords

Navigation