Skip to main content

Software Fault Tolerance

  • Chapter
Failsafe Control Systems
  • 117 Accesses

Abstract

Programmable electronic systems such as microprocessors or transputers offer high computational power, high reliability and low power consumption at a low cost. Their use in industry has increased significantly in recent years, particularly in embedded applications such as real-time instrumentation and control systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Musa, J., Iannino, A. and Okumoto, K., “Software reliability”, McGraw Hill, 1987.

    Google Scholar 

  2. Anderson, T., and Lee, P.A., “Fault Tolerance, Principles and Practice”, Prentice Hall, 1981.

    Google Scholar 

  3. Avienzis, A., “The N version approach to fault tolerant software”, IEEE SE, Vol. SE-11, No. 12, pp. 1491 1501, 1985.

    Google Scholar 

  4. Randell, B., “System structure for software fault tolerance”, IEEE Trans. SE,Vol. SE-1, pp. 220 – 232, 1975.

    Google Scholar 

  5. Hecht, H., “Fault tolerant software for real-time applications”, ACM Computing Surveys, Vol. 8, No. 4, pp. 391–2013407, 1976.

    Google Scholar 

  6. Young, S.J., “Real-time languages: design and development”, Ellis Horwood, 1982.

    Google Scholar 

  7. Inmos., “Occam programming manual”, Prentice Hall, 1984.

    Google Scholar 

  8. Ceri, S. and Pelagatti, G., “Distributed database principles and systems”, McGraw-Hill, 1984.

    Google Scholar 

  9. Korth, H.F., and Siberschatz, A., “Database system concepts”, McGraw Hill, 1986.

    Google Scholar 

  10. Hoare, C.A.R., “Communicating sequential processes”, Prentice Hall, 1986.

    Google Scholar 

  11. Peterson, J.L., “Petri net theory and the modeling of systems”, Prentice Hall, 1981.

    Google Scholar 

  12. Mekly, L.J., and Yau, S.S., “Software design representation using abstract process networks”, IEEE Trans. S., SE-6, pp. 420 – 434, 1980.

    Google Scholar 

  13. Hecht, H., “Fault tolerant software”, IEEE Trans, on Reliability, R-28, pp. 227 – 232, 1979.

    Google Scholar 

  14. Anderson, T. (Ed), “Resilient computing systems”, Collins Professional and Technical Books, 1987.

    Google Scholar 

  15. Holding, D.J. and Carpenter, G.F., “Software fault tolerance in real-time systems”, Ch 8. in “Parallel processing in control — the transputer and other architectures”, P.J. Fleming (Ed), Peter Peregrinus Ltd., London, 1988.

    Google Scholar 

  16. Knight, J.C., and Levenson, N.G., “An empirical study of failure probability in multiversion software”, Proc. 16th Int. Symposium on Fault Tolerant Computing Systems, pp. 165 – 170, 1986.

    Google Scholar 

  17. Levenson, N.G., “Software Fault Tolerance; the case for forward error recovery”, Proc. AIAA Conf. on Computers in Aerospace, pp. 50 – 54, 1983.

    Google Scholar 

  18. Campbell, R.H., and Randell, B., “Error recovery in asynchronous systems”, IEEE Trans. Software Engineering, SE-12, pp. 811 – 826, 1986.

    Google Scholar 

  19. Kramer, J., Magee, J. and Sloman, M., “Intertask communication primitive for distributed computer control systems”, Proc. 2nd Int. Conf. on Distributed Computer Systems, Paris, pp. 404 – 411, 1981.

    Google Scholar 

  20. Holding, DJ., Carpenter, G.F., and Tyrrell, A.M., “Aspects of software engineering for systems with safety implications”, Proc. 6th IEEE/Eurel Conf. on Computers in communications and control (Eurocon 84), Brighton, England, pp. 235 – 239, 1984.

    Google Scholar 

  21. Carpenter, G.F., Holding, D.J., and Tyrrell, A.M., “Analysis and protection of interprocess communications in real-time systems”, Journal IERE, Vol. 58, No. 4, June 1988.

    Google Scholar 

  22. Bernstein, P.A. and Goodman, N., “Concurrency control in distributed database systems”, ACM Computing Surveys, Vol. 13, No. 2, pp. 185 – 221, June 1981.

    Article  MathSciNet  Google Scholar 

  23. Erswaren, P.K., Gray, J.N., Lorie, R.A. and Traiger, I.L., “The notions of consistency and predicate locks in a database system”, Comm. ACM Computing Surveys, Vol. 19, No. 11, pp. 624 – 633, Nov. 1976.

    Google Scholar 

  24. Gray, J.N., “Notes on database operating systems”, in Lecture notes in Computer Science, Vol. 60, pp. 393–481, Springer-Verlag, 1978.

    Google Scholar 

  25. Balter, R., “Selection of a commitment and recovery mechanism for a distributed transactional system”, IEEE Proc. 1st Symp. on Reliability in distributed software database systems, pp. 21 – 26, 1981.

    Google Scholar 

  26. Campbell, R.H., Horton, K. and Belford, G.G., “Simulations of a fault tolerant deadline mechanism” in Digest of papers, Fault Tolerant Computing Systems, Madison, pp. 95 – 101, 1979.

    Google Scholar 

  27. Upadhyaya, J.S. and Saluja, K.K., “A watchdog processor based general roll back technique with multiple retries”, IEEE Trans. Software Engineering, SE-12, pp. 87 – 95, 1986.

    Google Scholar 

  28. Jackson, P.R., and White, B.A., “The application of fault tolerant techniques to a real-time system”, Proc. Int. Conf. on Safety of Computer Control Systems (Safecomp’83), pp. 75–82,1983.., “Design of reliable software in distributed systems using the conversation scheme”

    Google Scholar 

  29. Tyrrell, A.M. and Holding, DJIEEE Trans, on Sofware Engineering, SE-12, pp. 921 – 928, 1986.

    Google Scholar 

  30. Merlin, P.M., and Randell, B., “State restoration in distributed systems”, Proc. 8th Int. Symp. on Fault Tolerant Computers, pp. 129 – 134, 1978.

    Google Scholar 

  31. Russell, D.L., “State restoration in systems of communicating processes”IEEE Trans. Software Engineering, Vol. SE-6, pp. 183 – 194, 1980.

    Google Scholar 

  32. Carpenter, G.F., Holding, D.J., and Tyrrell, A.M., “The design and simulation of software fault tolerant mechanisms for application in distributed processing systems”, Microprocessing and Microprogramming, No. 22, pp. 175 – 185, May 1988.

    Article  Google Scholar 

  33. Carpenter, G.F., “The use of Occam and Petri nets in the simulation of logic structures for the control of loosely coupled distributed systems”, Proc. UKSC Conf. on Computer Simulation, Pub. SCS, pp. 30 – 35, Sept. 1987.

    Google Scholar 

  34. Campbell, R.H., Anderson,T., and Randell, B., “Practical fault tolerant software for asynchronous systems”, Proc. Int. Conf. on Safety of Computer Control Systems (Safecomp’83), pp. 59 – 65, 1983.

    Google Scholar 

  35. Gregory, S.T. and Knight, J.C., “A new linguistic approach to background error recovery”, Proc. 15th Int. Symp. on Fault Tolerant Computing, pp. 404 – 409, 1985.

    Google Scholar 

  36. Holding, D.J., Hill, M.R. and Carpenter, G.F., “The design of distributed software fault tolerant, real-time systems incorporating decision mechanisms”, Proc. 14th Symp. on Microprocessors and Microprogramming — Euromicro 88, Zurich, Sept. 1988.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Unicorn Seminars Ltd

About this chapter

Cite this chapter

Holding, D.J. (1991). Software Fault Tolerance. In: Warwick, K., Tham, M.T. (eds) Failsafe Control Systems. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-0429-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-94-009-0429-3_2

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-6677-8

  • Online ISBN: 978-94-009-0429-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics