An empirical research agenda for understanding formal methods productivity

https://doi.org/10.1016/j.infsof.2014.11.005Get rights and content

Abstract

Context

Formal methods, and particularly formal verification, is becoming more feasible to use in the engineering of large highly dependable software-based systems, but so far has had little rigorous empirical study. Its artefacts and activities are different to those of conventional software engineering, and the nature and drivers of productivity for formal methods are not yet understood.

Objective

To develop a research agenda for the empirical study of productivity in software projects using formal methods and in particular formal verification. To this end we aim to identify research questions about productivity in formal methods, and survey existing literature on these questions to establish face validity of these questions. And further we aim to identify metrics and data sources relevant to these questions.

Method

We define a space of GQM goals as an investigative framework, focusing on productivity from the perspective of managers of projects using formal methods. We then derive questions for these goals using Easterbrook et al.’s (2008) taxonomy of research questions. To establish face validity, we document the literature to date that reflects on these questions and then explore possible metrics related to these questions. Extensive use is made of literature concerning the L4.verified project completed within NICTA, as it is one of the few projects to achieve code-level formal verification for a large-scale industrially deployed software system.

Results

We identify more than thirty research questions on the topic in need of investigation. These questions arise not just out of the new type of project context, but also because of the different artefacts and activities in formal methods projects. Prior literature supports the need for research on the questions in our catalogue, but as yet provides little evidence about them. Metrics are identified that would be needed to investigate the questions. Thus although it is obvious that at the highest level concepts such as size, effort, rework and so on are common to all software projects, in the case of formal methods, measurement at the micro level for these concepts will exhibit significant differences.

Conclusions

Empirical software engineering for formal methods is a large open research field. For the empirical software engineering community our paper provides a view into the entities and research questions in this domain. For the formal methods community we identify some of the benefits that empirical studies could bring to the effective management of large formal methods projects, and list some basic metrics and data sources that could support empirical studies. Understanding productivity is important in its own right for efficient software engineering practice, but can also support future research on cost-effectiveness of formal methods, and on the emerging field of Proof Engineering.

Introduction

Formal methods is the mathematical specification, design and verification of computer systems. It can provide a much higher level of assurance than traditional code-and-test approaches to software engineering. Software engineering researchers have aspired to see the wide-spread use of formal methods since the 1970s, but only recently have technologies and techniques developed enough for it to become practical for use in non-trivial systems development projects. Increasingly, software systems are also safety- or security-critical and so could benefit from formal methods verification to provide direct evidence about system dependability. Nonetheless, to broaden its reach requires that its costs be better understood and where possible reduced. In particular, as noted by Klein [29], verification of low level implementations has been considered to be prohibitively expensive until recently. But this view is changing with the development of newer tools and methods. Studies such as those by King et al. [27] show the benefits of formal specification and verification. In this study they were able to show that for the SHOLIS system, the use of a Z proof of properties at the requirements and design levels was “substantially more efficient at finding faults than the most efficient testing phase”. In addition they concluded that verification of SPARK Ada code was “more efficient at error detection than unit testing”. However studies such as this are still rare and the insights provided into formal methods productivity partial and inconsistent. Thus we need a deeper understanding of productivity in this context.

Formal methods involves different kinds of development artefacts than traditional software engineering, and can provide qualitatively different kinds of assurance. As will be discussed in this paper, some of the traditional metrics used in empirical software engineering do not apply in projects using formal methods. New metrics will be needed. In a previous paper [1], we claimed that there is not yet a good understanding of what to measure in projects using formal methods. There is a need for research on metrics, cost models and estimation methods for such projects. This echoes much earlier statements [15], [48], which indicates the lack of progress in this field over many years.

In this paper we define a space of research questions about the productivity of formal methods, and define a collection of metrics that bear on these research questions. For each question, evidence provided in the literature pertaining to that question is shown. Our paper provides a research agenda and a call to researchers in Empirical Software Engineering to study formal methods projects, and for researchers and practitioners in formal methods to collaborate on the opportunity provided by these empirical studies.

The questions in this paper are relevant whether we want to (a) reduce the cost of formal methods, (b) have it scale better, or (c) provide evidence to compare its cost-effectiveness with conventional software engineering. For any of these goals we need to be able to characterize the cost of formal methods and uncover appropriate metrics for this context. We can then use these metrics to develop an understanding of task size and effort drivers, and the empirical relationships between these and effort and schedule.

The existing literature is used in three ways in this paper. Firstly, we motivate the work using published papers showing the current lack of empirical evidence concerning cost, effort and quality of formal methods application in industry. Secondly we use literature to test the face validity of the research questions proposed, and finally we summarize literature that addresses guidelines and management issues concerning the application of formal methods. This finally provides a summary of the state of practice and a picture of the need for careful empirical research on the use and characteristics of formal methods in industry.

Section snippets

Background

It is significant for this work that formal methods may use different activities to produce different artefacts as part of a different lifecycle compared with traditional software engineering. An illustration of this is shown in a formal verification process lifecycle model in Fig. 1. In this model major artefact differences include the proofs themselves and the set of invariants of the system, which would not be present in conventional software engineering. The different activities that create

Research goals

Our investigation uses the Goal Question Metric (GQM) approach [2], [37]. GQM is a framework for the specification of a measurement system. It has three levels: a conceptual level of high-level measurement goals, an operational level of questions that bear on each goal, and a quantitative level of data and metrics that can be used to answer the defined questions. We first define a space of goals for empirical research on productivity in formal methods projects. In later sections we identify

Research questions

For the space of GQM goals given above, we identify a list of research questions. This will not be an exhaustive list, but we use a theoretical categorization to systematically cover a range of question types. We then establish the face validity of the research questions by showing how existing literature identifies or provides evidence about them. Although our paper is not a systematic mapping study [41], our work also serves as a survey of the field.

We use the categories of research questions

Metrics and guidance on data

Following GQM, the previous sections defined our space of research Goals and research Questions that summarize the numerous research opportunities– this section identifies some of the related Metrics.

SEI’s method [39] notes that measurement requires: entities (objects of interest), attributes (characteristics of entities), and rules (for assigning values to attributes). Measures can be defined on various kinds of scale types (nominal, ordinal, interval, ratio, or absolute), and not all of these

Related work

It may be thought that many research questions for conventional software engineering could apply directly to formal methods – after all, from one point of view, formal methods is just another technique to help the development of software solutions to user problems. However, our review of the research question catalogue from SEI IPRC report [17] identified few that are directly relevant to formal methods. Nonetheless, some of their research questions on conventional verification could apply for

Future work

The major opportunity for future work arising from this paper is to empirically study the research questions we have identified. Many of these could be investigated in retrospective project analyses provided that version control repositories regularly keep track of changes to artefacts, provided that “bug-tracking” databases are used to record the reasons for repository changes, and provided that base metrics such as effort and size are analysed along with appropriate context and cost driver

Conclusions

Formal methods has long held the promise to radically improve software engineering, by qualitatively changing the nature of software quality and quality assurance. The use of formal methods is becoming feasible in practice, but there are barriers to its wider adoption. A major barrier is the lack of understanding about the cost-effectiveness of formal methods, and in particular a lack of understanding about cost, estimation, and productivity.

This paper has set out an empirical research agenda

Acknowledgments

NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program.

References (51)

  • M.E. Bush et al.

    Software measurement: a conceptual framework

    J. Syst. Softw.

    (1990)
  • T. Gorschek et al.

    On the use of software design models in software development practice: an empirical investigation

    J. Syst. Softw.

    (2014)
  • J. Andronick et al.

    Large-scale formal verification in practice: a process perspective

  • V. Basili et al.

    The goal question metric approach

  • R.E. Bloomfield, D. Craigen, Formal Methods Diffusion: Past Lessons and Future Prospects, Adelard,...
  • A. Bollin, A. Tabareh, 2012, Predictive software measures based on Z specifications – a case study, in: Proceedings of...
  • J.P. Bowen, M.G. Hinchey, Ten commandments of formal methods, in: IEEE Computer, vol. 28, 1994, pp....
  • J.P. Bowen et al.

    Seven more myths of formal methods

    IEEE Softw.

    (1995)
  • J.P. Bowen et al.

    The use of industrial-strength formal methods

  • J.P. Bowen, M.G. Hinchey, Ten commandments of formal methods.Ten years later, IEEE Computer, 2012,...
  • E.M. Clarke et al.

    Formal methods: state of the art and future directions

    ACM Comput. Surv.

    (1996)
  • D. Cock et al.

    Secure microkernels, state monads and scalable refinement

  • D. Craigen, S. Gerhart, T. Ralston, An international survey of industrial applications of formal methods, vol. 2, Case...
  • A. Degani et al.

    Modeling and formal analysis of human–machine interaction

  • S. Easterbrook et al.

    Selecting empirical methods for software engineering research

  • N.E. Fenton et al.

    An engineering theory of structure and measurement

  • J.S. Fitzgerald, P.G. Larsen, P.G. Larsen, Formal specification techniques in the commercial development process, in:...
  • S. Gerhart et al.

    Observations on industrial practice using formal methods

  • C. Ghezzi et al.

    Towards agile verification

  • S. Gregor

    The nature of theory in information systems

    MIS Quart.

    (2006)
  • A. Hall

    Seven myths of formal methods

    IEEE Softw.

    (1990)
  • G.J. Holzmann, Economics of software verification, in: PASTE’01, ACM,...
  • A. Iliasov, E. Troubitsyna, L. Laibinis, A. Romanovsky, K. Varpaaniemi, D. Ilic, T. Latvala, Supporting reuse in Event...
  • Cited by (0)

    View full text