An empirical research agenda for understanding formal methods productivity
Introduction
Formal methods is the mathematical specification, design and verification of computer systems. It can provide a much higher level of assurance than traditional code-and-test approaches to software engineering. Software engineering researchers have aspired to see the wide-spread use of formal methods since the 1970s, but only recently have technologies and techniques developed enough for it to become practical for use in non-trivial systems development projects. Increasingly, software systems are also safety- or security-critical and so could benefit from formal methods verification to provide direct evidence about system dependability. Nonetheless, to broaden its reach requires that its costs be better understood and where possible reduced. In particular, as noted by Klein [29], verification of low level implementations has been considered to be prohibitively expensive until recently. But this view is changing with the development of newer tools and methods. Studies such as those by King et al. [27] show the benefits of formal specification and verification. In this study they were able to show that for the SHOLIS system, the use of a Z proof of properties at the requirements and design levels was “substantially more efficient at finding faults than the most efficient testing phase”. In addition they concluded that verification of SPARK Ada code was “more efficient at error detection than unit testing”. However studies such as this are still rare and the insights provided into formal methods productivity partial and inconsistent. Thus we need a deeper understanding of productivity in this context.
Formal methods involves different kinds of development artefacts than traditional software engineering, and can provide qualitatively different kinds of assurance. As will be discussed in this paper, some of the traditional metrics used in empirical software engineering do not apply in projects using formal methods. New metrics will be needed. In a previous paper [1], we claimed that there is not yet a good understanding of what to measure in projects using formal methods. There is a need for research on metrics, cost models and estimation methods for such projects. This echoes much earlier statements [15], [48], which indicates the lack of progress in this field over many years.
In this paper we define a space of research questions about the productivity of formal methods, and define a collection of metrics that bear on these research questions. For each question, evidence provided in the literature pertaining to that question is shown. Our paper provides a research agenda and a call to researchers in Empirical Software Engineering to study formal methods projects, and for researchers and practitioners in formal methods to collaborate on the opportunity provided by these empirical studies.
The questions in this paper are relevant whether we want to (a) reduce the cost of formal methods, (b) have it scale better, or (c) provide evidence to compare its cost-effectiveness with conventional software engineering. For any of these goals we need to be able to characterize the cost of formal methods and uncover appropriate metrics for this context. We can then use these metrics to develop an understanding of task size and effort drivers, and the empirical relationships between these and effort and schedule.
The existing literature is used in three ways in this paper. Firstly, we motivate the work using published papers showing the current lack of empirical evidence concerning cost, effort and quality of formal methods application in industry. Secondly we use literature to test the face validity of the research questions proposed, and finally we summarize literature that addresses guidelines and management issues concerning the application of formal methods. This finally provides a summary of the state of practice and a picture of the need for careful empirical research on the use and characteristics of formal methods in industry.
Section snippets
Background
It is significant for this work that formal methods may use different activities to produce different artefacts as part of a different lifecycle compared with traditional software engineering. An illustration of this is shown in a formal verification process lifecycle model in Fig. 1. In this model major artefact differences include the proofs themselves and the set of invariants of the system, which would not be present in conventional software engineering. The different activities that create
Research goals
Our investigation uses the Goal Question Metric (GQM) approach [2], [37]. GQM is a framework for the specification of a measurement system. It has three levels: a conceptual level of high-level measurement goals, an operational level of questions that bear on each goal, and a quantitative level of data and metrics that can be used to answer the defined questions. We first define a space of goals for empirical research on productivity in formal methods projects. In later sections we identify
Research questions
For the space of GQM goals given above, we identify a list of research questions. This will not be an exhaustive list, but we use a theoretical categorization to systematically cover a range of question types. We then establish the face validity of the research questions by showing how existing literature identifies or provides evidence about them. Although our paper is not a systematic mapping study [41], our work also serves as a survey of the field.
We use the categories of research questions
Metrics and guidance on data
Following GQM, the previous sections defined our space of research Goals and research Questions that summarize the numerous research opportunities– this section identifies some of the related Metrics.
SEI’s method [39] notes that measurement requires: entities (objects of interest), attributes (characteristics of entities), and rules (for assigning values to attributes). Measures can be defined on various kinds of scale types (nominal, ordinal, interval, ratio, or absolute), and not all of these
Related work
It may be thought that many research questions for conventional software engineering could apply directly to formal methods – after all, from one point of view, formal methods is just another technique to help the development of software solutions to user problems. However, our review of the research question catalogue from SEI IPRC report [17] identified few that are directly relevant to formal methods. Nonetheless, some of their research questions on conventional verification could apply for
Future work
The major opportunity for future work arising from this paper is to empirically study the research questions we have identified. Many of these could be investigated in retrospective project analyses provided that version control repositories regularly keep track of changes to artefacts, provided that “bug-tracking” databases are used to record the reasons for repository changes, and provided that base metrics such as effort and size are analysed along with appropriate context and cost driver
Conclusions
Formal methods has long held the promise to radically improve software engineering, by qualitatively changing the nature of software quality and quality assurance. The use of formal methods is becoming feasible in practice, but there are barriers to its wider adoption. A major barrier is the lack of understanding about the cost-effectiveness of formal methods, and in particular a lack of understanding about cost, estimation, and productivity.
This paper has set out an empirical research agenda
Acknowledgments
NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program.
References (51)
- et al.
Software measurement: a conceptual framework
J. Syst. Softw.
(1990) - et al.
On the use of software design models in software development practice: an empirical investigation
J. Syst. Softw.
(2014) - et al.
Large-scale formal verification in practice: a process perspective
- et al.
The goal question metric approach
- R.E. Bloomfield, D. Craigen, Formal Methods Diffusion: Past Lessons and Future Prospects, Adelard,...
- A. Bollin, A. Tabareh, 2012, Predictive software measures based on Z specifications – a case study, in: Proceedings of...
- J.P. Bowen, M.G. Hinchey, Ten commandments of formal methods, in: IEEE Computer, vol. 28, 1994, pp....
- et al.
Seven more myths of formal methods
IEEE Softw.
(1995) - et al.
The use of industrial-strength formal methods
- J.P. Bowen, M.G. Hinchey, Ten commandments of formal methods.Ten years later, IEEE Computer, 2012,...