Elsevier

Information Systems

Volume 28, Issue 6, September 2003, Pages 619-650
Information Systems

Improving the quality of data models: empirical validation of a quality management framework

https://doi.org/10.1016/S0306-4379(02)00043-1Get rights and content

Abstract

This paper describes the results of a 5-year research programme into evaluating and improving the quality of data models. The theoretical base for this work was a data model quality management framework proposed by Moody and Shanks (In: P. Loucopolous (Ed.), Proceedings of the 13th International Conference on the Entity Relationship Approach, Manchester, England, December 14–17, 1994). A combination of field and laboratory research methods (action research, laboratory experiments and systems development) was used to empirically validate the framework. This paper describes how the framework was used to: (a) quality assure a data model in a large application development project (product quality); (b) reengineer application development processes to build quality into the data analysis process (process quality); (c) investigate differences between data models produced by experts and novices; (d) provide automated support for the evaluation process (the Data Model Quality Advisor). The results of the research have been used to refine and extend the framework, to the point that it is now a stable and mature approach.

Introduction

The choice of an appropriate representation of data is one of the most crucial tasks in information systems development. Although data modelling represents only a small proportion of the total systems development effort, its impact on the quality of the final system is probably greater than any other phase [1]. The data model is a major determinant of system development costs [2], system flexibility [3], integration with other systems [4] and the ability of the system to meet user requirements [5].

The traditional thrust of software quality assurance has been to use “brute force” testing at the end of development [6]. However, Total Quality Management (TQM) approaches suggest that it is faster and cheaper to concentrate effort during the early development phases of a product, in order to detect and correct defects as early as possible [7]. According to Boehm [8], relative to removing a defect discovered during the requirements stage, removing the same defect costs on average 3.5 times more during design, 50 times more at the implementation stage, and 170 times more after delivery (Fig. 1). Empirical studies have shown that moving quality assurance effort up to the early phases of development can be 33 times more cost effective than testing done at the end of development [9].

This suggests that substantially more effort should be spent during early development phases to catch defects when they occur, or to prevent them from occurring altogether. However, it is during analysis that the notion of software development as a craft rather than an engineering discipline is strongest, and quality is therefore most difficult to assess. There are relatively few guidelines for evaluating the quality of data models, and little agreement even among experts as to what makes a “good” data model. As a result, the quality of data models produced in practice is almost entirely dependent on the competence of the data modeller [10], [11].

In the quality management literature, the distinction is frequently made between product and process quality [12]:

  • Product quality focuses on the characteristics of the product. Product quality criteria are used to carry out inspections of the finished product and detect and correct defects. This is the traditional approach to quality assurance.

  • Process quality focuses on the process used to produce the product. The objective is to build quality into the production process rather than trying to add it in at the end through reviews and inspections of the finished product. The focus of process quality is on defect prevention rather than detection, and aims to reduce reliance on mass inspections as a way of achieving quality [13]. This is the TQM approach to quality assurance.

In the context of data modelling, product quality is concerned with evaluating and improving the quality of the data model (the product) while process quality is concerned with improving the data analysis process (the production process) (see Fig. 2). Product quality is most important in the context of an individual project—it is important to ensure that the data model is free of defects so that a database can be built which will meet user requirements. However process quality is more important in the wider organisational context: to improve the organisation's ability to efficiently deliver high quality information systems.

Previous research on data model quality has focused almost exclusively on product quality. A summary of approaches to quality in data modelling is shown in Table 1.

The simplest type of quality evaluation approach is where quality is defined as a list of desirable properties of a data model (e.g. [1], [14], [15], [16]). Such lists provide a useful starting point for understanding and evaluating quality in data models, but are mostly unstructured, use imprecise definitions, often overlap, and properties of models are often confused with language and method properties [17].

More comprehensive approaches to quality evaluation develop theoretical frameworks which define the key concepts underlying data model quality. Lindland et al. [17] define a framework based on semiotic theory, which defines a conceptual model as a set of statements in a language. For each semiotic level (syntactic, semantic, pragmatic) the framework defines quality goals and means to achieve them. Krogstie et al. [10] extend the framework to include a fourth semiotic level: the social level. These frameworks apply to conceptual models generally, not just data models. Kesh [18] develops a framework for evaluating data models based on ontological concepts. This framework defines criteria and metrics for evaluating the quality of data models.

The most serious deficiencies in the existing literature are:

  • None of the approaches have been empirically validated in practice: all are either justified based on theory or the author(s)’ experience. Theoretical justification is limited because methods have no “truth” value—the validity of a method is an empirical rather than a theoretical question [19], [20]. Experiential justification is also limited because personal experience is subject to bias. Also, a method which works well for one person may not work for another [21].

  • None of the approaches adequately addresses the issue of process quality: they define criteria and, in some cases, measures for evaluating the quality of data models (error detection) but not how to develop models in a high quality manner (error prevention).

Both of these issues are addressed in this paper.

The structure of the paper is:

  • Section 2 describes the quality management framework used as the theoretical basis for this research—this represents the a priori theory being tested.

  • Section 3 outlines the research methodology used to validate the framework.

  • Section 4 describes how the framework was used to quality assure a data model for an application development project as part of an action research study (product quality).

  • Section 5 describes how the framework was used to re-engineer the analysis process in an organisation as part of a longitudinal action research study (process quality).

  • Section 6 describes how the framework was used to analyse differences in the quality of models produced by expert and novice data modellers using a laboratory experiment.

  • Section 7 describes how the framework was used to provide automated support for the evaluation process (the Data Model Quality Advisor), and analyses its effectiveness using a laboratory experiment.

  • Section 8 summarises the research findings and their implications for research and practice.

Section snippets

Overview of the framework

The quality management framework used as the basis for this research is defined by the Entity Relationship model in Fig. 3 [11]. This represents the a priori theory being tested by this research. The purpose of the framework is to evaluate and improve the quality of application data models. The framework consists of five major constructs, each of which is shown as a separate entity in Fig. 3:

  • Quality factors define the characteristics of a data model that determine its overall quality. A

Validation of IS design methods

The question of how to validate IS design methods has been a longstanding issue in the IS field (e.g. [19], [21], [24], [25], [26], [27], [28]). There are inherent problems evaluating any methodology or design technique since there is typically no theory, no hypotheses, no experimental design and no data analysis to which traditional evaluation criteria can be applied [28].

As a result, IS design research tends to emphasise the development of new design methods and frameworks, while addressing

Action research study I: product quality

This section describes how the framework was used to evaluate and improve the quality of a data model in a large application development project. This was the first real world application of the framework. In this case, the framework is used to improve product quality.

Action research study II: process quality

This section describes how the framework was used to improve the process of developing data models as part of a longitudinal action research study in a single organisation [31]. One of the principles of TQM is that the most effective way to improve the quality of a product is to improve the process by which it is developed [13]. This was also one of the major findings from the first action research study.

Analysis of differences between expert and novice data modellers

This section describes how the framework was used to investigate differences in models produced by expert and novice data modellers. This study focuses on product quality, as the framework is used to evaluate the quality of models produced by experimental subjects.

Systems development as a research method

Systems development is a research method in which scientific knowledge is used to produce devices, systems or methods including design and development of prototypes [98]. In this approach, theory is used to develop a prototype system, which is then used to test the theory. It thus provides a way of linking basic and applied research [99]. According to Nunamaker et al. [37]:

The development of a method or system can provide a perfectly acceptable piece of evidence (an artifact) in support of a

Conclusion

This paper has described how the data model quality evaluation framework proposed by Moody and Shanks [11] has been validated using a variety of research methods. Experiences in practice have been used to refine the framework using an action research approach. The paper describes how the framework has been used to:

  • (a)

    Quality assure individual data models as part of application development projects (product quality),

  • (b)

    reengineer application development procedures to build quality into the data

References (110)

  • B.W Boehm

    Software Engineering Economics

    (1981)
  • C Walrad et al.

    Measurementthe key to application development quality

    IBM Systems J.

    (1993)
  • J. Krogstie, O.I. Lindland, G. Sindre, Towards a deeper understanding of quality in requirements engineering,...
  • D.L. Moody, G.G. Shanks, What makes a good data model? Evaluating the quality of entity relationship models, in: P....
  • J.R. Evans, W.M. Lindsay, The Management and Control of Quality, 5th Edition, South-Western (Thomson Learning),...
  • W.E Deming

    Out of the Crisis

    (1986)
  • B. von Halle, Data: asset or liability? Database Programming Design 4(7) (1991)...
  • C Batini et al.

    Conceptual Database DesignAn Entity Relationship Approach

    (1992)
  • A. Levitin, T. Redman, Quality dimensions of a conceptual view, Inform. Process. Manage. 31(1)...
  • O.I Lindland et al.

    Understanding quality in conceptual modelling

    IEEE Software

    (1994)
  • J Ivari

    Dimensions of information systems designa framework for a long range research program

    Inform. Systems J.

    (1986)
  • N Rescher

    Methodological PragmatismSystems-Theoretic Approach to the Theory of Knowledge

    (1977)
  • J.L Wynekoop et al.

    Studying systems development methodologiesan examination of research methods

    Inform. Systems J.

    (1997)
  • M Gibbons et al.

    The New Production of KnowledgeThe Dynamics of Science and Research in Contemporary Societies

    (1994)
  • D.L. Moody, Metrics for evaluating the quality of entity relationship models, in: T.W. Ling, S. Ram, M.L. Lee (Eds.),...
  • T.W. Olle, H.G. Sol, A.A. Verrijn-Stuart (Eds.), Information Systems Design Methodologies: A Comparative Review,...
  • T.W. Olle, H.G. Sol, C.J. Tully (Eds.), Information Systems Design Methodologies: A Feature Analysis, North-Holland,...
  • T.W. Olle, H.G. Sol, A.A. Verrijn-Stuart (Eds.), Information Systems Design Methodologies: Improving the Practice,...
  • G. Fitzgerald, in: H.E. Nissen, H.K. Klein, R. Hirschheim (Eds.), Validating New Information Systems Techniques: A...
  • R.A. Weber, Ontological Foundations of Information Systems, Coopers and Lybrand Accounting Research Methodology...
  • J.A. Bubenko, in: T.W. Olle, H.G. Sol, A.A. Verrijn-Stuart (Eds.), Information Systems Methodologies—A Research View,...
  • B. Curtis, in: E. Soloway, S. Iyengar, (Eds.), By The Way, Did Anyone Study Any Real Programmers? Empirical Studies of...
  • D.L Moody et al.

    Evaluating and improving the quality of entity relationship modelsan action research programme

    Aust. Comput. J.

    (1998)
  • C Westrup

    Information systems methodologies in use

    J. Inform. Technol.

    (1993)
  • N Rescher

    Cognitive Systematization

    (1979)
  • R.L Baskerville et al.

    A critical perspective on action research as a method for information systems research

    J. Inform. Technol.

    (1996)
  • R.D. Galliers, in: H.E. Nissen, H.K. Klein, R. Hirschheim, (Eds.), Choosing Information Systems Research Approaches,...
  • R.D Galliers

    Information Systems ResearchIssues, Methods and Practical Guidelines

    (1992)
  • J Nunamaker et al.

    Systems development in information systems research

    J. Manage. Inform. Systems

    (1991)
  • G. Shanks, A. Rouse, D. Arnott, A review of approaches to research and scholarship in information systems, Proceedings...
  • T.D Jick

    Mixing qualitative and quantitative methodstriangulation in action

    Administrative Sci. Q.

    (1979)
  • B Kaplan et al.

    Combining qualitative and quantitative methods in information systems researcha case study

    MIS Q.

    (1988)
  • A Lee

    Integrating positivist and interpretivist approaches to organisational research

    Organ. Sci.

    (1991)
  • W.L. Neuman, Social Research Methods—Qualitative and Quantitative Approaches, 4th Edition, Allyn and Bacon, Needham...
  • D Avison et al.

    Action Research

    Comm. ACM

    (1999)
  • T.L Baker

    Doing Social Research

    (1998)
  • J Mckernan

    Curriculum Action ResearchA Handbook of Methods and Resources for the Reflective Practitioner

    (1991)
  • E.T Stringer

    Action Research—A Handbook for Practitioners

    (1996)
  • J. Masters, The history of action research, in: I. Hughes (Ed.), Action Research Electronic Reader (on-line),...
  • B. Dick, A beginner's guide to action research [On line], in: B. Dick, R. Passfield, P. Wildman, (Eds.), Action...
  • Cited by (182)

    • A decision support tool for evaluating the wildlife corridor design and conservation performance using analytic network process (ANP)

      2022, Journal for Nature Conservation
      Citation Excerpt :

      After developing the WCDC index assessment model, the study conducted the model verification. Indirect observation and mixed empirical investigation (coupling qualitative and quantitative empirical data) are the most practical techniques for model verification (Flynn et al., 1990; Moody and Shanks, 2003; Tracey, 2009). Accordingly, this study has verified the model through empirical research and WSM.

    • Health data standards’ limitations

      2022, Roadmap to Successful Digital Health Ecosystems: A Global Perspective
    • A model for evaluating green credit rating and its impact on sustainability performance

      2021, Journal of Cleaner Production
      Citation Excerpt :

      Consequently, the study combined GRA to evaluate the degree of interrelation among the criterion of GCP, DEMATEL to analyze the preferences among the criterion of GCP, ANP to obtain the final weights for the GCP criterion, to attain optimal SCM in manufacturing solution with the integration of neutrosophic theory to evaluate the conditions of GCP and recommends the optimal SCM in manufacturing alternatives. Moody and Shanks (2003) applied the data model quality function to validate the framework. The study evaluated N-MCDMF according to eight factors illustrated by Moody and Shanks (2003).

    View all citing articles on Scopus

    Recommended by Professor P. Loucopoulos.

    View full text