A fuzzy-ontology-oriented case-based reasoning framework for semantic diabetes diagnosis

https://doi.org/10.1016/j.artmed.2015.08.003Get rights and content

Highlights

  • Propose a fuzzy ontology based semantic-CBR framework.

  • Propose a novel OWL2 fuzzy case-base ontology.

  • Propose a fuzzy semantic case retrieval algorithm using an SNOMED CT fragment.

  • Implement the fuzzy KI-CBR system using diabetes diagnosis as a case study.

  • Combine fuzzy logic and ontology semantics in CBR enhances the CBR accuracy.

Abstract

Objective

Case-based reasoning (CBR) is a problem-solving paradigm that uses past knowledge to interpret or solve new problems. It is suitable for experience-based and theory-less problems. Building a semantically intelligent CBR that mimic the expert thinking can solve many problems especially medical ones.

Methods

Knowledge-intensive CBR using formal ontologies is an evolvement of this paradigm. Ontologies can be used for case representation and storage, and it can be used as a background knowledge. Using standard medical ontologies, such as SNOMED CT, enhances the interoperability and integration with the health care systems. Moreover, utilizing vague or imprecise knowledge further improves the CBR semantic effectiveness. This paper proposes a fuzzy ontology-based CBR framework. It proposes a fuzzy case-base OWL2 ontology, and a fuzzy semantic retrieval algorithm that handles many feature types.

Material

This framework is implemented and tested on the diabetes diagnosis problem. The fuzzy ontology is populated with 60 real diabetic cases. The effectiveness of the proposed approach is illustrated with a set of experiments and case studies.

Results

The resulting system can answer complex medical queries related to semantic understanding of medical concepts and handling of vague terms. The resulting fuzzy case-base ontology has 63 concepts, 54 (fuzzy) object properties, 138 (fuzzy) datatype properties, 105 fuzzy datatypes, and 2640 instances. The system achieves an accuracy of 97.67%. We compare our framework with existing CBR systems and a set of five machine-learning classifiers; our system outperforms all of these systems.

Conclusion

Building an integrated CBR system can improve its performance. Representing CBR knowledge using the fuzzy ontology and building a case retrieval algorithm that treats different features differently improves the accuracy of the resulting systems.

Introduction

Diabetes is a complex, chronic illness requiring continuous medical care with multifactorial risk-reduction strategies beyond glycemic control. According to World Health Organization (WHO), diabetes will be the seventh leading cause of death in 2030 [1]. Globally, about 336 million people are living with type 2 diabetes mellitus, and this figure is set to rise to over 552 million by 2030 [2]. In 2014, 9% of adults 18 years and older had diabetes [1]. There are three main types of diabetes. The first type is type 1 diabetes mellitus or insulin dependent diabetes mellitus; this type occurs when the pancreas cannot produce sufficient insulin. The second type is type 2 diabetes mellitus or insulin-independent diabetes mellitus; this type occurs when the body cannot effectively use the produced insulin. The third type is gestational diabetes, which occurs in pregnant women. A patient of diabetes symptoms but not really diabetic is called a pre-diabetes patient.

The early diagnosis of diabetes is critical in its care process because the early care can prevent long-term microvascular complications such as retinopathy, nephropathy and neuropathy, and cardiovascular disease. Moreover, the early diagnosis can prevent the pre-diabetes patient to become a diabetic. At present, the results for early detection of diabetes are not highly accurate. Therefore, there is a need to develop a diagnosis system for diabetes that has better accuracy. Clinical decision support systems (CDSS) can help in this regard. Existing rule-based diagnose diabetes systems are mainly based on the A1C criteria or plasma glucose criteria, either the fasting plasma glucose (FPG) or the 2-h plasma glucose (2-h PG) value after a 75-g oral glucose tolerance test (OGTT). For example, they take decisions using rules such as if (A1C  6.5% or FPG  126 mg/dL or 2-h PG  200 mg/dL) then the patient is diabetic [3]. However, diabetes diagnosis is more complicated than these direct decisions. Diabetes is related to other diseases including renal diseases, heart diseases, foot diseases, etc. Moreover, it has symptoms related to hyperglycemia or hypoglycemia. The true or false decisions about these symptoms, e.g. thirst = true, is not enough.

Diabetes diagnosis is a theory-less and unstructured problem, and it depends on the physician's experience. For experience-based problem solving, case based reasoning (CBR) is one of the most suitable AI techniques for decision support [4]. CBR imitates human reasoning, and it is suitable when we cannot formulate a problem in a set of generalized rules. It is appropriate in a medical context where symptoms represent the problem, and diagnosis and treatment represent the solution. The CBR paradigm has been successfully used in various medical fields from lung disease and eating disorders to diabetes and Alzheimer's disease [5]. Many pieces of research utilized CBR for diabetes diagnosis [6], [7], [8], [9]. Although any CBR system relies on a set of specific previous experiences, its reasoning power can be improved by general knowledge about the domain [10]. Ontologies can enhance the capabilities of CBR by creating knowledge intensive-CBR (KI-CBR) systems [11]. It can play many roles in CBR such as background domain ontology, case-base ontology, semantic similarity measurement, and others [12]. Ontology can enhance CBR systems in many dimensions, as shown in Fig. 1. In this figure, we suggest three types of KI-CBRs paradigms. In part (a) of Fig. 1, the case-base is stored in a traditional database, and the domain knowledge is stored in an ontology. In part (b), the case-base is stored in a crisp ontology, and the domain knowledge is stored in an ontology. In part (c), the case-base is stored in a fuzzy ontology, and the domain knowledge is stored in an ontology. We have selected the most complicated and recent approach (part c). For diabetes diagnosis, researchers made efforts toward diabetes ontology development [13]. Nevertheless, the literature of ontology-based CBR for diabetes is not rich with studies [7], [8].

The most critical steps in CBR paradigm are the case representation and case retrieval. We concentrate on these two main steps to improve the performance of medical CBR. The case base building process reduces the efforts and time to build the system's knowledge base compared to rule-based systems. No generalized knowledge is required to build a successful CBR system. However, the collection of cases for patients requires the integration between the CDSS system and the distributed electronic health record (EHR) environment. As a result, the standardization of CBR knowledge and data is critical to achieving interoperability. Interoperability between EHR systems and CDSS facilitates the automatic collection of knowledge from patients’ EHRs, supports the integration of CDSS in the healthcare environment, and eases the physician's querying process. EHR uses standards as Health Level 7s reference information model (HL7 RIM) [14] and systematized nomenclature of medicine-clinical terms (SNOMED-CT) [15], SCT for short, ontology for data storage and exchange, which can be utilized in CBR. RIM can be used as a standard case-base structure, and SCT can be used as background knowledge to enhance semantic retrieval [16], [17]. El-Sappagh et al. [9] proposed a standard data model for diabetes case-base. SCT is a huge ontology, which affects the performance of the CBR retrieval algorithm. Creating a reference set from SCT for diabetes is required. El-Sappagh et al. [18] proposed a diabetes diagnosis OWL2 standard ontology from an SCT reference set. As far as we know, there are no studies utilize SCT reference sets in CBR systems for diabetes diagnosis, which is considered as a required issue for semantic retrieval and integration of CDSS in EHR environment. Using the created SCT-based OWL2 for semantic retrieval requires the encoding of the case-base unstructured knowledge with the same code. The encoding process is not a straightforward process, and it requires a methodology. El-Sappagh et al. [19] proposed an encoding methodology and utilized it to encode the case-base contents.

Physicians often describe patients using imperfect and linguistic data, and their knowledge and natural language have a great deal of imprecision and vagueness. As Zadeh [20] argued much of the knowledge that humans acquire through experience is perception-based and thus subject to imprecision and inaccuracy. Such knowledge, when not treated in some suitable way that can consider and convey its inherent imprecision, usually leads to the poor effectiveness of the knowledge-based systems that use it. As a result, KI-CBR paradigm must handle the imprecise knowledge representation and reasoning [21]. The existing fuzzy CBR systems utilize imprecise knowledge through the use of fuzzy logic for case representation and relevant fuzzy pattern matching techniques for similarity assessment [22]. A survey of existing systems of fuzzy CBR in diabetes diagnosis indicates that there are few works in this field. However, the lack of representation of this knowledge in ontological restricts the effectiveness of these systems because they did not take advantage of the reasoning capabilities that ontologies provide. The fuzzy ontology focuses on assigning a meaning to the fuzziness of the ontology's components. It is an important characteristic as it makes the fuzzy ontology's imprecision explicit, thus facilitating more efficient knowledge acquisition and ontology reuse. Moreover, it enables the definition of more effective semantic similarity measures, which facilitate case retrieval. For diabetes, the existing fuzzy CBR systems have not used fuzzy ontology or even crisp ontology as background domain knowledge or case-base ontologies [8]. On the other hand, ontologies and fuzzy logic have been utilized in diabetes in other reasoning methods such as rule-based expert systems [23].

In this paper, we present a fuzzy KI-CBR framework that handles and exploits imprecise knowledge through the effective integration of fuzzy logic in the ontology-based CBR paradigm. Fuzzy case-base ontology and a fuzzy semantic retrieval algorithm are proposed and integrated to build an intelligent CBR for diabetes diagnosis. This approach introduces fuzzy semantics to CBR in two places. The first is the representation of imprecise knowledge itself, and the second is case retrieval. In particular, our proposed framework is built using a fuzzy ontology that supports the representation of imprecise case-specific knowledge while the retrieval of cases is enabled by proposing a highly customizable fuzzy semantic similarity framework. As most of the CBR studies did not implement the entire cycle [12], [24], we concentrate on the most critical and most related steps (i.e., case representation and retrieval). Case adaptation, reuse, retention, and case-base maintenance will be handled in other works.

Importantly, our system is implemented in six modules: Case source preparation, case base ontology engineering, terminology server, fuzzy case-base ontology population, case retrieval engine, and case query parser. We implement and test the proposed framework on a real case-base. The system has a user-friendly interface; it supports the selection of standard medical concepts from an SCT dialog, and it implements the clinical distance in the case retrieval process.

As a result, the system achieves a high-level performance compared to the traditional CBR systems, other CBR systems in the literature, and machine learning algorithms. The system's accuracy is 97.67%. Therefore, it is highly accurate and can be applied in a real medical environment.

To this end, the remainder of the paper is organized as follows: Section 2 provides studies related to KI-CBR, especially for diabetes, and show its limitations. Section 3 is a set of preliminaries including our dataset description. Section 4 illustrates the research methodology used in the study. Section 5 is the proposed CBR framework. Implementation and evaluation are discussed in Section 6. Finally, Section 7 concludes the paper and highlights future work directions.

Section snippets

Related work

The physician can depend on clinical practice guidelines (CPG) to diagnose diabetes. However, CPGs are long plaintext documents. Some languages such as Arden syntax can be used for representing and sharing this medical knowledge. It can convert CPGs into actionable rules to implement rule-based CDSS systems. Samwald et al. [25] proposed a development environment including a compiler and rule engine for Arden Syntax rules. However, diabetes diagnosis is an ill-formed, theory-less, and experience

Preliminaries

To make the article self-contained, in this section we define some concepts, definitions, and terminologies before discussing the proposed framework.

Research methodology

As shown in Fig. 6, we follow a specific methodology to finish this study. To accomplish the purpose of this study, we have utilized some existing technologies and studies. Moreover, we have utilized our previous research studies to complete some specific steps. In the figure, we make a clear cut between the current study goals and the other utilized works.

In the first step, the detailed understanding of the nature of diabetes mellitus disease and its diagnosis process requires deep interviews

The proposed fuzzy KI-CBR framework for diabetes diagnosis

This section provides a description of our proposed fuzzy-ontology based CBR system for diabetes diagnosis. The architecture of this system is shown in Fig. 7. It has six modules: Case source preparation, case base ontology engineering, terminology server, fuzzy case-base ontology population, case retrieval engine, and case query parser. The main steps of the framework are case-base preparation and case retrieval.

The case-base preparation step is achieved by the case source preparation,

System implementation

A CBR system was developed in Java by extending the APIs of the JCOLIBRI2 CBR framework [52]. As shown in Fig. 18, the proposed customization has three layers, and each layer has specific tasks. Due to space restrictions, we do not discuss this framework in detail. The persistence layer prepares the fuzzy case-base ontology. The CBR application layer is the core of the framework as it contains the whole CBR cycle. The interface layer accepts a query from the physician and returns the most

Conclusion

This paper proposes a fuzzy ontology-based semantic CBR system and its implementation for a decision support system for diabetes diagnosis. This system enhances the decision maker efficiency in the diagnosing process. The proposed approach has many contributions and novelties: (1) It builds a case-base fuzzy ontology compatible with the most famous CBR framework, i.e. JCOLIBRI, (2) It builds and uses a standard medical terminology subset for diabetes diagnosis from SCT, which is the most

Acknowledgments

This project was supported by King Saud University, Deanship of Scientific Research, College of Sciences, Research Centre. The authors would like to thank Dr. Farid Badria, Prof. of Pharmacognosy, Department and head of Liver Research Lab, Mansoura University, Egypt; and Dr. Hosam Zaghloul, Prof. at Clinical Pathology Department, Faculty of Medicine, Mansoura University, Egypt, for their efforts in this work.

References (107)

  • F. Ali et al.

    Type-2 fuzzy ontology-based semantic knowledge for collision avoidance of autonomous underwater vehicles

    Inf Sci

    (2015)
  • S. Harispe et al.

    A framework for unifying ontology-based semantic similarity measures a study in the biomedical domain

    J Biomed Inf

    (2014)
  • T. Gruber

    Towards principles for the design of ontologies used for knowledge sharing

    Int J Hum Comput Stud

    (1995)
  • L. Zadeh

    Fuzzy sets

    Inf Control

    (1965)
  • F. Bobillo et al.

    Fuzzy ontology representation using OWL 2

    Int J Approx Reason

    (2011)
  • S. Akmal et al.

    Ontology-based similarity for product information retrieval

    Comput Ind

    (2014)
  • J. Qi et al.

    A case retrieval method combined with similarity measurement and multi-criteria decision making for concurrent design

    Expert Syst Appl

    (2009)
  • D. Sánchez et al.

    Ontology-based information content computation

    Knowledge-Based Syst

    (2011)
  • T. Bright et al.

    Development and evaluation of an ontology for guiding appropriate antibiotic prescribing

    J Biomed Inf

    (2012)
  • Y. Guo et al.

    A CBR system for injection mould design based on ontology: a case study

    Comput -Aided Des

    (2012)
  • R. Satter et al.

    Avatar-based simulation in the evaluation of diagnosis and management of mental health disorders in primary care

    J Biomed Inform

    (2012)
  • S. Petrovic et al.

    A novel case based reasoning approach to radiotherapy planning

    Expert Syst Appl

    (2011)
  • R. Fernandes et al.

    Semantic methods supporting engineering design innovation

    Adv Eng Inf

    (2011)
  • S. Montani et al.

    Diabetic patients management exploiting case-based reasoning techniques

    Comput Methods Programs Biomed

    (2000)
  • C. Fan et al.

    A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification

    Appl Soft Comput

    (2011)
  • S. Li et al.

    Predicting financial activity with evolutionary fuzzy case-based reasoning

    Expert Syst Appl

    (2009)
  • M. Han et al.

    An improved case-based reasoning method and its application in end point prediction of basic oxygen furnace

    Neurocomputing

    (2015)
  • World Health Organization (WHO)

    Diabetes

    (2015)
  • S. Ofori et al.

    Holistic approach to prevention and management of type 2 diabetes mellitus in a family setting

    Diabetes Metab Syndr Obes

    (2014)
  • A. AlJarullah

    Decision tree discovery for the diagnosis of type II diabetes

    International conference on innovations in information technology

    (2011)
  • S. Begum et al.

    Case-based reasoning systems in the health sciences: a survey of recent trends and developments

    IEEE Trans Syst Man Cybernet, C

    (2010)
  • C. Marlinga et al.

    Synergistic case-based reasoning in medical domains

    Expert Syst Appl

    (2014)
  • M. Jha et al.

    Diabetes detection and care applying CBR techniques

    Int J Soft Comput Eng (IJSCE)

    (2013)
  • A. Jaya et al.

    Role of ontology in case-based reasoning (CBR) for diagnosing diabetes

    J Inf Technol

    (2009)
  • J. Chen et al.

    Diabetes care decision support system

  • S. El-Sappagh et al.

    A CBR system for diabetes mellitus diagnosis: case-base standard data model

    Int J Med Eng Inf

    (2015)
  • N. Dendani et al.

    Use a domain ontology to develop knowledge intensive CBR systems for fault diagnosis

    International conference on information technology and e-Services (ICITeS)

    (2012)
  • B. Diaz-Agudo et al.
    (2000)
  • Health Level Seven International (HL7), 〈http://www.hl7.org/〉 (accessed: 19 August...
  • The International Health Terminology Standards Development Organization (IHTSDO), SNOMED CT: The Global Language of...
  • P. Jirathitikul et al.

    A similarity measuring service for SNOMED-CT structural analysis of concepts in ontology

  • S. El-Sappagh et al.

    A diabetes diagnostic domain ontology for CBR system from the conceptual model of SNOMED CT

    The second international conference on engineering and technology (ICET 2014)

    (2014)
  • S. El-Sappagh et al.

    A proposed SNOMED CT ontology-based encoding methodology for diabetes diagnosis case-base

    The ninth international conference on computer engineering and systems (ICCES 2014)

    (2014)
  • L. Zadeh
    (2003)
  • P. Alexopoulos et al.

    Utilizing imprecise knowledge in ontology-based CBR systems by means of fuzzy algebra

    Int J Fuzzy Syst

    (2010)
  • C. Lee et al.

    A fuzzy expert system for diabetes decision support application

    IEEE Trans Syst Man Cybernet, B: Cybernet

    (2011)
  • J. Zhaoa et al.

    Learning HAZOP expert system by case-based reasoning and ontology

    Comput Chem Eng

    (2009)
  • M. Gan et al.

    From ontology to semantic similarity: calculation of ontology-based semantic similarity

    Sci World J

    (2013)
  • P. Sherimon et al.

    Adaptive questionnaire ontology in gathering patient medical history in diabetes domain

    (2014)
  • Hayuhardhika W. Sugiyanto et al.

    Weighted ontology and weighted tree similarity algorithm for diagnosing diabetes mellitus

  • Cited by (92)

    View all citing articles on Scopus
    View full text