Elsevier

Biosystems

Volume 111, Issue 2, February 2013, Pages 127-135
Biosystems

Theoretical formulation and analysis of the deterministic dendritic cell algorithm

https://doi.org/10.1016/j.biosystems.2013.01.001Get rights and content

Abstract

As one of the emerging algorithms in the field of artificial immune systems (AIS), the dendritic cell algorithm (DCA) has been successfully applied to a number of challenging real-world problems. However, one criticism is the lack of a formal definition, which could result in ambiguity for understanding the algorithm. Moreover, previous investigations have mainly focused on its empirical aspects. Therefore, it is necessary to provide a formal definition of the algorithm, as well as to perform runtime analyses to reveal its theoretical aspects. In this paper, we define the deterministic version of the DCA, named the dDCA, using set theory and mathematical functions. Runtime analyses of the standard algorithm and the one with additional segmentation are performed. Our analysis suggests that the standard dDCA has a runtime complexity of O(n2) for the worst-case scenario, where n is the number of input data instances. The introduction of segmentation changes the algorithm's worst case runtime complexity to O(max(nN,nz)), for DC population size N with size of each segment z. Finally, two runtime variables of the algorithm are formulated based on the input data, to understand its runtime behaviour as guidelines for further development.

Introduction

Artificial immune systems (AIS) (de Castro and Timmis, 2002, Hart and Timmis, 2008) are computer systems inspired by both theoretical immunology and observed immune functions, principles and models, which are applied to real-world problems. The human immune system from which AIS draw inspiration, is evolved to protect the host from a wealth of invading micro-organisms. AIS are developed to provide similar defensive properties within a computing context. Initially, AIS were based on simple models of the human immune system. As noted by Stibor et al. (2005), ‘first generation’ immune algorithms, such as negative selection and clonal selection, do not produce the same high-quality performance as the human immune system. These algorithms, negative selection in particular, are prone to problems with scalability and the generation of excessive false alarms, when used to solve problems such as network-based intrusion detection. Recent AIS use more rigorous and up-to-date immunology and are developed in collaboration with modellers and immunologists. The resulting algorithms are believed to encapsulate the desirable properties of immune systems, including robustness, error tolerance, and self-organisation (de Castro and Timmis, 2002).

One such ‘second generation’ immune algorithms is the Dendritic Cell Algorithm (DCA) (Greensmith, 2007). The algorithm is inspired by functions of the dendritic cells (DCs) of the innate immune system, while incorporating principles of a key novel theory in immunology, named the danger theory (Lutz and Schuler, 2002). An abstract model of natural DC behaviour is used as the foundation of the developed algorithm. The DCA has been successfully applied to numerous security-related problems, including port scan detection (Greensmith, 2007), botnet detection (Al-Hammadi et al., 2008) and as a classifier for robot security (Oates et al., 2007). These applications refer to the area of anomaly detection, which is essentially one particular type of binary classification with an ‘anomalous’ class and a ‘normal’ class. According to results of these applications, the DCA has shown not only good performance in terms of detection rate, but also the ability to reduce the rate of false alarms in comparison to other systems, such as Self Organising Maps (SOM) (Greensmith et al., 2008).

However, there are also issues concerning the DCA. One criticism is the lack of a formal definition, which could result in ambiguity for understanding the algorithm and thus lead to incorrect applications and implementations. It is pointed out in Stibor et al. (2009) that the DCA shares similarities to linear classifiers since it employs a linear discriminant function for signal transformation. However, the DCA is not simply a collection of linear classifiers, as it performs classification based on the temporal correlation of a multi-agent DC population, rather than linear signal transformation. Signal transformation is used to identify if any anomalies occurred in the past. Whether the identified anomalies can be correctly correlated with potential causes is determined by the effectiveness of the temporal correlation performed at the population level. As a first step, a formal definition should be provided for presenting the algorithm in a clear and accessible manner.

Previous investigations have mainly focused on its empirical aspects, evidenced by experimental results on a range of problem domains. Except for the geometry analysis of Stibor et al. (2009) that was later extended in Oates's thesis (Oates, 2010), theoretical analysis of the DCA has barely been performed, and most theoretical aspects of the algorithm have not yet been revealed. Other immune inspired algorithms, such as negative and clonal selection algorithms, were theoretically presented in Timmis et al. (2008). Elberfeld and Textor (2011) theoretically analysed string-based negative selection algorithms, to show the possibility of reducing the worst-case runtime complexity from exponential to polynomial, through compressing detectors. More recently, the work of Zarges, 2008, Zarges, 2009 theoretically analysed one of the vital components of the clonal selection based algorithms, namely inversely proportional mutation rates. Janson and Zarges (2011) performed a theoretical analysis of immune inspired somatic contiguous hypermutations for function optimisation. As a result, it is important to conduct a similar theoretical analysis of the DCA, to determine its runtime complexity and numerous other algorithmic properties, in line with other AIS.

In this paper, we extend the work presented in Gu et al. (2009), which involved formal specifications of a single-cell model at the behavioural level using interval temporal logic (Moszkowski, 1985). Note the algorithm demonstrated in this work is the deterministic DCA (dDCA) (Greensmith and Aickelin, 2008), created by removing stochastic components for the ease of analysis. Any statements of the DCA made subsequently are referred to the dDCA. The aim is to provide a clear and accessible definition of the DCA, as well as an initial theoretical analysis on the algorithm's runtime complexity and other algorithmic properties. As potential readers may not have a deep understanding of complicated formal methods such as the B-method (Jean-Raymond, 1996), we use set theory and mathematical functions to specify the algorithm. From the formal definitions, theoretical analyses on the runtime complexity are performed, for the standard algorithm and an extended system with segmentation. Moreover, the formulations of two important runtime variables are included to present the algorithm's runtime behaviour, and to provide guidelines for future development. The paper is organised as follows, an overview of the DCA is given in Section 2, the formal definition is presented in Section 3, runtime analyses are shown in Section 4, formulation of two runtime variables is described in Section 5, and finally conclusions and future work are presented in Section 6.

Section snippets

Biological background

The DCA is inspired by functions of the dendritic cells (DCs) of the innate immune system, which forms part of the body's first line of defence against invaders. DCs exhibit the ability to combine a multitude of molecular information and to interpret this information for the T-cells of the adaptive immune system. This could lead to the induction of various immune responses against perceived pathogenic threats. Therefore, DCs are often seen as detectors responsible for policing different

Formalisation of the DCA

In this section, we formally define data structures and procedural operations of the DCA at the population level. Unlike specifications of a single DC at the behavioural level in Gu et al. (2009), here we focus on specifying the entire DC population using quantitative measures at the functional level. Instead of using more advanced and possibly more complex interval temporal logic (Moszkowski, 1985), set theory and mathematical functions e.g. addition, multiplication and recursion are used for

The standard DCA

By combining the procedural operations of the DCA with for, while loops or if statements the algorithm can be presented as in Algorithm 1. Previous applications of the DCA have shown that the runtime of the algorithm is relatively short and the consumption of computation power is also low (Greensmith and Aickelin, 2008). However, theoretical analysis of the runtime complexity of the DCA, given a set of input data, has not yet been performed. Runtime analysis involves calculating the number of

Formulation of runtime properties

Two runtime variables of the DCA are assessed, as they can be used as quantitative indicators of the changes to the algorithm's runtime behaviour. They are the number of matured DCs (those which reach the termination condition and are reset) and the number of processed antigens respectively. The number of matured DCs indicates that the amount of processed information is related to signal instances. Conversely, the number of processed antigens implies that the amount of processed information is

Conclusions and future work

In this paper, we provide formal definitions of the data structures and procedural operations of the deterministic version of the DCA, name the dDCA. It aims to clearly present the algorithm, to prevent future misunderstanding and ambiguity that could result in inappropriate applications and implementations. Based on the formal definitions, a runtime analysis of the standard DCA is performed. The DCA achieves the the worst-case runtime complexity bounded by O(n2), which is quadratic. The

References (32)

  • L.N. de Castro et al.

    Artificial Immune Systems: A New Computational Intelligent Approach

    (2002)
  • R.O. Duda et al.

    Pattern Classification

    (2000)
  • Greensmith, J., 2007. The Dendritic Cell Algorithm. Ph.D. Thesis, School of Computer Science, University of...
  • J. Greensmith et al.

    DCA for SYN scan detection

  • J. Greensmith et al.

    The deterministic dendritic cell algorithm

  • J. Greensmith et al.

    The DCA: SOMe comparison. A comparative study between two biologically-inspired algorithms

    Evol. Intell.

    (2008)
  • Cited by (39)

    • Spam message detection using Danger theory and Krill herd optimization

      2021, Computer Networks
      Citation Excerpt :

      The details is given as Algorithm 2. It is analyzed that TF gives promising results as compared to other weighting schemes [5, 48]. Therefore, in this research work, TF is used by further modifying it with unit vector normalization method.

    • Artificial immune systems applied to fault detection and isolation: A brief review of immune response-based approaches and a case study

      2017, Applied Soft Computing Journal
      Citation Excerpt :

      In Greensmith [7], the algorithm would be applied to problems related to intrusion detection in computer networks, among them, the SYN scan problem, with promising results for the application in the problem. In Gu et al. [29], most mathematical and formal aspects of the algorithm were analyzed. Another example of application for DCA is in Mohsin et al. [30], as a decision provider in a data mining system applied to outbreak diseases detection.

    View all citing articles on Scopus
    View full text