Abstract

Existing application-layer distributed denial of service (AL-DDoS) attack detection methods are mainly targeted at specific attacks and cannot effectively detect other types of AL-DDoS attacks. This study presents an application-layer protocol communication model for AL-DDoS attack detection, based on the explicit duration recurrent network (EDRN). The proposed method includes model training and AL-DDoS attack detection. In the AL-DDoS attack detection phase, the output of each observation sequence is updated in real time. The observation sequences are based on application-layer protocol keywords and time intervals between adjacent protocol keywords. Protocol keywords are extracted based on their identification using regular expressions. Experiments are conducted using datasets collected from a real campus network and the CICDDoS2019 dataset. The results of the experiments show that EDRN is superior to several popular recurrent neural networks in accuracy, F1, recall, and loss values. The proposed model achieves an accuracy of 0.996, F1 of 0.992, recall of 0.993, and loss of 0.041 in detecting HTTP DDoS attacks on the CICDDoS2019 dataset. The results further show that our model can effectively detect multiple types of AL-DDoS attacks. In a comparison test, the proposed method outperforms several state-of-the-art approaches.

1. Introduction

With the progress of increasingly advanced network infrastructure and network layer defense technologies, attackers increasingly turn to Internet-based applications as their attack targets, resulting in the continuous emergence of application-layer attacks [1, 2]. These attacks are carried out using legitimate user requests and protocols at the application-layer. Therefore, the data flow of application-layer attacks at the network and transport layers is not significantly different from that generated by normal users.

Distributed denial of service (DDoS) attacks are one of the most dangerous attacks [36], especially application-layer DDoS (AL-DDoS) attacks, such as HTTP DDoS attacks [7, 8] and SMTP flood attacks [9]. The HTTP DDoS attacks are usually implemented by a large number of bots sending a flood of page requests to a web server at the same time, thus consuming server resources, such as database cycles, CPU cycles, or memory. In August 2022, Google encountered the largest HTTP DDoS attack in history, which tried to shut down Google’s Cloud Armor customer service, with a peak of 46 million requests per second [10]. The complexity of AL-DDoS attacks is also expected to grow over time.

Existing AL-DDoS attack detection methods are mainly targeted at specific attacks but cannot effectively identify other types of AL-DDoS attacks. Therefore, to comprehensively detect AL-DDoS attacks, multiple detection methods need to be deployed in a network. However, the principles and parameter settings of each detection method are fundamentally different, which complicates network management. Moreover, deploying multiple AL-DDoS attack detection methods simultaneously will also lead to the degradation of network performance. Hence, it is necessary to design a detection method that can effectively detect various AL-DDoS attacks.

In this study, we reexamine the issue from the perspective of application-layer protocol communication. The key idea is to model the communication of the application-layer protocol through an explicit duration recurrent network (EDRN) to detect AL-DDoS attacks, taking observed application-layer protocol keywords and time intervals between adjacent protocol keywords as inputs. Application-layer protocol keywords refer to custom request commands and server response status codes, which can reflect the behavior of users when using the protocol.

Recurrent neural networks (RNNs) exploit cycles in network nodes to capture the dynamics of sequences, and they have been widely used in sequential data mining with outstanding performance results [11]. However, traditional RNNs have hidden states whose durations approximately follows a geometric or exponential distribution [12, 13]. As a result, it is difficult to use traditional RNNs to model the variable durations of hidden states.

During the communication process of the application-layer protocol, the behavior of users and time intervals between adjacent protocol keywords are determined by many factors, such as the request method, network transmission delay, and response processing time of servers. Thus, the duration of hidden states under a sequence of application-layer protocol communication may follow a relatively complex distribution, and not necessarily a geometric or exponential distribution. The EDRN is based on an extended hidden semi-Markov model (HSMM) and can describe hidden states of any duration distribution [14]. This study adopts EDRN to model application-layer protocol communication for AL-DDoS attack detection. To evaluate the model, experiments are conducted on the CICDDoS2019 dataset [15] and datasets collected in a real campus network. The experimental results show that the EDRN is superior to several popular RNNs, and the EDRN-based model can effectively detect multiple types of AL-DDoS attacks.

The main contributions of this study can be summarized as follows:(i)We proposed an EDRN-based application-layer protocol communication model. The model uses EDRN to describe the communication process of the application-layer protocol and takes application-layer protocol keywords and time intervals between adjacent protocol keywords as inputs for the first time.(ii)Based on the application-layer protocol communication model, we proposed an attack detection method that detects AL-DDoS attacks in real time by monitoring the application-layer protocol keywords that are used in the process of protocol communication.(iii)We compared several RNNs based on the CICDDoS2019 dataset and a real campus network dataset, and the experimental results showed that the EDRN has the best performance. We also compared our proposed AL-DDoS attack detection method with several existing methods, and the test results confirmed the effectiveness and superiority of our method.

The remainder of this paper is organized as follows: Section 2 reviews recent studies on AL-DDoS detection. In Section 3, we describe the model for application-layer protocol communication. Section 4 presents the proposed AL-DDoS attack detection method. The experimental results are presented in Section 5 and discussed in Section 6. Section 7 concludes the paper.

The detection of AL-DDoS attacks has attracted the attention of researchers [1621]. Existing methods are mainly targeted at specific AL-DDoS attacks. For example, Xie and Yu [22] used HSMM, independent component analysis, and principal component analysis to mine web server logs to detect HTTP DDoS attacks. Wang et al. [23] used the Hellinger distance and sketch data structure to detect HTTP DDoS attacks. Zhou et al. [24] calculated the entropy of flash crowds and attacks for HTTP DDoS attack detection. Singh et al. [25] used four behavioral features and a support vector machine (SVM) to detect HTTP DDoS attacks. Praseed and Thilagam [26] used probabilistic timed automata (PTA) models to describe the behavior of legitimate users for HTTP DDoS attack detection. Lin et al. [27] used the rhythm matrix statistical model to capture the characteristics of user access trajectories to detect HTTP DDoS attacks. Zhao et al. [28] used URL access entropy to identify HTTP DDoS attacks. Praseed and Thilagam [29] used signatures based on HTTP request patterns to detect HTTP DDoS attacks. Raja Sree and Mary Saira Bhanu [30] used fuzzy bat clustering to analyze web server logs for HTTP DDoS attack detection in the cloud.

In terms of SMTP flood attack detection, Tudosi et al. [31] analyzed the traffic of SMTP flood attacks and used Snort (open source intrusion prevention system) to detect SMTP flood attacks. Schneider et al. [32] used the statistical characteristics of attack flows to detect SMTP flood attacks. Aziz and Okamura [33] adopted deep learning algorithms to detect SMTP flood attacks on software-defined networking (SDN) platforms. Gurusamy and Msk [34] detected SMTP flood attacks by monitoring all ports’ traffic statistics in the SDN.

In addition, Kasim [35] used the convolutional neural network (CNN) and long short-term memory (LSTM) to detect DNS flood attacks. Trejo et al. [36] used a visual platform and K-nearest neighbor (KNN) classification algorithm to detect DNS flood attacks. Datta et al. [37] detected DNS flood attacks by monitoring the DNS query per second in IoT networks. Bushart and Rossow [38] used an anomaly-based low-pass filter to detect DNS flood attacks.

Existing methods are mainly targeted at specific AL-DDoS attacks and do not consider the characteristics of application-layer protocol communication. In this study, we adopt EDRN to describe the communication process of the application-layer protocol, which can capture the suddenness, randomness, and volume of protocol communication, and then present an EDRN-based application-layer protocol communication model for AL-DDoS attack detection. This model can effectively detect multiple types of AL-DDoS attacks.

3. Application-Layer Protocol Communication Models

From the perspective of application-layer protocols, when using an application-layer protocol, user behavior over a period of time is reflected in the application-layer protocol; that is, the interaction between a series of application-layer protocol keywords. Application-layer protocol keywords refer to custom request commands and server response status codes, which can reflect the behavior of users when using the application-layer protocol. For example, HTTP protocol keywords include request commands “POST,” “GET,” and “HEAD,” and server response status codes “100,” “200,” “304,” and “404,” while SMTP protocol keywords are composed of “MAIL FROM,” “HELO,” “RCPT TO,” “VRFY,” “QUIT,” “REST,” “DATA,” “EXPN,” “HELP,” and “NOOP” and server response codes, such as “250” and “334.”

3.1. Application-Layer Protocol Communication Process

When regular users are using an application-layer protocol, the statistical characteristics of the protocol keywords and the time intervals between adjacent protocol keywords are quite different from those of AL-DDoS attacks. For example, when regular users are using the HTTP protocol, their speed of clicking pages, time taken to, and the process of browsing pages have certain stability. However, in the application-layer protocol keyword sequences generated by HTTP DDoS attacks, the protocol keyword “GET” appears very frequently, while other protocol keywords appear less frequently, and the time intervals between adjacent protocol keywords are small. Therefore, the application-layer protocol keywords and the time intervals between adjacent protocol keywords can be used as observations to describe the communication process of the application-layer protocol and enable the detection of AL-DDoS attacks.

Figure 1 shows the communication process between users and a web server represented by a sequence of HTTP protocol keywords, wherein the HTTP protocol keyword sequence representing users’ behavior is as follows: “GET,” “POST,” “200,” “HEAD,” “304,” …, “200,” and “GET.”

3.2. Application-Layer Protocol Keyword Extraction

We first identify the application-layer protocol based on regular expressions, and then extract the protocol keywords. In this way, the number of protocol keywords to be matched each subsequent time can be reduced, thereby improving the speed of the protocol keyword extraction process. When identifying a TCP-based application-layer protocol, the first few data packets of each TCP connection are cached, then the application-layer data of the data packets are reassembled, and finally the protocol regular expression [39] is matched against the reassembled application-layer data. When identifying a UDP-based application-layer protocol, we use regular expressions to match the payload of each data packet. The identification process of TCP-based application-layer protocols is shown in Figure 2. This method can identify application-layer protocols in real time.

3.3. Protocol Communication Modeling

At the gateway of a network, we can obtain application-layer protocol keywords and their arrival times using the protocol keyword extraction method described in Section 3.2. Assuming that the application-layer protocol has W keywords, which can be digitized as: 1, 2, ..., W. When users are using the application-layer protocol, the communication process can be described as a series of observations thus:  = {I1, I2, …, It}, where It (t ≥ 2) is the observation at the tth time that the protocol keyword arrives the gateway. The value of It is based on the protocol keyword and the time interval between adjacent protocol keywords arriving at the gateway; that is, It =  (, ), where is the digitized label of the tth protocol keyword arriving at the gateway, and is expressed by equation (1). In equation (1), Rt denotes the time the tth protocol keyword arrives the gateway, and denotes the time the th protocol keyword arrives the gateway. In this study, the unit of time measurement is chosen as seconds; I1 is the observation generated by the first protocol keyword arriving at the gateway, where is the digitized label of the first protocol keyword arriving at the gateway, and  = 0.

When using the application-layer protocol, users’ behaviors may change. For example, users may use the HTTP protocol for varied purposes, including browsing web pages, watching movies online, and shoping online. Therefore, the protocol keywords and time intervals between adjacent protocol keywords arriving at the gateway will change over time. Therefore, the durations of hidden states in the observation sequences of an application-layer protocol communication process may follow a relatively complex distribution.

We used the EDRN to model the communication process of the application-layer protocol. The EDRN-based application-layer protocol communication model is shown in Figure 1, where xt is the next possible states’ predicted probabilities at the (t +1)th time and yt is the all possible states’ probabilities at the tth time. The unfolded unit structure of EDRN is presented in Figure 3, where tanh denotes the hyperbolic tangent function and σ denotes the sigmoid function. , and denote input, forget, output, and tanh gates, respectively.

Assuming that the communication process of the application-layer protocol has K macrostates, and each macrostate has L substates. , and are calculated using the following equations:

In the above equations, A(1), A(2), A(3), and A(4) denote probability matrixes of state transition; b(1), b(2), b(3), and b(4) are bias parameters following a marginal distribution; B(1), B(2), B(3), and B(4) denote probability matrixes of observations; is the probability matrix of substate transition. In equations (4) and (5), “:” symbolizes all the states.

Similar to LSTM, each unit finally returns (xt and yt) to the next unit. The xt and yt are calculated using equations (6) and (7), respectively, where “” represents the element-wise production as follows:

4. AL-DDoS Attack Detection

The AL-DDoS detection method proposed in this study involves two phases. In the first phase, we train the EDRN-based protocol communication model. In the second phase, every application-layer protocol communication process is monitored in real time. Once a protocol keyword arrives at the network gateway, the corresponding observation sequence will be updated, where t denotes the number of observations. Then, we calculate the output η using following equation:

In equation (8), () denotes the probability of pt under and denotes that protocol keywords will appear in state . The () is defined and expressed as follows:

In equation (10), denotes the interstate transition probability from to and is defined by following equation:.

In equation (10), is the probability of and defined by following equation:

In equation (10), and are defined by equations (13) and (14), where is the scaling factor as follows:

If η is larger than a predefined threshold, the network is considered as normal. Otherwise, we consider that there is an AL-DDoS attack related to this protocol in the network. The detection architecture of our method is shown in Figure 4. Our method can detect AL-DDoS attacks in real time.

5. Evaluation

In this section, we test our proposed AL-DDoS attack detection method using multiple datasets to evaluate the detection performance against HTTP DDoS and SMTP flood attacks.

5.1. Datasets
5.1.1. HTTP Datasets

At the gateway of the campus network, we collected the data generated by a large number of normal users when using the HTTP protocol. In addition, we adopted the method described in [16] and DDoS generators to generate three different types of HTTP DDoS attacks, namely, single-page, random-page, and top-five-page HTTP DDoS attacks. A single-page HTTP DDoS attack targets a specific page of a website, usually one that is frequently visited by users, while a random-page HTTP DDoS attack targets a random page from all potentially visited pages of a website. A top-five-page HTTP DDoS attack targets the top five most visited pages from a resource site. Subsequently, we extracted observation sequences from the collected data for training and testing. The time length of each observation sequence was 60 seconds. The HTTP datasets are summarized in Table 1.

5.1.2. SMTP Dataset

Similar to HTTP data collection, we collected data generated by a large number of normal users when using the SMTP protocol. We adopted the method described in [18] to generate SMTP flood attacks. After that, we extracted observation sequences. The time length of each observation sequence was equally 60 seconds. The SMTP dataset is summarized in Table 2.

5.1.3. CICDDoS2019 Dataset

The CICDDoS2019 dataset is a public dataset developed by the Canadian Institute for Network Security (CIC) in 2019 [15]. This dataset is one of the popular datasets and is widely used in the field of DDoS detection. The dataset contains 11 kinds of DDoS attacks, among which the AL-DDoS attack is HTTP DDoS attack. The packet in the CICDDoS2019 dataset contains the application-layer payload. We use this dataset to test the performance of our method against HTTP DDoS attacks.

5.2. Estimation Criteria

In the recurrent neural network training phase of our proposed AL-DDoS detection method, we use accuracy and loss as evaluation metrics, while in the AL-DDoS attack detection phase, we use accuracy, F1, recall, and loss as evaluation metrics. In the comparison experiment with other methods, we use accuracy, F1 and recall as evaluation metrics. The loss is calculated using following equation:

In equation (15), is the label value of the sample and is the predicted value of the recurrent neural network.

5.3. AL-DDoS Attack Detection Results

In this section, experiments are carried out on a computer with 64 bit Ubuntu OS (version: 20.04.1), TensorFlow (version: 1.14.0), Python (version: 3.6.2), and Keras (version: 2.2.5). To prove that the EDRN can better model application-layer protocol communication, we compared it with other RNNs, including LSTM [12], GRU [40], PLSTM [13], IndRNN [41], and DSTP-RNN [42]. In the recurrent neural network training phase of our AL-DDoS detection method, the maximum value of the epoch was set to 100.

5.3.1. Detection Results on HTTP Datasets

The training results of different RNNs on D1, D2 and D3 datasets, as the epoch changes, are shown in Figures 5(a) and 5(b), 6(a) and 6(b), 7(a) and 7(b), respectively. The EDRN had a higher training accuracy and lower training loss on the D1 dataset than the other RNNs. When the epoch reached 100, the training accuracy rates of LSTM, GRU, PLSTM, IndRNN, DSTP-RNN, and the EDRN were 0.9921, 0.9929, 0.9914, 0.9923, 0.9926, and 0.9981, respectively, while their training loss rates were 0.0274, 0.0211, 0.0247, 0.0230, 0.0206, and 0.0105, respectively.

On the D2 dataset, the EDRN had the lowest training loss and highest training accuracy. At the end of training, the training accuracy rates of LSTM, GRU, PLSTM, IndRNN, DSTP-RNN, and the EDRN were 0.9929, 0.9939, 0.9933, 0.9936, 0.9941, and 0.9979, respectively, while their training loss rates were 0.0207, 0.0174, 0.0202, 0.0188, 0.0156, and 0.0050, respectively.

On the D3 dataset, the training accuracy of EDRN was the highest. When the epoch was 100, the training accuracy rates of LSTM, GRU, PLSTM, IndRNN, DSTP-RNN, and the EDRN were 0.9928, 0.9940, 0.9933, 0.9938, 0.9943, and 0.9976, respectively. Conversely, the training loss of the EDRN was the lowest. At the end of training, the training loss rates of LSTM, GRU, PLSTM, IndRNN, DSTP-RNN, and the EDRN were 0.0225, 0.0184, 0.0209, 0.0197, 0.0168, and 0.0065, respectively.

The lower the training loss and the higher the training accuracy, the better the performance of the recurrent neural network is. Therefore, the EDRN performed best on D1, D2, and D3 datasets in the training phase. On the HTTP datasets, the average training accuracy and loss of the EDRN were 0.9978 and 0.0073.

After training, we used the corresponding testing sets to evaluate the EDRN and other RNNs. The test results are listed in Table 3, and as shown, the EDRN had the highest accuracy, F1, and recall, and the lowest loss on D1, D2, and D3 datasets. Hence, the EDRN had the best performance in the HTTP DDoS attack detection phase. On the HTTP datasets, the average test accuracy, F1, recall and loss of the EDRN were 0.995, 0.991, 0.992, and 0.042, respectively.

5.3.2. Detection Results on SMTP Dataset

The training results on the SMTP dataset are shown in Figures 8(a) and 8(b). At the end of training, the training accuracy rates of LSTM, GRU, PLSTM, IndRNN, DSTP-RNN, and the EDRN were 0.9926, 0.9937, 0.9931, 0.9936, 0.9941, and 0.9986, respectively; the training loss rates of LSTM, GRU, PLSTM, IndRNN, DSTP-RNN, and the EDRN were 0.0224, 0.0183, 0.0203, 0.0193, 0.0165, and 0.0062, respectively. In the training phase, the EDRN attained lowest training loss and the highest training accuracy. That is, EDRN achieved the best performance in the training phase. A comparison of test results is shown in Table 4. Compared with other RNNs, EDRN had a better performance in detecting SMTP flood attacks.

5.3.3. Detection Results on CICDDoS2019 Dataset

We compared the EDRN with other RNNs on the CICDDoS2019 dataset. A comparison of the test results is shown in Table 5 and show that the EDRN achieved the best performance in detecting HTTP DDoS attacks on the CICDDoS2019 dataset.

5.3.4. Comparison with Existing Approaches

In this section, we compare our proposed AL-DDoS attack detection method with several existing state-of-the-art approaches. Accuracy, F1 and recall are adopted as evaluation metrics. The comparison results based on HTTP, SMTP, and CICDDoS datasets are presented in Tables 68, respectively.

Existing approaches use traditional statistical analysis or machine learning algorithms to detect HTTP DDoS attacks, while this study uses EDRN to construct an application-layer protocol communication model to detect HTTP DDoS attacks. The EDRN is a novel recurrent neural network that has better performance than traditional statistical analysis and machine learning algorithms in sequence data mining. Therefore, our method outperforms existing approaches in detecting HTTP DDoS attacks.

Existing approaches do not consider the characteristics of application-layer protocol communication when detecting SMTP flood or HTTP DDoS attacks. However, the communication process of the application-layer protocol can better reflect the users’ behavior. This study uses EDRN to describe the communication process of the application-layer protocol, which can capture the suddenness, randomness, and volume of protocol communication. Therefore, our method has better performance than existing approaches in detecting SMTP flood attacks.

6. Discussion

The accuracy of the application-layer protocol identification method has a great influence on the performance of the proposed AL-DDoS attack detection method. We conducted an online test on the application-layer protocol identification method at the gateway of a real campus network, shown in Figure 9. The duration of the test experiment was five hours, and accuracy and recall were selected as evaluation indicators. Table 9 presents the identification results of some common application-layer protocols. The test results show that for common application-layer protocols, the accuracy and recall of the method were both above 0.998. Therefore, the application-layer protocol identification method can meet the needs of AL-DDoS attack detection.

To improve the accuracy of the EDRN-based protocol communication model, we update the model parameters online. Specifically, we collect training observation sequences of normal and AL-DDoS attacks online, and then train the model parameters at regular intervals, as shown in Figure 10.

7. Conclusion

This study investigated the issue of AL-DDoS attack detection. An application-layer protocol communication model is proposed based on the EDRN. The model takes as input application-layer protocol keywords and time intervals between adjacent protocol keywords. Based on the application-layer protocol communication model, a new method is proposed for AL-DDoS attacks detection. We evaluated the EDRN-based model and compared it with other RNNs. The experimental results show that the EDRN outperforms traditional RNNs, and our model can effectively detect multiple types of AL-DDoS attacks. For the datasets collected from a real campus network, our model achieves an overall accuracy of 0.995, F1 of 0.991, recall of 0.992, and loss of 0.043. For the CICDDoS2019 dataset, our model can effectively detect HTTP DDoS attacks, with an accuracy of 0.996, F1 of 0.992, recall of 0.993, and loss of 0.041. Our model can be used to detect AL-DDoS attacks in multiple networks, including the Internet of Vehicles, the Internet of Things, and software-defined networks.

However, it is difficult to define the protocol keywords of emerging application-layer protocols. Therefore, our model cannot effectively detect AL-DDoS attacks based on emerging application-layer protocols. In future work, we aim to automatically analyze emerging application-layer protocols and define their protocol keywords.

Data Availability

The CICDDoS2019 dataset used to support the findings of this study is a public dataset developed by the Canadian Institute for Network Security (CIC) in 2019. The CICDDoS2019 data can be downloaded from https://www.unb.ca/cic/datasets/ddos-2019.html.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Guangdong Basic and Applied Basic Research Foundation (grant no. 2018A0303130045) and the Science and Technology Program of Guangzhou (grant no. 201904010334).