Abstract

The Internet of Things (IoT) cyberattacks of fully integrated servers, applications, and communications networks are increasing at exponential speed. As problems caused by the Internet of Things network remain undetected for longer periods, the efficiency of sensitive devices harms end users, increases cyber threats and identity misuses, increases costs, and affects revenue. For productive safety and security, Internet of Things interface assaults must be observed nearly in real time. In this paper, a smart intrusion detection system suited to detect Internet of Things-based attacks is implemented. In particular, to detect malicious Internet of Things network traffic, a deep learning algorithm has been used. The identity solution ensures the security of operation and supports the Internet of Things connectivity protocols to interoperate. An intrusion detection system (IDS) is one of the popular types of network security technology that is used to secure the network. According to our experimental results, the proposed architecture for intrusion detection will easily recognize real global intruders. The use of a neural network to detect attacks works exceptionally well. In addition, there is an increasing focus on providing user-centric cybersecurity solutions, which necessitate the collection, processing, and analysis of massive amounts of data traffic and network connections in 5G networks. After testing, the autoencoder model, which effectively reduces detection time as well as effectively improves detection precision, has outperformed. Using the proposed technique, 99.76% of accuracy was achieved.

1. Introduction

Deep learning frameworks became an active field for the role of network intrusion detection in cybersecurity. While many excellent surveys cover a growing research field on this topic, the literature fails to make an impartial comparison of different deep learning models, especially in recent datasets for intrusion detection, in a controlled setting. Cybersecurity is a critical issue in today’s world [1, 2]. Firewalls, for example, have long been used to secure data confidential information [1]. IDS analyses network traffic or a specific computer environment to detect signs of malicious action [2]. The rapid growth in interest in artificial intelligence (AI) development has resulted in major advances in mechanisms including pattern recognition and anomaly detection.

For these issues, neural networks are a suitable option, so their use is no longer limited. This is largely due to higher in the number of computing resources available. This situation caused researchers to make changes to neural network architectures to incorporate or improve IDS [35].

1.1. Internet of Things Devices Using 5G Networks

5G is a key to the Internet of Things although it provides a faster network with greater capacity to meet connectivity needs. The 5G spectrum broadens the frequencies used by mobile communication technologies to transmit data. Because there is a wide range available for use, a whole bandwidth of mobile networks rises, enabling more devices to connect. To resolve current issues, 5G certainly requires control of the response period of the network and network infrastructure. Notice that the model Internet of Things has a different system of interaction technology, including wireless sensor networks. Therefore, in this case, the role of fog calculation is also evident. Fog computing or fogging mostly consists of effective distribution of data, transmission, stocking, and applications between data sources and the cloud, through a decentralized networking and computing framework. Various applications of 5G network are depicted in Figure 1 which was reproduced from the article [5].

1.2. The Architecture of 5G Technology

5G has an integrated infrastructure that usually updates network modules and terminals to include a new scenario. Advanced technology may often be used by service companies to easily take advantage of value-added offers. However, upgradability is based on cognitive radio technology which has many important characteristics such as the ability of devices to recognize their place, voice, sensors, health, energy, environment, temperature, etc. In its working environment, cognitive radio equipment works as a transceiver (beam) that can collect radio signals perceptually and respond to them. Furthermore, it detects changes in its environment automatically and thus responds to continuous continuity.

As soon as 4G becomes publicly accessible, package companies will be forced to adopt a 5G network [6]. To satisfy customer demands and resolve conflicts in the 5G environment, a fundamental change in the 5G wireless cellular technology growth is needed. According to the researchers’ major findings in [7], the majority of wireless customers spend essentially 80% of their time indoors and 20% of their time outdoors (2018)). That is a narrow blend of both NFV and SDN innovations which are efficiently detected by 5G networks and cyberattacks that mitigate them. To address this challenge, a new concept or design technique for planning the 5G cellular architecture has emerged: distinguishing between outside and inside setups [8]. The accessibility loss across the building’s boundaries will be slightly decreased with this design technique. The user details would be filtered by other user computers as in the device-to-device communications, so the anonymity of this information is the key concern. Closed access ensures the confidentiality of programs at the system stage. The system lists such trustworthy devices as consumers situated near you or in your place of business, you know; otherwise, a trusted entity, e.g., an organization, will easily link and maintain a degree of confidentiality, whereas devices that are not included in this list can interact on the macrocell phase.

One of the important issues in 5G network lies in the component which is used in the designing as well as at the deployment phase, as every element needs authentication with all of the other elements in the network architecture even before initiating any operation, whereas in physical layer phase the network, components are also required to be developed through the trusty worthy network components. As the traffic on internet is growing continuously, the domain is also constantly updating 5G and IoT technologies, lot of security breaches are there which can be easily affected due to intrusion-based attacks like Denial-of-Service Attacks (DDoS) which can not only affect the application layer of the Open Systems Interconnection (OSI) model but also affects the network layer as well. In this paper, the dataset used for the implementation purpose consists of all such kinds of attacks which are feasible on not only 5G networks but also IoT-based system. Hence, a novel technique to detect such type of attacks is elaborated in Section 3 of this paper.

1.3. The Contribution of the Paper Is as Follows

(i)Autoencoder-based novel deep learning technique is implemented for the detection of network attacks(ii)Several machine learning algorithms are used for implementation purposes(iii)A recent benchmark dataset is used for the implementation purpose(iv)A comparative analysis of the work with the existing framework has been provided

1.4. The Structure of This Paper Is as Follows

Section 2 outlines the literature review based upon various intrusion detection systems. The third section reveals the research methodology by providing a theoretical description of IDS and deep neural network (DNN) concepts along with the details of the proposed model. Following that, the fourth section provides the results and discussion of intrusion detection with system configuration and comparative analysis. The fifth section provides the conclusion based upon the presented work along with its future scope.

2. Literature Survey

2.1. Intrusion Detection System-Based Detection Systems

To identify possible computer intrusions, intrusion detection calls for monitoring and analysing running networks and networking traffic. The IDS system is a collection of methods and mechanisms for this purpose. In general, most IDSs have standard capabilities to secure the network [9]. An IDS starts with data collection from the observed incidents. It does detailed logging and compares operations with event-related data from different networks. The detector which employs various methods and related techniques is at the core of an IDS, depending on the situation.

There would also be mitigation capability. It is here that the process of identification and avoidance of intrusion is called as Intrusion Detection and Prevention System.

2.2. Anomaly-Based Detection Strategy

In anomaly-based recognition methods, different patterns, like generative, discriminatory, and hybrid structures, can be used. Where the assault is allocated to one form of attack, clustering should be binary if normal and until behaviours are to be distinguished. Divide into multisubclass hierarchical structures may be further broken down into many classes. Any studies utilizing hierarchical datasets have been performed. The detector often operates online or offline to support applications in real time. In the last several years, various research articles have been published on intrusion detection techniques, many of which are focused on the detection of network attacks using various machine learning and deep learning approaches. In Table 1, various existing frameworks are compared.

2.3. Summary of the Related Work

After reviewing various research papers based upon the intrusion detection system, it can be observed that commonly used datasets NSL-KDD and KDD99 were utilized for the implementation purpose. No doubt both the datasets are benchmark dataset seems to be outdated, and hence, work needs to be carried out on new dataset. Also, very few researchers have implemented the system based upon the Internet of Things and 5G network [2024]. In this work, the UNSW-NB15 benchmark dataset which was created by the IXIA PerfectStorm tool in the Cyber Range Lab of UNSW Canberra was used [25, 26].

3. Proposed Methodology

3.1. Background of Deep Learning Architectures

Deep learning is one of the popular techniques of data mining. Tiefskin’s learning is a valuable algorithm for modelling abstract subjects and relationships over two neural layers [27]. Deep learning is currently being studied in several areas, including the identification of images, speech, natural language processing, social network filtering, and so forth. In addition to finding correlations between vast data from different sources to carry out attribute learning, classification, or classification tasks at the same time, deep-sea learning algorithms vary in their ability. Various deep learning techniques which are used by different researchers for the development of intrusion detection system are discussed as follows: (i)Generative Architectures. Unlisted raw data dynamically train algorithms to carry out different activities. This is the most general architecture in the architectural class category.(ii)Autoencoder (AE). Gao et al. [28] is a massive neural network widely used to minimize dimensionality by having improved data representation compared to raw inputs. The AE contains layers of the same number of feature vectors, in addition to a hidden layer with a low-dimensional representation. An AE incorporates and trains an encoder and decoder with a backpack. As knowledge is translated into small abstraction, it captures brutal characteristics and learns the representation of details. Afterward, the decoder receives small displays and reconstructs original functions [29]. Some AE extensions like AE (SAE), sparse AE, and denoise AE are available.(iii)SAE. Cascade through a vast network and SAE via more than one hidden sheet. The features used to create a new data display are more thoroughly studied [30].(iv)Sparse AE. Units covered in scarce AE are of a little sparse size. While there are many hidden units available to research data representations, AE remains valuable. Sparsity constraints are intended to ensure that most neurons are inactive [31] in the low average output.(v)Denoising AE. Denoise is built on the use of skewed data for refined data view where hidden layers only use stable characteristics vectors [32].(vi)Restricted Boltzmann Machine (RBM). As a probabilistic neural network, the Boltzmann (BM) was created by Hinton and Sejnovsk. A BM network consists of binary units symmetrically connected which specifies which units are permitted. Several interactions between units, however, contribute to very slow learning [33]. RBM is the unidirectional paradigm of Smolensky 1986 which solves BM’s uncertainty. The principle of RBM is that neuronal connections are removed on the same sheet. RBM contains a translucent layer and an occult layer of latent (hidden) variables for initial input variables. Both units are connected to the hidden layer in a clear layer with corresponding weights. The feature distribution learns the units covered by the input variables. As an initial stage, RBM is typically employed as a preprocessing feature extractor or for initializing other network parameters, for another learning network. RBM may be used as a grouping model as well. A nonlinear, autonomous classification of Larochell and Bengaio was taught to be the discriminatory Boltzmann (DRBM) device. If some of the Boltzmann press are waterfalls, this is called a deeper Boltzmann (DBM).(vii)Deep Belief Network (DBN). The DBN consists of stacked RBMs, which have been trained in a pessimistic way. In comparison to the previous RBM, each RBM is trained and represents a contribution to the hidden layer of the RBM. The algorithm for profound learning is efficient and rapid to move. In case an additional layer of bias is applied, DBN is generalized for both dimensional reduction and independent classification for practical applications.(viii)Recurrent Neural Network (RNN). Holand suggested RNN is a dynamic network of nerve feed in 1982. In normal transmission, depending upon the neural network architecture and its dependency, the output of each layer will consist of the same unit of the neuron. The discrepancy in the neural feed system feeds from hidden layers to RNNs. Various models of the memory unit such as long, temporary memory were extended, and the recurrent gated unit can be used.(ix)Long Shorter Memory Time (LSTM). The RNN gradient is addressed by LSTM. It will learn long-term dependencies by utilizing the gate scheme. Every LSTM device is equipped with a memory cell containing old states.(x)Gated System Recurrent (GRU). Lightweight is the GRU version of the LSTM. The architecture has been streamlined, the doors merged, and the states integrated.(xi)Neural Classic Network (NCN). The perceptron is multilayered and is further known as the fully connect network. The model must be modified to binary entries in simple records.(xii)Linear Function (LF). Rightly called, it is a single line multiplying the input by a constant multiplier.(xiii)Nonlinear Function (NLF). Furthermore, the nonlinear function is split into three subsets: curve sigmoid, which is an S-shaped feature with several zeros to 1. The S-shaped curve with a scale of -1 to 1 relates to the hyperbolic tangent (tanh).

3.2. Proposed Architecture

In this work, UNSW 2015 benchmark dataset was used. Initially, the dataset was analysed in the data preprocessing phase where the null value-based columns were dropped. Further, the updated dataset was provided to feature selection and feature scaling phase. In this phase, important features were considered using the Pearson correlation technique. Attack category used for the analysis purpose is depicted in Figure 2. After obtaining the important features, the categorical features were converted to the numeric features using the one-hot encoding technique. Later for scaling, the feature normalization and standardization techniques were used. Finally, various machine learning and deep learning algorithms were used for training the model. In the training phase, 80% of the dataset was utilized, while for testing the model, remaining 20% of the dataset was used. The proposed autoencoder technique has outperformed compared to other models. Initially, a deep neural network and proposed model were implemented using 100 epochs. Finally, the proposed model has got more promising results when it was trained with 5000 epochs. The generalized flow chart of the proposed model is depicted in Figure 3. The proposed system algorithm has been represented in Algorithm 1 as follows:

Input Given: Complete Dataset D=(a1, a2…. an) where ai ➔X
Output: Prediction result for binary class label denoted by variable b
Step 1: Data Pre-processing (S) ⟶ S’
Step 2: Correlated sample Obtained was S”
Step 3: Training using AE ⟶ Cust AE
Step 4: Apply AE (S’ , Cust AE) ⟶ Full_AE
Step 5: S” = S’, Where Full AE ⟶ Correlated Features
Step 6: Training using DNN (S”’) ⟶ Cust DNN
Step 7: Add Cust DNN after Cust AE to form Cust AE+ Cust DNN
Step 8: Input Testing Data to generate class label b.

For identifying the highly correlated features in the dataset, Pearson correlation technique was used. The correlated features were analysed through heatmap as provided in Figure 4.

For implementation purpose, ensemble-based machine learning techniques were considered. XGboost, Adaboost, ExtraTreeClassifier, and Random Forest Classifier were used for implementation purpose. Results obtained through various machine learning algorithms are depicted in Figures 58.

Figures 9 and 10 represent the model summary of deep neural network and the proposed methodology, respectively.

4. Results and Discussions

4.1. Dataset Description and Preprocessing

In this work, the UNSW-NB15 dataset was used for the implementation. This dataset consists of a total of 175341 rows and 45 attributes. In the dataset preprocessing step, initially, the null values were dropped. Due to this, the dataset was converted into almost half of its size. Further, to handle the categorical features, encoding techniques were used. Later for performing feature scaling, the normalization technique was applied. Figure 2 provides the details of the attack category available in the dataset. The sample of feature description is depicted in Figure 11. Dataset distribution after splitting in training and testing samples is depicted in Figure 12. For generating efficient results of the proposed model, the implementation is done with high configuration architecture which is comprised of AMD RYZEN 9 processor with 8 cores, 64-bit Windows 10 OS, 16 GB of RAM, and 6 GB GTX 1660 TI GPU. The complete model script was implemented on Jupyter Notebook tool using python programming language.

4.2. Evaluation Metrics

The main aim of the evaluation metrics is to depict the implications of enriching the IDS with the proposed model using some of the following important parameters: (i)Maximize detection rate (DR) (ii)Maximize accuracy (AC) (iii)Minimize false alarm rate (FA)

Our model obtained higher accuracy compared to the existing model as depicted in Table 2. The proposed model was tested on various performance metrics, and classification accuracy was used as a comparison parameter with the existing model.

Table 3 provides the details of the results obtained using various machine learning algorithms. Results obtained using the deep learning algorithm and proposed methodology are provided in Table 2. In Table 2, only results obtained from those researchers are considered who have used accuracy as their comparative parameter.

In Figures 13 and 14, the accuracy results and loss results of the proposed methodology computed for 100 epochs are depicted. An appendix of all the acronyms is given in Table 4.

5. Conclusion and Future Scope

The proposed algorithm is trained using the SoftMax classifier to identify the attack types in the dataset. The benchmark dataset, UNSW-NB15, was used for training and testing the model. For training the hidden layers, there are many options, such as linear, SoftMax, sigmoid, and corrected linear functionality which can be used as an activation function. A novel autoencoder technique was used for training and testing the model. The proposed model has achieved comparatively high accuracy than the existing system. Several machine learning and deep learning approaches were used for implementation purposes. Further, to extend the work, stack-based autoencoder technique can be used for reducing the computational resources. Also, more focus can be given to optimizing the computational time.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.