DETECTION OF ATTACKS ON A COMPUTER NETWORK BASED ON THE USE OF NEURAL NETWORKS COMPLEX

Purpose. The article is aimed at the development of a methodology for detecting attacks on a computer network. To achieve this goal the following tasks were solved: to develop a methodology for detecting attacks on a computer network based on an ensemble of neural networks using normalized data from the open KDD Cup 99 database; when performing machine training to identify the optimal parameters of the neural network which will provide a sufficiently high level of reliability of detection of intrusions into the computer network. Methodology. As an architectural solution of the attack detection module, a two-level network system is proposed, based on an ensemble of five neural networks of the multilayer perceptron type. The first neural network to determine the category of attack class (DoS, R2L, U2R, Probe) or the fact that there was no attack; other neural networks – to detect the type of attack, if any (each of these four neural networks corresponds to one class of attack and is able to identify types that belong only to this class). Findings. The created software model was used to study the parameters of the neural network configuration 41–1–132–5, which determines the category of the attack class on the computer network. It is determined that the optimal training speed is 0.001. The ADAM algorithm proved to be the best for optimization. The ReLU function is the most suitable activation function for the hidden layer, and the hyperbolic tangent function – for the output layer activation function. Accuracy in test and validation samples was 92.86 % and 91.03 %, respectively. Originality. The developed software model, which uses the Python 3.5 programming language, the integrated development environment PyCharm 2016.3 and the Tensorflow 1.2 framework, makes it possible to detect all types of attacks of DoS, U2R, R2L, Probe classes. Practical value. Graphical dependencies of accuracy of neural networks at various parameters are received: speed of training; activation function; optimization algorithm. The optimal parameters of neural networks have been determined, which will ensure a sufficiently high level of reliability of intrusion detection into a computer network.


Introduction
Efficiency of modern information systems is largely related to the problem of protecting the information processed in them. According to the Verizon 2018 Data Breach Investigations Report [14], the problem of intrusion detection is relevant.
From year to year, the value of the average cost of hacking increases by 6 %. There are many algorithms for classification and detection of anomalies, each of which has its advantages and disadvantages [11]. Intrusion detection systems based on anomalies are also used to detect new types of attacks. Based on a set of queries, a model of normal behavior is formed, with which each subsequent query to the system is compared.
The capabilities of existing security systems do not allow ensuring the security of the information system at a sufficient level. The reason for this is that the process of creating attack detection systems involves a number of unsolved scientific and technical problems. Existing attack detection systems use the simplest algorithms for processing incoming information, which does not allow detecting a significant number of attacks on information systems. The main methods of detecting attacks include misuse and anomaly detection [11]. Misuse detection involves the presence of attack signatures and is based on a simple notion of coincidence of the sequence with the sample. Because the signature-based attack detection method is static, it is vulnerable to new types of attacks. To detect them, it is necessary to use systems capable of self-training in real time [6]. Creating an efficient attack detection system requires the use of qualitatively new approaches to information processing, which should be based on adaptive algorithms capable of self-training. It is known that there are two main types of implementation of intrusion detection systems based on neural networks: IDS (Intrusion-Detection System) based on a combination of expert system and neural network; IDS using neural networks (NN) as autonomous systems [6]. The most promising direction in the creation of such attack detection systems is the use of artificial intelligence. It should be noted that both large foreign commercial companies (Cisco, Computer Associates, ISS, Symantec and others) and research centers at various universities (Columbia University, Florida Institute of Technology, Purdue Univer-sity, Ohio University, etc.) carry out research in this aspect.
Analysis of scientific sources. The most promising direction is IDS, built on the basis of NN: Multi Layer Perceptron (MLP); Radial Basis Function Network (RBF) and Kohonen or Self Organizing Maps (SOM). For example, in [10] only DDoSattacks on TCP, UDP and ICMP protocols were analyzed due to their popularity among malefactors. In [13], some threats were detected in the network based on the analysis and processing of network connection parameters, which use the TCP/IP protocol stack, using a neural network configuration 19-1-25-5 (19is the number of input neurons; 1is the number of hidden layers, 25is the number of hidden neurons, 5is the number of output neurons), but other types of attacks also require research.
At the present stage, on the one hand, there are more and more works that use a combined approach to solving the problem. For example, the work [5] proposed a new ensemble classifier that uses RBF and fuzzy clustering to increase detection accuracy, reduce the number of false positives, and provide a higher detection rate for infrequent attacks. In [1] the approach with the use of neural networks, immune systems, neurofuzzy classifiers and their combinations is considered. The essence of hybrid approaches is to implement various schemes of combining basic classifiers, which allow eliminating shortcomings in their operation separately. However, an important disadvantage of such methodologies is the lack of universality of their application. In [4] to improve the efficiency of IDS, it is proposed to use the method of coincidence, based on the fact that NN with different topologies (MLP, RBF, SOM) can detect different attacks, but false positives also do not always occur on the same network packages during the analysis with the help of different types of NN. In addition, each type of neural network has its advantages and disadvantages that need to be considered or additional research conducted.
On the other hand, there are attempts to use NN at different levels. For example, in [2] a new approach to the construction of a multilevel network intrusion detection system is considered. It consists in the fact that groups of similar parameters of interconnection interaction are fed to the inputs of individual first-level modules, each of which is a hierarchical structure of several NN of different types and detects anomalies for a given group of parameters. The results of the first level modules are fed to the input of the second level solver, which makes the final decision about the presence of the attack and its classification. According to this approach, the probability of identifying known attacks was 91%, the detection of intrusions, about which there was no information during training, was 86 %. However, the developed prototype has a relatively significant probability of type II errors -18 %, analysis and correction of the causes of these errors is promising for further study.
In addition, in large information systems (in particular, the information and telecommunication system of railway transport) there is a problem of large amounts of network traffic, this is caused by the fact that network traffic is constantly changing, and it is difficult to establish the cyclical nature of such changes. To increase the efficiency of detecting situations related to possible intrusions into the computer network, which is the basis of various information systems, recently widely used modern technologies of data mining (in particular, DataMining technologies) [8]. Network intrusion detection systems based on the anomaly detection paradigm have a high false alarm frequency, which complicates their use. To solve this weakness, in [7] it was proposed to smooth the outputs of anomaly detectors using local adaptive multifactor smoothing.
We believe that during processing a large amount of constantly changing network traffic, the use of a multi-level network system to detect various categories of MLP-based attacks using machine training (especially deep one) leads to a large number of false positives and skip attacks. Therefore, one of the approaches to solving this problem is to conduct additional research to determine the rational parameters of NN.

Purpose
Our study aims to develop a methodology for detecting attacks on a computer network. Achieving this goal involves solving the following tasks: develop a method of detecting attacks on a computer network based on neural networks ensemble; during machine training to find the optimal parameters of the neural network (MLP), which will provide a sufficiently high level of reliability of detection of intrusions into the computer network.

Methodology
Researched materials used during modeling. Attack detection system based on neural networks such as a multilayer perceptron is considered. In [3], the authors investigated two approaches to the detection of attacks on a computer network and proved the rationality of using a neural networks ensemble. In our work, the source of data for training and testing of neural networks is the KDD Cup 99 database [9], which contains more than four gigabytes of characteristics of TCP connections. The database presents the following categories of attacks: DoS, R2L, U2R, Probe. Each of these categories, in turn, is represented by several types.
DoSnetwork attacks aimed at creating a situation where in the attacked system denial of service takes place. Such attacks are characterized by generation of large amounts of traffic, which leads to overloading and blocking the server. There are six types of DoS-attacks: back, land, neptune, pod, smurf, teardrop.
R2L attacks are characterized by access by an unregistered user from a remote computer. There are eight types of R2L attacks: ftp_write, guess_passwd, imap, multihop, phf, spy, warezclient, warezmaster.
U2R attacks involve obtaining a privilege of a local superuser (network administrator) by a registered user. There are four types of U2R attacks: buffer_overflow, loadmodule, perl, rootkit.
Probe attacks are about scanning network ports for confidential information. There are four types of Probe attacks: ipsweep, nmap, portsweep, satan.
The input vector for the attack detection system is a set of 41 TCP connection parameters, the full description of which is given in [9], examples of the description are in Table 1.
As an architectural solution of the attack detection module, five neural networks of the multilayer perceptron type are proposed: NN1to determine the category of the attack class (DoS, R2L, U2R, Probe) or the fact that there was no attack; NN2…NN5to detect the type of attack, if any національного університету залізничного транспорту, 2020, № (each of these four neural networks corresponds to one attack class and is able to identify types that belong only to this class). Fig. 1 shows the structure of a hypothetical complex that uses this architectural solution.
The complex contains a module for detecting network attacks, which receives connection data from network sensors and provides the result to the response module. The signal from NN1, which represents the category of the attack class, through the key «key» turns on one of the neural networks NN2…NN5, which will determine the type of attack of this class. Neural network to identify the category of attack class or determine the fact of its absence. The input vector is 41 TCP connection parameters, the output is a vector of five values, four of which represent the attack class, and the fifth is the regular connection. This is the so-called one-hot vectora vector whose components are equal to zero, except for one, which is equal to one. This component will indicate the neural network-defined attack class or the normal connection, if there was no attack. The number of neurons in the hidden layer of the multilayer perceptron can be determined by a known formula, which is a consequence of the Kolmogorov-Arnold-Hecht-Nielsen theorem: The values N  . Let us take a larger value -132 neurons, then NN1 will have the configuration 41-1-132-5, which is presented in Fig. 2, where 41 is the number of input neurons, 1 is the number of hidden layers, 132 is the number of hidden neurons, and 5 is the number of output neurons.
By the way, the architectures of all other NN are similar and differ only in the number of neurons in different layers, the results of the calculations are summarized in Table 2.
General characteristics of the software model. Python 3.5 programming language, PyCharm 2016.3 integrated development environment and Tensorflow 1.2 framework were used to create the software model [12].
The software allows you to build a multilayer perceptron (add layers of a given length), train and test the neural model. In addition, you can set such parameters as the convergence coefficient, the number of training epochs, size of the data portion for the training step. The ADAM optimizer is taken as the optimization function. In the program, you can calculate the difference between the responses of the neural network and reference networks and obtain the values of accuracy.
The Tensorboard visualization tool was used to analyze neural models. to identify the attack class category

Findings
As an example, the analysis of NN1, which defines the category of attack class. For NN1 of configuration 41-1-132-5, the length of the training, test and validation sample is 1024, 294 and 156 examples, respectively.
The obtained graphs of accuracy and error of training of NN1 of configuration 41-1-132-5 by iterations for different training speeds are presented in Fig. 3. The total number of training iterations is 5000; training data come in 100 lines per iteration. The Smoothing parameter is responsible for approximating the curves on the graphs, which allows you to estimate the rate of their growth or decline. Bright colors indicate approximated graphics, paletheir true view.  As can be seen from Fig. 3, at a training speed of 0.1, the accuracy dropped sharply between the second and third thousand iterations, which may be the result of a poor sample. However, at a training speed of 0.0001 the result is satisfactory, and at a training speed of 0.001the result is the best.
The obtained graphs of accuracy and training error of NN1 of configuration 41-1-132-5 by iterations for different activation functions are presented in Fig. 4. The training speed is 0.001, the number of training iterations is 5000, the length of the data portion on each iteration is 100 lines.
[f1_][f2_]neurons_h1-132_lr-0,001_epochs-5000: f1, f2hidden and output layer activation functions; by defaultrelu; all -f2 repeatsf1 It was determined that NN1 of configuration 41-1-132-5 provides the best accuracy value at a training speed of 0.001 and requires the least training time when using the semilinear ReLU activation function in the hidden layer and the hyperbolic tangent in the output layer; ADAM algorithm compared to gradient descent algorithm works faster, gives higher accuracy and lower error.
According to the results of the study of NN1 of configuration 41-1-132-5 to identify the category of attack class the following parameters were determined: learning speed -0.001; number of iterations -5000; data portion length -100; activation function in the hidden layer -ReLU; activation function in the output layerhyperbolic tangent; optimization algorithm -ADAM, for which the accuracy of the test and validation samples was 92.86 and 91.03 %, respectively.
The simulation results on other neural networks (attack detection accuracy) are summarized in Ta The table shows that the best result is achieved when determining the type of attacks of the DoS and Probe classes, slightly worsefor the R2L class. For the U2R class, it was not possible to configure the NN4 neural network to obtain acceptable results. This is due to the small number of records (52 in total) in the KDD Cup 99 database that belong to the U2R class.

Originality and practical value
In our work, the detection of network attacks was carried out using the apparatus of neural networks (multilayer perceptron), as in other works [10,13], which is not a contradiction to those works [1,4,5], where a hybrid (immune mechanisms and SOM; neural, immune and neuro-fuzzy classifiers) or combined approach used based on different types of neural networks (MLP and neural fuzzy network; multiple SOM and neural fuzzy network; MLP, RBF and SOM). Herewith, in our work all types of attacks of DoS classes are investigated; U2R; R2L; Probe, not individual, as in [10,13].
We believe that the use of a multi-layer perceptron as a mathematical apparatus is appropriate and sufficient. For example, although the RBF network is trained faster than the MLP network, it is necessary to determine the number of radial elements, location of their centers and deviation values, the RBF model requires slightly more elements, i.e. will run slower and requires more memory than the MLP model.
Processing a large amount of constantly changing network traffic, based on MLP using machine training (especially deep) leads to a large number of false positives and skip attacks, which requires additional research using DataMining technology [8]. Thus, in particular, in our work on the software model it is determined that the ADAM optimization algorithm works faster than the gradient descent algorithm. It gives higher accuracy and lower error; this cannot be a contradiction to the use of other means (in particular, local adaptive multifactor smoothing, proposed in [7]).
The study used a multilevel (namely two-level) approach to building a network system for detect-ing intrusions into a computer network: determining the category of attack class (first level); assigning the type of attack to the appropriate class (second level), which is also not a contradiction to the modular approach in [2]. But the probability of error of the second kind (the number of skip of attacks) in our work is about 10 vs. 18 % for the modular approach, which is implemented in [2], which is 1.8 times better.

1.
When processing a large amount of constantly changing network traffic, it is appropriate to use a two-level network system based on five neural networks of the following configurations: 41-1-132-5 to determine the category of attack class at the first level, as well as 41-1-160-7, 41-1-8-5, 41-1-111-9, 41-1-107-5 to detect the type of attack from the DoS classes (back, land, neptune, pod, smurf, teardrop), U2R (buff-er_overflow, loadmodule, perl, rootkit), R2L (ftp_write, guess_passwd, imap, multihop, phf, spy, warezclient, warez-master), Probe (ipsweep, nmap, portsweep, sa-tan) respectively at the second level. The training data are taken from the open KDD Cup 99 database, which stores a large number of characteristics of TCP connections. The Google TensorFlow machine training framework was chosen to build all neural networks because of its flexibility and speed. 2.
We conducted a study of the parameters of the neural network configuration 41-1-132-5, which determines the category of the attack class on the computer network. It is determined that the optimal training speed is 0.001. The ADAM algorithm proved to be the best for optimization. As a function of activation for the hidden layer, the ReLU function is the most suitable, for the activation function of the output layerthe hyperbolic tangent function. The accuracy in the test and validation samples was 92.86 and 91.03 %, respectively. The probability of a type II error is about 10%. 3.
The study showed that for training a neural network of 41-1-8-5 configuration, which determines the type of attack of the U2R class, the available training sample is not enough. національного університету залізничного транспорту, 2020, №