Intrusion Detection System with Hierarchical Different Parallel Classification

(1)

Int ernat ional journal of Com put er Science & Net w ork Solut ions Dec.2015-Volume 3.No.12 ht t p:/ / ww w .ijcsns.com

3397

-ISSN 2345

39

Intrusion Detection System with Hierarchical

Different Parallel Classification

Behrouz Safaiezadeh, Alireza Zebarjad, Amin Einipour

Department Of Computer, Andimeshk Branch, Islamic Azad University, Andimeshk, Iran co.safaie@gmail.com

Department Of Computer, Andimeshk Branch, Islamic Azad University, Andimeshk, Iran ali_reza_zebarjad@yahoo.com

Department Of Computer, Andimeshk Branch, Islamic Azad University, Andimeshk, Iran a.einipour@gmail.com

Abstract

Todays, lives integrated to networks and internet. The needed information is transmitted through networks. So, someone may attempt to abuse the information and attack and make changes by weakness of networks. Intrusion Detection System is a system capable to detect some attacks. The system detects attacks through classifier construction and considering IP in network. The recent researches showed that a fundamental classification cannot be effective lonely and due to its errors, but mixing some classifications provide better efficiency. So, the current study attempt to design three classes of support vector machine, the neural network of multilayer perceptron and parallel fuzzy system in which there are trained dataset and capability to detect two classes. Finally, decisions made by an intermediate network due to type of attack. In the present research, suggested system tested through dataset of KDD99 and results indicated appropriate efficiency 99.71% in average.

Keywords: Intrusion Detection System, support vector machine, neural network of multilayer Perceptron, fuzzy system.

1. Introduction

(2)

40

(Shanmugavadivu,2012), a fuzzy theory of a detect system of attacks suggested. This system act base on detecting the unusual actions. Also, some attacks and normal sets of fuzzy rules are used to get results. In the phase of detecting attack, F test used. The average of detection was 52.19% which cannot be consider as efficient compare to others. In (Mohamed,2012), four hierarchical mixed algorithms compared. In the first step, all algorithms are applied in order to separate the normal sets against neural network attack. As the neural network consists of capabilities to separate precisely without make errors in normal sets against attack, so it is used firstly. Latter, each algorithm used one certain class in order to separate various kinds of attacks. In fact, multi-layer neural networks, support vector machines, Naive-Bayes and decision tree applied respectively so that next attack can be predict and it is continue to detect all attacks.

2. Dataset KDD Cup 99

In order to train and test the suggested algorithm, KDD Cup 99 is used. Such dataset is presented by Lincoln University as DARPA 1998. The data apply to study the attacks in computerized networks. KDD Cup included five million records simulated through 9 weeks in a local network of air army of U.S. A.. Due to many records in dataset, only 10% is trained and tested. Each record is a set of 42 features, and 42th feature shows the kind of dataset. The training dataset consist of 494014 sample. The kind and name of attacks are presented in Table 1. The dataset of test consist of 311021 records which illustrated in Table 2 base on kinds and name of attacks. The set included a normal classification and four attack as Probe, Dos, R2L and U2R.

Table I

KIND AND NAME OF ATTACKS IN TRINING DATASET Attack Name

Classification of Attacks

Port-sweep, IP-sweep, Nmap, Satan Probing

Neptune, Smurf, Pod, Teardrop, Land, Back, Apache2 Denial of Service(Dos)

Buffer-overflow, Load-module, Perl, Rootkit, spy User to Root (U2R)

Guess-password, Ftp-write, Imap, Phf, Multihop, Warezmaster, Warezclient Remote to Local(R2L)

Table II

KIND AND NAME OF ATTACKS IN TESTED DATASET Attack Name Classification of attacks

Port-sweep, IP-sweep, Nmap, Satan Saint, Mscan Probing

Neptune, Smurf, Pod, Teardrop, Land, Back, Apache2, Udpstorm, Process-table, Mail-bomb Denial of Service(Dos)

Buffer-overflow, Load-module, Perl, Rootkit, spy, Xterm, Ps, Http-tunnel, Sql-attack, Worm, Snmp-guess User to Root (U2R)

Guess-password, Ftp-write, Imap, Phf, Multihop, Warezmaster, Warezclient,Snmpget-attack, Named, Xlock, Xsnoop, Send-mail Remote to Local(R2L)

3. Proposed method

The proposed method is formed through the idea of mixing hierarchical parallel classification, and finally a mediated network decide about the kind of attack. This system used ten blocks to decision making. Every block included three classification of SVM, neural network of MLP (Multi-layer perceptron) and fuzzy which showed their ideas about entered packs, then the plurality make decision. Every block make decision in two trained classes. First block separate the normal packs against attack, and if the pack is normal, it is finished. But, if an attack is detected, other nine blocks attempt to detect attack hierarchically, finally all ideas sent to mediated network included a classification of MLP, and tried to detect kind of attack Figure 1. Are trained. The training consisted of seven steps as: normalize data, select features, presenting various experts. Dataset to train, train the first block to detect the normal packs, add the scarce data , train the other nine blocks, detect the attack classes and train the mediated network by received ideas. After training, whole dataset tested, evaluated and results considered. All classifications are performed by Matlab software.

3.1 Normalize the data in dataset

Regarding to various extent of features in dataset, it can be claim that the extent is not means its importance, so

(3)

41

and minimum extent of Xi, min, Relation 1 is used to normalize data. So, converting data perform in range [1, 0] (Sadiq

,2010).

(1)

3.2 Feature selection

The feature selection algorithms are used to eliminate the features are imposed less effects on detecting attacks to reduce the extra data. Therefore, the feature selection algorithms as Symmetrical Uncertainty impose on the normalized data of previous step through Weka software, and features of 7, 8, 9, 15, 18, 19, 20, 21-imposed effects under 1% to detect attacks- are eliminated in dataset in Table 3. So, 33 effective features are used to detect attacks and train classifications.

Table III

RANKING FEATURES BY WEKA SOFTWARE Symmetrical Uncertainty

Ranked Attribute Ranked Attribute Ranked Attribute

0.61358 6 0.17095 40 0.02142 14

0.56344 37 0.16731 28 0.01954 17

0.55694 12 0.16311 41 0.01644 11

0.52063 5 0.14094 33 0.0142 22

0.43549 3 0.12901 30 0.0115 16

0.3746 32 0.12532 38 0.00924 18

0.3662 23 0.12384 34 0.00664 19

0.27997 36 0.11526 25 0.00456 8

0.27154 2 0.08411 29 0 20

0.26062 24 0.08096 1 0 21

0.23819 31 0.07225 39 0 7

0.1822 27 0.06446 26 0 15

0.18183 4 0.04422 10 0 9

0.17308 35 0.03058 13

)

(

)

(

min , max ,

min ,

i i

i i i

X

(4)

42

Figure 1. Proposed method

3.3 Making various datasets to train experts

In order to make variety in entries and kind of decision maker classification, ten new dataset make through KDD Cup99, and each dataset include only two classes to provide simple training and make decisions easily Table 4.

There is no need each block get the idea about each five classes in KDD Cup 99. Every block train to detect the entered case is belong to which class, then classify them, and continue to get results.

4.3 Training the experts of first block

(5)

43

next blocks to detect kind of block. The blocks of next steps trained only for detecting attacks, and normal sets didn’t used as Table 4 showed.

Table IV

CREATED DATASETS OF KDDCUP 99

Dataset Class and content Dataset Class and content

Set1 Class1:Normal Class2:All attack Set6 Class4:R2L Class5:U2R Set2 Class1:Dos,Probe Class2:R2L,U2R Set7 Class2:Dos Class4:R2L Set3 Class1:Dos,R2L Class2:Probe,U2R Set8 Class3:Probe Class5:U2R Set4 Class1:Dos,U2R Class2:Probe,R2L Set9 Class2:Dos Class5:U2R

Set5 Class2:Dos Class3:Probe Set10 Class3Probe Class4:R2L

3.5 Adding scarce sets

In previous step, if there is no normal set, it is considered as attack and its kind should be determine. Attacks classified into four classes as: U2R, R2L, Probe, Dos. But, the sets of R2: and U2R are low to train, for this reason, classifications which used to detect, have no good training, and include less precise, so in order to eliminate the shortcoming, classes included a few sets, and have good training, data are repeated on them so that training sets increase and make a balance. The proposed method perform it on two classes of R2L and U2R.

3.6 Training the next experts

The experts of next blocks are trained through Set2 to Set 10. If an attack is reported for entered set, firstly it is consider by blocks 2, 3 and 4. The experts of three blocks trained they include two attacks separately. For example, if second block in existence determine class 1, it means that attack of Dos or Probe happen. Then, its output send to block 5 to determine the trained attacks it is Dos or Probe. Then, the idea of block 3 send to blocks 7 or 8, and idea of block 4 send to block 9 or 10. In the other hands, regarding to selecting blocks 2, 3 and 4 among six other blocks (5, 6, 7, 8, 9, and 10), only three blocks express ideas about kind of attack. These ideas may be belong to class 5, 4, 3, and 2 to show attacks U2R, R2L, Probe and Dos, respectively. Three final ideas send to intermediate network.

3.7 Training intermediate network

The intermediate network is a classification of MLP. As three ideas only send, so it is consist of three entered neuron, because network should make decisions among 4 classes of attacks and included 4 output neurons. The neural network of MLP have two intermediated layers and each one consist of 25 neurons. This system train in two steps, firstly ten blocks train by created datasets, then all trained datasets imposed on block and results-included three ideas for each set- use to train intermediated network. So, intermediate MLP make final decision based on recieved ideas about kind of attack and declare it. After traing whole system, testsed dataset perform and final results consider.

4. Results

(6)

44

Table V

THE RESULTS OF PROPOSED METHOD COMPARE TO OTHER METHODS

Attack type Accuracy(%)

DT SVM Hybrid decision tree-SVM Ensemble approach MLP Proposal Approach

Normal 99.64 99.64 99.7 99.7 98.64 _99.99

Probe 99.86 98.57 98.57 100 92.17 _99.97

DOS 96.83 99.92 99.92 99.92 97.61 _99.93

U2R 68 40 48 68 27.12 _98.77

R2L 84.19 33.92 37.8 97.16 28.48 _99.87

0 20 40 60 80 100 120

Normal Probe DOS U2R R2L

Attack type

A

c

u

rac

y

(%)

DT

SVM

Hybrid decision tree-SVM

Ensemble approach

MLP

Proposal Approach

Figure.2. Compare Proposed Method and other Methods

According to scarcity of training samples, plurality of methods uncapable to detect attacks of R2L and U2R, but in the proposed method, there are variety in modes due to scarce data and various mixes of data to get success and detect attacks. In the method, each block consider one certain part and is not deal with whole dataset and call available c;asses. The efficiency of classifications in blocks is better than when whole dataset act. Also, each block use three different classification in parallel to consider the modes and it cause to assimilate errors of each other.

The proposed method consist of average correct classification-99.71%-. Generally, the reason of success is in some points: 1.There are various errors in different block through different entries in each block, 2. Reducing the number of classes in each block should be classified. So, training is so easy and the number of noises to make errors are reduc, 3. Increasing the main dataset to smaller datasets (10 dataset creat), while in general, whole method of extra data increased, but extra data in each block is reduced. So, it is caused to increase efficiency and much more precision in system , 4. Using three various classification in each blockare created to assimilate each others‘ errors in different performances, 5. Adding scarce sets to make balance among trained attacks and fair classifications, 6. Using intermediate network to correct the final results when error happen. All the factors cause to make better efficiency.

Acknowledgment

The work described in this paper was fully supported by research grants of Andimeshk Branch, Islamic Azad University, Andimeshk, Iran.

References

i. Denning, D. E., "An intrusion-detection model.", IEEE Transactions on Software Engineering, 13, p.p 222-232,1987.

ii. Kaliyamurthie, K.P., Parameswari, D., Suresh, DR. R.M.,” Intrusion Detection System using Memtic Algorithm Supporting with Genetic and Decision Tree Algorithms”, IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 2, No 3,508-514, 2012.

iii. Scarfone K., Mell P., "Guide to intrusion detection and prevention systems (IDPS)". NIST Special Publication, 800-94, 2007.

(7)

45

v. M. Govindarajan, RM. Chandrasekaran, "Intrusion Detection using an Ensemble of Classification Methods", Proceedings of the World Congress on Engineering and Computer Science, WCECS, vol. 1,2012.

vi. Shanmugavadivu , R., Nagarajan, Dr.N., ” Learning of Intrusion Detector in Conceptual Approach of Fuzzy Towards Intrusion Methodology”, International Journal of Advanced Research in Computer Science and Software Engineering, Vol 2, Issue 5, p.p 246-250,2012.

vii. Mohamed, M. A., "Development of Hybrid-Multi-Stages Intrusion Detection Systems", IJCSNS International Journal of Computer Science and Network Security, vol. 10, no. 3,2012.

viii. KDD’99 Archive: The Fifth International Conference on Knowledge Discovery and Data Mining. ix. URL: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

x. M. Sadiq Ali Khan, Dr. S. M. Aqil Burney, S. M. Aqil Burney, "Feature Deduction and Ensemble Design of Parallel Neural Networks for Intrusion Detection System", International IJCSNS Journal of Computer Science and Network Security, vol. 10, no. 10, 2010.