An Approach for Optimizing Ensemble Intrusion Detection Systems

Abdullah A.H., Stiawan D., Rini D.P., Subroto I.M.I., Kerim B., Kurniabudi, Idris M.Y.B., Heryanto A., Bardadi A., Budiarto R.

Abstract

Intrusion Detection System is yet an interesting research topic. With a very large amount of traffic in real-time networks, feature selection techniques that are effectively able to find important and relevant features are required. Hence, the most important and relevant set of features is the key to improve the performance of intrusion detection system. This study aims to find the best relevant selected features that can be used as important features in a new IDS dataset. To achieve the aim, an approach for generating optimized ensemble IDS is developed. Six features selection methods are used and compared, i.e.: Information Gain (IG), Gain Ratio (GR), Symmetrical Uncertainty (SU), Relief-F (R-F), One-R (OR) and Chi-Square (CS). The feature selection techniques produce sets of selected features. Each best selected number of features that are obtained from feature ranking step for respective feature selection technique will be used to classify attacks via four classification methods, i.e.: Bayesian Network (BN), Naïve Bayesian (NB), Decision Tree: J48 and SOM. Then, each feature selection technique with its respective best features is combined with each classifier method to generate ensemble IDSs. Lastly, the ensemble IDSs are evaluated using Hold-up, K-fold validation approaches, as well as F-Measure and statistical validation approaches. Experimental results using Weka tools on ITD-UTM dataset show the optimized ensemble IDSs using (SU and BN); using (CS and BN) or (CS and SOM) or (IG and NB); and using (OR and BN) with respective ten, four and seven best selected features achieves 81.0316%, 85.2593%, and 80.8625% of accuracy, respectively. In addition, ensemble IDSs using (SU and BN) and using (OR and J48) with ten and six best respective selected features, perform the best F-measure value, i.e.: 0.853 and 0.830, respectively. Indirect comparison with other ensemble IDS on different dataset is discussed.

Journal
IEEE Access
Page Range
6930-6947
Publication date
2021
Total citations
Floating search methods in feature selection

Kittler J., Novovicova J., Pudil P.

CICIDS-2017 Dataset Feature Analysis with Information Gain for Anomaly Detection

Bamhdi A.M., Bin Idris M.Y.B., Budiarto R., Darmawijoyo, Kurniabudi, Kurniabudi, Stiawan D.

No Title

Norvig P., Russell S.

Machine learning in network security using KNIME analytics

Abualkibash M.

Comparison of classification techniques for intrusion detection dataset using WEKA

Garg T., Khurana S.S.

Intrusion detection system using self organizing maps

Kulkarni P., Nikam D.M., Pachghare V.K.

No Title

Friedman J., Hastie T., Tibshirani R.

Network intrusion detection system using J48 Decision Tree

Mehtre B.M., Sahu S.

ERCR TV: Ensemble of Random Committee and Random Tree for Efficient Anomaly Classification Using Voting

Niranjan A., Nitish A., Nutan D.H., Shenoy P.D., Venugopal K.R.

A survey on data mining approaches for dynamic analysis of malwares

Shah K., Singh D.K.

Ensembling PCA-based Feature Selection with Random Tree Classifier for Intrusion Detection on IoT Network

International Conference on Electrical Engineering Computer Science and Informatics Eecsi

A Machine Learning Framework for Intrusion Detection System in IoT Networks Using an Ensemble Feature Selection Method

2021 IEEE 12th Annual Information Technology Electronics and Mobile Communication Conference Iemcon 2021

Access to Document