Skip to content

Latest commit

 

History

History
136 lines (95 loc) · 5.98 KB

File metadata and controls

136 lines (95 loc) · 5.98 KB

Anomaly-based Network Intrusion Detection System with R

DOI

Anomaly based Network Intrusion Detection System is software application that monitors a network. It is composed by some parts, as shown in figure:

figure.

  1. Sniffing packets that cross the network.
  2. Prediction model.
    • Find a dataset.
    • Training model with dataset.
  3. Notify attack
  4. Making a decision in order to block an attack.

IMPORTANT :this project permit to create an a prediction model. It permit to train five differents models. Thus, this is a simple code that shows as different models can be trained in R. The models are:

  • Support Vector Machine
  • Random Forest
  • XGBoost
  • NeuralNet
  • Keras

Furthermore, the compute power available limits models dimensions. In fact, the profiling metrics is not considered.

In the future, I will provide to better models, like Keras model, when a sufficient compute power is aviable for me.

I advide you to try a change a models in order to exploit a model beneficts. Thus, I advice to imporove Keras models in order to obtain a better result. If you obtain better result, I'm really happy to consider and work on new branch or fork

The code is written in R and uses a KDD-Cup 1999 dataset and NSL-KDD dataset .

For more information, it is possible to read my [master degree thesis] or contact me through e-mail at [email protected].

Getting Started

It is possible to clone or download the project. Remember that the project has the main goal to show how train different models of Machine or Deep Learning. Thus, it is possible to implement Network Intrusion Detection System "PREDICTION" phase.

R language is choosen to reach the goal. Furthermore, it is possible to integrate R model in other programming laguages, as Java. I advise to import, or copy, the project using the RStudio Community IDE.

Homewher, the project uses external resources. They are two dataset: KDD-Cup 1999 and NSL-KDD. They are widely used in academic world.

For futher information, it is possible to read my [master degree thesis] or contact me through e-mail at [email protected].

Prerequisites

The libraries are used in project are:

  • caret
  • keras
  • nuerlanet
  • e1071
  • xgboost
  • dplyr

It is important to install previous libraries. Thus, it is possible install library, in R, using the command:

install.packages("caret")

The previous command install permits to get the caret library.

For keras, it is necessary to solve different dependencies, as Anaconda. The dependencies vary for different Operating System used. I advice to follow the istruction that are presented in official Keras site, available there.

Installing

It is only necessary to download or clone project and run its.

Remember to set a working diretctory, running the command:

setwd("relativepath")

in the R console. Otherwise, you would use a tab panel, in bottom right, in RStudio: click on More -> Set As Working Directory.

Running the tests

It is sufficient to click "Run" in the Rstudio. Otherwise, it is possible run code in R Console.

Other infromation

The project is divided in four directory:

  • "adjusting dataset " : the R scripts permit to prepare the datasets. In fact, it is convenient to add the names of the columns and to group the data of the attacks according to five classes (DoS, Probe, U2R, R2l, normal).
  • "dataset reduction" : permit to reduce a dataset dimensions, analyzing the correlation function between features, the importance features with Random Forest and the scaling a values, in order to get a better performance.
  • "exploratory data analysis" : permit to get confidence with data. Furthermore, permit to delete some outlier, if presents, using a simple plots.
  • "models evaluation" : contains five different models. It is the core of the project: after having trained a model with KDD and NSL data sets, it is possible to save the model (the created R object) that will represent the prediction block of the Network Intrusion Detection System.
  • "python evaluation": collects four Python model (SVM, Random Forest, XGBoost and Keras).

Add Python scripts

The cleaning KDD and NSL-KDD were trained also in Python. You can find the Python script in "python evaluation folder".

Authors

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this software, please cite it as below

@article{serinelli2020training,
  title={Training guidance with KDD cup 1999 and NSL-KDD data sets of ANIDINR: Anomaly-based network intrusion detection system},
  author={Serinelli, Benedetto Marco and Collen, Anastasija and Nijdam, Niels Alexander},
  journal={Procedia Computer Science},
  volume={175},
  pages={560--565},
  year={2020},
  publisher={Elsevier}
}
@article{serinelli2021analysis,
  title={On the analysis of open source datasets: validating IDS implementation for well-known and zero day attack detection},
  author={Serinelli, Benedetto Marco and Collen, Anastasija and Nijdam, Niels Alexander},
  journal={Procedia Computer Science},
  volume={191},
  pages={192--199},
  year={2021},
  publisher={Elsevier}
}