Logo image
Hierarchical clustering analysis for network intrusion detection and attack similarity identification: a packet-level approach : a thesis in Computer Science
Thesis   Open access

Hierarchical clustering analysis for network intrusion detection and attack similarity identification: a packet-level approach : a thesis in Computer Science

Dnyaneshwari Shashikant Jagtap
Master of Science (MS), University of Massachusetts Dartmouth
2024
DOI:
https://doi.org/10.62791/20356

Abstract

Network Intrusion Detection Systems (NIDS) are crucial for identifying and mitigating a wide range of malicious activities by monitoring incoming and outgoing network traffic. NIDS employ various Machine Learning (ML) and Deep Learning (DL) algorithms, such as signature-based detection and anomaly-based detection, to identify threats. The primary objective of this research is to identify the similarities between different cyber-attack classes by grouping network packets into distinct clusters based on various data point distances. In this study, we propose a packet-level approach for NIDS using a dataset obtained from extracting Packet Capture (PCAP) files from two popular datasets, namely UNSW-NB15 and CIC-IDS 2017. This dataset contains packet-level information, including packet header and payload data, for analysis. We utilized preprocessed datasets from the Payload Byte tool, which had already been extracted and labeled. These datasets are in CSV file format and contain numerical data. Key features extracted from network traffic at the packet level are available in the UNSW-NB15 and CIC-IDS 2017 datasets. The UNSW-NB15 dataset includes various types of cyber-attacks, while the CIC-IDS 2017 dataset contains fifteen types of cyber-attacks. Both datasets also include normal network traffic data. We employed the agglomerative hierarchical clustering algorithm with multiple cluster sizes on both datasets to identify similarities and dissimilarities between different attack classes. The Ward linkage method was used to detect similarities between clusters. Additionally, Euclidean distance was utilized to obtain the distance matrix for the hierarchical clustering algorithm. The analysis also employs correlation matrices, heatmaps, and dendrograms to visualize and interpret the relationships between attack types. This research offers cybersecurity professionals a technique to identify suspicious packets with malicious attack patterns that are similar to other attacks. This work is crucial for developing more effective detection strategies in cybersecurity.
pdf
Jagtap D.S. COE MS Thesis 20241.28 MBDownloadView
CC BY-NC-ND V4.0 Open Access

Metrics

13 File views/ downloads
32 Record Views

Details

Logo image