Explainability of network intrusion detection using transformers: a packet-level approach : a thesis in Data Science

Pahalavan Rajkumar Dheivanayahi

doi:10.62791/20312

Back

Thesis

Open access

Explainability of network intrusion detection using transformers: a packet-level approach : a thesis in Data Science

Pahalavan Rajkumar Dheivanayahi

Master of Science (MS), University of Massachusetts Dartmouth

2023

DOI:

https://doi.org/10.62791/20312

Abstract

Network Intrusion Detection Systems (NIDS) are critical in ensuring the security of connected computer systems by actively detecting and preventing unauthorized activities and malicious attacks. Machine learning (ML) and deep learning (DL) based NIDS models leverage algorithms that learn from historical network traffic data to identify patterns and anomalies to capture complex relationships. The primary objective of this research is to generate tags and descriptions for the packets that are difficult to classify by the NIDS. Most NIDS datasets that are publicly available have focused on flow data, offering aggregated information about network connections, and have played a crucial role in enabling researchers and network security professionals to design and develop flow-based NIDS solutions. While flow records provide valuable information for detecting network-level anomalies and attacks, they do not consider packet-level information and payload contents. In this research, we propose a packet-level approach for NIDS that leverages the flow information with the packet header fields and payload. To facilitate this research, we have curated a comprehensive Packet-level dataset constructed by extracting the Packet Capture (PCAP) files from two widely used flow-level datasets, namely CIC-IDS2017 and UNSW-NB15. Recent advancements in Natural Language Processing (NLP) have demonstrated the effectiveness of Transformer-based models in handling sequence data with tasks such as token classification and text generation. We have adapted this technology to NIDS to extract key features and characteristics of the header and payload in the context of various attacks. Unlike traditional classification methods that assign predefined labels to network packets, this method focuses on generating tags based on the packet signature that explains the packet content and potential risks. The tags and descriptions offer network security professionals a tool to comprehend suspicious packets with an unfamiliar or potentially malicious signature, assess their nature, and help make informed decisions promptly.

Files and links (1)

pdf

Rajkumar Dheivanayahi P. COE MS Thesis 20231.36 MBDownload View

CC BY-NC-ND V4.0, Open Access

Metrics

15 File views/ downloads

37 Record Views

Details

Title: Explainability of network intrusion detection using transformers
Creators: Pahalavan Rajkumar Dheivanayahi
ORCID: 0009-0003-8142-2490
Contributors: Gokhan Kul (Advisor) - University of Massachusetts Dartmouth, Department of Computer and Information Science
Bharatendra K. Rai (Committee Member) - University of Massachusetts Dartmouth, Department of Decision and Information Sciences
Lance Fiondella (Committee Member) - University of Massachusetts Dartmouth, Department of Electrical and Computer Engineering
Scott E Field (Committee Member) - University of Massachusetts Dartmouth, Department of Mathematics
Number of pages: ix, 67 pages
Illustrations: illustrations (some color)
Table of contents: List of figures -- List of tables -- Abbreviations -- Chapter 1. Introduction -- Chapter 2. Data preparation -- Data collection -- Data extraction -- Data integration -- Data transformation -- Chapter 3. Methodology -- Overview -- Word embeddings -- BERTSimilar -- Model fine-tuning -- Cluster analysis -- Tag generation -- Text generation -- Chapter 4. Results -- Chapter 5. Conclusion -- Summary -- Limitations and future work -- References.
References: Includes bibliographical references (pages 54-58).
Awarding Institution: University of Massachusetts Dartmouth
Degree Awarded: Master of Science (MS)
Degree in: Data Science
Academic Unit: Department of Computer and Information Science
Language: English
Resource Type: Thesis
DOI: https://doi.org/10.62791/20312
Record Identifier: 9914424797801301