Large-Scale Realistic Network Data Generation on a Budget
Research in computer networking domains have been a mainstay for many decades. Often a first step for perfoming such research involves acquiring or collecting relevant network data. Unfortunately, network datasets are not as plentiful in the real-world as one would hope or assume. This may leave researchers with only two options: abandon the research, or generate the data required to perform the research.
In this work, we set out to develop a method for realistic network trace data generation which can be applied in a network emulation setting. Network emulation enables the construction of very large-scale, real-world networks within a single physical host, providing an inexpensive testbed within a lab or personal environment, and without the abstractions often present in network simulators. Within such a network, we deploy our method, called eMews, which provides network dataset generation and monitoring, enabling the autonomous generation of realistic network trace data over potentially very large-scale networks. Client-side human behavior is abstracted to a set of behavioral models, which are then used to automate protocols which would normally require human interaction (such as SSH and HTTP/HTTPS). eMews is written with shared resource constraints in mind, allowing it to scale up for very large networks.
Initial scalability results on a Dell Inspiron laptop (Intel Core i7-7500U CPU @ 2.70GHz [2C/4T], 8GB RAM, 8GB swap) shows promise, with RAM being exhausted before CPU. Current work is focused on increasing scalability on lower-end hardware in terms of memory usage, increasing the number of autonomous client-side protocols supported, and creating more expressive human behavioral models.
eMews Open-Source Software
Related Publications
- Brian Ricks, Patrick Tague, Bhavani Thuraisingham, and Sriraam Natarajan, "Utilizing Threat Partitioning for More Practical Network Anomaly Detection", Proceedings of the 29th ACM Symposium on Access Control Models and Technologies (SACMAT), May 2024. [pdf,bib]
- Brian Ricks, Patrick Tague, and Bhavani Thuraisingham, "DDoS-as-a-Smokescreen: Leveraging Netflow Concurrency and Segmentation for Faster Detection", IEEE Intl Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS), Dec 2021. [pdf,bib]
- Brian Ricks, Bhavani Thuraisingham, and Patrick Tague, "Mimicking Human Behavior in Shared-Resource Computer Networks", IEEE Intl Conference on Information Reuse and Integration for Data Science (IRI), Jul 2019. [pdf,bib]
- Brian Ricks, Bhavani Thuraisingham, and Patrick Tague, "Lifting the Smokescreen: Detecting Underlying Anomalies During a DDoS Attack", IEEE Intelligence and Security Informatics (ISI), Nov 2018. (best paper award) [pdf,bib]
- Brian Ricks, Patrick Tague, and Bhavani Thuraisingham, "Large-Scale Realistic Network Data Generation on a Budget", IEEE Intl Conference on Information Reuse and Integration for Data Science (IRI), Jul 2018. (best paper award) [pdf,bib]