Improved Flow Recovery from Packet Data

Date

2023

Advisors

Journal Title

Journal ISSN

ISSN

Volume Title

Publisher

arXiv preprint arXiv:2310.09834

Type

Article

Peer reviewed

Abstract

Typical event datasets such as those used in network intrusion detection comprise hundreds of thousands, sometimes millions, of discrete packet events. These datasets tend to be high dimensional, stateful, and time-series in nature, holding complex local and temporal feature associations. Packet data can be abstracted into lower dimensional summary data, such as packet flow records, where some of the temporal complexities of packet data can be mitigated, and smaller well-engineered feature subsets can be created. This data can be invaluable as training data for machine learning and cyber threat detection techniques. Data can be collected in real-time, or from historical packet trace archives. In this paper we focus on how flow records and summary metadata can be extracted from packet data with high accuracy and robustness. We identify limitations in current methods, how they may impact datasets, and how these flaws may impact learning models. Finally, we propose methods to improve the state of the art and introduce proof of concept tools to support this work.

Description

Keywords

Citation

Kenyon, A., Elizondo, D. and Deka, L. (2023) Improved Flow Recovery from Packet Data. arXiv:2310.09834

Rights

Research Institute