Balanced Datasets for IoT IDS
As the Internet of Things (IoT) continues to grow, cyberattacks are becoming increasingly common. The security of IoT networks relies heavily on intrusion detection systems (IDSs). The development of an IDS that is accurate and efficient is a challenging task. As a result, this challenge is made more challenging by the absence of balanced datasets for training and testing the proposed IDS. In this study, four commonly used datasets are visualized and analyzed visually. Moreover, it proposes a sampling algorithm that generates a sample that represents the original dataset. In addition, it proposes an algorithm to generate a balanced dataset. Researchers can use this paper as a starting point when investigating cybersecurity and machine learning. The proposed sampling algorithms showed reliability in generating well-representing and balanced samples from NSL-KDD, UNSW-NB15, BotNetIoT-01, and BoTIoT datasets.
READ FULL TEXT