An Imbalance-Aware Approach for Cyber-Attack Detection in IoT-Enabled Cyber-Physical Systems using Autoencoders and Deep Learning
Keywords:
Cyber-attacks, Internet of Things,Water Subsystem, Machine Learning, Principal Component Analysis, Deep Neural Networks.Abstract
Cyber-physical systems enabled by the Internet of Things (IoT) facilitate the transmission and reception of data over the internet, encompassing industrial equipment and operational IT. This equipment will be equipped with sensors to monitor its condition and transmit data to a centralized server via an internet connection. Malicious users may occasionally target sensors, compromising their integrity by altering data. This manipulated information can then be transmitted to a centralized server, resulting in erroneous actions being executed. Many countries' equipment and production systems have failed due to inaccurate data. Consequently, numerous algorithms have been developed to detect attacks. However, these algorithms often encounter issues related to data imbalance, where one class may contain a significantly larger number of records (for instance, NORMAL records) compared to the other class, such as attacks, which may have only a few records. This imbalance can result in detection algorithms failing to make accurate predictions. To address data imbalance, current algorithms utilize OVER and UNDER sampling techniques, which generate new records exclusively for the minority class. To address this issue, we are introducing a novel technique that does not utilize any under or oversampling algorithms. The proposed technique is comprised of two components. An autoencoder will be trained on an imbalanced dataset to extract features. These extracted features will subsequently be utilized to train a decision tree algorithm for predicting labels for both known and unknown attacks. The decision tree is trained using a reduced set of features derived from the principal component analysis (PCA) algorithm. Deep Neural Networks (DNNs) are trained on both known and unknown attacks. If any records contain an attack signature, DNN will identify the corresponding attack label or class and attribute them accordingly.