PH.D DEFENCE - PUBLIC SEMINAR

Cyber Attack Detection Using Machine Learning Techniques

Speaker

Mr Qi Panpan

Advisor

Dr Ng See Kiong, Professor, School of Computing

28 Apr 2022 Thursday, 04:00 PM to 05:30 PM

Zoom presentation

Abstract:

Cyber attack detection remains an urgent and challenging problem as a rising number of incidents related to cyber attacks have imposed substantial economic costs all over the world. As the cyber space continues to expand with increasing adoption of digital technologies, the cyber threat landscape is also constantly changing. In particular, malicious software (or malware) on computer systems has remained the most common and biggest threat to cyber security, while cyber attack on Cyber Physical Systems (CPSs) is becoming a rising concern as many mission-critical systems are CPSs, especially with the rapid development of the Internet of Things (IoT) systems. In this thesis, we therefore focus on cyber attack detection from two aspects: supervised detection for malicious software on computer systems and unsupervised anomaly detection for attacks in CPSs.

To build a supervised malware detection model, one key phase is feature engineering, in which an input software is transformed via dynamic or static analysis to a set of features. Dynamic analysis executes each software in an isolated environment (e.g., a sandbox) to collect its run-time behaviour information, while static analysis methods scan the binary byte-streams of the software to create features. For dynamic analysis, existing works typically only consider the API name while ignoring the API arguments, or require complex feature engineering operations and expert knowledge to process the arguments. To this end, we propose a novel feature extraction approach to encode the API arguments associated with the API name and API category into a homogeneous and low-cost representation and devise a deep neural network architecture to mine the sequential correlation among API calls. Static analysis is important for protection against malware as it allows malicious files to be detected prior to execution. Recent deep learning models for static malware detection do not rely on the expert knowledge and read the binary files directly to do the classification, but they treat all the parts of the file equally and fail to utilize all the information decently. To tackle these issues, we propose an end-to-end malware detection framework that learns the features from multiple domains without feature engineering. Malware programs are known to evolve rapidly and malware detection models trained on the source domain (training data) often fail to generalize to the target domain, i.e., the deployed environment, due to the underlying distribution drifts. Recently, gradient boosting decision trees (GBDT) models, e.g., LightGBM, have shown outstanding performance for malware detection. To handle the data distribution drifts in malware detection, we adapt the adversarial learning framework for unsupervised domain adaptation to GBDT to alleviate performance degradation in the target domain.

CPSs are large and complex data-intensive systems with a multitude of interconnected sensors and actuators. Given the lack of labelled attack data, supervised classification-based detection methods are usually infeasible for detecting the attacks in CPSs. Unsupervised anomaly detection techniques for multivariate time series have been proposed for detecting CPSs attacks and showed promising performance. However, current deep learning-based unsupervised anomaly detection methods are either limited by their representation learning methods in encoding the temporal and spatial information simultaneously and effectively, or they cannot easily scale to other tasks without having explicit knowledge of the internal relationships between the different variables or sensors, which are both important for characterising CPS data. To this end, we propose a novel unsupervised anomaly detection method MAD-SGCN for multivariate time series, in which the temporal and spatial correlations of each input sequence are captured by Long Short-Term Memory networks (LSTMs) and spectral-based Graph Convolutional Networks (GCNs). Furthermore, we design a self-supervised graph structure learning mechanism to minimize the usage of the prior knowledge about the network structures of the CPSs.