Contents
Preface xiii
Acknowledgments xvii
Acronyms xix
1 Cybersecurity in the Era of Artificial Intelligence 1
1.1 Artificial Intelligence for Cybersecurity . 2
1.1.1 Artificial Intelligence 2
1.1.2 Machine Learning 4
1.1.3 Data-Driven Workflow for Cybersecurity . 6
1.2 Key Areas and Challenges 7
1.2.1 Anomaly Detection . 8
1.2.2 Trustworthy Artificial Intelligence . 10
1.2.3 Privacy Preservation . 10
1.3 Toolbox to Build Secure and Intelligent Systems . 11
1.3.1 Machine Learning and Deep Learning . 12
1.3.2 Privacy-Preserving Machine Learning . 14
1.3.3 Adversarial Machine Learning . 15
1.4 Data Repositories for Cybersecurity Research . 16
1.4.1 NSL-KDD . 17
1.4.2 UNSW-NB15 . 17
v
1.4.3 EMBER 18
1.5 Summary 18
2 Cyber Threats and Gateway Defense 19
2.1 Cyber Threats . 19
2.1.1 Cyber Intrusions . 20
2.1.2 Distributed Denial of Services Attack . 22
2.1.3 Malware and Shellcode . 23
2.2 Gateway Defense Approaches 23
2.2.1 Network Access Control 24
2.2.2 Anomaly Isolation 24
2.2.3 Collaborative Learning . 24
2.2.4 Secure Local Data Learning 25
2.3 Emerging Data-Driven Methods for Gateway Defense 26
2.3.1 Semi-Supervised Learning for Intrusion Detection 26
2.3.2 Transfer Learning for Intrusion Detection 27
2.3.3 Federated Learning for Privacy Preservation . 28
2.3.4 Reinforcement Learning for Penetration Test 29
2.4 Case Study: Reinforcement Learning for Automated Post-Breach
Penetration Test . 30
2.4.1 Literature Review 30
2.4.2 Research Idea 31
2.4.3 Training Agent using Deep Q-Learning 32
2.5 Summary 34
vi
3 Edge Computing and Secure Edge Intelligence 35
3.1 Edge Computing . 35
3.2 Key Advances in Edge Computing . 38
3.2.1 Security 38
3.2.2 Reliability . 41
3.2.3 Survivability . 42
3.3 Secure Edge Intelligence . 43
3.3.1 Background and Motivation 44
3.3.2 Design of Detection Module 45
3.3.3 Challenges against Poisoning Attacks . 48
3.4 Summary 49
4 Edge Intelligence for Intrusion Detection 51
4.1 Edge Cyberinfrastructure . 51
4.2 Edge AI Engine 53
4.2.1 Feature Engineering . 53
4.2.2 Model Learning . 54
4.2.3 Model Update 56
4.2.4 Predictive Analytics . 56
4.3 Threat Intelligence 57
4.4 Preliminary Study . 57
4.4.1 Dataset 57
4.4.2 Environment Setup . 59
4.4.3 Performance Evaluation . 59
vii
4.5 Summary 63
5 Robust Intrusion Detection 65
5.1 Preliminaries 65
5.1.1 Median Absolute Deviation . 65
5.1.2 Mahalanobis Distance 66
5.2 Robust Intrusion Detection . 67
5.2.1 Problem Formulation 67
5.2.2 Step 1: Robust Data Preprocessing 68
5.2.3 Step 2: Bagging for Labeled Anomalies 69
5.2.4 Step 3: One-Class SVM for Unlabeled Samples . 70
5.2.5 Step 4: Final Classifier . 74
5.3 Experiment and Evaluation . 76
5.3.1 Experiment Setup 76
5.3.2 Performance Evaluation . 81
5.4 Summary 92
6 Efficient Preprocessing Scheme for Anomaly Detection 93
6.1 Efficient Anomaly Detection . 93
6.1.1 Related Work . 95
6.1.2 Principal Component Analysis . 97
6.2 Efficient Preprocessing Scheme for Anomaly Detection . 98
6.2.1 Robust Preprocessing Scheme . 99
6.2.2 Real-Time Processing 103
viii
6.2.3 Discussions 103
6.3 Case Study . 104
6.3.1 Description of the Raw Data 105
6.3.2 Experiment 106
6.3.3 Results 108
6.4 Summary 109
7 Privacy Preservation in the Era of Big Data 111
7.1 Privacy Preservation Approaches 111
7.1.1 Anonymization 111
7.1.2 Differential Privacy . 112
7.1.3 Federated Learning . 114
7.1.4 Homomorphic Encryption 116
7.1.5 Secure Multi-Party Computation . 117
7.1.6 Discussions 118
7.2 Privacy-Preserving Anomaly Detection . 120
7.2.1 Literature Review 121
7.2.2 Preliminaries . 123
7.2.3 System Model and Security Model 124
7.3 Objectives and Workflow . 126
7.3.1 Objectives . 126
7.3.2 Workflow . 128
7.4 Predicate Encryption based Anomaly Detection . 129
7.4.1 Procedures 129
ix
7.4.2 Development of Predicate . 131
7.4.3 Deployment of Anomaly Detection 132
7.5 Case Study and Evaluation . 134
7.5.1 Overhead . 134
7.5.2 Detection . 136
7.6 Summary 137
8 Adversarial Examples: Challenges and Solutions 139
8.1 Adversarial Examples . 139
8.1.1 Problem Formulation in Machine Learning 140
8.1.2 Creation of Adversarial Examples . 141
8.1.3 Targeted and Non-Targeted Attacks . 141
8.1.4 Black-Box and White-Box Attacks 142
8.1.5 Defenses against Adversarial Examples 142
8.2 Adversarial Attacks in Security Applications 143
8.2.1 Malware 143
8.2.2 Cyber Intrusions . 143
8.3 Case Study: Improving Adversarial Attacks Against Malware
Detectors 144
8.3.1 Background 144
8.3.2 Adversarial Attacks on Malware Detectors 145
8.3.3 MalConv Architecture 147
8.3.4 Research Idea 148
8.4 Case Study: A Metric for Machine Learning Vulnerability to
Adversarial Examples . 149
8.4.1 Background 149
8.4.2 Research Idea 150
8.5 Case Study: Protecting Smart Speakers from Adversarial Voice
Commands . 153
8.5.1 Background 153
8.5.2 Challenges 154
8.5.3 Directions and Tasks 155
8.6 Summary 157
xi