Atnaujinkite slapukų nuostatas

Anti-Spam Techniques Based on Artificial Immune System [Kietas viršelis]

(Peking University, China)
  • Formatas: Hardback, 264 pages, aukštis x plotis: 234x156 mm, weight: 552 g, 45 Tables, black and white; 104 Illustrations, black and white
  • Išleidimo metai: 01-Dec-2015
  • Leidėjas: CRC Press Inc
  • ISBN-10: 149872518X
  • ISBN-13: 9781498725187
Kitos knygos pagal šią temą:
  • Formatas: Hardback, 264 pages, aukštis x plotis: 234x156 mm, weight: 552 g, 45 Tables, black and white; 104 Illustrations, black and white
  • Išleidimo metai: 01-Dec-2015
  • Leidėjas: CRC Press Inc
  • ISBN-10: 149872518X
  • ISBN-13: 9781498725187
Kitos knygos pagal šią temą:
Email has become an indispensable communication tool in daily life. However, high volumes of spam waste resources, interfere with productivity, and present severe threats to computer system security and personal privacy. This book introduces research on anti-spam techniques based on the artificial immune system (AIS) to identify and filter spam. It provides a single source of all anti-spam models and algorithms based on the AIS that have been proposed by the author for the past decade in various journals and conferences.

Inspired by the biological immune system, the AIS is an adaptive system based on theoretical immunology and observed immune functions, principles, and models for problem solving. Among the variety of anti-spam techniques, the AIS has been highly effective and is becoming one of the most important methods to filter spam. The book also focuses on several key topics related to the AIS, including:





Extraction methods inspired by various immune principles Construction approaches based on several concentration methods and models Classifiers based on immune danger theory The immune-based dynamic updating algorithm Implementing AIS-based spam filtering systems

The book also includes several experiments and comparisons with state-of-the-art anti-spam techniques to illustrate the excellent performance AIS-based anti-spam techniques.

Anti-Spam Techniques Based on Artificial Immune System gives practitioners, researchers, and academics a centralized source of detailed information on efficient models and algorithms of AIS-based anti-spam techniques. It also contains the most current information on the general achievements of anti-spam research and approaches, outlining strategies for designing and applying spam-filtering models.
List of Figures xi
List of Tables xv
List of Symbols xix
Preface xxi
Acknowledgments xxv
Author xxvii
1 Anti-Spam Technologies 1(22)
1.1 Spam Problem
1(2)
1.1.1 Definition of Spam
1(1)
1.1.2 Scale and Influence of Spam
2(1)
1.2 Prevalent Anti-Spam Technologies
3(4)
1.2.1 Legal Means
3(1)
1.2.2 E-Mail Protocol Methods
4(1)
1.2.3 Simple Techniques
5(2)
1.2.3.1 Address Protection
5(1)
1.2.3.2 Keywords Filtering
5(1)
1.2.3.3 Black List and White-List
6(1)
1.2.3.4 Gray List and Challenge-Response
6(1)
1.2.4 Intelligent Spam Detection Approaches
7(1)
1.3 E-Mail Feature Extraction Approaches
7(10)
1.3.1 Term Selection Strategies
8(1)
1.3.2 Text-Based Feature Extraction Approaches
9(2)
1.3.3 Image-Based Feature Extraction Approaches
11(2)
1.3.3.1 Property Features of Image
11(1)
1.3.3.2 Color and Texture Features of Image
11(1)
1.3.3.3 Character Edge Features
12(1)
1.3.3.4 OCR-Based Features
13(1)
1.3.4 Behavior-Based Feature Extraction Approaches
13(6)
1.3.4.1 Behavior Features of Spammers
14(1)
1.3.4.2 Network Behavior Features of Spam
15(1)
1.3.4.3 Social Network—Based Behavior Features
15(1)
1.3.4.4 Immune-Based Behavior Feature Extraction Approaches
16(1)
1.4 E-Mail Classification Techniques
17(2)
1.5 Performance Evaluation and Standard Corpora
19(2)
1.5.1 Performance Measurements
19(1)
1.5.2 Standard Corpora
20(1)
1.6 Summary
21(2)
2 Artificial Immune System 23(22)
2.1 Introduction
23(1)
2.2 Biological Immune System
24(4)
2.2.1 Overview
24(1)
2.2.2 Adaptive Immune Process
25(1)
2.2.3 Characteristics of BIS
26(2)
2.3 Artificial Immune System
28(12)
2.3.1 Overview
28(1)
2.3.2 AIS Models and Algorithms
29(8)
2.3.2.1 Negative Selection Algorithm
30(1)
2.3.2.2 Clonal Selection Algorithm
31(2)
2.3.2.3 Immune Network Model
33(1)
2.3.2.4 Danger Theory Model
34(1)
2.3.2.5 Immune Concentration
35(2)
2.3.2.6 Other Models and Algorithms
37(1)
2.3.3 Characteristics of AIS
37(1)
2.3.4 Application Fields of AIS
38(2)
2.4 Applications of AIS in Anti-Spam
40(4)
2.4.1 Heuristic Methods
40(1)
2.4.2 Negative Selection
41(1)
2.4.3 Immune Network
42(1)
2.4.4 Dynamic Algorithms
42(1)
2.4.5 Hybrid Models
43(1)
2.5 Summary
44(1)
3 Term Space Partition-Based Feature Construction Approach 45(14)
3.1 Motivation
45(2)
3.2 Principles of the TSP Approach
47(2)
3.3 Implementation of the TSP Approach
49(4)
3.3.1 Preprocessing
49(1)
3.3.2 Term Space Partition
49(2)
3.3.3 Feature Construction
51(2)
3.4 Experiments
53(5)
3.4.1 Investigation of Parameters
53(2)
3.4.2 Performance with Different Feature Selection Metrics
55(1)
3.4.3 Comparison with Current Approaches
56(2)
3.5 Summary
58(1)
4 Immune Concentration-Based Feature Construction Approach 59(24)
4.1 Introduction
59(1)
4.2 Diversity of Detector Representation in AIS
60(1)
4.3 Motivation of Concentration-Based Feature Construction Approach
61(1)
4.4 Overview of Concentration-Based Feature Construction Approach
62(1)
4.5 Gene Library Generation
62(1)
4.6 Concentration Vector Construction
63(2)
4.7 Relation to Other Methods
65(1)
4.8 Complexity Analysis
66(1)
4.9 Experimental Validation
66(8)
4.9.1 Experiments on Different Concentrations
68(2)
4.9.2 Experiments with Two-Element Concentration Vector
70(2)
4.9.3 Experiments with Middle Concentration
72(2)
4.10 Discussion
74(4)
4.11 Summary
78(5)
5 Local Concentration-Based Feature Extraction Approach 83(18)
5.1 Introduction
83(1)
5.2 Structure of Local Concentration Model
84(1)
5.3 Term Selection and Detector Sets Generation
85(2)
5.4 Construction of Local Concentration—Based Feature Vectors
87(1)
5.5 Strategies for Defining Local Areas
88(1)
5.5.1 Using a Sliding Window with Fixed Length
88(1)
5.5.2 Using a Sliding Window with Variable Length
89(1)
5.6 Analysis of Local Concentration Model
89(1)
5.7 Experimental Validation
90(9)
5.7.1 Selection of a Proper Tendency Threshold
91(1)
5.7.2 Selection of Proper Feature Dimensionality
91(1)
5.7.3 Selection of a Proper Sliding Window Size
92(1)
5.7.4 Selection of Optimal Terms Percentage
93(1)
5.7.5 Experiments of the Model with Three Term Selection Methods
93(1)
5.7.6 Comparison between the LC Model and Current Approaches
94(3)
5.7.7 Discussion
97(2)
5.8 Summary
99(2)
6 Multi-Resolution Concentration-Based Feature Construction Approach 101(14)
6.1 Introduction
101(1)
6.2 Structure of Multi-Resolution Concentration Model
102(1)
6.2.1 Detector Sets Construction
103(1)
6.2.2 Calculation of Multi-Resolution Concentrations
103(1)
6.3 Multi-Resolution Concentration-Based Feature Construction Approach
103(2)
6.4 Weighted Multi-Resolution Concentration-Based Feature Construction Approach
105(1)
6.5 Experimental Validation
106(5)
6.5.1 Investigation of Parameters
107(1)
6.5.2 Comparison with the Prevalent Approaches
108(3)
6.5.3 Performance with Other Classification Methods
111(1)
6.6 Summary
111(4)
7 Adaptive Concentration Selection Model 115(10)
7.1 Overview of Adaptive Concentration Selection Model
115(1)
7.2 Setup of Gene Libraries
116(1)
7.3 Construction of Feature Vectors Based on Immune Concentration
116(2)
7.4 Implementation of Adaptive Concentration Selection Model
118(1)
7.5 Experimental Validation
119(5)
7.5.1 Experimental Setup
119(1)
7.5.2 Parameter Selection
120(2)
7.5.3 Experiments of Proposed Model
122(1)
7.5.4 Discussion
123(1)
7.6 Summary
124(1)
8 Variable Length Concentration-Based Feature Construction Method 125(10)
8.1 Introduction
125(1)
8.2 Structure of Variable Length Concentration Model
126(3)
8.2.1 Construction of Variable Length Feature Vectors
126(1)
8.2.2 Recurrent Neural Networks
127(2)
8.3 Experimental Parameters and Setup
129(2)
8.3.1 Proportion of Terms Selection
129(1)
8.3.2 Dimension of Feature Vectors
129(1)
8.3.3 Selection of Size of Sliding Window
129(1)
8.3.4 Parameters of RNN
130(1)
8.4 Experimental Results on the VLC Approach
131(2)
8.5 Discussion
133(1)
8.6 Summary
134(1)
9 Parameter Optimization of Concentration-Based Feature Construction Approaches 135(10)
9.1 Introduction
135(1)
9.2 Local Concentration-Based Feature Extraction Approach
136(2)
9.3 Fireworks Algorithm
138(1)
9.4 Parameter Optimization of Local Concentration Model for Spam Detection by Using Fireworks Algorithm
139(2)
9.5 Experimental Validation
141(2)
9.5.1 Experimental Setup
141(1)
9.5.2 Experimental Results and Analysis
141(2)
9.6 Summary
143(2)
10 Immune Danger Theory-Based Ensemble Method 145(10)
10.1 Introduction
145(1)
10.2 Generating Signals
146(1)
10.3 Classification Using Signals
146(2)
10.4 Self-Trigger Process
148(1)
10.5 Framework of DTE Model
148(1)
10.6 Analysis of DTE Model
148(2)
10.7 Filter Spam Using the DTE Model
150(3)
10.8 Summary
153(2)
11 Immune Danger Zone Principle-Based Dynamic Learning Method 155(16)
11.1 Introduction
155(1)
11.2 Global Learning and Local Learning
156(1)
11.3 Necessity of Building Hybrid Models
157(1)
11.4 Multi-Objective Learning Principles
158(1)
11.5 Strategies for Combining Global Learning and Local Learning
159(2)
11.6 Local Trade-Off between Capacity and Locality
161(1)
11.7 Hybrid Model for Combining Models with Varied Locality
161(2)
11.8 Relation to Multiple Classifier Combination
163(1)
11.9 Validation of the Dynamic Learning Method
164(5)
11.9.1 Danger Zone Size
164(1)
11.9.2 Effects of Threshold
165(1)
11.9.3 Comparison Results
165(4)
11.10 Summary
169(2)
12 Immune-Based Dynamic Updating Algorithm 171(26)
12.1 Introduction
171(1)
12.2 Backgrounds of SVM and AIS
172(4)
12.2.1 Support Vector Machine
172(2)
12.2.2 Artificial Immune System
174(2)
12.3 Principles of EM-Update and Sliding Window
176(2)
12.3.1 EM-Update
176(1)
12.3.2 Work Process of Sliding Window
176(2)
12.3.3 Primary Response and Secondary Response
178(1)
12.4 Implementation of Algorithms
178(7)
12.4.1 Overview of Dynamic Updating Algorithm
178(2)
12.4.2 Message Representation
180(1)
12.4.3 Dimension Reduction
180(1)
12.4.4 Initialization of the Window
180(1)
12.4.5 Classification Criterion
181(1)
12.4.6 Update of the Classifier
182(3)
12.4.7 Purge of Out-of-Date Knowledge
185(1)
12.5 Filtering Spam Using the Dynamic Updating Algorithms
185(4)
12.6 Discussion
189(7)
12.7 Summary
196(1)
13 MS-Based Spam Filtering System and Implementation 197(16)
13.1 Introduction
197(1)
13.2 Framework of AIS-Based Spam Filtering Model
198(2)
13.3 Postfix-Based Implementation
200(2)
13.3.1 Design of Milter-Plugin
202(1)
13.3.2 Maildrop-Based Local Filter
202(1)
13.4 User Interests—Based Parameter Design
202(3)
13.4.1 Generation and Storage of Parameters
203(1)
13.4.2 Selection of Parameters
204(1)
13.5 User Interaction
205(1)
13.6 Test and Analysis
205(6)
13.6.1 Testing Method
205(1)
13.6.2 Testing Results
206(5)
13.6.3 Results Analysis
211(1)
13.7 Summary
211(2)
References 213(16)
Index 229
Ying Tan, PhD, is a full professor and PhD advisor in the School of Electronics Engineering and Computer Science at Peking University, China. He is also director of the Computational Intelligence Laboratory at Peking University. He received his PhD from Southeast University in Nanjing, China. His research interests include computational intelligence, swarm intelligence, data mining, machine learning, fireworks algorithm, and intelligent information processing for information security. He has published more than 280 papers, has authored or coauthored six books and more than 10 book chapters, and holds three invention patents. He is editor in chief of the International Journal of Computational Intelligence and Pattern Recognition and is an associate editor of IEEE Transactions on Cybernetics and IEEE Transactions on Neural Networks and Learning Systems. He is the general chair of the ICSICCI 2015 joint conference and ICSI series conference and is a senior member of the IEEE.