Preface |
|
vii | |
|
Chapter 1 Basic Concepts of Document Analysis and Understanding |
|
|
1 | (54) |
|
|
1 | (3) |
|
1.2 Basic Model of Document Processing |
|
|
4 | (4) |
|
|
8 | (5) |
|
1.3.1 Strength of Structure |
|
|
8 | (1) |
|
1.3.2 Geometric Structure |
|
|
9 | (1) |
|
1.3.2.1 Geometric Complexity |
|
|
10 | (2) |
|
|
12 | (1) |
|
|
13 | (10) |
|
1.4.1 Hierarchical Methods |
|
|
13 | (1) |
|
1.4.1.1 Top-Down Approach |
|
|
14 | (2) |
|
1.4.1.2 Bottom-Up Approach |
|
|
16 | (1) |
|
1.4.2 No-Hierarchical Methods |
|
|
17 | (1) |
|
1.4.2.1 Modified Fractal Signature |
|
|
18 | (2) |
|
1.4.2.2 Order Stochastic Filtering |
|
|
20 | (2) |
|
1.4.3 Web Document Analysis |
|
|
22 | (1) |
|
1.5 Document Understanding |
|
|
23 | (4) |
|
1.5.1 Document Understanding Based on Tree Transformation |
|
|
24 | (1) |
|
1.5.2 Document Understanding Based on Formatting Knowledge |
|
|
25 | (1) |
|
1.5.3 Document Understanding Based on Description Language |
|
|
26 | (1) |
|
1.6 Form Document Processing |
|
|
27 | (5) |
|
1.6.1 Characteristics of Form Documents |
|
|
27 | (1) |
|
1.6.2 Wavelet Transform Approach |
|
|
27 | (1) |
|
1.6.3 Approach Based on Form Description Language |
|
|
28 | (3) |
|
1.6.4 Form Document Processing Based on Form Registration |
|
|
31 | (1) |
|
1.6.5 Form Document Processing System |
|
|
31 | (1) |
|
1.7 Character Recognition and Document Image Processing |
|
|
32 | (11) |
|
1.7.1 Handwritten and Printed Character Recognition |
|
|
32 | (1) |
|
1.7.1.1 Extracting Multiresolution Features in Recognition of Handwritten Numerals with 2-D Haar Wavelet |
|
|
33 | (4) |
|
1.7.1.2 Recognition of Printed Kannada Text in Indian Languages |
|
|
37 | (1) |
|
1.7.1.3 Wavelet Descriptors of Handprinted Characters |
|
|
38 | (1) |
|
1.7.2 Document Image Analysis Based on Multiresolution Hadamard Representation (MHR) |
|
|
38 | (5) |
|
|
43 | (12) |
|
|
44 | (1) |
|
1.8.2 Techniques for Skew Detection |
|
|
45 | (1) |
|
1.8.3 Projection Profile Cuts |
|
|
46 | (1) |
|
1.8.4 Run-Length Smoothing Algorithm (RLSA) |
|
|
47 | (1) |
|
1.8.5 Neighborhood Line Density (NLD) |
|
|
48 | (1) |
|
1.8.6 Connected Components Analysis (CCA) |
|
|
49 | (1) |
|
|
50 | (1) |
|
1.8.8 Form Definition Language (FDL) |
|
|
50 | (1) |
|
1.8.9 Texture Analysis -- Gabor Filters |
|
|
51 | (1) |
|
|
52 | (1) |
|
1.8.11 Other Segmentation Techniques |
|
|
53 | (2) |
|
Chapter 2 Basic Concepts of Fractal Dimension |
|
|
55 | (40) |
|
2.1 Definitions of Fractals |
|
|
55 | (2) |
|
|
57 | (12) |
|
|
57 | (3) |
|
2.2.2 Hausdorff Dimension |
|
|
60 | (4) |
|
2.2.3 Examples of Computing Hausdorff Dimension |
|
|
64 | (5) |
|
2.3 Box Computing Dimension |
|
|
69 | (14) |
|
|
69 | (1) |
|
2.3.2 Box Computing Dimension |
|
|
70 | (5) |
|
2.3.3 Minkowski Dimension |
|
|
75 | (6) |
|
2.3.4 Properties of Box Counting Dimension |
|
|
81 | (2) |
|
2.4 Basic Methods for Calculating Dimensions |
|
|
83 | (12) |
|
Chapter 3 Basic Concepts of Wavelet Theory |
|
|
95 | (78) |
|
3.1 Continuous Wavelet Transforms |
|
|
95 | (29) |
|
3.1.1 General Theory of Continuous Wavelet Transforms |
|
|
95 | (16) |
|
3.1.2 The Continuous Wavelet Transform as a Filter |
|
|
111 | (3) |
|
3.1.3 Description of Regularity of Signal by Wavelet |
|
|
114 | (4) |
|
3.1.4 Some Examples of Basic Wavelets |
|
|
118 | (6) |
|
3.2 Multiresolution Analysis (MRA) and Wavelet Bases |
|
|
124 | (49) |
|
3.2.1 Multiresolution Analysis |
|
|
124 | (1) |
|
3.2.1.1 Basic Concept of MRA |
|
|
124 | (5) |
|
3.2.1.2 The Solution of Two-Scale Equation |
|
|
129 | (10) |
|
3.2.2 The Construction of MRAs |
|
|
139 | (7) |
|
3.2.2.1 The Biorthonormal MRA |
|
|
146 | (7) |
|
3.2.2.2 Examples of Constructing MRA |
|
|
153 | (10) |
|
3.2.3 The Construction of Biorthonormal Wavelet Bases |
|
|
163 | (5) |
|
3.2.4 S. Mallat Algorithms |
|
|
168 | (5) |
|
Chapter 4 Document Analysis by Fractal Dimension |
|
|
173 | (30) |
|
|
173 | (6) |
|
4.2 Document Analysis Based on Modified Fractal Signature (MFS) |
|
|
179 | (8) |
|
4.2.1 Basic Idea of Modified Fractal Signature (MFS) |
|
|
179 | (1) |
|
|
180 | (3) |
|
4.2.3 Blanket Technique to Extract Fractal Feature |
|
|
183 | (4) |
|
4.3 Algorithm of Modified Fractal Signature (MFS) |
|
|
187 | (7) |
|
4.3.1 Identification of Different Blocks of Document by Fractal Signature |
|
|
187 | (4) |
|
4.3.2 Modified Fractal Signature (MFS) |
|
|
191 | (3) |
|
|
194 | (9) |
|
Chapter 5 Text Extraction by Wavelet Decomposition |
|
|
203 | (28) |
|
|
203 | (1) |
|
5.2 Wavelet Decomposition of Pseudo-Motion Functions |
|
|
204 | (5) |
|
|
204 | (4) |
|
|
208 | (1) |
|
5.3 Segmentation of Different Areas of Document Image |
|
|
209 | (6) |
|
5.3.1 Segmentation of Areas of Different Frequency |
|
|
209 | (3) |
|
|
212 | (3) |
|
|
215 | (16) |
|
5.4.1 Position of License Plate |
|
|
215 | (1) |
|
5.4.1.1 Choose of the Bases |
|
|
215 | (4) |
|
5.4.1.2 Experimental Results |
|
|
219 | (1) |
|
5.4.2 Localization of Text Areas of Document Images |
|
|
220 | (11) |
|
Chapter 6 Rotation Invariant by Fractal Theory with Central Projection Transform (CPT) |
|
|
231 | (48) |
|
|
231 | (13) |
|
|
232 | (2) |
|
6.1.2 Rotation Invariants |
|
|
234 | (3) |
|
6.1.3 Rotation Invariant of Discrete Images |
|
|
237 | (4) |
|
6.1.4 Rotation Invariants in Pattern Recognition |
|
|
241 | (1) |
|
6.1.4.1 Boundary Curvature |
|
|
242 | (1) |
|
6.1.4.2 Fourier Descriptors |
|
|
242 | (1) |
|
|
243 | (1) |
|
|
243 | (1) |
|
6.2 Preprocessing and Central Projection Transform (CPT) |
|
|
244 | (12) |
|
|
244 | (2) |
|
6.2.2 Central Projection Transform (CPT) |
|
|
246 | (1) |
|
6.2.2.1 Basic Definitions of CPT |
|
|
246 | (4) |
|
6.2.2.2 Properties of CPT |
|
|
250 | (2) |
|
6.2.2.3 Parallel Algorithm for CPT |
|
|
252 | (1) |
|
6.2.2.4 Contour Unfolding |
|
|
253 | (3) |
|
6.3 Rotation Invariance Based on Box Computing Dimension |
|
|
256 | (11) |
|
6.3.1 Estimation of the 1-D Fractal Dimension |
|
|
256 | (2) |
|
6.3.2 Rotation Invariant Signature (RIS) |
|
|
258 | (9) |
|
|
267 | (12) |
|
6.4.1 Rotation Invariant Signature (RIS) Algorithm |
|
|
267 | (1) |
|
6.4.1.1 Estimation of the BCD |
|
|
267 | (2) |
|
6.4.1.2 Extraction of Feature with Rotation Invariant Property |
|
|
269 | (2) |
|
6.4.2 Experimental Procedure and Results |
|
|
271 | (8) |
|
Chapter 7 Wavelet-Based and Fractal-Based Methods for Script Identification |
|
|
279 | (30) |
|
|
280 | (2) |
|
7.2 Wavelet-Based Approach |
|
|
282 | (18) |
|
7.2.1 Image Decomposition by Multi-Scale Wavelet Transform |
|
|
284 | (3) |
|
7.2.2 Wavelet-Based Features |
|
|
287 | (1) |
|
7.2.2.1 Average Energy of Document Image |
|
|
287 | (2) |
|
7.2.2.2 Wavelet Energy Distribution Features (Fd) |
|
|
289 | (2) |
|
7.2.2.3 Wavelet Energy Distribution Proportion Features (Fdp) |
|
|
291 | (2) |
|
|
293 | (1) |
|
7.2.3.1 Distance Functions |
|
|
293 | (2) |
|
7.2.3.2 Experimental Results |
|
|
295 | (5) |
|
7.3 Fractal-Based Approach |
|
|
300 | (9) |
|
|
301 | (1) |
|
|
302 | (7) |
|
Chapter 8 Writer Identification Using Hidden Markov Model in Wavelet Domain (WD-HMM) |
|
|
309 | (28) |
|
|
309 | (1) |
|
8.2 Hidden Markov Model and Relative Statistical Knowledge |
|
|
310 | (10) |
|
8.2.1 Expectation Maximization (EM) Algorithm |
|
|
310 | (2) |
|
8.2.2 Gaussian Mixture Model (GMM) and Expectation Maximization (EM) for Gaussian Mixture Model (GMM) |
|
|
312 | (4) |
|
8.2.3 Hidden Markov Model |
|
|
316 | (1) |
|
8.2.3.1 Basic Frame of HMM |
|
|
316 | (2) |
|
8.2.3.2 Three Basic Problems for HMM |
|
|
318 | (1) |
|
8.2.3.3 Important Assumptions for HMM |
|
|
319 | (1) |
|
8.3 Hidden Markov Models in Wavelet Domain |
|
|
320 | (4) |
|
8.3.1 GMM Model for a Single Wavelet Coefficient |
|
|
320 | (1) |
|
8.3.2 Independence Mixture Model |
|
|
320 | (1) |
|
8.3.3 WD-HMM and EM for WD-HMM |
|
|
321 | (3) |
|
8.4 Writer Identification Using WD-HMM |
|
|
324 | (5) |
|
8.4.1 The Whole Procedure |
|
|
324 | (1) |
|
|
325 | (1) |
|
8.4.3 Similarity Measurement |
|
|
326 | (3) |
|
8.4.4 Performance Evaluation |
|
|
329 | (1) |
|
|
329 | (8) |
Bibliography |
|
337 | (16) |
Index |
|
353 | |