Klientų aptarnavimas: +370 652 87781

Pagalba | Naujas vartotojas | Prisijungti

El. knyga: Text Information Retrieval Systems

2.79/5 (55 ratings by Goodreads)

Carol L Barry, Charles T. Meadow, Donald H. Kraft, Bert R. Boyce

Formatas: 390 pages
Serija: Library and Information Science
Išleidimo metai: 19-Dec-2006
Leidėjas: Academic Press Inc
ISBN-13: 9780080469034

Kitos knygos pagal šią temą:

IT, Internet & electronic resources in libraries

Formatas - PDF+DRM
Kaina: 83,06 €*
* ši kaina yra galutinė, t.y. papildomos nuolaidos nebus taikomos
Įdėti į krepšelį
Įtraukti į pageidavimų sąrašą
Ši e-knyga skirta tik asmeniniam naudojimui. El. knygos nėra grąžinamos.

Formatas: 390 pages
Serija: Library and Information Science
Išleidimo metai: 19-Dec-2006
Leidėjas: Academic Press Inc
ISBN-13: 9780080469034

Kitos knygos pagal šią temą:

IT, Internet & electronic resources in libraries

DRM apribojimai

Kopijuoti:

neleidžiama
Spausdinti:

neleidžiama
El. knygos naudojimas:

Skaitmeninių teisių valdymas (DRM)
Leidykla pateikė šią knygą šifruota forma, o tai reiškia, kad norint ją atrakinti ir perskaityti reikia įdiegti nemokamą programinę įrangą. Norint skaityti šią el. knygą, turite susikurti Adobe ID . Daugiau informacijos čia. El. knygą galima atsisiųsti į 6 įrenginius (vienas vartotojas su tuo pačiu Adobe ID).

Reikalinga programinė įranga
Norint skaityti šią el. knygą mobiliajame įrenginyje (telefone ar planšetiniame kompiuteryje), turite įdiegti šią nemokamą programėlę: PocketBook Reader (iOS / Android)

Norint skaityti šią el. knygą asmeniniame arba „Mac“ kompiuteryje, Jums reikalinga Adobe Digital Editions “ (tai nemokama programa, specialiai sukurta el. knygoms. Tai nėra tas pats, kas „Adobe Reader“, kurią tikriausiai jau turite savo kompiuteryje.)

Negalite skaityti šios el. knygos naudodami „Amazon Kindle“.

This will be the third edition of the highly successful "Text Information Retrieval Systems". The book's purpose is to teach people who will be searching or designing text retrieval systems how the systems work. For designers, it covers problems they will face and reviews currently available solutions to provide a basis for more advanced study. For the searcher its purpose is to describe why such systems work as they do. The book is primarily about computer-based retrieval systems, but the principles apply to nonmechanized ones as well. The book covers the nature of information, how it is organized for use by a computer, how search functions are carried out, and some of the theory underlying these functions. As well, it discusses the interaction between user and system and how retrieved items, users, and complete systems are evaluated. A limited knowledge of mathematics and of computing is assumed. This third edition will be updated to include coverage of the WWW and current search engines. In many cases, examples of non-web searching will be replaced with web-based illustrations. Coverage of interfaces, various features available to assist searchers, and areas in which search assistance is not available will also be covered. In addition, the book will have a web dimension which will include relevant material available online, to be used in conjunction with the text. It is a follow-up to the award winning 2nd Edition. It focuses on computer-based system but basic principles can be applied to any information seeking context.

Dealing with computer-based retrieval systems, this book covers the nature of information, how it is organized for use by a computer, how search functions are carried out, and the theory underlying these functions. It also discusses the interaction between user and system and how retrieved items, users, and complete systems are evaluated.

Preface

1 Introduction

1.1 What Is Information?

1.2 What Is Information Retrieval?

1.3 How Does Information Retrieval Work?

1.3.1 The User Sequence

1.3.2 The Database Producer Sequence

1.3.3 System Design and Functioning

1.3.4 Why the Process Is Not Perfect

1.4 Who Uses Information Retrieval?

1.4.1 Information Specialists

1.4.2 Subject Specialist End Users

1.4.3 Non-Subject Specialist End Users

1.5 What Are the Problems in IRS Design and Use?

1.5.1 Design

1.5.2 Understanding User Behavior

1.6 A Brief History of Information Retrieval

1.6.1 Traditional Information Retrieval Methods

1.6.2 Pre-Computer IR Systems

1.6.3 Special Purpose Computer Systems

1.6.4 General Purpose Computer Systems

1.6.5 Online Database Services

1.6.6 The World Wide Web

Recommended Reading

2 Data, Information, and Knowledge

2.1 Introduction

2.1 Definitions

2.2.1 Data

2.2.2 Information

2.2.3 News

2.2.4 Knowledge

2.2.5 Intelligence

2.2.6 Meaning

2.2.7 Wisdom

2.2.8 Relevance and Value

2.3 Metadata

2.4 Knowledge Base

2.5 Credence, Justified Belief, and Point of View

2.6 Summary

3 Representation of Information

3.1 Information to Be Represented

3.2 Types of Representation

3.2.1 Natural Language

3.2.2 Restricted Natural Language

3.2.3 Artificial Language

3.2.4 Codes, Measures, and Descriptors

3.2.5 Mathematical Models of Text

3.3 Characteristics of Information Representations

3.3.1 Discriminating Power

3.3.2 Identification of Similarity

3.3.3 Descriptiveness

3.3.4 Ambiguity

3.3.5 Conciseness

3.4 Relationships Among Entities and Attribute Values

3.4.1 Hierarchical Codes

3.4.2 Measurements

3.4.3 Nominal Descriptors

3.4.4 Inflected Language

3.4.5 Full Text

3.4.6 Explicit Pointers and Links

3.5 Summary

4 Attribute Content and Values

4.1 Types of Attribute Symbols

4.1.1 Numbers

4.1.2 Character Strings: Names

4.1.3 Other Character Strings

4.2 Class Relationships

4.2.1 Hierarchical Classification

4.2.2 Network Relationships

4.2.3 Class Membership: Binary, Probabilistic, or Fuzzy

4.3 Transformations of Values

4.3.1 Transformation of Words by Stemming

4.3.2 Sound-Based Transformation of Words

4.3.3 Transformation of Words by Meaning

4.3.4 Transformation of Graphics

4.3.5 Transformation of Sound

4.4 Uniqueness of Values

4.5 Ambiguity of Attribute Values

4.6 Indexing of Text

4.7 Control of Vocabulary

4.7.1 Elements of Control

4.7.2 Dissemination of Controlled Vocabularies

100

4.8 Importance of Point of View

100

4.9 Summary

102

5 Models of Virtual Data Structure

5.1 Concept of Models of Data

103

5.2 Basic Data Elements and Structures

106

5.2.1 Scalar Variables and Constants

106

5.2.2 Vector Variables

107

5.2.3 Structures

107

5.2.4 Arrays

107

5.2.5 Tuples

107

5.2.6 Relations

109

5.2.7 Text

109

5.3 Common Structural Models

111

5.3.1 Linear Sequential Model

112

5.3.2 Relational Model

112

5.3.3 Hierarchical and Network Models

114

5.4 Applications of the Basic Models

116

5.4.1 Hypertext

116

5.4.2 Spreadsheet Files

118

5.5 Entity-Relationship Model

120

5.6 Summary

121

6 The Physical Structure of Data

6.1 Introduction to Physical Structures

123

6.2 Record Structures and Their Effects

124

6.2.1 Basic Structures

124

6.2.2 Space-Time and Transaction Rate

127

6.3 Basic Concepts of File Structure

127

6.3.1 The Order of Records

128

6.3.2 Finding Records

128

6.4 Organizational Methods

129

6.4.1 Sequential Files

129

6.4.2 Index-File Structures

131

6.4.3 Lists

133

6.4.4 Trees

136

6.4.5 Direct-Access Structures

138

6.5 Parsing of Data Elements

141

6.5.1 Phrase Parsing

142

6.5.2 Word Parsing

143

6.5.3 Word and Phrase Parsing

143

6.6 Combination Structures

144

6.6.1 Nested Indexes

144

6.6.2 Direct Structure with Chains

145

6.6.3 Indexed Sequential Access Method

147

6.7 Summary

148

7 Querying the Information Retrieval System

7.1 Introduction

151

7.2 Language Types

152

7.3 Query Logic

154

7.3.1 Sets and Subsets

155

7.3.2 Relational Statements

155

7.3.3 Boolean Query Logic

156

7.3.4 Ranked and Fuzzy Sets

159

7.3.5 Similarity Measures

162

7.4 Functions Performed

162

7.4.1 Connect to an IRS

162

7.4.2 Select a Database

164

7.4.3 Search the Inverted File or Thesaurus

164

7.4.4 Create a Subset of the Database

167

7.4.5 Search for Strings

168

7.4.6 Analyze a Set

170

7.4.7 Sort, Display, and Format Records

171

7.4.8 Handle the Unstructured Record

172

7.4.9 Download

172

7.4.10 Order Documents

173

7.4.11 Save, Recall, and Edit Searches

173

7.4.12 Current Awareness Search

174

7.4.13 Cost Summary

175

7.4.14 Terminate a Session

175

7.5 The Basis for Charging for Searches

176

8 Interpretation and Execution of Query Statements

8.1 Problems of Query Language Interpretation

177

8.1.1 Parsing Command Language

178

8.1.2 Parsing Natural Language

181

8.1.3 Processing Menu Choices

183

8.2 Executing Retrieval Commands

184

8.2.1 Database Selection

184

8.2.2 Inverted File Search

184

8.2.3 Set or Subset Creation

185

8.2.4 Truncation and Universal Characters

187

8.2.5 Left-Hand Truncation

188

8.3 Executing Record Analysis and Presentation Commands

191

8.3.1 Set Analysis Functions

191

8.3.2 Display, Format, and Sort

193

8.3.3 Offline Printing

195

8.4 Executing Other Commands

196

8.4.1 Ordering

196

8.4.2 Save, Recall, and Edit Searches

196

8.4.3 Current Awareness

197

8.4.4 Cost Summation and Billing

198

8.4.5 Terminate a Session

199

8.5 Feedback to Users and Error Messages

199

8.5.1 Response to Command Errors

199

8.5.2 Set-Size Indication

200

8.5.3 Record Display

200

8.5.4 Set Analysis

201

8.5.5 Cost

201

8.5.6 Help

201

9 Text Searching

9.1 The Special Problems of Text Searching

203

9.1.1 A Note on Terminology and Symbols

204

9.1.2 The Semantic Web

205

9.2 Some Characteristics of Text and Their Applications

207

9.2.1 Components of Text

207

9.2.2 Significant Words Indexing

208

9.2.3 Significant Sentences—Abstracting

209

9.2.4 Measures of Complete Texts

213

9.3 Command Language for Text Searching

214

9.3.1 Set Membership Statements

215

9.3.2 Word or String Occurrence Statements

215

9.3.3 Proximity Statements

215

9.3.4 Web Based Text Search

217

9.4 Term Weighting

218

9.4.1 Indexing with Weights

220

9.4.2 Automated Assignment of Weights

220

9.4.3 Improving Weights

221

9.5 Word Association Techniques

221

9.5.1 Dictionaries and Thesauri

221

9.5.2 Mini-Thesauri

222

9.5.3 Word Co-occurrence Statistics

223

9.5.4 Stemming and Conflation

224

9.6 Text or Record Association Techniques

224

9.6.1 Similarity Measures

225

9.6.2 Clustering

228

9.6.3 Signature Matching

230

9.6.4 Discriminant Methods

233

9.7 Other Processes with Words of a Text

234

9.7.1 Stop Words

234

9.7.2 Replacement of Words with Roots or Associated Words

235

9.7.3 Varying Significance as a Function of Frequency

236

9.7.4 Comments on the Computation of the Strength of Document Association

236

10 System-Computed Relevance and Ranking

10.1 The Retrieval Status Value (rsv)

241

10.2 Ranking

241

10.3 Methods of Evaluating the rsv

242

10.3.1 The Vector Space Model

242

10.3.2 The Probabilistic Model

244

10.3.3 The Extended Boolean Model

245

10.4 The rsv in Operational Retrieval

247

11 Search Feedback and Iteration

11.1 Basic Concepts of Feedback and Iteration

249

11.2 Command Sequences

251

11.3 Information Available as Feedback

252

11.3.1 File or Database Selection

252

11.3.2 Term Search or Browsing

253

11.3.3 Record Search and Set Formation

254

11.3.4 Record Display and Browsing

256

11.3.5 Record Acquisition

257

11.3.6 Requests for Information About the Retrieval System

257

11.3.7 Establishing Communications Parameters

258

11.3.8 Trends Over Sequences and Cycles

258

11.4 Adjustments in the Search

259

11.4.1 Improve Term Selection

260

11.4.2 Improve Set Formation Logic

260

11.4.3 Improve Final Set Size

260

11.4.4 Improve Precision, Recall, or Total Utility

260

11.5 Feedback from User to System

261

12 Multi-Database Searching and Mapping

12.1 Basic Concepts

265

12.2 Multi-Database Search

266

12.2.1 Nature of Duplicate Records

266

12.2.2 Detection of Duplicates

269

12.2.3 Scanning Multiple Databases

271

12.3 Mapping

273

12.4 Value of Mapping

275

13 Search Strategy

13.1 The Nature of Searching Reconsidered

277

13.1.1 Known Item Search

278

13.1.2 Specific Information Search

278

13.1.3 General Information Search

278

13.1.4 Exploration of the Database

279

13.2 The Nature of Search Strategy

279

13.2.1 Search Objective

280

13.2.2 General Plan of Operation

280

13.2.3 The Essential Information Elements of a Search

281

13.2.4 Specific Plan of Operation

282

13.3 Types of Strategies

282

13.3.1 Categorizing by Objective

283

13.3.2 Categorizing by Plan of Operation

283

13.4 Tactics

285

13.4.1 Monitoring Tactics

286

13.4.2 File Structure Tactics

286

13.4.3 Search Formulation Tactics

286

13.4.4 Term Tactics

286

13.5 Summary

286

14 The Information Retrieval System Interface

14.1 General Model of Message Flow

287

14.2 Sources of Ambiguity

290

14.3 The Role of a Search Intermediary

291

14.3.1 Establishing the Information Need

292

14.3.2 Development of a Search Strategy

269

14.3.3 Translation of the Need Statement into a Query

292

14.3.4 Interpretation and Evaluation of Output

293

14.3.5 Search Iteration within die Strategic Plan

293

14.3.6 Change of Strategy When Necessary

293

14.3.7 Help in Using an IR.S

294

14.4 Automated Search Mediation

294

14.4.1 Early Development

294

14.4.2 Fully Automatic Intermediary Functions

295

14.4.3 Interactive Intermediary Functions

296

14.5 The User Interface as a Component of All Systems

298

14.6 The User Interface in Web Search Engines

299

15 A Sampling of Information Retrieval Systems

15.1 Introduction

301

15.2 Dialog

302

15.2.1 Command Language Using 13oolean Logic

303

15.2.2 Target

304

15.2.3 DIALOGWeb: A Web Adaptation

305

15.3 Alta Vista

308

15.3.1 Default Query Entry Form

309

15.3.2 Advanced Search Form

310

15.4 Google

311

15.4.1 Web Crawler

311

15.4.2 Searching

312

15.4.3 Google Advanced Search

312

15.5 PubMed

313

15.6 EBSCO Host

314

15.7 Summary

315

16 Measurement and Evaluation

16.1 Basics of Measurement

317

16.1.1 The Data Manager

318

16.1.2 The Query Manager

319

16.1.3 The Query Composition Process

319

16.1.4 Deriving the Information Need

320

16.1.5 The Database

320

16.1.6 Users

321

16.2 Relevance, Value, and Utility

321

16.2.1 Relevance as Relatedness

322

16.2.2 Aspects of Value

322

16.2.3 Relevance as Utility

323

16.2.4 Retaining Two Separate Relevance Measures

323

16.2.5 The Relevance Measurement Scale

325

16.2.6 Taking the Measurements

326

16.2.7 Questions about Relevance as a Measure

327

16.3 Measures Based on Relevance

328

16.3.1 Precision (Pr)

328

16.3.2 Recall (Re)

329

16.3.3 Relationship of Recall and Precision

330

16.3.4 Overall Effectiveness Measures Based on Re and Pr

331

16.4 Measures of Process

334

16.4.1 Query Translation

334

16.4.2 Errors in a Query Statement

334

16.4.3 Average Time per Command or per User Decision

335

16.4.4 Elapsed Time of a Search

335

16.4.5 Number of Commands or Steps in a Search

335

16.4.6 Cost of a Search

335

16.4.7 Size of Final Set Formed

336

16.4.8 Number of Records Reviewed by the User

336

16.4.9 Patterns of Language Use

336

16.4.10 Measures of Rank Order

339

16.5 Measures of Outcome

340

16.5.1 Precision

341

16.5.2 Recall

341

16.5.3 Efficiency

341

16.5.4 Overall User Evaluation

341

16.6 Measures of Environment

342

16.6.1 Database Record Selection

342

16.6.2 Record Content

342

16.6.3 Measures of Users

342

16.7 Conclusion

343

Bibliography

345

Index

357

Charles T. Meadow, professor emeritus, University of Toronto, and has been visiting professor at the Universities of North Carolina and the West Indies. He edited the Journal of the American Society for Information Science and the Canadian Journal of Information Science and was president of the Canadian Association for Information Science. Received Research Award and shared Annual Information Science Book Award from ASIS&T. Bert Boyce has been an Information System Research Analyst, for the Information Systems Office, at the Library of Congress, a faculty member and acting Dean of the School of Library and Information Science, University of Missouri, Columbia, Missouri, and Dean of the School of Library and Information Science, Louisiana State University, where he is now Professor and Dean Emeritus. He is currently Editor of the Academic Press Library and Information Science Series. He received the ASIS&T Outstanding Information Science Teacher Award in 1989, and has shared the Annual Information Science Book Award from ASIS&T. Donald Kraft is professor at LSU and Distinguished Visiting Professor at the U.S. Air Force Academy. He is a fellow of IEEE and AAAS and editor of the Journal of the American Society for Information Science and Technology He received the Research Award, Watson Davis Award, and shared the Annual Information Science Book Award from ASIS&T and the LSU Distinguished Faculty award. Carol Barry is associate professor in the School of Library and Information Science, Louisiana State University. She has received the Best JASIS Paper Award, 1995; the LSU Alumni Association Teaching Award, 1995; and the American Society for Information Science, Doctoral Forum Award, 1993. She is associate editor of JASIS&T, a Member of the Board of ASIS&T, and a member of the LSU Faculty Senate and its vice president in 2000-2001. She has authored or co-authored over 30 research papers.

Daugiau apie elektronines knygas

Pastovi nuoroda: https://www.kriso.lt/db/97800804690342e.html

El. knyga: Text Information Retrieval Systems

DRM apribojimai

Kopijuoti:

Spausdinti:

El. knygos naudojimas:

Paskyra ir nustatymai

Paieška

Ieškoti duomenų bazėje

Patikslinti paiešką

Temos E-knygų temos

Pasirinkti pirkinių krepšelį