Atnaujinkite slapukų nuostatas

El. knyga: Natural Language Processing with Transformers

4.45/5 (99 ratings by Goodreads)
  • Formatas: 408 pages
  • Išleidimo metai: 26-Jan-2022
  • Leidėjas: O'Reilly Media
  • Kalba: eng
  • ISBN-13: 9781098103217
Kitos knygos pagal šią temą:
  • Formatas: 408 pages
  • Išleidimo metai: 26-Jan-2022
  • Leidėjas: O'Reilly Media
  • Kalba: eng
  • ISBN-13: 9781098103217
Kitos knygos pagal šią temą:

DRM apribojimai

  • Kopijuoti:

    neleidžiama

  • Spausdinti:

    neleidžiama

  • El. knygos naudojimas:

    Skaitmeninių teisių valdymas (DRM)
    Leidykla pateikė šią knygą šifruota forma, o tai reiškia, kad norint ją atrakinti ir perskaityti reikia įdiegti nemokamą programinę įrangą. Norint skaityti šią el. knygą, turite susikurti Adobe ID . Daugiau informacijos  čia. El. knygą galima atsisiųsti į 6 įrenginius (vienas vartotojas su tuo pačiu Adobe ID).

    Reikalinga programinė įranga
    Norint skaityti šią el. knygą mobiliajame įrenginyje (telefone ar planšetiniame kompiuteryje), turite įdiegti šią nemokamą programėlę: PocketBook Reader (iOS / Android)

    Norint skaityti šią el. knygą asmeniniame arba „Mac“ kompiuteryje, Jums reikalinga  Adobe Digital Editions “ (tai nemokama programa, specialiai sukurta el. knygoms. Tai nėra tas pats, kas „Adobe Reader“, kurią tikriausiai jau turite savo kompiuteryje.)

    Negalite skaityti šios el. knygos naudodami „Amazon Kindle“.

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book shows you how to train and scale these large models using Hugging Face Transformers, a Python-based deep learning library.

Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf, among the creators of Hugging Face Transformers, use a hands-on approach to teach you how transformers work and how to integrate them in your applications. You'll quickly learn a variety of tasks they can help you solve.

  • Build, debug, and optimize transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering
  • Learn how transformers can be used for cross-lingual transfer learning
  • Apply transformers in real-world scenarios where labeled data is scarce
  • Make transformer models efficient for deployment using techniques such as distillation, pruning, and quantization
  • Train transformers from scratch and learn how to scale to multiple GPUs and distributed environments
Foreword xi
Preface xv
1 Hello Transformers 1(20)
The Encoder-Decoder Framework
2(2)
Attention Mechanisms
4(2)
Transfer Learning in NLP
6(3)
Hugging Face Transformers: Bridging the Gap
9(1)
A Tour of Transformer Applications
10(5)
Text Classification
10(1)
Named Entity Recognition
11(1)
Question Answering
12(1)
Summarization
13(1)
Translation
13(1)
Text Generation
14(1)
The Hugging Face Ecosystem
15(4)
The Hugging Face Hub
16(1)
Hugging Face Tokenizers
17(1)
Hugging Face Datasets
18(1)
Hugging Face Accelerate
18(1)
Main Challenges with Transformers
19(1)
Conclusion
20(1)
2 Text Classification 21(36)
The Dataset
22(7)
A First Look at Hugging Face Datasets
23(3)
From Datasets to DataFrames
26(1)
Looking at the Class Distribution
27(1)
How Long Are Our Tweets?
28(1)
From Text to Tokens
29(7)
Character Tokenization
29(2)
Word Tokenization
31(2)
Subword Tokenization
33(2)
Tokenizing the Whole Dataset
35(1)
Training a Text Classifier
36(18)
Transformers as Feature Extractors
38(7)
Fine-Tuning Transformers
45(9)
Conclusion
54(3)
3 Transformer Anatomy 57(30)
The Transformer Architecture
57(3)
The Encoder
60(16)
Self-Attention
61(9)
The Feed-Forward Layer
70(1)
Adding Layer Normalization
71(2)
Positional Embeddings
73(2)
Adding a Classification Head
75(1)
The Decoder
76(2)
Meet the Transformers
78(7)
The Transformer Tree of Life
78(1)
The Encoder Branch
79(3)
The Decoder Branch
82(1)
The Encoder-Decoder Branch
83(2)
Conclusion
85(2)
4 Multilingual Named Entity Recognition 87(36)
The Dataset
88(4)
Multilingual Transformers
92(1)
A Closer Look at Tokenization
93(3)
The Tokenizer Pipeline
94(1)
The SentencePiece Tokenizer
95(1)
Transformers for Named Entity Recognition
96(2)
The Anatomy of the Transformers Model Class
98(5)
Bodies and Heads
98(1)
Creating a Custom Model for Token Classification
99(2)
Loading a Custom Model
101(2)
Tokenizing Texts for NER
103(2)
Performance Measures
105(1)
Fine-Tuning XLM-RoBERTa
106(2)
Error Analysis
108(7)
Cross-Lingual Transfer
115(6)
When Does Zero-Shot Transfer Make Sense?
116(2)
Fine-Tuning on Multiple Languages at Once
118(3)
Interacting with Model Widgets
121(1)
Conclusion
122(1)
5 Text Generation 123(18)
The Challenge with Generating Coherent Text
125(2)
Greedy Search Decoding
127(3)
Beam Search Decoding
130(4)
Sampling Methods
134(2)
Top-k and Nucleus Sampling
136(4)
Which Decoding Method Is Best?
140(1)
Conclusion
140(1)
6 Summarization 141(24)
The CNN/DailyMail Dataset
141(2)
Text Summarization Pipelines
143(3)
Summarization Baseline
143(1)
GPT-2
144(1)
T5
144(1)
BART
145(1)
PEGASUS
145(1)
Comparing Different Summaries
146(2)
Measuring the Quality of Generated Text
148(6)
BLEU
148(4)
ROUGE
152(2)
Evaluating PEGASUS on the CNN/DailyMail Dataset
154(3)
Training a Summarization Model
157(6)
Evaluating PEGASUS on SAMSum
158(1)
Fine-Tuning PEGASUS
158(4)
Generating Dialogue Summaries
162(1)
Conclusion
163(2)
7 Question Answering 165(44)
Building a Review-Based QA System
166(23)
The Dataset
167(6)
Extracting Answers from Text
173(8)
Using Haystack to Build a QA Pipeline
181(8)
Improving Our QA Pipeline
189(16)
Evaluating the Retriever
189(7)
Evaluating the Reader
196(3)
Domain Adaptation
199(4)
Evaluating the Whole QA Pipeline
203(2)
Going Beyond Extractive QA
205(2)
Conclusion
207(2)
8 Making Transformers Efficient in Production 209(40)
Intent Detection as a Case Study
210(2)
Creating a Performance Benchmark
212(5)
Making Models Smaller via Knowledge Distillation
217(13)
Knowledge Distillation for Fine-Tuning
217(3)
Knowledge Distillation for Pretraining
220(1)
Creating a Knowledge Distillation Trainer
220(2)
Choosing a Good Student Initialization
222(4)
Finding Good Hyperparameters with Optuna
226(3)
Benchmarking Our Distilled Model
229(1)
Making Models Faster with Quantization
230(6)
Benchmarking Our Quantized Model
236(1)
Optimizing Inference with ONNX and the ONNX Runtime
237(6)
Making Models Sparser with Weight Pruning
243(5)
Sparsity in Deep Neural Networks
244(1)
Weight Pruning Methods
244(4)
Conclusion
248(1)
9 Dealing with Few to No Labels 249(50)
Building a GitHub Issues Tagger
251(9)
Getting the Data
252(1)
Preparing the Data
253(4)
Creating Training Sets
257(2)
Creating Training Slices
259(1)
Implementing a Naive Bayesline
260(3)
Working with No Labeled Data
263(8)
Working with a Few Labels
271(18)
Data Augmentation
271(4)
Using Embeddings as a Lookup Table
275(9)
Fine-Tuning a Vanilla Transformer
284(4)
In-Context and Few-Shot Learning with Prompts
288(1)
Leveraging Unlabeled Data
289(8)
Fine-Tuning a Language Model
289(4)
Fine-Tuning a Classifier
293(2)
Advanced Methods
295(2)
Conclusion
297(2)
10 Training Transformers from Scratch 299(46)
Large Datasets and Where to Find Them
300(10)
Challenges of Building a Large-Scale Corpus
300(3)
Building a Custom Code Dataset
303(3)
Working with Large Datasets
306(3)
Adding Datasets to the Hugging Face Hub
309(1)
Building a Tokenizer
310(13)
The Tokenizer Model
312(1)
Measuring Tokenizer Performance
312(1)
A Tokenizer for Python
313(5)
Training a Tokenizer
318(4)
Saving a Custom Tokenizer on the Hub
322(1)
Training a Model from Scratch
323(15)
A Tale of Pretraining Objectives
323(2)
Initializing the Model
325(1)
Implementing the Dataloader
326(4)
Defining the Training Loop
330(7)
The Training Run
337(1)
Results and Analysis
338(5)
Conclusion
343(2)
11 Future Directions 345(26)
Scaling Transformers
345(9)
Scaling Laws
347(2)
Challenges with Scaling
349(2)
Attention Please!
351(1)
Sparse Attention
352(1)
Linearized Attention
353(1)
Going Beyond Text
354(7)
Vision
355(4)
Tables
359(2)
Multimodal Transformers
361(9)
Speech-to-Text
361(3)
Vision and Text
364(6)
Where to from Here?
370(1)
Index 371
Lewis Tunstall is a machine learning engineer at Hugging Face. His current work focuses on developing tools for the NLP community and teaching people to use them effectively. Leandro von Werra is a machine learning engineer in the open source team at Hugging Face, where he primarily works on code generation models and community outreach. Thomas Wolf is chief science officer at and cofounder of Hugging Face. His team is on a mission to catalyze and democratize NLP research.