Skip to content

Latest commit

 

History

History
173 lines (172 loc) · 17.2 KB

ToC.md

File metadata and controls

173 lines (172 loc) · 17.2 KB

Table of Contents

PDF version

Preface ix
1. Introduction to Building AI Applications with Foundation Models 1
The Rise of AI Engineering 2
- From Language Models to Large Language Models 2
- From Large Language Models to Foundation Models 8
- From Foundation Models to AI Engineering 12
Foundation Model Use Cases 16
- Coding 20
- Image and Video Production 22
- Writing 22
- Education 24
- Conversational Bots 26
- Information Aggregation 26
- Data Organization 27
- Workflow Automation 28
Planning AI Applications 28
- Use Case Evaluation 29
- Setting Expectations 32
- Milestone Planning 33
- Maintenance 34
The AI Engineering Stack 35
- Three Layers of the AI Stack 37
- AI Engineering Versus ML Engineering 39
- AI Engineering Versus Full-Stack Engineering 46
Summary 47
2. Understanding Foundation Models 49
Training Data 50
- Multilingual Models 51
- Domain-Specific Models 56
Modeling 58
- Model Architecture 58
- Model Size 67
Post-Training 78
- Supervised Finetuning 80
- Preference Finetuning 83
Sampling 88
- Sampling Fundamentals 88
- Sampling Strategies 90
- Test Time Compute 96
- Structured Outputs 99
- The Probabilistic Nature of AI 105
Summary 111
3. Evaluation Methodology 113
Challenges of Evaluating Foundation Models 114
Understanding Language Modeling Metrics 118
- Entropy 119
- Cross Entropy 120
- Bits-per-Character and Bits-per-Byte 121
- Perplexity 121
- Perplexity Interpretation and Use Cases 122
Exact Evaluation 125
- Functional Correctness 126
- Similarity Measurements Against Reference Data 127
- Introduction to Embedding 134
AI as a Judge 136
- Why AI as a Judge? 137
- How to Use AI as a Judge 138
- Limitations of AI as a Judge 141
- What Models Can Act as Judges? 145
Ranking Models with Comparative Evaluation 148
- Challenges of Comparative Evaluation 152
- The Future of Comparative Evaluation 155
Summary 156
4. Evaluate AI Systems 159
Evaluation Criteria 160
- Domain-Specific Capability 161
- Generation Capability 163
- Instruction-Following Capability 172
- Cost and Latency 177
Model Selection 179
- Model Selection Workflow 179
- Model Build Versus Buy 181
- Navigate Public Benchmarks 191
Design Your Evaluation Pipeline 200
- Step 1. Evaluate All Components in a System 200
- Step 2. Create an Evaluation Guideline 202
- Step 3. Define Evaluation Methods and Data 204
Summary 208
5. Prompt Engineering 211
Introduction to Prompting 212
- In-Context Learning: Zero-Shot and Few-Shot 213
- System Prompt and User Prompt 215
- Context Length and Context Efficiency 218
Prompt Engineering Best Practices 220
- Write Clear and Explicit Instructions 220
- Provide Sufficient Context 223
- Break Complex Tasks into Simpler Subtasks 224
- Give the Model Time to Think 227
- Iterate on Your Prompts 229
- Evaluate Prompt Engineering Tools 230
- Organize and Version Prompts 233
Defensive Prompt Engineering 235
- Proprietary Prompts and Reverse Prompt Engineering 236
- Jailbreaking and Prompt Injection 238
- Information Extraction 243
- Defenses Against Prompt Attacks 248
Summary 251
6. RAG and Agents 253
RAG 253
- RAG Architecture 256
- Retrieval Algorithms 257
- Retrieval Optimization 268
- RAG Beyond Texts 273
Agents 275
- Agent Overview 276
- Tools 278
- Planning 281
- Agent Failure Modes and Evaluation 298
Memory 300
Summary 305
7. Finetuning 307
Finetuning Overview 308
When to Finetune 311
- Reasons to Finetune 311
- Reasons Not to Finetune 312
- Finetuning and RAG 316
Memory Bottlenecks 319
- Backpropagation and Trainable Parameters 320
- Memory Math 322
- Numerical Representations 325
- Quantization 328
Finetuning Techniques 332
- Parameter-Efficient Finetuning 333
- Model Merging and Multi-Task Finetuning 347
- Finetuning Tactics 357
Summary 361
8. Dataset Engineering 363
Data Curation 365
- Data Quality 368
- Data Coverage 370
- Data Quantity 372
- Data Acquisition and Annotation 377
Data Augmentation and Synthesis 380
- Why Data Synthesis 381
- Traditional Data Synthesis Techniques 383
- AI-Powered Data Synthesis 386
- Model Distillation 395
Data Processing 396
- Inspect Data 397
- Deduplicate Data 399
- Clean and Filter Data 401
- Format Data 401
Summary 403
9. Inference Optimization 405
Understanding Inference Optimization 406
- Inference Overview 406
- Inference Performance Metrics 412
- AI Accelerators 419
Inference Optimization 426
- Model Optimization 426
- Inference Service Optimization 440
Summary 447
10. AI Engineering Architecture and User Feedback 449
AI Engineering Architecture 449
- Step 1. Enhance Context 450
- Step 2. Put in Guardrails 451
- Step 3. Add Model Router and Gateway 456
- Step 4. Reduce Latency with Caches 460
- Step 5. Add Agent Patterns 463
- Monitoring and Observability 465
- AI Pipeline Orchestration 472
User Feedback 474
- Extracting Conversational Feedback 475
- Feedback Design 480
- Feedback Limitations 490
Summary 492
Epilogue 495
Index 497