Preface	ix
1. Introduction to Building AI Applications with Foundation Models	1
The Rise of AI Engineering	2
- From Language Models to Large Language Models	2
- From Large Language Models to Foundation Models	8
- From Foundation Models to AI Engineering	12
Foundation Model Use Cases	16
- Coding	20
- Image and Video Production	22
- Writing	22
- Education	24
- Conversational Bots	26
- Information Aggregation	26
- Data Organization	27
- Workflow Automation	28
Planning AI Applications	28
- Use Case Evaluation	29
- Setting Expectations	32
- Milestone Planning	33
- Maintenance	34
The AI Engineering Stack	35
- Three Layers of the AI Stack	37
- AI Engineering Versus ML Engineering	39
- AI Engineering Versus Full-Stack Engineering	46
Summary	47
2. Understanding Foundation Models	49
Training Data	50
- Multilingual Models	51
- Domain-Specific Models	56
Modeling	58
- Model Architecture	58
- Model Size	67
Post-Training	78
- Supervised Finetuning	80
- Preference Finetuning	83
Sampling	88
- Sampling Fundamentals	88
- Sampling Strategies	90
- Test Time Compute	96
- Structured Outputs	99
- The Probabilistic Nature of AI	105
Summary	111
3. Evaluation Methodology	113
Challenges of Evaluating Foundation Models	114
Understanding Language Modeling Metrics	118
- Entropy	119
- Cross Entropy	120
- Bits-per-Character and Bits-per-Byte	121
- Perplexity	121
- Perplexity Interpretation and Use Cases	122
Exact Evaluation	125
- Functional Correctness	126
- Similarity Measurements Against Reference Data	127
- Introduction to Embedding	134
AI as a Judge	136
- Why AI as a Judge?	137
- How to Use AI as a Judge	138
- Limitations of AI as a Judge	141
- What Models Can Act as Judges?	145
Ranking Models with Comparative Evaluation	148
- Challenges of Comparative Evaluation	152
- The Future of Comparative Evaluation	155
Summary	156
4. Evaluate AI Systems	159
Evaluation Criteria	160
- Domain-Specific Capability	161
- Generation Capability	163
- Instruction-Following Capability	172
- Cost and Latency	177
Model Selection	179
- Model Selection Workflow	179
- Model Build Versus Buy	181
- Navigate Public Benchmarks	191
Design Your Evaluation Pipeline	200
- Step 1. Evaluate All Components in a System	200
- Step 2. Create an Evaluation Guideline	202
- Step 3. Define Evaluation Methods and Data	204
Summary	208
5. Prompt Engineering	211
Introduction to Prompting	212
- In-Context Learning: Zero-Shot and Few-Shot	213
- System Prompt and User Prompt	215
- Context Length and Context Efficiency	218
Prompt Engineering Best Practices	220
- Write Clear and Explicit Instructions	220
- Provide Sufficient Context	223
- Break Complex Tasks into Simpler Subtasks	224
- Give the Model Time to Think	227
- Iterate on Your Prompts	229
- Evaluate Prompt Engineering Tools	230
- Organize and Version Prompts	233
Defensive Prompt Engineering	235
- Proprietary Prompts and Reverse Prompt Engineering	236
- Jailbreaking and Prompt Injection	238
- Information Extraction	243
- Defenses Against Prompt Attacks	248
Summary	251
6. RAG and Agents	253
RAG	253
- RAG Architecture	256
- Retrieval Algorithms	257
- Retrieval Optimization	268
- RAG Beyond Texts	273
Agents	275
- Agent Overview	276
- Tools	278
- Planning	281
- Agent Failure Modes and Evaluation	298
Memory	300
Summary	305
7. Finetuning	307
Finetuning Overview	308
When to Finetune	311
- Reasons to Finetune	311
- Reasons Not to Finetune	312
- Finetuning and RAG	316
Memory Bottlenecks	319
- Backpropagation and Trainable Parameters	320
- Memory Math	322
- Numerical Representations	325
- Quantization	328
Finetuning Techniques	332
- Parameter-Efficient Finetuning	333
- Model Merging and Multi-Task Finetuning	347
- Finetuning Tactics	357
Summary	361
8. Dataset Engineering	363
Data Curation	365
- Data Quality	368
- Data Coverage	370
- Data Quantity	372
- Data Acquisition and Annotation	377
Data Augmentation and Synthesis	380
- Why Data Synthesis	381
- Traditional Data Synthesis Techniques	383
- AI-Powered Data Synthesis	386
- Model Distillation	395
Data Processing	396
- Inspect Data	397
- Deduplicate Data	399
- Clean and Filter Data	401
- Format Data	401
Summary	403
9. Inference Optimization	405
Understanding Inference Optimization	406
- Inference Overview	406
- Inference Performance Metrics	412
- AI Accelerators	419
Inference Optimization	426
- Model Optimization	426
- Inference Service Optimization	440
Summary	447
10. AI Engineering Architecture and User Feedback	449
AI Engineering Architecture	449
- Step 1. Enhance Context	450
- Step 2. Put in Guardrails	451
- Step 3. Add Model Router and Gateway	456
- Step 4. Reduce Latency with Caches	460
- Step 5. Add Agent Patterns	463
- Monitoring and Observability	465
- AI Pipeline Orchestration	472
User Feedback	474
- Extracting Conversational Feedback	475
- Feedback Design	480
- Feedback Limitations	490
Summary	492
Epilogue	495
Index	497

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ToC.md

ToC.md

Table of Contents

Files

ToC.md

Latest commit

History

ToC.md

File metadata and controls

Table of Contents