v0.6.0
DDPO for diffusion models
We are excited to welcome the first RLHF + diffusion models algorithm to refine the generations from diffusion models.
Read more about it directly in the docs.
Before | After DDPO finetuning |
---|---|
- Denoising Diffusion Policy Optimization by @metric-space in #508
Bug fixes and other enhancements
The release also comes with multiple bug fixes reported and/or led by the community, check out the commit history below
What's Changed
- Release: v0.5.0 by @younesbelkada in #607
- Set dev version by @younesbelkada in #608
- [
Modeling
] Add token support forhf_hub_download
by @younesbelkada in #604 - Add docs explaining logged metrics by @vwxyzjn in #616
- [DPO] stack-llama-2 training scripts by @kashif in #611
- Use log_with argument in SFT example by @hitorilabs in #620
- Allow already tokenized sequences for
response_template
inDataCollatorForCompletionOnlyLM
by @ivsanro1 in #622 - Improve docs by @lvwerra in #612
- Move repo by @lvwerra in #628
- Add score scaling/normalization/clipping by @zfang in #560
- Disable dropout in DPO Training by @NouamaneTazi in #639
- Add checks on backward batch size by @vwxyzjn in #651
- Resolve various typos throughout the docs by @tomaarsen in #654
- Update README.md by @Santosh-Gupta in #657
- Allow for ref_model=None in DPOTrainer by @vincentmin in #640
- Add more args to SFT example by @photomz in #642
- Handle potentially long sequences with DataCollatorForCompletionOnlyLM by @tannonk in #644
- [
sft_llama2
] Add check of arguments by @younesbelkada in #660 - Fix DPO blogpost thumbnail by @lvwerra in #673
- propagating eval_batch_size to TrainingArguments by @rahuljha in #675
- [
CI
] Fix unmutableTrainingArguments
issue by @younesbelkada in #676 - Update sft_llama2.py by @msaad02 in #678
- fix PeftConfig loading from a remote repo. by @w32zhong in #649
- Simplify immutable TrainingArgs fix using
dataclasses.replace
by @tomaarsen in #682
New Contributors
- @hitorilabs made their first contribution in #620
- @ivsanro1 made their first contribution in #622
- @zfang made their first contribution in #560
- @NouamaneTazi made their first contribution in #639
- @Santosh-Gupta made their first contribution in #657
- @vincentmin made their first contribution in #640
- @photomz made their first contribution in #642
- @tannonk made their first contribution in #644
- @rahuljha made their first contribution in #675
- @msaad02 made their first contribution in #678
- @w32zhong made their first contribution in #649
Full Changelog: v0.5.0...v0.6.0