Skip to content

Migrate hf trainer #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 5, 2023
Merged

Migrate hf trainer #287

merged 10 commits into from
Dec 5, 2023

Conversation

gkumbhat
Copy link
Collaborator

Changes

  • Replace custom training loop with HF Trainer

gkumbhat and others added 7 commits November 26, 2023 14:49
Co-authored-by: Alex-Brooks <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Co-authored-by: Alex-Brooks <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Co-authored-by: Alex-Brooks <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Co-authored-by: Alex-Brooks <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Co-authored-by: Alex-Brooks <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
gkumbhat and others added 2 commits November 30, 2023 10:15
Signed-off-by: gkumbhat <[email protected]>
Co-authored-by: Alex-Brooks <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
@gkumbhat gkumbhat mentioned this pull request Dec 1, 2023
Copy link
Collaborator

@Ssukriti Ssukriti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the changes it seems like fine tuning was already using Trainer, you have now added it to prompt tuning as well. I think you meant to re-use some functions, hence you moved them to toolkit.
Prompt tuning is using launch_training and preprocessing from toolkit, but text_generation_local has its own definition of both functions, _launch_training. Code looks the same between both launch training, so maybe you forgot to change fine tuning to use toolkit as well?

@gkumbhat
Copy link
Collaborator Author

gkumbhat commented Dec 4, 2023

@Ssukriti , I did this on purpose to avoid making too many refactors in one PR. I added a comment regarding that: https://github.com/caikit/caikit-nlp/pull/287/files#diff-3ca8e28141febc0ff5a812a8b7a2f92997f0098406b924bd4bdbcab680813cc3R73

Copy link
Collaborator

@Ssukriti Ssukriti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested PT before and after the change with llama model on some quick text samples, and results are same.

@Ssukriti Ssukriti merged commit bc595c6 into caikit:main Dec 5, 2023
@gkumbhat gkumbhat deleted the migrate_hf_trainer branch December 5, 2023 22:44
@Ssukriti Ssukriti restored the migrate_hf_trainer branch December 13, 2023 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants