-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM error when using "sequential" chain method on a GPU #899
Comments
Hi @fehiepsi I did some testing with the new master branch.
|
Self-contained script for reproducing the OOM error
|
Thanks, @PaoloRanzi81! I did see memory is increasing with |
Congratulations @fehiepsi : you have caught the bug with 1 GPU! I have tested my custom toy model and the OOM error of the RAM does disappear setting Still not clear to me why a small change did such a major improvement in the RAM memory consumption... I was thinking that setting the progress_bar was just aesthetics... Please remember to change code/documentation etc. in order to make other people aware of it. This way we avoid them wasting their time troubleshooting it. I do still have the OOM error with my actual model. Using You can close the issue. Thanks again for your help so far! |
Thanks, @PaoloRanzi81, for your effort to isolate the issue! We'll improve documentation to reflect this issue better. |
As reported by @PaoloRanzi81 in #539, RAM might not free its resources after a chain finished its run, which leads to OOM for complicated models trained in hours.
Some observations so far:
progress_bar=False
In progress of testing
progress_bar=True
The text was updated successfully, but these errors were encountered: