Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop the support for PyTorch<2.0 #3272

Merged
merged 5 commits into from
Oct 4, 2023
Merged

Drop the support for PyTorch<2.0 #3272

merged 5 commits into from
Oct 4, 2023

Conversation

ordabayevy
Copy link
Member

@ordabayevy ordabayevy commented Oct 4, 2023

This PR drops the support for torch<2.0

  • refactor torch.set_default_tensor_type (deprecated in torch=2.1) into torch.set_default_dtype and torch.set_default_device
  • remove workarounds for max-recursion depth error for torch<2.0
  • remove workaround for _torch_scheduler_base definition
  • pin pillow>=8.3.1 since the issue in 8.3 has been fixed in 8.3.1
  • pinned torch>=2.0 and torchvision>=0.15.0
  • deprecated tensors_default_to and replaced it with with torch.device
  • skip cvae example test (Bug in CVAE example #3273)
  • renamed pytorch_lightning to lightning.pytorch (preferred)

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@ordabayevy
Copy link
Member Author

FAILED tests/test_examples.py::test_cpu[cvae/main.py --num-quadrant-inputs=1 --num-epochs=1] - subprocess.CalledProcessError: Command '['/opt/hostedtoolcache/Python/3.8.18/x64/bin/python', '/home/runner/work/pyro/pyro/examples/cvae/main.py', '--num-quadrant-inputs=1', '--num-epochs=1']' returned non-zero exit status 1.
= 1 failed, 148 passed, 97 skipped, 26558 deselected, 2 warnings in 1948.89s (0:32:28) =

I can't reproduce this failing test locally and there is not enough information in the logs to debug it. Does anyone have any suggestions how to fix it? The changes in this PR should be unrelated to this test.

Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! Just a few comments.

@@ -21,7 +21,8 @@ def random_mvn(batch_shape, dim, requires_grad=False):

def main(args):
if args.cuda:
torch.set_default_tensor_type("torch.cuda.FloatTensor")
torch.set_default_device("cuda")
torch.set_default_dtype(torch.float32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the set_default_dtype needed here? I see it omitted in other changes.

tests/common.py Outdated
new_module = "torch.cuda" if host == "cuda" else "torch"
torch.set_default_tensor_type("{}.{}".format(new_module, name))
old_host = torch.Tensor().device
torch.set_default_device(host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could we move the torch.set_default_device(host) into the try block? I realize this was wrong before your PR, but seems like a good time to fix it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this context manager at all? How is it different from with torch.device(device)? https://pytorch.org/docs/stable/generated/torch.set_default_device.html

To only temporarily change the default device instead of setting it globally, use with torch.device(device): instead.

Copy link
Member

@fritzo fritzo Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it would be great to replace this custom context manager with torch.device, as long as torch.device can be used as a context manager in our earliest supported torch version 1.11.0

Copy link
Member Author

@ordabayevy ordabayevy Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, well let's keep the polyfill until we drop support for torch==1.11.

Copy link
Member Author

@ordabayevy ordabayevy Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: this also applies to torch.set_default_device throughout this PR and we if we want to keep torch 1.11 support then there will be a lot of if/else statements between torch.set_default_device and torch.set_default_tensor_type to set the device

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. it looks like set_default_device is not exposed or not available even in torch 1.13

https://pytorch.org/docs/1.13/search.html?q=set_default&check_keywords=yes&area=default

Copy link
Member

@fritzo fritzo Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pyro has always aimed at being more stable than torch, and we have historically implemented polyfills in Pyro to smooth over PyTorch's move-fast-and-break-things attitude. If I had time, I'd implement polyfills like a pyro.util.device() context manager, a pyro.util.set_default_device() helper. etc. But I don't have time, and maybe it's time to drop this aim as our maintenance resources dwindle. 🤷

The motivation for being very stable is to avoid giving people headaches. Every time we drop support for some version of some underlying library, some grad student wastes a day trying to install an old repo whose author has graduated and didn't pin versions. Every time we pin a library version, some software engineer at BigCo wastes a week solving dependency issues between conflicting libraries with non-overlapping version pins, spending a day committing to an upstream repo maintained by an overcommitted professor, building polyfills around another dependency (who doesn't explicitly pin versions but actually depends on a version outside our range, which it took haf a day to figure out), and replacing a third library that is no longer maintained.

If you do decide to drop torch 1.11 support, could you update the version pins everywhere and update the python supported versions? And we'll bump the minor version in our next release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. it looks like set_default_device is not exposed or not available even in torch 1.13

https://pytorch.org/docs/1.13/search.html?q=set_default&check_keywords=yes&area=default

I confirmed this by installing it locally:

>>> import torch
>>> torch.__version__
'1.13.0+cu117'
>>> torch.set_default_device
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'torch' has no attribute 'set_default_device'

Can you confirm that it's ok to drop the support for all torch 1.11, 1.12, and 1.13?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's just be sure to announce in our release notes and bump the minor version.

Comment on lines 859 to 860
if args.double:
torch.set_default_dtype(torch.float64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete these two new lines (they are unnecessary given line 858 above)

@ordabayevy ordabayevy requested a review from fritzo October 4, 2023 17:01
@ordabayevy
Copy link
Member Author

@fritzo addressed your comments and added the description to the PR

@ordabayevy ordabayevy changed the title torch.set_default_tensor_type has been deprecated in PyTorch 2.1 Drop the support for PyTorch<2.0 Oct 4, 2023
@fritzo fritzo added this to the 1.9 release milestone Oct 4, 2023
Copy link
Member

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@fritzo fritzo merged commit fa73d9c into dev Oct 4, 2023
9 checks passed
@ordabayevy
Copy link
Member Author

Thanks for reviewing @fritzo !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants