Fix: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. #1237

liuzhaoze · 2025-01-19T05:04:38Z

When I trained the agent on a Mac, the following error occurred:

Epoch #1:   0%|                                                                                                                                 | 0/10000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/Users/rocco/Documents/code/faas-resource-drl/run.py", line 290, in <module>
    result, ma_policy = train_agent(args)
                        ^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/faas-resource-drl/run.py", line 266, in train_agent
    ).run()
      ^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/trainer/base.py", line 629, in run
    deque(self, maxlen=0)  # feed the entire iterator into a zero-length deque
    ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/trainer/base.py", line 334, in __next__
    train_stat, update_stat, self.stop_fn_flag = self.training_step()
                                                 ^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/trainer/base.py", line 483, in training_step
    training_stats = self.policy_update_fn(collect_stats)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/trainer/base.py", line 715, in policy_update_fn
    update_stat = self._sample_and_update(self.train_collector.buffer)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/trainer/base.py", line 651, in _sample_and_update
    update_stat = self.policy.update(sample_size=self.batch_size, buffer=buffer)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/policy/base.py", line 545, in update
    batch = self.process_fn(batch, buffer, indices)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/policy/multiagent/mapolicy.py", line 158, in process_fn
    results[agent] = policy.process_fn(tmp_batch, buffer, tmp_indice)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/policy/modelfree/dqn.py", line 148, in process_fn
    return self.compute_nstep_return(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/policy/base.py", line 715, in compute_nstep_return
    batch.returns = to_torch_as(n_step_return_IA, target_q_torch_IA)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/data/utils/converter.py", line 75, in to_torch_as
    return to_torch(x, dtype=y.dtype, device=y.device)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rocco/Documents/code/tianshou-dev/tianshou/data/utils/converter.py", line 48, in to_torch
    x = torch.from_numpy(x).to(device)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

I print some information in converter.py:

if isinstance(x, np.ndarray) and issubclass(
    x.dtype.type,
    np.bool_ | np.number,
):  # most often case
    print(tmp := torch.from_numpy(x), tmp.dtype, tmp.device, dtype)
    x = torch.from_numpy(x).to(device)
    if dtype is not None:
        x = x.type(dtype)
    return x

The output closest to the error is:

tensor([[-4.4213, -4.2664, -4.1114,  ...,  3.0171,  3.1721,  3.3271],
        [-9.0000, -8.6400, -8.2800,  ...,  8.2800,  8.6400,  9.0000],
        [-4.4213, -4.2664, -4.1114,  ...,  3.0171,  3.1721,  3.3271],
        ...,
        [-6.1470, -5.9108, -5.6746,  ...,  5.1905,  5.4267,  5.6628],
        [-4.4213, -4.2664, -4.1114,  ...,  3.0171,  3.1721,  3.3271],
        [-4.4213, -4.2664, -4.1114,  ...,  3.0171,  3.1721,  3.3271]],
       dtype=torch.float64) torch.float64 cpu torch.float32

Therefore, I think the tensor's dtype should be set before it is passed to the device, just like in the second often case:

if isinstance(x, torch.Tensor):  # second often case
    if dtype is not None:
        x = x.type(dtype)
    return x.to(device)

…k doesn't support float64.

MischaPanch · 2025-01-23T14:03:55Z

Looks good, but for some reason the test on mac is failing now. I can't see how the failure is related to this change, maybe it's a fluke or some mistake in the test itself. I'll look into it and get back to you.

Did changing the type prior to moving the tensor to the device solve your issue, or have you not tried with the adjusted code yet?

liuzhaoze · 2025-01-24T13:48:21Z

The adjusted code solved my issue. And I tested a small demo below:

Python 3.11.11 (main, Dec 11 2024, 10:25:04) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.1.1'
>>> a = torch.rand(2, dtype=torch.float64)
>>> a.device
device(type='cpu')
>>> b = a.to('mps')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
>>> c = a.type(torch.float32)
>>> b = c.to('mps')
>>> b.device
device(type='mps', index=0)

It seems that MPS can only accept float32 tensors. The type of the tensor should be changed before being moved to MPS.

The same situation happens in the latest version of PyTorch:

Python 3.11.11 (main, Dec 11 2024, 10:25:04) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.5.1'
>>> a = torch.rand(2, dtype=torch.float64)
>>> a.device
device(type='cpu')
>>> b = a.to('mps')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
>>> b = a.type(torch.float32).to('mps')
>>> b
tensor([0.6099, 0.9129], device='mps:0')

Fix: Cannot convert a MPS Tensor to float64 dtype as the MPS framewor…

c04cd1b

…k doesn't support float64.

MischaPanch approved these changes Jan 26, 2025

View reviewed changes

MischaPanch merged commit 0a79016 into thu-ml:master Jan 26, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. #1237

Fix: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. #1237

liuzhaoze commented Jan 19, 2025

MischaPanch commented Jan 23, 2025

liuzhaoze commented Jan 24, 2025

Fix: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. #1237

Fix: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. #1237

Conversation

liuzhaoze commented Jan 19, 2025

MischaPanch commented Jan 23, 2025

liuzhaoze commented Jan 24, 2025