Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTNN Tensor Unpad support is needed for MultiDeviceHostStorage #16870

Open
wooseokTT opened this issue Jan 17, 2025 · 4 comments
Open

TTNN Tensor Unpad support is needed for MultiDeviceHostStorage #16870

wooseokTT opened this issue Jan 17, 2025 · 4 comments
Assignees

Comments

@wooseokTT
Copy link

While running tt-mlir runtime ToLayout op on multi-device, following failure occurs due mainly to unpad does not support MultiDeviceHostStorage.

} else if constexpr (std::is_same_v<StorageType, MultiDeviceHostStorage>) {

2025-01-17 19:52:45,122 - ERROR - ERROR: test=test_ttnn.ttnn experienced an error with exception=TT_THROW @ /proj_sw/user_dev/wooseoklee/tt-mlir/third_party/tt-metal/src/tt-metal/ttnn/cpp/ttnn/tensor/tensor_impl.cpp:1229: tt::exception
info:
Device storage isn't supported
backtrace:
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_C.cpython-310-x86_64-linux-gnu.so(+0xca018) [0x7f059a8fb018]
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_ttnn.so(+0x2733310) [0x7f05993d5310]
--- tt::tt_metal::Tensor tt::tt_metal::tensor_impl::unpad(tt::tt_metal::Tensor const&, tt::tt_metal::SimpleShape const&, tt::tt_metal::SimpleShape const&)
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_ttnn.so(ZN2tt8tt_metal11tensor_impl8dispatchIZNS1_13unpad_wrapperIJRKNS0_6TensorERKNS0_11SimpleShapeES9_EEEDaDpOT_EUlTyDpOT0_E_JS6_S9_S9_EEEDaNS0_8DataTypeEOT_SF+0x84) [0x7f05993e1a64]
--- tt::tt_metal::tensor_ops::tensor_unpad(tt::tt_metal::Tensor const&, tt::tt_metal::SimpleShape const&, tt::tt_metal::SimpleShape const&)
--- tt::tt_metal::tensor_ops::tensor_unpad_from_tile(tt::tt_metal::Tensor const&, tt::tt_metal::SimpleShape const&)
--- tt::tt_metal::Tensor::unpad_from_tile(tt::tt_metal::SimpleShape const&) const
--- tt::tt_metal::Tensor ttnn::operations::core::detail::to_layout_impltt::tt_metal::v0::IDevice(tt::tt_metal::Tensor const&, tt::tt_metal::Layout, std::optionaltt::tt_metal::DataType const&, std::optionaltt::tt_metal::MemoryConfig const&, tt::tt_metal::v0::IDevice*)
--- ttnn::operations::core::ToLayout::invoke(tt::tt_metal::Tensor const&, tt::tt_metal::Layout, std::optionaltt::tt_metal::DataType const&, std::optionaltt::tt_metal::MemoryConfig const&, tt::tt_metal::v0::IDevice*)
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_C.cpython-310-x86_64-linux-gnu.so(+0x1fa895) [0x7f059aa2b895]
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_C.cpython-310-x86_64-linux-gnu.so(+0x1f8326) [0x7f059aa29326]
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_C.cpython-310-x86_64-linux-gnu.so(+0x1f7d14) [0x7f059aa28d14]
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_C.cpython-310-x86_64-linux-gnu.so(+0x1f288b) [0x7f059aa2388b]
--- tt::runtime::ttnn::operations::layout::run(tt::target::ttnn::ToLayoutOp const*, tt::runtime::ttnn::ProgramContext&)
--- /opt/ttmlir-toolchain/venv/lib/python3.10/site-packages/ttrt/runtime/_C.cpython-310-x86_64-linux-gnu.so(+0xf0b1d) [0x7f059a921b1d]
--- tt::runtime::ttnn::runProgram(tt::tt_metal::distributed::MeshDevice&, tt::runtime::Binary, unsigned int, std::vector<tt::tt_metal::Tensor*, std::allocatortt::tt_metal::Tensor* > const&)
--- tt::runtime::ttnn::submit(tt::runtime::Device, tt::runtime::Binary, unsigned int, std::vector<tt::runtime::Tensor, std::allocatortt::runtime::Tensor > const&)
--- tt::runtime::submit(tt::runtime::Device, tt::runtime::Binary, unsigned int, std::vector<tt::runtime::Tensor, std::allocatortt::runtime::Tensor > const&)

@wooseokTT wooseokTT changed the title TTNN Tensor Pad/Unpad support is needed for MultiDeviceHostStorage TTNN Tensor Unpad support is needed for MultiDeviceHostStorage Jan 17, 2025
@nsmithtt
Copy link
Contributor

@cfjchu, can you help us triage?

@cfjchu
Copy link
Collaborator

cfjchu commented Jan 18, 2025

I can take a look during the week. What's the priority on this issue?

@wooseokTT
Copy link
Author

@cfjchu It seems that multidevice to row_major conversion with non-32x32 aligned tensor size fails due to this issue. It was reported by the PJRT team when they add more test cases.

@nsmithtt
Copy link
Contributor

@wooseokTT, this isn't blocking anything right? We just found this via test case? I think P2 for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants