Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2? #27

cmdrscotty · 2023-03-02T02:20:09Z

Not sure how difficult it is, but is there a chance we might be able to get an updated build of PyTorch for Rocm5.2 with GFX803 enabled?

Currently the Rocm5.2 pytorch has gfx803 left out. and attempting to use xuhuisheng's build results in compatibility errors as other libraries are expecting the torch version to be 1.13.1 and torchvision 0.14.1.

Xuhuisheng's version is built on Torch 1.11.1 and Torchvision 0.12.0.

I'm certainly willing to willing to try and build it myself if anyone has a good guild on how to compile both Torch with Rocm (so far only found guides for Cuda) and TorchVision

xuhuisheng · 2023-03-02T03:25:44Z

You can build by yourself.
Here is some sample build scripts for pytorch on gfx803. you can have a try.
https://github.com/xuhuisheng/rocm-build/tree/master/gfx803#pytorch-190-crashed-on-gfx803

tsl0922 · 2023-04-20T04:45:00Z

I've run stable-diffusion-webui with ROCm on Ubuntu 22.04.2 LTS successfully with unpatched rocm-5.4.3, pytorch built with PYTORCH_ROCM_ARCH=gfx803.

https://github.com/tsl0922/pytorch-gfx803

xuhuisheng · 2023-04-20T06:59:14Z

I've run stable-diffusion-webui with ROCm on Ubuntu 22.04.2 LTS successfully with unpatched rocm-5.4.3, pytorch built with PYTORCH_ROCM_ARCH=gfx803.

https://github.com/tsl0922/pytorch-gfx803

Good job, I think I will give a try SD on gfx803 again.

poVoq · 2023-04-24T11:42:07Z

I am failing to build pytorch myself on Fedora with an endless stream of c errors. No idea what it going wrong. Would be much appreciated if you could give it another try on your Ubuntu system. Thanks!

WeirdWood · 2023-05-04T03:36:11Z

Here is my build process on Ubuntu 22.04.2 for Pytorch 2.0.1-rc2 and Vision 0.15.2-rc2, both seem to work fine with the latest ROCm 5.5.0. All the steps are based on tsl0922 repository: https://github.com/tsl0922/pytorch-gfx803

Note that I'm not building MAGMA so UniPC sampler will fail to run.

Install cmake, sccache (on snapstore, use the store app)

Install dependencies

sudo apt install libopenmpi3 libstdc++-12-dev libdnnl-dev ninja-build libopenblas-dev libpng-dev libjpeg-dev

Install ROCm

sudo -i
sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
# Reboot after this

wget https://repo.radeon.com/amdgpu-install/latest/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
sudo apt install ./amdgpu-install_5.5.50500-1_all.deb
sudo amdgpu-install -y --usecase=rocm,hiplibsdk,mlsdk

sudo usermod -aG video $LOGNAME
sudo usermod -aG render $LOGNAME

# verify
rocminfo
clinfo

Build MAGMA if you need, I'm skipping this

git clone https://bitbucket.org/icl/magma.git
cd magma
# Setup make.inc check README
make -j16
make install

Build Torch

git clone --recursive https://github.com/pytorch/pytorch.git -b v2.0.1-rc2
cd pytorch
pip install cmake mkl mkl-include
pip install -r requirements.txt
sudo ln -s /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/lib/x86_64-linux-gnu/librt.so
export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
export PYTORCH_ROCM_ARCH=gfx803
export PYTORCH_BUILD_VERSION=2.0.0 PYTORCH_BUILD_NUMBER=1
export USE_CUDA=0 USE_ROCM=1 USE_NINJA=1
python3 tools/amd_build/build_amd.py
python3 setup.py bdist_wheel
pip install dist/torch-2.0.0-cp310-cp310-linux_x86_64.whl

Build Vision

git clone https://github.com/pytorch/vision.git -b v0.15.2-rc2
cd vision
export BUILD_VERSION=0.15.1
FORCE_CUDA=1 ROCM_HOME=/opt/rocm/ python3 setup.py bdist_wheel
pip install dist/torchvision-0.15.1-cp310-cp310-linux_x86_64.whl

Activate your venv environment in Automatic1111's webui and reinstall torch, torchvision with the built wheels after.

agtian0 · 2023-06-10T06:26:46Z

Here is my build process on Ubuntu 22.04.2 for Pytorch 2.0.1-rc2 and Vision 0.15.2-rc2, both seem to work fine with the latest ROCm 5.5.0. All the steps are based on tsl0922 repository: https://github.com/tsl0922/pytorch-gfx803

Note that I'm not building MAGMA so UniPC sampler will fail to run.

Install cmake, sccache (on snapstore, use the store app)

Install dependencies

sudo apt install libopenmpi3 libstdc++-12-dev libdnnl-dev ninja-build libopenblas-dev libpng-dev libjpeg-dev

Install ROCm
sudo -i
sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
# Reboot after this
wget https://repo.radeon.com/amdgpu-install/latest/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
sudo apt install ./amdgpu-install_5.5.50500-1_all.deb
sudo amdgpu-install -y --usecase=rocm,hiplibsdk,mlsdk

sudo usermod -aG video $LOGNAME
sudo usermod -aG render $LOGNAME

# verify
rocminfo
clinfo
Build MAGMA if you need, I'm skipping this
git clone https://bitbucket.org/icl/magma.git
cd magma
# Setup make.inc check README
make -j16
make install
Build Torch
git clone --recursive https://github.com/pytorch/pytorch.git -b v2.0.1-rc2
cd pytorch
pip install cmake mkl mkl-include
pip install -r requirements.txt
sudo ln -s /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/lib/x86_64-linux-gnu/librt.so
export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
export PYTORCH_ROCM_ARCH=gfx803
export PYTORCH_BUILD_VERSION=2.0.0 PYTORCH_BUILD_NUMBER=1
export USE_CUDA=0 USE_ROCM=1 USE_NINJA=1
python3 tools/amd_build/build_amd.py
python3 setup.py bdist_wheel
pip install dist/torch-2.0.0-cp310-cp310-linux_x86_64.whl
Build Vision
git clone https://github.com/pytorch/vision.git -b v0.15.2-rc2
cd vision
export BUILD_VERSION=0.15.1
FORCE_CUDA=1 ROCM_HOME=/opt/rocm/ python3 setup.py bdist_wheel
pip install dist/torchvision-0.15.1-cp310-cp310-linux_x86_64.whl
Activate your venv environment in Automatic1111's webui and reinstall torch, torchvision with the built wheels after.

gpu not detected in rocm sudo /opt/rocm-5.5.0/bin/rocm-smi

======================= ROCm System Management Interface =======================
WARNING: No AMD GPUs specified
================================= Concise Info =================================
GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%

============================= End of ROCm SMI Log ==============================

viebrix · 2023-08-04T17:29:31Z

@WeirdWood thanks for your build process!
I have also a RX 580 (8GB) and Linux Mint 21.2 and tried to follow your description.
Unfortunatly the latest rocm version is now 5.6 - I tried it with 5.6 version and also with version 5.5.3 but both lead to a lot of troubles. Webui stops with an 'memory' error. Because my errors were printed in german I don't know the exactly english translation - but I think I run into this error:
AUTOMATIC1111/stable-diffusion-webui#11712
It seems to be an error in rocm itself. ROCm/clr#4

Therefore I tried with exactly your version (5.5.0) and this finally worked. For others who want to do same build process:
you only have to change the line:
wget https://repo.radeon.com/amdgpu-install/latest/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
with:
wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
in WeirdWood's description.

Many thanks @WeirdWood !

waltercool · 2023-09-17T15:50:26Z

@fidgety520 If you can describe your issue, would be great. Is your GPU a gfx803 generation? What OS are you using? What are you experiencing?

Any information would be useful

CPU rendering will always work, but takes several minutes when GPU takes usually seconds.

viebrix · 2023-09-17T16:22:38Z

@fidgety520 which linux is running under your docker (I'm not familiar with docker installations)
What is the exact error message ./webui.sh is returning you after the start?
version of torch and torchvision is the own compiled, like WeirdWood has it described in this thread here: #27 (comment)

viebrix · 2023-09-17T17:28:12Z

@fidgety520 here is what I have done, but at one's own risk....

sudo apt autoremove rocm-core amdgpu-dkms
sudo apt install libopenmpi3 libstdc++-12-dev libdnnl-dev ninja-build libopenblas-dev libpng-dev libjpeg-dev
sudo -i
sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
# Reboot after this

wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
sudo apt install ./amdgpu-install_5.5.50500-1_all.deb
sudo amdgpu-install -y --usecase=rocm,hiplibsdk,mlsdk

sudo usermod -aG video $LOGNAME
sudo usermod -aG render $LOGNAME

# verify
rocminfo
clinfo

#Build Torch

git clone --recursive https://github.com/pytorch/pytorch.git -b v2.0.1-rc2
cd pytorch
pip install cmake mkl mkl-include
pip install -r requirements.txt
sudo ln -s /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/lib/x86_64-linux-gnu/librt.so
export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
export PYTORCH_ROCM_ARCH=gfx803
export PYTORCH_BUILD_VERSION=2.0.0 PYTORCH_BUILD_NUMBER=1
export USE_CUDA=0 USE_ROCM=1 USE_NINJA=1
python3 tools/amd_build/build_amd.py
python3 setup.py bdist_wheel
pip install dist/torch-2.0.0-cp310-cp310-linux_x86_64.whl --force-reinstall

cd ..
git clone https://github.com/pytorch/vision.git -b v0.15.2-rc2
cd vision
export BUILD_VERSION=0.15.1
FORCE_CUDA=1 ROCM_HOME=/opt/rocm/ python3 setup.py bdist_wheel
pip install dist/torchvision-0.15.1-cp310-cp310-linux_x86_64.whl --force-reinstall

# automatic
cd ..

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip wheel
pip uninstall torch torchvision
pip3 install /home/xxxxxx/pytorch/dist/torch-2.0.0-cp310-cp310-linux_x86_64.whl
pip3 install /home/xxxxxx/vision/dist/torchvision-0.15.1-cp310-cp310-linux_x86_64.whl
pip list | grep 'torch'

# edit:  webui-user.sh
# change or add this lines:

# Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
export COMMANDLINE_ARGS="--no-half-vae --disable-nan-check --opt-split-attention --medvram --medvram-sdxl"

during compilation you have to agree some steps...
credits for this steps go to WeirdWood and xuhuisheng

PS: xxxxxx in the path stands for your account name of home directory. If you compile it in another folder, you have to change this

viebrix · 2023-09-17T19:41:34Z

Are there any other errors except those warnings during the compilation process. On my computer the compilation needed a very long time.
I tried to upload my whl files on github (fork of this repo) but there is a limit of 25MB and torch-2.0.0-cp310-cp310-linux_x86_64.whl has a size of 165.4 MB
Do you have a second gpu - maybe an internal - ob your pc?
As I remember the segmentation error can also be a missing lib, but I haven't found the issue thread yet

viebrix · 2023-09-17T20:17:06Z

In your /pytorch/dist/ folder. The whl file - is this from date after compilation or from date at (or before) download time

viebrix · 2023-09-18T11:04:31Z

Only difference I'm aware is the deinstallation of rocm/amdgpu and the installation of libraries I missed during the process under linux mint. The download path of amd is also another, but with the "old" one there should happen an error at download - because file isn't anymore available in this folder on server.
I'm very happy that it worked for you @fidgety520

viebrix · 2024-01-15T18:18:36Z

here is an update for GFX803 - e.g. RX580 with pytorch v2.1.2 and automatic sd webui 1.7


sudo apt autoremove rocm-core amdgpu-dkms
sudo apt install libopenmpi3 libstdc++-12-dev libdnnl-dev ninja-build libopenblas-dev libpng-dev libjpeg-dev
sudo -i
sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
# Reboot after this

wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb
sudo apt install ./amdgpu-install_5.5.50500-1_all.deb
sudo amdgpu-install -y --usecase=rocm,hiplibsdk,mlsdk

sudo usermod -aG video $LOGNAME
sudo usermod -aG render $LOGNAME

# verify
rocminfo
clinfo

#in home directory create directory pytorch2.1.2
#Build Torch
cd pytorch2.1.2
git clone --recursive https://github.com/pytorch/pytorch.git -b v2.1.2
cd pytorch
pip install cmake mkl mkl-include
pip install -r requirements.txt
sudo ln -s /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/lib/x86_64-linux-gnu/librt.so
export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
export PYTORCH_ROCM_ARCH=gfx803
export PYTORCH_BUILD_VERSION=2.1.2 PYTORCH_BUILD_NUMBER=1
export USE_CUDA=0 USE_ROCM=1 USE_NINJA=1
python3 tools/amd_build/build_amd.py
python3 setup.py bdist_wheel
pip install dist/torch-2.0.0-cp310-cp310-linux_x86_64.whl --force-reinstall

cd ..
git clone https://github.com/pytorch/vision.git -b v0.16.2
cd vision
export BUILD_VERSION=0.16.2
FORCE_CUDA=1 ROCM_HOME=/opt/rocm/ python3 setup.py bdist_wheel
pip install dist/torchvision-0.15.1-cp310-cp310-linux_x86_64.whl --force-reinstall

# automatic
cd ..

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip wheel
pip uninstall torch torchvision
pip3 install /home/*******/pytorch2.1.2/pytorch/dist/torch-2.1.2-cp310-cp310-linux_x86_64.whl
pip3 install /home/*******/pytorch2.1.2/vision/dist/torchvision-0.16.2-cp310-cp310-linux_x86_64.whl
pip list | grep 'torch'

see also: https://github.com/viebrix/pytorch-gfx803/tree/main

cmdrscotty · 2024-01-16T21:59:54Z

@viebrix
Thank you for the write up!

I needed to update my server to 22.04 (was on 20.04) but was able to follow your guide and get it working on my RX580 successfully!

picarica · 2024-02-07T13:05:49Z

didnt worked for me, i got unrecognized comamnd line option for cc1plus

Building wheel torch-2.1.2
-- Building version 2.1.2
cmake --build . --target install --config Release -- -j 32
[13/725] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/ir.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/ir.cpp.o 
/opt/cache/bin/sccache /usr/bin/c++ -DAT_PER_OPERATOR_HEADERS -DBUILD_ONEDNN_GRAPH -DCAFFE2_BUILD_MAIN_LIB -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_C10D_GLOO -DUSE_C10D_MPI -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -I/dockerx/pytorch/build/aten/src -I/dockerx/pytorch/aten/src -I/dockerx/pytorch/build -I/dockerx/pytorch -I/dockerx/pytorch/cmake/../third_party/benchmark/include -I/dockerx/pytorch/third_party/onnx -I/dockerx/pytorch/build/third_party/onnx -I/dockerx/pytorch/third_party/foxi -I/dockerx/pytorch/build/third_party/foxi -I/dockerx/pytorch/torch/csrc/api -I/dockerx/pytorch/torch/csrc/api/include -I/dockerx/pytorch/caffe2/aten/src/TH -I/dockerx/pytorch/build/caffe2/aten/src/TH -I/dockerx/pytorch/build/caffe2/aten/src -I/dockerx/pytorch/build/caffe2/../aten/src -I/dockerx/pytorch/torch/csrc -I/dockerx/pytorch/third_party/miniz-2.1.0 -I/dockerx/pytorch/third_party/kineto/libkineto/include -I/dockerx/pytorch/third_party/kineto/libkineto/src -I/dockerx/pytorch/aten/src/ATen/.. -I/dockerx/pytorch/third_party/FXdiv/include -I/dockerx/pytorch/c10/.. -I/dockerx/pytorch/third_party/pthreadpool/include -I/dockerx/pytorch/third_party/cpuinfo/include -I/dockerx/pytorch/third_party/QNNPACK/include -I/dockerx/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/dockerx/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/dockerx/pytorch/third_party/cpuinfo/deps/clog/include -I/dockerx/pytorch/third_party/NNPACK/include -I/dockerx/pytorch/third_party/fbgemm/include -I/dockerx/pytorch/third_party/fbgemm -I/dockerx/pytorch/third_party/fbgemm/third_party/asmjit/src -I/dockerx/pytorch/third_party/ittapi/src/ittnotify -I/dockerx/pytorch/third_party/FP16/include -I/dockerx/pytorch/third_party/tensorpipe -I/dockerx/pytorch/build/third_party/tensorpipe -I/dockerx/pytorch/third_party/tensorpipe/third_party/libnop/include -I/dockerx/pytorch/third_party/fmt/include -I/dockerx/pytorch/build/third_party/ideep/mkl-dnn/include -I/dockerx/pytorch/third_party/ideep/mkl-dnn/src/../include -I/dockerx/pytorch/third_party/flatbuffers/include -isystem /dockerx/pytorch/build/third_party/gloo -isystem /dockerx/pytorch/cmake/../third_party/gloo -isystem /dockerx/pytorch/cmake/../third_party/tensorpipe/third_party/libuv/include -isystem /dockerx/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /dockerx/pytorch/cmake/../third_party/googletest/googletest/include -isystem /dockerx/pytorch/third_party/protobuf/src -isystem /dockerx/pytorch/third_party/gemmlowp -isystem /dockerx/pytorch/third_party/neon2sse -isystem /dockerx/pytorch/third_party/XNNPACK/include -isystem /dockerx/pytorch/third_party/ittapi/include -isystem /dockerx/pytorch/cmake/../third_party/eigen -isystem /opt/ompi/include -isystem /dockerx/pytorch/third_party/ideep/mkl-dnn/include/oneapi/dnnl -isystem /dockerx/pytorch/third_party/ideep/include -isystem /dockerx/pytorch/build/include -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -std=gnu++17 -fPIC -DTORCH_USE_LIBUV -DCAFFE2_USE_GLOO -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-type-limits -Wno-array-bounds -Wno-strict-overflow -Wno-strict-aliasing -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -pthread -DASMJIT_STATIC -fopenmp -fopenmp -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/ir.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/ir.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/ir/ir.cpp.o -c /dockerx/pytorch/torch/csrc/jit/ir/ir.cpp
/dockerx/pytorch/torch/csrc/jit/ir/ir.cpp: In member function ‘bool torch::jit::Node::hasSideEffects() const’:
/dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1191:16: error: ‘set_stream’ is not a member of ‘torch::jit::cuda’; did you mean ‘c10::cuda::set_stream’?
 1191 |     case cuda::set_stream:
      |                ^~~~~~~~~~
In file included from /dockerx/pytorch/torch/csrc/jit/ir/ir.h:18,
                 from /dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1:
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:228:11: note: ‘c10::cuda::set_stream’ declared here
  228 |   _(cuda, set_stream)                \
      |           ^~~~~~~~~~
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:353:35: note: in definition of macro ‘DEFINE_SYMBOL’
  353 |   namespace ns { constexpr Symbol s(static_cast<unique_t>(_keys::ns##_##s)); }
      |                                   ^
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:354:1: note: in expansion of macro ‘FORALL_NS_SYMBOLS’
  354 | FORALL_NS_SYMBOLS(DEFINE_SYMBOL)
      | ^~~~~~~~~~~~~~~~~
/dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1192:16: error: ‘_set_device’ is not a member of ‘torch::jit::cuda’; did you mean ‘c10::cuda::_set_device’?
 1192 |     case cuda::_set_device:
      |                ^~~~~~~~~~~
In file included from /dockerx/pytorch/torch/csrc/jit/ir/ir.h:18,
                 from /dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1:
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:227:11: note: ‘c10::cuda::_set_device’ declared here
  227 |   _(cuda, _set_device)               \
      |           ^~~~~~~~~~~
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:353:35: note: in definition of macro ‘DEFINE_SYMBOL’
  353 |   namespace ns { constexpr Symbol s(static_cast<unique_t>(_keys::ns##_##s)); }
      |                                   ^
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:354:1: note: in expansion of macro ‘FORALL_NS_SYMBOLS’
  354 | FORALL_NS_SYMBOLS(DEFINE_SYMBOL)
      | ^~~~~~~~~~~~~~~~~
/dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1193:16: error: ‘_current_device’ is not a member of ‘torch::jit::cuda’; did you mean ‘c10::cuda::_current_device’?
 1193 |     case cuda::_current_device:
      |                ^~~~~~~~~~~~~~~
In file included from /dockerx/pytorch/torch/csrc/jit/ir/ir.h:18,
                 from /dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1:
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:229:11: note: ‘c10::cuda::_current_device’ declared here
  229 |   _(cuda, _current_device)           \
      |           ^~~~~~~~~~~~~~~
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:353:35: note: in definition of macro ‘DEFINE_SYMBOL’
  353 |   namespace ns { constexpr Symbol s(static_cast<unique_t>(_keys::ns##_##s)); }
      |                                   ^
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:354:1: note: in expansion of macro ‘FORALL_NS_SYMBOLS’
  354 | FORALL_NS_SYMBOLS(DEFINE_SYMBOL)
      | ^~~~~~~~~~~~~~~~~
/dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1194:16: error: ‘synchronize’ is not a member of ‘torch::jit::cuda’; did you mean ‘c10::cuda::synchronize’?
 1194 |     case cuda::synchronize:
      |                ^~~~~~~~~~~
In file included from /dockerx/pytorch/torch/csrc/jit/ir/ir.h:18,
                 from /dockerx/pytorch/torch/csrc/jit/ir/ir.cpp:1:
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:230:11: note: ‘c10::cuda::synchronize’ declared here
  230 |   _(cuda, synchronize)               \
      |           ^~~~~~~~~~~
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:353:35: note: in definition of macro ‘DEFINE_SYMBOL’
  353 |   namespace ns { constexpr Symbol s(static_cast<unique_t>(_keys::ns##_##s)); }
      |                                   ^
/dockerx/pytorch/aten/src/ATen/core/interned_strings.h:354:1: note: in expansion of macro ‘FORALL_NS_SYMBOLS’
  354 | FORALL_NS_SYMBOLS(DEFINE_SYMBOL)
      | ^~~~~~~~~~~~~~~~~
At global scope:
cc1plus: warning: unrecognized command line option ‘-Wno-aligned-allocation-unavailable’
cc1plus: warning: unrecognized command line option ‘-Wno-unused-private-field’
cc1plus: warning: unrecognized command line option ‘-Wno-invalid-partial-specialization’
[44/725] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/jit/runtime/register_prim_ops_fulljit.cpp.o
ninja: build stopped: subcommand failed.

viebrix · 2024-02-07T13:47:12Z

is it possible that git clone --recursive https://github.com/pytorch/pytorch.git -b v2.1.2 had some error or stopped during download? I had to restart this line 4-5 times until everything was downloaded...

maybe can also be a problem with docker? I'm not familiar with docker.

picarica · 2024-02-07T14:30:34Z

is it possible that git clone --recursive https://github.com/pytorch/pytorch.git -b v2.1.2 had some error or stopped during download? I had to restart this line 4-5 times until everything was downloaded...

maybe can also be a problem with docker? I'm not familiar with docker.

not sure about docker, i passed throught all my hardware and its rocm docker, but yes i did had to run it like third time for it to clone correctly, but do i have to run it from scratch all the time? git submodule update would do similar thing ?

viebrix · 2024-02-08T20:30:47Z

frankly speaking I don't know. I restarted every time the whole
git clone --recursive https://github.com/pytorch/pytorch.git -b v2.1.2
it seemed to me, that there is already something like a cache, so that not everything was downloaded again and again.

Did you reboot after the line # Reboot after this ?

viebrix · 2024-02-08T20:32:01Z

Docker Rocm- it's necessary that you use the specific rocm version I did with the line: wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb

MarienvD · 2024-03-30T07:19:29Z

I followed the exact steps in #27 (comment)
The only difference is instead of compiling pytorch and pytorchvision from scratch, I used these precompiled binaries: https://github.com/tsl0922/pytorch-gfx803/releases
It works now, thanks so much @viebrix

hectpz5 · 2025-01-23T16:11:49Z

Docker Rocm- it's necessary that you use the specific rocm version I did with the line: wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/jammy/amdgpu-install_5.5.50500-1_all.deb

Excuse me friend, I replicated your process. Testing, I discovered that you can update the GPU driver for more stability.

ejem:
download key
sudo mkdir --parents --mode=0755 /etc/apt/keyrings

echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/6.2.3/ubuntu jammy main" | sudo tee /etc/apt/sources.list.d/amdgpu.list

Update driver more stability

sudo apt-get update && sudo apt fuul-upgrade

ALL ok

linux mint 21.2 mate low use ram
kernel 5.19.0-50
rocm 5.5
python 3.10.12
AMDGPU driver in terminal with "inxi -G" 6.8.5
xfx RX 580 8gb stable diffution with centumix

abelfodil · 2025-03-03T00:14:52Z

I gave up trying to make ROCm work with my 5700 XT. Instead, I'm using projects that support Vulkan as a backend such as https://github.com/ggml-org/llama.cpp which work out of the box.

cmdrscotty changed the title ~~Possible to update PyTorch build to support Rocm5.2?~~ Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2? Mar 2, 2023

Aryetis mentioned this issue May 20, 2023

[Bug]: Instructions presented for AMD do not seem to work AUTOMATIC1111/stable-diffusion-webui#10435

Closed

1 task

viebrix mentioned this issue Jul 21, 2023

Older RX AMD card support? easydiffusion/easydiffusion#1375

Open

viebrix mentioned this issue Aug 9, 2023

[help-with-local-system]: Segmentation fault RX580 AUTOMATIC1111/stable-diffusion-webui#12376

Closed

1 task

This was referenced Sep 20, 2023

How to set AMD GPU targets when compiling tensorflow-rocm? #33

Closed

Pytorch2.0.1 Rocm5.5 support #31

Open

djizi mentioned this issue Dec 8, 2023

по поводу amd SunOner/sunone_aimbot#60

Closed

hectpz5 mentioned this issue Jan 23, 2025

Yes update driver GPU only change repo AMD #38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2? #27

Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2? #27

cmdrscotty commented Mar 2, 2023

xuhuisheng commented Mar 2, 2023

tsl0922 commented Apr 20, 2023

xuhuisheng commented Apr 20, 2023

poVoq commented Apr 24, 2023

WeirdWood commented May 4, 2023

agtian0 commented Jun 10, 2023

Install cmake, sccache (on snapstore, use the store app)

Install dependencies

Install ROCm

Build MAGMA if you need, I'm skipping this

Build Torch

Build Vision

viebrix commented Aug 4, 2023

waltercool commented Sep 17, 2023

viebrix commented Sep 17, 2023

viebrix commented Sep 17, 2023 •

edited

Loading

viebrix commented Sep 17, 2023

viebrix commented Sep 17, 2023

viebrix commented Sep 18, 2023

viebrix commented Jan 15, 2024 •

edited

Loading

cmdrscotty commented Jan 16, 2024

picarica commented Feb 7, 2024 •

edited

Loading

viebrix commented Feb 7, 2024 •

edited

Loading

picarica commented Feb 7, 2024

viebrix commented Feb 8, 2024

viebrix commented Feb 8, 2024 •

edited

Loading

MarienvD commented Mar 30, 2024

hectpz5 commented Jan 23, 2025

abelfodil commented Mar 3, 2025

Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2? #27

Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2? #27

Comments

cmdrscotty commented Mar 2, 2023

xuhuisheng commented Mar 2, 2023

tsl0922 commented Apr 20, 2023

xuhuisheng commented Apr 20, 2023

poVoq commented Apr 24, 2023

WeirdWood commented May 4, 2023

Install cmake, sccache (on snapstore, use the store app)

Install dependencies

Install ROCm

Build MAGMA if you need, I'm skipping this

Build Torch

Build Vision

agtian0 commented Jun 10, 2023

Install cmake, sccache (on snapstore, use the store app)

Install dependencies

Install ROCm

Build MAGMA if you need, I'm skipping this

Build Torch

Build Vision

======================= ROCm System Management Interface ======================= WARNING: No AMD GPUs specified ================================= Concise Info ================================= GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%

viebrix commented Aug 4, 2023

waltercool commented Sep 17, 2023

viebrix commented Sep 17, 2023

viebrix commented Sep 17, 2023 • edited Loading

viebrix commented Sep 17, 2023

viebrix commented Sep 17, 2023

viebrix commented Sep 18, 2023

viebrix commented Jan 15, 2024 • edited Loading

cmdrscotty commented Jan 16, 2024

picarica commented Feb 7, 2024 • edited Loading

viebrix commented Feb 7, 2024 • edited Loading

picarica commented Feb 7, 2024

viebrix commented Feb 8, 2024

viebrix commented Feb 8, 2024 • edited Loading

MarienvD commented Mar 30, 2024

hectpz5 commented Jan 23, 2025

abelfodil commented Mar 3, 2025

======================= ROCm System Management Interface =======================
WARNING: No AMD GPUs specified
================================= Concise Info =================================
GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%

viebrix commented Sep 17, 2023 •

edited

Loading

viebrix commented Jan 15, 2024 •

edited

Loading

picarica commented Feb 7, 2024 •

edited

Loading

viebrix commented Feb 7, 2024 •

edited

Loading

viebrix commented Feb 8, 2024 •

edited

Loading