cudnn_home not valid during build #33

mfruhner · 2021-03-18T11:08:39Z

Description
I am not able to build the ONNX Backend. I am following the build instructions in the README but the build fails at Step 17.

Triton Information
Main Branch for Trition Version 21.02

To Reproduce

I am running DGX OS 5 (Ubuntu 20.04).

cmake -DCMAKE_INSTALL_PREFIX:PATH=pwd/install -DTRITON_BUILD_ONNXRUNTIME_VERSION=1.6.0 -DTRITO N_BUILD_CONTAINER_VERSION=21.02 ..
make install

Output:

Step 17/24 : RUN ./build.sh ${COMMON_BUILD_ARGS} --update --build --use_cuda --cuda_home "/usr/local/cuda" ---> Running in 3360f12bb769 2021-03-18 11:01:00,463 build [ERROR] - cuda_home and cudnn_home paths must be specified and valid. cuda_home='/usr/local/cuda' valid=True. cudnn_home='None' valid=False The command '/bin/sh -c ./build.sh ${COMMON_BUILD_ARGS} --update --build --use_cuda --cuda_home "/usr/local/cuda"' returned a non-zero code: 1 make[2]: *** [CMakeFiles/ort_target.dir/build.make:81: onnxruntime/lib/libonnxruntime.so.1.6.0] Fehler 1 make[1]: *** [CMakeFiles/Makefile2:158: CMakeFiles/ort_target.dir/all] Fehler 2 make: *** [Makefile:149: all] Fehler 2

Expected behavior
I expect the build to succed.

The text was updated successfully, but these errors were encountered:

CoderHam · 2021-04-30T17:45:39Z

https://github.com/triton-inference-server/onnxruntime_backend/blob/main/tools/gen_ort_dockerfile.py#L93
The build relies on getting CUDNN_VERSION from the base containers here.
I checked that the variable is indeed present in the nvcr.io/nvidia/tritonserver:21.02-py3-min container. Can you share the dockerfile generated by gen_ort_dockerfile.py in your build?

askhade · 2021-05-28T16:30:54Z

I have encountered this issue too... perhaps it may be a good idea to allow users to enter the path for cudnn similar to cuda.
TRITON_BUILD_CUDNN_HOME (similar to TRITON_BUILD_CUDA_HOME)

askhade · 2021-05-28T16:32:33Z

@mfruhner : Were you able to find a workaround ? I simply updated the gen_ort_dockerfile.py script tp explicitly include the cudnn path

GowthamKudupudi · 2021-06-08T12:47:30Z

@mfruhner, did you resolve it? if the build is failing then many should report but only few are here in this thread; are we doing anything wrong? I'm using --no-container-build flag while building triton server; why the build is trying to build ONNXruntime inside a container!

GowthamKudupudi · 2021-07-22T11:05:00Z

@CoderHam https://paste.ubuntu.com/p/nF3HCcYycR/ is the Dockerfile.ort found in the build folder.

GowthamKudupudi · 2021-07-22T13:15:21Z

@askhade what should be the value of --cudnn-home?

askhade · 2021-07-22T19:50:26Z

--cudnn_home should be set to the path to cudnn libs dir

chandrameenamohan · 2021-08-06T06:17:16Z

@askhade or @GowthamKudupudi Are you able to resolve it? I am trying this in centos7. I am able to build tensorflow1, tensorflow2, python and pytorch backend. But I am getting error when I try to build backend for onnx.
I use this command to build:
./build.py --cmake-dir=/server/build --build-dir=/tmp/citritonbuild --no-container-build --endpoint=http --endpoint=grpc --repo-tag=common:r21.04 --repo-tag=core:r21.04 --repo-tag=backend:r21.04 --repo-tag=thirdparty:r21.04 --backend=onnxruntime:r21.04 --enable-logging --enable-stats --enable-tracing

This is the error:

Step 26/37 : RUN ./build.sh ${COMMON_BUILD_ARGS} --update --build --use_cuda --cuda_home "/usr/local/cuda" --use_tensorrt --tensorrt_home "/usr/src/tensorrt" --use_openvino CPU_FP32
 ---> Running in dc590d08a4e7
2021-08-06 06:22:20,248 tools_python_utils [INFO] - flatbuffers module is not installed. parse_config will not be available
2021-08-06 06:22:20,257 build [ERROR] - cuda_home and cudnn_home paths must be specified and valid.
cuda_home='/usr/local/cuda' valid=True. cudnn_home='None' valid=False
The command '/bin/sh -c ./build.sh ${COMMON_BUILD_ARGS} --update --build --use_cuda --cuda_home "/usr/local/cuda" --use_tensorrt --tensorrt_home "/usr/src/tensorrt" --use_openvino CPU_FP32' returned a non-zero code: 1
make[2]: *** [CMakeFiles/ort_target.dir/build.make:81: onnxruntime/lib/libonnxruntime.so.1.7.1] Error 1
make[2]: Leaving directory '/tmp/citritonbuild/onnxruntime/build'
make[1]: *** [CMakeFiles/Makefile2:158: CMakeFiles/ort_target.dir/all] Error 2
make[1]: Leaving directory '/tmp/citritonbuild/onnxruntime/build'
make: *** [Makefile:149: all] Error 2
error: make install failed

chandrameenamohan · 2021-08-06T07:12:09Z

@mfruhner : Were you able to find a workaround ? I simply updated the gen_ort_dockerfile.py script tp explicitly include the cudnn path

Can you please explicitly mention where did you edit and how did you run this?
When I run build script. It automatically downloads the onnxruntime from git. What you did to use your local downloaded onnxruntime backend codebase?

mfruhner · 2021-08-09T12:10:13Z

I didn't try to solve this any further and went on to use something else, sorry.

askhade · 2021-09-13T18:27:54Z

@chandrameenamohan : I suppose you are hitting this issue because of "--no-container-build" can you remove it and test again?
@CoderHam : Can you pick this up?

GowthamKudupudi · 2021-09-20T13:03:58Z

The comment by @CoderHam that is liked is the key

aravindhank11 · 2022-09-11T22:03:27Z

Any known fixes on how to pass the cudnn_home with --no-container-build opt?

aravindhank11 · 2022-09-19T14:43:55Z

The following changes to build.py did the trick (in case somebody else has come across a similar issue & looking for an easy fix):

diff --git a/build.py b/build.py
index 82754fa9..a06b42e9 100755
--- a/build.py
+++ b/build.py
@@ -640,7 +640,11 @@ def pytorch_cmake_args(images):
 def onnxruntime_cmake_args(images, library_paths):
     cargs = [
         cmake_backend_arg('onnxruntime', 'TRITON_BUILD_ONNXRUNTIME_VERSION',
-                          None, TRITON_VERSION_MAP[FLAGS.version][2])
+                          None, TRITON_VERSION_MAP[FLAGS.version][2]),
+        cmake_backend_arg('onnxruntime', 'TRITON_BUILD_CUDA_HOME',
+                          None, '/usr/local/cuda-11.7/'),
+        cmake_backend_arg('onnxruntime', 'TRITON_BUILD_CUDNN_HOME',
+                          None, '/usr/lib/x86_64-linux-gnu/')
     ]
 
     # TRITON_ENABLE_GPU is already set for all backends in backend_cmake_args()

I am not sure if build.py exposes this as a generic run-time parameter. I would be more than happy to add the support if need be.

changchengx · 2023-05-25T08:13:27Z

@askhade @GowthamKudupudi
I hit the same failure when build Triton/server/r23.04 branch with below command on the host with Ubuntu20.04.5 OS:

$ ./build.py -v --no-container-build --build-dir=`pwd`/build --enable-all

changchengx · 2023-05-25T08:14:37Z

@aravindhank11
I'll apply your patch on Triton/server/r23.04 branch and run the build command again to check it:

$ ./build.py -v --no-container-build --build-dir=`pwd`/build --enable-all

changchengx · 2023-05-25T14:20:07Z

@aravindhank11 It also works when building Triton/server/r23.04

changchengx · 2023-05-26T07:38:42Z

@aravindhank11 Without using the patch, it also works to build Triton/sever/r23.04 without docker by below command:

./build.py -v --no-container-build --build-dir=`pwd`/build --enable-all --extra-backend-cmake-arg=onnxruntime:TRITON_BUILD_CUDA_HOME=/usr/local/cuda-12.1/ --extra-backend-cmake-arg=onnxruntime:TRITON_BUILD_CUDNN_HOME=/usr/lib/x86_64-linux-gnu/

pultarmi · 2024-06-24T08:42:17Z

I just want to report that this bug still exists in v23.11 and I solved it by changing
"RUN ./build.sh ${{COMMON_BUILD_ARGS}} --update --build {}"
to
"RUN ./build.sh ${{COMMON_BUILD_ARGS}} --cudnn_home=/usr/local/cudnn-8.9 --update --build {}"
in gen_ort_dockerfile.py. The build process is run in a docker image so first you should check what cudnn version is inside.

There is another bug related to build.sh requiring non-privileged user for it to run, hence you want to use
"RUN ./build.sh ${{COMMON_BUILD_ARGS}} --allow_running_as_root --cudnn_home=/usr/local/cudnn-8.9 --update --build {}"

Then it compiles with no further problems

631068264 · 2024-11-11T10:48:06Z

v24.09

#!/bin/bash

mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install \
      -DTRITON_BUILD_ONNXRUNTIME_VERSION=1.19.2 \
      -DTRITON_BACKEND_REPO_TAG=r24.09 \
      -DTRITON_CORE_REPO_TAG=r24.09 \
      -DTRITON_COMMON_REPO_TAG=r24.09 \
      -DTRITON_ENABLE_ONNXRUNTIME_TENSORRT=ON \
      -DTRITON_BUILD_CONTAINER_VERSION=24.09 ..

make install

Step 17/27 : RUN ./build.sh ${COMMON_BUILD_ARGS} --update --build --use_cuda --cuda_home "/usr/local/cuda" --use_tensorrt --use_tensorrt_builtin_parser --tensorrt_home "/usr/src/tensorrt" --allow_running_as_root
 ---> Running in 6d9e291eee48
2024-11-11 08:33:16,650 tools_python_utils [INFO] - flatbuffers module is not installed. parse_config will not be available
2024-11-11 08:33:16,655 build [DEBUG] - Command line arguments:
  --build_dir /workspace/onnxruntime/build/Linux --config Release --skip_submodule_sync --parallel --build_shared_lib --build_dir /workspace/build --cmake_extra_defines 'CMAKE_CUDA_ARCHITECTURES='"'"'60;61;70;75;80;86;90'"'"'' --update --build --use_cuda --cuda_home /usr/local/cuda --use_tensorrt --use_tensorrt_builtin_parser --tensorrt_home /usr/src/tensorrt --allow_running_as_root
2024-11-11 08:33:16,659 build [ERROR] - cuda_home and cudnn_home paths must be specified and valid.
cuda_home='/usr/local/cuda' valid=True. cudnn_home='None' valid=False
Namespace(build_dir='/workspace/build', config=['Release'], update=True, build=True, clean=False, parallel=0, nvcc_threads=-1, test=False, skip_tests=False, compile_no_warning_as_error=False, enable_nvtx_profile=False, enable_memory_profile=False, enable_training=False, enable_training_apis=False, enable_training_ops=False, enable_nccl=False, mpi_home=None, nccl_home=None, use_mpi=False, enable_onnx_tests=False, path_to_protoc_exe=None, fuzz_testing=False, enable_symbolic_shape_infer_tests=False, gen_doc=None, gen_api_doc=False, use_cuda=True, cuda_version=None, cuda_home='/usr/local/cuda', cudnn_home=None, enable_cuda_line_info=False, enable_cuda_nhwc_ops=False, enable_pybind=False, build_wheel=False, wheel_name_suffix=None, skip_keras_test=False, build_csharp=False, build_nuget=False, msbuild_extra_options=None, build_java=False, build_nodejs=False, build_objc=False, build_shared_lib=True, build_apple_framework=False, cmake_extra_defines=[["CMAKE_CUDA_ARCHITECTURES='60;61;70;75;80;86;90'"]], target=None, x86=False, rv64=False, arm=False, arm64=False, arm64ec=False, buildasx=False, riscv_toolchain_root='', riscv_qemu_path='', msvc_toolset=None, windows_sdk_version=None, android=False, android_abi='arm64-v8a', android_api=27, android_sdk_path='', android_ndk_path='', android_cpp_shared=False, android_run_emulator=False, use_gdk=False, gdk_edition='.', gdk_platform='Scarlett', ios=False, visionos=False, macos=None, apple_sysroot='', ios_toolchain_file='', visionos_toolchain_file='', xcode_code_signing_team_id='', xcode_code_signing_identity='', cmake_generator=None, osx_arch='x86_64', apple_deploy_target=None, enable_address_sanitizer=False, use_binskim_compliant_compile_flags=False, disable_memleak_checker=False, build_wasm=False, build_wasm_static_lib=False, emsdk_version='3.1.59', enable_wasm_simd=False, enable_wasm_threads=False, disable_wasm_exception_catching=False, enable_wasm_api_exception_catching=False, enable_wasm_exception_throwing_override=True, wasm_run_tests_in_browser=False, enable_wasm_profiling=False, enable_wasm_debug_info=False, wasm_malloc=None, emscripten_settings=None, use_extensions=False, extensions_overridden_path=None, cmake_path='cmake', ctest_path='ctest', skip_submodule_sync=True, use_mimalloc=False, use_dnnl=False, dnnl_gpu_runtime='', dnnl_opencl_root='', use_openvino=None, dnnl_aarch64_runtime='', dnnl_acl_root='', use_coreml=False, use_webnn=False, use_snpe=False, snpe_root=None, use_nnapi=False, use_vsinpu=False, nnapi_min_api=None, use_jsep=False, use_qnn=False, qnn_home=None, use_rknpu=False, use_preinstalled_eigen=False, eigen_path=None, enable_msinternal=False, llvm_path=None, use_vitisai=False, use_tvm=False, tvm_cuda_runtime=False, use_tvm_hash=False, use_tensorrt=True, use_tensorrt_builtin_parser=True, use_tensorrt_oss_parser=False, tensorrt_home='/usr/src/tensorrt', test_all_timeout='10800', use_migraphx=False, migraphx_home=None, use_full_protobuf=False, llvm_config='', skip_onnx_tests=False, skip_winml_tests=False, skip_nodejs_tests=False, enable_msvc_static_runtime=False, use_dml=False, dml_path='', use_winml=False, winml_root_namespace_override=None, dml_external_project=False, use_telemetry=False, enable_wcos=False, enable_lto=False, enable_transformers_tool_test=False, use_acl=None, acl_home=None, acl_libs=None, use_armnn=False, armnn_relu=False, armnn_bn=False, armnn_home=None, armnn_libs=None, build_micro_benchmarks=False, minimal_build=None, include_ops_by_config=None, enable_reduced_operator_type_support=False, disable_contrib_ops=False, disable_ml_ops=False, disable_rtti=False, disable_types=[], disable_exceptions=False, rocm_version=None, use_rocm=False, rocm_home=None, code_coverage=False, enable_lazy_tensor=False, ms_experimental=False, enable_external_custom_op_schemas=False, external_graph_transformer_path=None, enable_cuda_profiling=False, use_cann=False, cann_home=None, enable_rocm_profiling=False, use_xnnpack=False, use_azure=False, use_cache=False, use_triton_kernel=False, use_lock_free_queue=False, allow_running_as_root=True)
The command '/bin/sh -c ./build.sh ${COMMON_BUILD_ARGS} --update --build --use_cuda --cuda_home "/usr/local/cuda" --use_tensorrt --use_tensorrt_builtin_parser --tensorrt_home "/usr/src/tensorrt" --allow_running_as_root' returned a non-zero code: 1
make[2]: *** [CMakeFiles/ort_target.dir/build.make:74: onnxruntime/lib/libonnxruntime.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:278: CMakeFiles/ort_target.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

tanmayv25 mentioned this issue Jun 7, 2021

build with --no-container-build flag fails while building onnxruntime inside a container triton-inference-server/server#2967

Closed

CoderHam closed this as completed Sep 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudnn_home not valid during build #33

cudnn_home not valid during build #33

mfruhner commented Mar 18, 2021

CoderHam commented Apr 30, 2021

askhade commented May 28, 2021

askhade commented May 28, 2021

GowthamKudupudi commented Jun 8, 2021 •

edited

Loading

GowthamKudupudi commented Jul 22, 2021

GowthamKudupudi commented Jul 22, 2021

askhade commented Jul 22, 2021

chandrameenamohan commented Aug 6, 2021 •

edited

Loading

chandrameenamohan commented Aug 6, 2021

mfruhner commented Aug 9, 2021

askhade commented Sep 13, 2021

GowthamKudupudi commented Sep 20, 2021

aravindhank11 commented Sep 11, 2022

aravindhank11 commented Sep 19, 2022

changchengx commented May 25, 2023

changchengx commented May 25, 2023

changchengx commented May 25, 2023

changchengx commented May 26, 2023

pultarmi commented Jun 24, 2024 •

edited

Loading

631068264 commented Nov 11, 2024

cudnn_home not valid during build #33

cudnn_home not valid during build #33

Comments

mfruhner commented Mar 18, 2021

CoderHam commented Apr 30, 2021

askhade commented May 28, 2021

askhade commented May 28, 2021

GowthamKudupudi commented Jun 8, 2021 • edited Loading

GowthamKudupudi commented Jul 22, 2021

GowthamKudupudi commented Jul 22, 2021

askhade commented Jul 22, 2021

chandrameenamohan commented Aug 6, 2021 • edited Loading

chandrameenamohan commented Aug 6, 2021

mfruhner commented Aug 9, 2021

askhade commented Sep 13, 2021

GowthamKudupudi commented Sep 20, 2021

aravindhank11 commented Sep 11, 2022

aravindhank11 commented Sep 19, 2022

changchengx commented May 25, 2023

changchengx commented May 25, 2023

changchengx commented May 25, 2023

changchengx commented May 26, 2023

pultarmi commented Jun 24, 2024 • edited Loading

631068264 commented Nov 11, 2024

GowthamKudupudi commented Jun 8, 2021 •

edited

Loading

chandrameenamohan commented Aug 6, 2021 •

edited

Loading

pultarmi commented Jun 24, 2024 •

edited

Loading