You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to make the exo project work on my Orin NX without success, here is the error I am getting when running exo:
(exo) sgoudelis@jetson:~/projects/exo$ exo
Selected inference engine: None
_____ _____
/ _ \ \/ / _ \
| __/> < (_) |
\___/_/\_\___/
Detected system: Linux
Inference engine name after selection: tinygrad
Traceback (most recent call last):
File "/home/sgoudelis/miniconda3/envs/exo/bin/exo", line 33, in <module>
sys.exit(load_entry_point('exo', 'console_scripts', 'exo')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/miniconda3/envs/exo/bin/exo", line 25, in importlib_load_entry_point
return next(matches).load()
^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/importlib/metadata/__init__.py", line 205, in load
module = import_module(match.group('module'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/importlib/__init__.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 999, in exec_module
File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
File "/home/sgoudelis/projects/exo/exo/main.py", line 106, in <module>
inference_engine = get_inference_engine(inference_engine_name, shard_downloader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/projects/exo/exo/inference/inference_engine.py", line 69, in get_inference_engine
from exo.inference.tinygrad.inference import TinygradDynamicShardInferenceEngine
File "/home/sgoudelis/projects/exo/exo/inference/tinygrad/inference.py", line 4, in <module>
from exo.inference.tinygrad.models.llama import Transformer, TransformerShard, convert_from_huggingface, fix_bf16, sample_logits
File "/home/sgoudelis/projects/exo/exo/inference/tinygrad/models/llama.py", line 2, in <module>
from tinygrad import Tensor, Variable, TinyJit, dtypes, nn, Device
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/__init__.py", line 5, in <module>
from tinygrad.tensor import Tensor # noqa: F401
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/tensor.py", line 12, in <module>
from tinygrad.device import Device, BufferSpec
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/device.py", line 226, in <module>
class CPUProgram:
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/device.py", line 227, in CPUProgram
helper_handle = ctypes.CDLL(ctypes.util.find_library('System' if OSX else 'kernel32' if sys.platform == "win32" else 'gcc_s'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/ctypes/__init__.py", line 379, in __init__
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so: invalid ELF header
looking into the so file I get this:
(exo) sgoudelis@jetson:~/projects/exo$ file /home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so
/home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so: ASCII text
(exo) sgoudelis@jetson:~/projects/exo$ more /home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so
/* GNU ld script
Use the shared library, but some functions are only in
the static library. */
GROUP ( libgcc_s.so.1 -lgcc )
Anyone had any idea how to make exo work on the Orin Jetson ?
UPDATE:
Moving the mentioned static object file out of the way actually makes exo go further. It does fail in another way:
Traceback (most recent call last):
File "/home/sgoudelis/miniconda3/envs/exo/bin/exo", line 33, in <module>
sys.exit(load_entry_point('exo', 'console_scripts', 'exo')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/projects/exo/exo/main.py", line 385, in run
loop.run_until_complete(main())
File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
File "/home/sgoudelis/projects/exo/exo/main.py", line 349, in main
await node.start(wait_for_peers=args.wait_for_peers)
File "/home/sgoudelis/projects/exo/exo/orchestration/node.py", line 59, in start
self.device_capabilities = await device_capabilities()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/projects/exo/exo/topology/device_capabilities.py", line 153, in device_capabilities
return await linux_device_capabilities()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/projects/exo/exo/topology/device_capabilities.py", line 188, in linux_device_capabilities
gpu_memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/pynvml.py", line 2934, in nvmlDeviceGetMemoryInfo
_nvmlCheckReturn(ret)
File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/pynvml.py", line 979, in _nvmlCheckReturn
raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported
I am a complete noob when it comes to NVIDIA CUDA stuff btw. I am guessing this happens because the Orin has shared memory.
ANOTHER UPDATE:
Exo does work with the Orin NX 16GB, by bypassing the part of the code is querying the VRAM amount and giving it a bogus number does make exo boot up just fine and also have GPU accelerated inference.
I would love for some feedback from one of the developers of the Exo project about this. Please feel free to comment.
The text was updated successfully, but these errors were encountered:
Good morning,
I have been trying to make the exo project work on my Orin NX without success, here is the error I am getting when running exo:
looking into the so file I get this:
Anyone had any idea how to make exo work on the Orin Jetson ?
UPDATE:
Moving the mentioned static object file out of the way actually makes exo go further. It does fail in another way:
I am a complete noob when it comes to NVIDIA CUDA stuff btw. I am guessing this happens because the Orin has shared memory.
ANOTHER UPDATE:
Exo does work with the Orin NX 16GB, by bypassing the part of the code is querying the VRAM amount and giving it a bogus number does make exo boot up just fine and also have GPU accelerated inference.
I would love for some feedback from one of the developers of the Exo project about this. Please feel free to comment.
The text was updated successfully, but these errors were encountered: