-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker container of the cloned task crashes/stucks. #189
Comments
Hey there, |
Hi, Can you include a full log of the task execution? |
Thanks for the quick response.
More or less, i switched all used parameters in the config. Besides i tried a completly new setup on a different computer with the default config getting the same result. Also I tried to use an older version of the agent (1.6) but that didn`t work aswell. |
From the looks of it, it looks like the execution inside the container cannot reach the ClearML Server - can you add |
Shure thing. My first thougth was that a proxy-setting is causing the problems, but on a different machine without any proxys my logs and problems are the same. |
Is this reachable from inside the container? It seems to me this won't resolve to anything... |
You are correct, i can't reach http://localhost:8008. What setting do you mean with task container argument? is this |
Yes, that would work |
Sadly I still cannot reach the API. |
Where is the server running? |
The server runs at the same machine from where i try to execute my task. also i tried it both on windows and Linux |
Small Update, i didnt't change anything but tried again to start a agent with docker mode and got a different output. I now get the following output bevor nothing happens: 1713784929586 DLB1:gpu1 DEBUG Successfully installed PyYAML-6.0.1 attrs-23.2.0 certifi-2024.2.2 charset-normalizer-3.3.2 clearml-agent-1.8.0 distlib-0.3.8 filelock-3.13.4 furl-2.1.3 idna-3.7 importlib-resources-6.4.0 jsonschema-4.21.1 jsonschema-specifications-2023.12.1 orderedmultidict-1.0.1 pathlib2-2.3.7.post1 pkgutil-resolve-name-1.3.10 platformdirs-4.2.0 psutil-5.9.8 pyjwt-2.8.0 pyparsing-3.1.2 python-dateutil-2.8.2 referencing-0.34.0 requests-2.31.0 rpds-py-0.18.0 six-1.16.0 urllib3-1.26.18 virtualenv-20.25.3 zipp-3.18.1 The following additional packages will be installed: ` |
Hello everyone,
I installed clearml server and clearml agent locally with docker on a ubuntu linux system following the documentation guide. My problem is with the task clone after write code and logged. If I clone a task and assign it to a queue that uses virtual enviroment mode for execution, then the clone executes all the code correctly, however, if I clone a task and then assign it to a queue that uses docker for execution, the container gets started, downloads packages but does not execute the task code. Where am I going wrong?
PS: To be clear, the cloned task container will not crash or die, because it is possible to enter the container with docker exec -it id_container /bin/bash ...so it is as if clearml were merely creating the container.
The text was updated successfully, but these errors were encountered: