Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clearml-agent is not able to execute task #219

Open
gunter-trooper opened this issue Dec 2, 2024 · 2 comments
Open

clearml-agent is not able to execute task #219

gunter-trooper opened this issue Dec 2, 2024 · 2 comments

Comments

@gunter-trooper
Copy link

Description:
Hello everyone,

I am currently experiencing an issue with a pipeline I rebuilt from the example provided. The pipeline registers correctly on Cearml, but it does not get executed by the worker.

What I have done:

Rebuilt the pipeline using the example from the GitHub repository.
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_functions.py

Verified that the pipeline is registered on Cearml without errors.
The issue:
The worker does not execute the registered pipeline.

Attachments:
Logs from the worker.
task_5009c074b3b841e28d3e1855dcb5a06c (1).log

Screenshots of the Cearml platform showing the pipeline registration and worker status.
Screenshot 2024-12-01 142752
Screenshot 2024-12-01 142525

Expected behavior:
The pipeline should execute as expected once registered on Cearml.

Environment:
Docker Compose Deployment for ClearMLSever
https://hub.docker.com/layers/allegroai/clearml/latest/images/sha256-cbc6e5519edecc716112bb03ca291e937c38e6c33630907e557545148cf85d51?context=explore
image

Agent was testet in docker mode on Windows and WSL
clearml-agent 1.9.2
Logs are identical
example_log_worker.log

Link to discussion in Slack:
https://clearml.slack.com/archives/CTK20V944/p1733059802636009

Any insights or suggestions would be greatly appreciated! Thank you in advance for your help.

@Ruhrozz
Copy link

Ruhrozz commented Dec 4, 2024

afaik pipeline is a task and it executes on your queue, but pipeline creates another tasks and you need another queue to handle these

@gunter-trooper
Copy link
Author

@Ruhrozz
Thanks for pointing that out! However, I'm still facing an issue: when I use two agents in Docker mode, the pipeline remains stuck with the same behavior. Interestingly, it works correctly when I switch to venv mode.

Task Worker -> clearml-agent daemon --gpus all --docker --queue default --detached
Pipeline Worker -> clearml-agent daemon --docker --queue controller_queue --detached

image

Now running new pipeline :
image

controller_queue is taking the task :
image

controller gets stuck :
image

The default query with second worker remains idle:
image

I hope this provides a clearer description of the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants