Description
Summary
When I try to installed oneAPI base toolkit and test the MODIN sample apps:
https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted
Then detected error below:
(raylet) [2023-10-10 22:04:54,885 E 21639 21688] (raylet) agent_manager.cc:135: The raylet exited immediately because the Ray agent failed. The raylet fate shares with the agent. This can happen because the Ray agent was unexpectedly killed or failed. Agent can fail when
(raylet) - The version of grpcio
doesn't follow Ray's requirement. Agent can segfault with the incorrect grpcio
version. Check the grpcio version pip freeze | grep grpcio
.
(raylet) - The agent failed to start because of unexpected error or port conflict. Read the log cat /tmp/ray/session_latest/dashboard_agent.log
. You can find the log file structure here https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure.
(raylet) - The agent is killed by the OS (e.g., out of memory).
Version
oneAPI toolkit version: 2023.2.0
Environment
OS is Linux uBuntu 22.04.2 LTS
CPU: 13th Gen Intel(R) Core(TM) i9-13900
RAM: 32GB
Steps to reproduce
Using the conda running the MODIN sample apps that released by oneAPI:
https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted
Observed behavior
Detected the raylet fail like below log:
(raylet) [2023-10-10 22:04:54,885 E 21639 21688] (raylet) agent_manager.cc:135: The raylet exited immediately because the Ray agent failed. The raylet fate shares with the agent. This can happen because the Ray agent was unexpectedly killed or failed. Agent can fail when
(raylet) - The version of grpcio
doesn't follow Ray's requirement. Agent can segfault with the incorrect grpcio
version. Check the grpcio version pip freeze | grep grpcio
.
(raylet) - The agent failed to start because of unexpected error or port conflict. Read the log cat /tmp/ray/session_latest/dashboard_agent.log
. You can find the log file structure here https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure.
(raylet) - The agent is killed by the OS (e.g., out of memory).
Expected behavior
I tested on XEON is working, but CORE product not working as same setup.