Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In distributed scenarios, how to configure k8s ports and which ports need to be opened? #3489

Open
markluofd opened this issue Feb 11, 2025 · 6 comments
Assignees

Comments

@markluofd
Copy link

2 H800 node deploying DeepSeek R1 model

Using k8s deployment. If the pod only opens the server port and the port agreed by dist-init-addr, the worker will report an error and cannot connect to the master. I want to know which ports I need to open to the pod. Can it be configured?

Image

@shuaills shuaills self-assigned this Feb 11, 2025
@shuaills
Copy link
Collaborator

cc @ByronHsu

@shuaills
Copy link
Collaborator

@markluofd
Copy link
Author

Yes, I followed the instructions in the document. If I use docker --net host, there is no problem with 2-machine deployment. But now I want to use k8s. The network cannot be shared between k8s 2 pods. I need to specify the ports. If I only open 20000 and 40000, the service cannot start. It seems that the service also depends on many other ports, and these ports are random.

When I use docker --net host, I observe that the master node has many sglang processes that occupy many ports, and these port numbers are random.

Image

@markluofd
Copy link
Author

Sorry, I misunderstood. I thought the communication ports between k8s pods need to be configured in advance.

In fact, the ports between k8s pods are fully open by default, so there's no problem.

@Hugh-yw
Copy link

Hugh-yw commented Feb 13, 2025

Sorry, I misunderstood. I thought the communication ports between k8s pods need to be configured in advance.

In fact, the ports between k8s pods are fully open by default, so there's no problem.

Did you deploy it through k8s deploy resources? How are the --dist-init-addr --node-rank 1 parameters defined?

@markluofd
Copy link
Author

--dist-init-addr set to be master pod ip and port

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants