Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Multiple Traefik Replicas Across Different Availability Zones for Enhanced High Availability #920

Open
mcandylab opened this issue Dec 17, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@mcandylab
Copy link

What problem will this feature address?

Currently, there can only be a single Dokploy service running a single instance of Traefik. While adding additional workers in different availability zones allows containers to be replicated across the cluster, all traffic routing still depends on the single Traefik instance located on the primary server. If the primary server becomes unavailable, the entire setup loses external accessibility, rendering all containers in other zones inaccessible despite their healthy state. This creates a single point of failure.

Describe the solution you'd like

I would like the ability to run multiple Traefik replicas within a Dokploy-managed Swarm environment. Each replica could be placed in a different availability zone, ensuring that if one zone or server goes down, another Traefik instance can continue to route traffic to the containers still running in other zones. Ideally, Dokploy would seamlessly configure and maintain these Traefik replicas, balancing traffic and ensuring continuous availability without manual intervention.

Describe alternatives you've considered

  • External Load Balancers:
    Using an external load balancer or DNS-based failover could distribute traffic across multiple zones. However, this adds complexity and external dependencies.

  • Manual Configuration of Traefik Instances:
    Manually deploying multiple Traefik services in the Swarm and managing their configurations independently, which can become cumbersome and error-prone.

  • Single Availability Zone Deployment:
    Accepting the current limitation and relying on a single zone, which increases downtime risk and may not be suitable for large or mission-critical projects.

Additional context

By enabling multiple Traefik replicas, Dokploy could better serve large-scale or high-availability environments. This feature would help ensure uninterrupted access to services, improve resilience against zone-level failures, and simplify traffic management in multi-zone infrastructures. It would also align Dokploy with modern best practices for fault tolerance and high availability in distributed applications.

Will you send a PR to implement it?

No

@mcandylab mcandylab added the enhancement New feature or request label Dec 17, 2024
@kamellperry
Copy link

I came to submit an issue similar to this. We've been experiencing many issues with our Traefik container going down and needing to spin back up. It renders all of our services useless despite them being up and healthy consistently.

I'm not too knowledgable in this realm but I'd love to learn to contribute to this issue, or get people much smarter than I on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants