Enable Multiple Traefik Replicas Across Different Availability Zones for Enhanced High Availability #920

mcandylab · 2024-12-17T10:33:53Z

What problem will this feature address?

Currently, there can only be a single Dokploy service running a single instance of Traefik. While adding additional workers in different availability zones allows containers to be replicated across the cluster, all traffic routing still depends on the single Traefik instance located on the primary server. If the primary server becomes unavailable, the entire setup loses external accessibility, rendering all containers in other zones inaccessible despite their healthy state. This creates a single point of failure.

Describe the solution you'd like

I would like the ability to run multiple Traefik replicas within a Dokploy-managed Swarm environment. Each replica could be placed in a different availability zone, ensuring that if one zone or server goes down, another Traefik instance can continue to route traffic to the containers still running in other zones. Ideally, Dokploy would seamlessly configure and maintain these Traefik replicas, balancing traffic and ensuring continuous availability without manual intervention.

Describe alternatives you've considered

External Load Balancers:
Using an external load balancer or DNS-based failover could distribute traffic across multiple zones. However, this adds complexity and external dependencies.
Manual Configuration of Traefik Instances:
Manually deploying multiple Traefik services in the Swarm and managing their configurations independently, which can become cumbersome and error-prone.
Single Availability Zone Deployment:
Accepting the current limitation and relying on a single zone, which increases downtime risk and may not be suitable for large or mission-critical projects.

Additional context

By enabling multiple Traefik replicas, Dokploy could better serve large-scale or high-availability environments. This feature would help ensure uninterrupted access to services, improve resilience against zone-level failures, and simplify traffic management in multi-zone infrastructures. It would also align Dokploy with modern best practices for fault tolerance and high availability in distributed applications.

Will you send a PR to implement it?

No

kamellperry · 2024-12-18T18:57:25Z

I came to submit an issue similar to this. We've been experiencing many issues with our Traefik container going down and needing to spin back up. It renders all of our services useless despite them being up and healthy consistently.

I'm not too knowledgable in this realm but I'd love to learn to contribute to this issue, or get people much smarter than I on it.

mcandylab added the enhancement New feature or request label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Multiple Traefik Replicas Across Different Availability Zones for Enhanced High Availability #920

Enable Multiple Traefik Replicas Across Different Availability Zones for Enhanced High Availability #920

mcandylab commented Dec 17, 2024

kamellperry commented Dec 18, 2024

Enable Multiple Traefik Replicas Across Different Availability Zones for Enhanced High Availability #920

Enable Multiple Traefik Replicas Across Different Availability Zones for Enhanced High Availability #920

Comments

mcandylab commented Dec 17, 2024

What problem will this feature address?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Will you send a PR to implement it?

kamellperry commented Dec 18, 2024