Cluster | Nomad Regions | Nomad Datacenters | Nomad Servers | Nomad Clients | Nomad Allocations | Nomad Total CPU | Nomad Total Memory | Consul Datacenters |
---|---|---|---|---|---|---|---|---|
Development | 1 | 1 | 3 | 2 | 27 | 9 Ghz | 16 GiB | 1 |
Staging/UAT | 2 | 4 | 9 | 7 | 190 | 200 Ghz | 240 GiB | 2 |
Production | 2 | 3 | 9 | 17 | 410 | 2655 Ghz | 1873 GiB | 2 |
-
OIDC auth
-
Staging/UAT and Production clusters
-
Consul WAN federated datacenters
-
Nomad Multi-region federation
- Ubuntu (22.04)
- servers/masters -> Nomad + Consul + Unbound (service.consul) + Traefik + Keepalived (vm)
- keepalived for centralized UI and APP differentation.
- example
- nomad-development.company.com IN A -> nomad-development-keepalived-VIP *.nomad-development.company.com IN A -> nomad-development-keepalived-VIP
- nomad-staging.company.com IN A -> nomad-staging-keepalived-VIP *.nomad-staging.company.com IN A -> nomad-staging-keepalived-VIP
- nomad-production.company.com IN A -> nomad-production-keepalived-VIP
*.nomad-production.company.com IN A -> nomad-production-keepalived-VIP
- why?
- OIDC / Oauth like to have consistent callback-URL's, so if you login to Nomad-UI with OIDC, and a server fails, users can just keep using same login-url.
- in jobspecs, you can work with templated traefik-rules to more easily automate a staging-job to get templated to jobname.nomad-staging.company.com
- and if you leverage Letsencrypt (natively supported by Traefik) less hassle with SSL certificates.
- of course, traefik also supports including custom certificates if preferred.
- developers can work without too much manual interference needed.
- example
- keepalived for centralized UI and APP differentation.
- Clients/Workers -> Nomad/Consul + Dockerd
- DNS resolvers set to masters (service.consul)
- Job deployments through Gitlab CI/CD
- custom templating on gitlab-side & on nomad side with Levant
- may indicate jobs that do not run exactly on all environments.
- infrastructure/devops system jobs
- infrastructure/devops service jobs
- oauth2proxy
- custom traefik errorpage
- elastic logstash
- loki
- plugin-nfs-controller
- plugin-smb-controller
- nomad-autoscaler
- vault
- also used to template secrets, again, in above setup, job-specs can be same, and environment helps differentiate for production / staging / development (as vault's are environment-specific)
- kv
- totp
- ssh
- database
- grafana
-
- API backend for prometheus configuration
-
- [Ubuntu mirror]
-
- docker-registry-cache
- monitoring stack
- alertmanager
- karma - unified alertmanager dashboard
- ms-teams handlers
- prometheus
- thanos
- exporters
- blackbox
- SSH probing
- VNC probing
- Consul API critical/warning state
- SMTP response
- various TCP probes
- URL monitoring
- certificate expiry monitoring
- DNS monitoring
- snmp-exporter
- postgres-exporter
- mysql-exporter
- clickhouse-exporter
- idrac-exporter
- smokeping-prober
- solace-exporter
- vmware-exporter
- rundeck-exporter
- graphite-exporter
- custom exporters
- blackbox