Nomad server retry_join
gives up after a single discovery failure
#24560
Labels
hcc/jira
stage/accepted
Confirmed, and intend to work on. No timeline committment though.
theme/discovery
type/bug
Nomad version
Operating system and Environment details
Fedora 40
Issue
Nomad server
retry_join
gives up after a single discovery failure.Reproduction steps
Use
retry_join
withprovider=aws
inside a VPC while the EC2 VPC endpoint is still provisioning.Expected Result
The process retries until it succeeds or exhausts the configured number of retries.
Actual Result
The process gave up after a single discovery failure (see logs below).
We believe this is related to #18745 and the addition of a
return
incommand/agent/retry_join.go
here.Prior to this change, such a failure would not cause it to give up (i.e.
return
).Nomad Server config
Nomad Server logs
The text was updated successfully, but these errors were encountered: