Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add propgation_timeout option into cloudamqp_custom_domain #155

Open
Marcus-James-Adams opened this issue May 26, 2022 · 4 comments
Open
Labels
enhancement New feature or request

Comments

@Marcus-James-Adams
Copy link

Issue

Currently cloudamqp_custom_domain validates and requires that the cname to the hostname exists before creating the custom domain name. This is a good security posture to take to avoid spoofing.

However if your terraform code is like the following, the apply may fail as the cname has not yet propagated.

Example code block

resource "azurerm_dns_cname_record" "instance" {
  name                = "mycluster"
  zone_name           = "mydomain.com"
  resource_group_name = data.azurerm_resource_group.example.name
  ttl                 = 60
  record              = cloudamqp_instance.instance.host
}
resource "cloudamqp_custom_domain" "settings" {
  instance_id = cloudamqp_instance.instance.id
  hostname    = "mycluster.mydomain.com"
}

Error received

cloudamqp_custom_domain.settings: Creating...
╷
│ Error: CreateCustomDomain failed, status: 400, message: map[errors:[map[error:Provided hostname does not point to this RabbitMQ cluster]]]
│
│   with cloudamqp_custom_domain.settings,
│   on cloudamqp.tf line 18, in resource "cloudamqp_custom_domain" "settings":18: resource "cloudamqp_custom_domain" "settings" {
│

I would like to suggest that an option setting is added to cloudamqp_custom_domain that allows you to specify a propagation timeout period.
During this period, it would wait 10 seconds and then retry the hostname validation check:

  • If it resolves then it continues on to create the SSL cert and attach it to the server
  • If it does not resolve it would wait 10 seconds and try again
  • If the propagation timeout period is then exceeded it would error as above.

Example resource setup

resource "cloudamqp_custom_domain" "settings" {
  instance_id = cloudamqp_instance.instance.id
  hostname    = "mycluster.mydomain.com"
  propagation timeout  = 600      // <--- Time in seconds, optional defaults to 1 minute if not set 
}

@dentarg dentarg added the enhancement New feature or request label May 27, 2022
@dentarg
Copy link
Member

dentarg commented May 27, 2022

I'm hesitant to add this because I'm not sure it will properly address the issue due to negative caching: https://serverfault.com/questions/426807/how-long-does-negative-dns-caching-typically-last

The CloudAMQP backend powering the API might cache the NXDOMAIN response for your custom domain for a while and the timeout you specify might not be enough. It would be better if the user ensured the record exist publicly before using the API/provider resource.

As an example, with AWS Route53 there's an API ChangeInfo that can be queried for any DNS change. (I tried looking for a corresponding API with Azure DNS but I did not find any.)

Another complicating factor is that the API can return multiple errors (e.g. lack of CAA record on your domain) and not all should be retried.

@Marcus-James-Adams
Copy link
Author

Marcus-James-Adams commented May 27, 2022 via email

@dentarg
Copy link
Member

dentarg commented Oct 13, 2023

We have many resources (https://github.com/search?q=repo%3Acloudamqp%2Fterraform-provider-cloudamqp%20retries&type=code) having sleep and timeout

"sleep": {
Type: schema.TypeInt,
Optional: true,
Default: 60,
Description: "Configurable sleep time in seconds between retries for RabbitMQ configuration",
},
"timeout": {
Type: schema.TypeInt,
Optional: true,
Default: 3600,
Description: "Configurable timeout time in seconds for RabbitMQ configuration",
},

So sure, the cloudamqp_custom_domain could have that too

@arthmoeros
Copy link
Contributor

arthmoeros commented May 3, 2024

I will leave this here, I had the same issue and used Hashicorp's time_sleep resource like this

resource "cloudamqp_custom_domain" "instance_custom_domain" {
  instance_id = cloudamqp_instance.instance.id
  hostname    = cloudflare_record.record.hostname

  depends_on = [ time_sleep.record_creation_delay ]
}

resource "time_sleep" "record_creation_delay" {
  depends_on = [cloudflare_record.record]

  create_duration = "30s"
}

resource "cloudflare_record" "record" {
  name    = local.mq_dns_record
  proxied = false
  type    = "CNAME"
  value   = cloudamqp_instance.instance.host
  zone_id = var.cloudflare_zone_id
  tags    = ["unable-to-proxy"]
}

it may be useful as a workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants