Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Feature]: Preserve SPS S3 buckets throughout deployment lifecycle #272

Open
LucaCinquini opened this issue Jan 19, 2025 · 5 comments
Open
Assignees
Labels
enhancement New feature or request U-SPS

Comments

@LucaCinquini
Copy link
Collaborator

LucaCinquini commented Jan 19, 2025

A SPS deployment stores data into 4 S3 buckets, for example:

  • unity-luca-1-dev-sps-airflowlogs
  • unity-luca-1-dev-sps-code
  • unity-luca-1-dev-sps-config
  • unity-luca-1-dev-sps-isl

We need to change the Terraform code such that by default these buckets are NOT destroyed when the cluster is taken down, and are reused when the cluster is deployed again. Optionally, the user can set the variable to destroy and recreate the S3 buckets.

@LucaCinquini LucaCinquini transferred this issue from unity-sds/unity-cs Jan 19, 2025
@LucaCinquini LucaCinquini added the enhancement New feature or request label Jan 19, 2025
@LucaCinquini LucaCinquini moved this from Todo to In Progress in Unity Project Board Jan 23, 2025
@LucaCinquini LucaCinquini assigned nikki-t and unassigned jpl-btlunsfo Jan 26, 2025
@nikki-t
Copy link
Collaborator

nikki-t commented Jan 28, 2025

I dug into this issue a little bit and haven't really come up with a satisfactory solution. Here is what I explored:

solution 1

"lifecycle" meta-argument: https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle

  • Set prevent_destroy to "true". Terraform will reject with an error any plan that would destroy the infrastructure object associated with the resource.
  • This throws an error so the destroy would not be completed successfully. This seems like it will not work for our use case as we would want to complete a destroy action successfully AND retain the S3 bucket.

solution 2

Run a script before destroy to create a new temporary bucket and copy data over. Then run a script after create to copy data over to new S3 bucket and remove old temporary bucket.

  • Have to be careful to manage temporary buckets with a reliable naming convention otherwise risk losing the bucket and/or generating a lot of temporary buckets. Also need to track individual deployments.
  • Might work best as it preserves the data in a bucket outside of terraform state so no need for Terraform to manage it.
  • Terraform provisioners can be used to manage when the data is copied.
  • May want to set the temporary bucket as an output to log a record of it.

solution 3

Remove S3 bucket from terraform state before destroying and import S3 bucket into terraform state when creating.

  • I tried to implement this as a proof of concept as this could be modified to support solution 2:
    resource "aws_s3_bucket" "airflow_logs" {
    bucket = format(local.resource_name_prefix, "airflowlogs")
    force_destroy = true
    tags = merge(local.common_tags, {
    Name = format(local.resource_name_prefix, "airflowlogs")
    Component = "airflow"
    Stack = "airflow"
    })
    depends_on = [ null_resource.import_s3_bucket ]
    }
    resource "null_resource" "import_s3_bucket" {
    provisioner "local-exec" {
    command = "${path.module}/../../scripts/preserve_s3.sh"
    when = create
    environment = {
    PRESERVE_ACTION = "import"
    BUCKET_RESOURCE = "module.unity-sps-airflow.aws_s3_bucket.airflow_logs"
    BUCKET_NAME = format(local.resource_name_prefix, "airflowlogs")
    TF_VAR_airflow_webserver_password = var.airflow_webserver_password
    TF_VAR_kubeconfig_filepath = var.kubeconfig_filepath
    TF_VAR_venue = var.venue
    }
    }
    }
    resource "null_resource" "remove_s3_bucket" {
    provisioner "local-exec" {
    command = "${path.module}/../../scripts/preserve_s3.sh"
    when = destroy
    environment = {
    PRESERVE_ACTION = "remove"
    BUCKET_RESOURCE = "module.unity-sps-airflow.aws_s3_bucket.airflow_logs"
    }
    }
    depends_on = [ aws_s3_bucket.airflow_logs ]
    }
  • I initially ran into a race condition just using local-exec provisioners defined on the s3 resource but was able to define 2 null_resource resource objects to control when a local-exec provisioner is called.
  • The bucket was still deleted despite removing from state and it seems like a bad practice to remove alter terraform state during destroy operations. It does look like the s3 bucket was destroyed after the preserve_s3.sh script was called.

I don't think it would take too long to implement solution 2 and that seems like the best out of the 3. But I feel like I might be missing something obvious, @LucaCinquini and @jpl-btlunsfo any ideas?

@LucaCinquini
Copy link
Collaborator Author

Hi @nikki-t : thanks for investigating. I agree that solutions 1 and 3 are no-go, while 2 seems the most reasonable. I wonder if we can leverage AWS backup and restoration (via Terraform - or manually). Let us investigate some more.

@mike-gangl
Copy link

mike-gangl commented Jan 28, 2025

I wonder if it makes more sense to use a single bucket+ prefixees for most of these items (code, config, logs?) and have those be requirements on a deployment? e.g. "bring your own bucket" as a variable in a deployment? This is pretty much how AWS does its managed sources (choose a bucket...) before deployment.

This way the bucket and its contents live outside of the airflow lifecycle itself.

@LucaCinquini
Copy link
Collaborator Author

We certainly want to reduce the number of buckets, and BYOB might be a good idea, actually.

@rtapella
Copy link

Yeah I think having the buckets be pre-conditions rather than “in” the airflow terraform could help? I’m not sure if it’s BYOB for the airflow metadata… BYOB makes sense for the data itself. Is SPS-code deployed apps, or is that something to manage the Airflow guts?

Would these buckets ever get large, like tens or hundreds of GB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request U-SPS
Projects
Status: In Progress
Development

No branches or pull requests

5 participants