diff --git a/docs/getting-started/getting-started-aws/connect-an-account/finish-and-connect.png b/docs/getting-started/getting-started-aws/connect-an-account/finish-and-connect.png old mode 100755 new mode 100644 index c99c3b52..aeb289c8 Binary files a/docs/getting-started/getting-started-aws/connect-an-account/finish-and-connect.png and b/docs/getting-started/getting-started-aws/connect-an-account/finish-and-connect.png differ diff --git a/docs/getting-started/getting-started-aws/connect-an-account/index.md b/docs/getting-started/getting-started-aws/connect-an-account/index.md index 418970c8..3b0a6445 100644 --- a/docs/getting-started/getting-started-aws/connect-an-account/index.md +++ b/docs/getting-started/getting-started-aws/connect-an-account/index.md @@ -22,7 +22,7 @@ Switch back to the Guardrails console **Account Import** browser tab you opened ## Step 2: Update account details -Paste the role ARN you obtained from step 7 in the previous guide into the **IAM Role ARN** field. Also, enter the AWS account ID into the **Account ID** field. +Paste the role ARN you obtained from step 7 in the previous guide into the **IAM Role ARN** field and enter the AWS account ID into the **Account ID** field.

ready-to-connect

diff --git a/docs/getting-started/getting-started-aws/connect-an-account/ready-to-connect.png b/docs/getting-started/getting-started-aws/connect-an-account/ready-to-connect.png index b5ef5325..b98cb704 100644 Binary files a/docs/getting-started/getting-started-aws/connect-an-account/ready-to-connect.png and b/docs/getting-started/getting-started-aws/connect-an-account/ready-to-connect.png differ diff --git a/docs/getting-started/getting-started-aws/connect-an-account/set-parent-resource.png b/docs/getting-started/getting-started-aws/connect-an-account/set-parent-resource.png index aa8b8b39..46fbb3c0 100644 Binary files a/docs/getting-started/getting-started-aws/connect-an-account/set-parent-resource.png and b/docs/getting-started/getting-started-aws/connect-an-account/set-parent-resource.png differ diff --git a/docs/getting-started/getting-started-aws/prepare-account/download-cloudformation-template.png b/docs/getting-started/getting-started-aws/prepare-account/download-cloudformation-template.png new file mode 100644 index 00000000..86aa6bec Binary files /dev/null and b/docs/getting-started/getting-started-aws/prepare-account/download-cloudformation-template.png differ diff --git a/docs/getting-started/getting-started-aws/prepare-account/index.md b/docs/getting-started/getting-started-aws/prepare-account/index.md index a63bee39..423c9a0c 100644 --- a/docs/getting-started/getting-started-aws/prepare-account/index.md +++ b/docs/getting-started/getting-started-aws/prepare-account/index.md @@ -24,13 +24,17 @@ Login to your Guardrails console and select the **CONNECT** option from the home

locate-top-level-connect

+Select **AWS**. + +

locate-top-level-connect

+ ## Step 2: Download the CloudFormation template Guardrails needs an IAM role that grants permission to discover [resources](/guardrails/docs/reference/glossary#resource) in your account and to monitor changes via event handlers. The CloudFormation template downloaded in this step has the minimum permissions necessary to create that role. Select **AWS Account** from the left navigation and then click the blue **Download CloudFormation Template** button to download the CloudFormation template you will use to create the required IAM role in your AWS account. -

initial-connect-screen

+

initial-connect-screen

> [!IMPORTANT] > Leave this browser tab open while we do the next steps in a different tab. Closing and reopening this page will cause a new random ExternalID to be generated. diff --git a/docs/getting-started/getting-started-aws/prepare-account/initial-connect-screen.png b/docs/getting-started/getting-started-aws/prepare-account/initial-connect-screen.png old mode 100755 new mode 100644 index 13660c55..e6013d62 Binary files a/docs/getting-started/getting-started-aws/prepare-account/initial-connect-screen.png and b/docs/getting-started/getting-started-aws/prepare-account/initial-connect-screen.png differ diff --git a/docs/getting-started/getting-started-aws/prepare-account/locate-top-level-connect.png b/docs/getting-started/getting-started-aws/prepare-account/locate-top-level-connect.png old mode 100755 new mode 100644 index d26a51ba..ab7fcd65 Binary files a/docs/getting-started/getting-started-aws/prepare-account/locate-top-level-connect.png and b/docs/getting-started/getting-started-aws/prepare-account/locate-top-level-connect.png differ diff --git a/docs/getting-started/getting-started-azure/connect-subscription/connect-1.png b/docs/getting-started/getting-started-azure/connect-subscription/connect-1.png index 56150244..e60902cf 100644 Binary files a/docs/getting-started/getting-started-azure/connect-subscription/connect-1.png and b/docs/getting-started/getting-started-azure/connect-subscription/connect-1.png differ diff --git a/docs/getting-started/getting-started-azure/connect-subscription/connect-2.png b/docs/getting-started/getting-started-azure/connect-subscription/connect-2.png index 81dcf65d..de7c8cd1 100644 Binary files a/docs/getting-started/getting-started-azure/connect-subscription/connect-2.png and b/docs/getting-started/getting-started-azure/connect-subscription/connect-2.png differ diff --git a/docs/getting-started/getting-started-azure/connect-subscription/connect-3.png b/docs/getting-started/getting-started-azure/connect-subscription/connect-3.png index 7713bd65..1e5f954e 100644 Binary files a/docs/getting-started/getting-started-azure/connect-subscription/connect-3.png and b/docs/getting-started/getting-started-azure/connect-subscription/connect-3.png differ diff --git a/docs/getting-started/getting-started-azure/connect-subscription/index.md b/docs/getting-started/getting-started-azure/connect-subscription/index.md index 14600a1a..fa275291 100644 --- a/docs/getting-started/getting-started-azure/connect-subscription/index.md +++ b/docs/getting-started/getting-started-azure/connect-subscription/index.md @@ -21,6 +21,10 @@ Login to your Guardrails console and select the **CONNECT** option from the home

login

+Select **Azure**. + +

login

+ ## Step 2: Select Azure Subscription

connect-1

diff --git a/docs/getting-started/getting-started-azure/connect-subscription/login.png b/docs/getting-started/getting-started-azure/connect-subscription/login.png index bf9ac43e..ab7fcd65 100644 Binary files a/docs/getting-started/getting-started-azure/connect-subscription/login.png and b/docs/getting-started/getting-started-azure/connect-subscription/login.png differ diff --git a/docs/getting-started/getting-started-azure/connect-subscription/select-azure.png b/docs/getting-started/getting-started-azure/connect-subscription/select-azure.png new file mode 100644 index 00000000..7d62abae Binary files /dev/null and b/docs/getting-started/getting-started-azure/connect-subscription/select-azure.png differ diff --git a/docs/getting-started/getting-started-gcp/connect-project/locate-top-level-connect.png b/docs/getting-started/getting-started-gcp/connect-project/locate-top-level-connect.png old mode 100755 new mode 100644 index ad1c0834..ab7fcd65 Binary files a/docs/getting-started/getting-started-gcp/connect-project/locate-top-level-connect.png and b/docs/getting-started/getting-started-gcp/connect-project/locate-top-level-connect.png differ diff --git a/docs/guides/configuring-guardrails/activity-retention/check-setting.png b/docs/guides/configuring-guardrails/activity-retention/check-setting.png new file mode 100644 index 00000000..aa87ec52 Binary files /dev/null and b/docs/guides/configuring-guardrails/activity-retention/check-setting.png differ diff --git a/docs/guides/configuring-guardrails/activity-retention/index.md b/docs/guides/configuring-guardrails/activity-retention/index.md new file mode 100644 index 00000000..623d3bf4 --- /dev/null +++ b/docs/guides/configuring-guardrails/activity-retention/index.md @@ -0,0 +1,67 @@ +--- +title: Workspace Activity Retention +sidebar_label: Workspace Activity Retention +--- + +# Configuring Workspace Activity Retention + +In this guide, you will: + +- Set up the Guardrails *Turbot > Workspace > Retention > Activity Retention* policy to manage the lifecycle of activity records associated with resources or controls. +- Understand how activity retention impacts storage usage and workspace performance. + +Guardrails' [Activity Retention](https://hub.guardrails.turbot.com/mods/turbot/policies/turbot/activityRetention) policy controls the duration for which activity records such as actions, events, or notifications, are kept within your workspace. Properly configured retention periods optimize storage, enhance performance, and satisfy compliance and auditing requirements. + +## Prerequisites + +- **Turbot/Admin** permissions at the Turbot resource level. +- Familiarity with the Guardrails console. + +## Step 1: Navigate to the Policy + +Log in to the Guardrails console using your local credentials or via SAML-based login. Select **Policies** from the top navigation menu, then search for the policy named `Turbot > Workspace > Retention > Activity Retention`. + +![Navigate to Policies](/images/docs/guardrails/guides/configuring-guardrails/activity-retention/navigate-to-policies.png) + +Click **New Policy Setting** in the top-right corner of the policy details page. + +## Step 2: Configure the Policy + +Select the **Resource** or **Folder** at which you wish to set the retention policy. +> [!IMPORTANT] +> It is recommended to set this policy at the `Turbot` (root) level. Applying this policy at a lower level (e.g., individual folder or resource) may result in errors like: +> +> ``` +> Error creating policy setting +> Internal Error: Create failed: Resource (aaa - Customer Simulated Environments) is not valid. Aborting policy setting create. +> ``` + +Under **Settings**, choose the appropriate retention period based on your organization's operational needs. Refer to [Retention Options](#retention-options) for details. + +![New Policy Setting](/images/docs/guardrails/guides/configuring-guardrails/activity-retention/new-policy-setting.png) + +Click **Update** to save the new policy setting. + +## Step 3: Review + +- [ ] Return to the **Policies** tab and confirm the policy has been correctly applied by verifying the **Current Setting**. + +![Verify Policy](/images/docs/guardrails/guides/configuring-guardrails/activity-retention/verify-activity-retention-policy.png) + +## Retention Options + +1. **30 days**: For maximum UI performance. +2. **60 days**: A balanced approach recommended for most environments. +3. **90 days (default)**: Default retention duration. +4. **180, 365 days**: Suitable for meeting compliance requirements or specific organizational needs. + +## Next Steps + +- Explore additional ways to [Configure Guardrails](/guardrails/docs/guides/configuring-guardrails). + +## Troubleshooting + +| Issue | Description | Guide | +| ------------------------------ | ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | +| Permission Issues | Confirm that you have the necessary (`Turbot/Admin`) permissions to modify policies. | [Permissions & Roles](/guardrails/docs/concepts/iam/permissions#permissions) | +| Further Assistance | If you continue to encounter issues, please open a ticket with us and attach the relevant information to assist you more efficiently. | [Open Support Ticket](https://support.turbot.com) | diff --git a/docs/guides/configuring-guardrails/activity-retention/navigate-to-policies.png b/docs/guides/configuring-guardrails/activity-retention/navigate-to-policies.png new file mode 100644 index 00000000..4181bbcc Binary files /dev/null and b/docs/guides/configuring-guardrails/activity-retention/navigate-to-policies.png differ diff --git a/docs/guides/configuring-guardrails/activity-retention/new-policy-setting.png b/docs/guides/configuring-guardrails/activity-retention/new-policy-setting.png new file mode 100644 index 00000000..35364886 Binary files /dev/null and b/docs/guides/configuring-guardrails/activity-retention/new-policy-setting.png differ diff --git a/docs/guides/configuring-guardrails/activity-retention/verify-activity-retention-policy.png b/docs/guides/configuring-guardrails/activity-retention/verify-activity-retention-policy.png new file mode 100644 index 00000000..dddba558 Binary files /dev/null and b/docs/guides/configuring-guardrails/activity-retention/verify-activity-retention-policy.png differ diff --git a/docs/guides/configuring-guardrails/index.md b/docs/guides/configuring-guardrails/index.md index 9a95078a..54e8b252 100644 --- a/docs/guides/configuring-guardrails/index.md +++ b/docs/guides/configuring-guardrails/index.md @@ -8,6 +8,7 @@ This section provides how-to guides for common tasks that will help you effectiv | Guide | Description | | - | - +| [Workspace Activity Retention](guides/configuring-guardrails/activity-retention) | Learn how to set up Workspace activity retention in Guardrails console | [Install Mod](guides/configuring-guardrails/install-mod) | Learn how to install mod in Guardrails console | [Update Mod](guides/configuring-guardrails/update-mod) | Learn how to uninstall mod in Guardrails console | [Uninstall Windows](guides/configuring-guardrails/uninstall-mod) | Learn how to update mod in Guardrails console diff --git a/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/index.md b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/index.md new file mode 100644 index 00000000..9ce0ec5b --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/index.md @@ -0,0 +1,102 @@ +--- +title: Architecture Options +sidebar_label: Architecture Options +--- + +# Architecture Options + +In this guide, you will: + +- Explore architectural considerations for deploying Turbot Guardrails. +- Understand different options available based on organizational risk and availability requirements. + + +Turbot Guardrails is a comprehensive governance platform that automates discovery, compliance, security, and operational remediation tasks across cloud environments. Due to its critical role as a security and compliance control plane, it's essential to configure Guardrails with high availability and disaster recovery in mind. + +This document outlines various architectural options to help you select an approach aligned with your organization's specific high availability (HA) and disaster recovery (DR) needs, based on your risk tolerance and operational requirements. + + +| Tier | Account | Region | Availability Zone | Availability | RTO | RPO | Use Cases | +|----------|---------------|-----------------|-------------------|--------------|-----|-----|----------------------------------------------| +| Tier1 | Single-account | Single-region | Single-AZ | 99% | 4 Hr | 4 Hr | Development and non-prod environments | +| Tier2 | Single-account | Single-region | Multi-AZ | 99.9% | 4 Hr | 4 Hr | Production without rapid DR requirements | +| Tier3 | Single-account | Multi-region | Multi-AZ | 99.9% | 2 Hr | 2 Hr | Production requiring rapid DR | +| Tier4 | Multi-account | Multi-region | Multi-AZ | 99.99% | 0 Hr | 0 Hr | Mandated zero downtime DR | + + + +## Tier 1: Development + +**Key Characteristics**: Single-account, single-region, single availability zone. + +This deployment option is appropriate for non-production and development workspaces, where high-availability and disaster recovery are not important for the accounts monitored by guardrails. + +This is the lowest cost infrastructure deployment option available. + +![Tier 1 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-1.png) + +This deployment uses one primary RDS instance without a failover configuration. Recovery can be performed from RDS point-in-time backups. + +## Tier 2: High Availability + +**Key Characteristics**: Single-account, single-region, multi-availability zone. + +This deployment option is appropriate for all production usage. It is the most cost-effective deployment option for production use cases and has the capability to achieve 4hr RPO/RTO in all circumstances except the loss of an entire AWS Region. + +![Tier 2 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-2.png) + +The changes in this deployment vs the **Tier 1 DR** architecture are: + +1. The ECS compute cluster is deployed across multiple availability zones. +2. Lambda are deployed across multiple availability zones. +3. An RDS failover instance is deployed in a second availability zone. +4. An Elasticache failover instance is deployed in a second availability zone. + +## Tier 3: Multi-Region + +**Key Characteristics**: Single-account, multi-region, multi-availability zone. + +This deployment option is appropriate when regulatory requirements demand that a multi-region solution be implemented, or when requirements drive less than a 4hr RTO/RPO. It has the benefit of being resilient to the loss of an entire AWS Region. + +![Tier 3 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-3.png) + +The key difference between this deployment is that a second Turbot Guardrails deployment is created in the standby region. The compute cluster will be set to be dormant, and no inbound events will be received by the cluster. On declaration of a disaster, DNS will be changed to send events to this region, while the database is recovered from a cross region RDS snapshot. Once the DB is recovered, the workspace is enabled, and events will start processing from the queue. + +To use this pattern, [cross-region RDS backups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReplicateBackups.html) must be configured in this account to ensure the DB can be restored in the target region without access to KMS in the primary region. This option also requires the use of AWS API Gateway, and a public DNS endpoint and SSL certificate to allow redirection of inbound real-time events between regions. + +## Tier 4: Multi-Account + +**Key Characteristics**: Multi-account, multi-region, multi-availability zone. + +The **Tier 4** deployment option should be considered for any organization with zero RTO/RPO requirements. This deployment option allows for instantaneous failover between two active Guardrails environments. We use the “Change Window” feature of guardrails to prevent one of the implementations from executing any enforcements. Upon declaration of an emergency, the standby environment change window can be removed allowing that environment to become the primary and enforce changes. + +In normal day to day operation, both environments consume cloud events and maintain independent CMDB databases. This pattern results in both doubling the infrastructure and per control usage costs for Guardrails if employed. + +![Tier 4 DR Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-4.png) + +Care must be made in this configuration to ensure that policy packs and account onboarding/offboarding is done across both environments in tandem, using the Guardrails Terraform provider to maintain consistency between the deployments. Custom scripting may be necessary to periodically check to ensure both environments are identical in configuration, to meet your organizations DR requirements. diff --git a/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-1.png b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-1.png new file mode 100644 index 00000000..719c37d5 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-1.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-2.png b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-2.png new file mode 100644 index 00000000..6232e2ac Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-2.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-3.png b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-3.png new file mode 100644 index 00000000..cf0327ca Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-3.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-4.png b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-4.png new file mode 100644 index 00000000..e728f4b3 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/architecture-options/tier-4.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/ec2-connect-bastion-host.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/ec2-connect-bastion-host.png new file mode 100644 index 00000000..df64b84c Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/ec2-connect-bastion-host.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/index.md b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/index.md new file mode 100644 index 00000000..80c87453 --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/index.md @@ -0,0 +1,550 @@ +--- +title: Database Upgrade +sidebar_label: Database Upgrade +--- + +# Database Upgrade + +In this guide, you will: + +- Resize and/or upgrade a database engine version with minimal downtime using AWS and PostgreSQL tools. + +[Turbot Guardrails Enterprise Database (TED)](/guardrails/docs/reference/glossary#turbot-guardrails-enterprise-database-ted) is an AWS Service Catalog product that provides automated configuration and management of the infrastructure needed to run the enterprise version of Turbot Guardrails in your AWS account. Efficient management of database resources ensures optimal storage utilization, minimizes costs, and enhances performance by reducing unused storage. This process also ensures seamless version upgrades with minimal disruption. + + +This guide outlines two main scenarios for database upgrades: + +1. **Storage Optimization**: Resizing storage allocation for improving efficiency and cost. +2. **Engine Version Upgrade**: Upgrading the PostgreSQL database engine version to access new features and security updates. + + +## Prerequisites + +The activities are performed in the Turbot Guardrails hosting AWS account. + +- Access to the Guardrails hosting AWS account with [Administrator Privileges](/guardrails/docs/enterprise/FAQ/admin-permissions). +- Familiarity with AWS RDS, EC2, Service Catalog and CloudFormation services. + +- Knowledge of the current database usage (storage and version). +- Awareness of the backup schedule to avoid disruptions during the process. + + +### Required Down Time + +- Less than 1 minute for rebooting DB instance while enabling the logical replication in the source DB [Step 1: Enable DB Logical Replication](#reboot-db-instance) +- Approximate ~5 to ~10 minutes in the process of renaming the databases [Step 14: Rename DB Instances](#step-14-rename-db-instances) + +## Step 1: Enable DB Logical Replication + +Select the origin(source) database instance. + +![Select New Database](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-source-database.png) + +Navigate to the `Configurations` tab and select the **DB instance parameter group**. + +![Select DB Instance Parameter Group](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-dbinstance-parameter-group.png) + +Select **Edit**. + +![Edit DB Instance Parameter Group](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-parameter-group-edit.png) + +Set `rds.logical_replication` to **1**. Select **Save Changes**. + +![Edit DB Instance Parameter Group](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-set-logical-replication-group.png) + +### Pause Events + +[Pause the events](/guardrails/docs/guides/hosting-guardrails/troubleshooting/pause-events#pause-event-processing) to avoid any lost events. During this time the respective workspace will still be available in `readonly` mode. + +> [!TIP] +> Pausing events before database downtime is critical because: +> - During database unavailability, Guardrails continues to receive events from cloud providers +> - If events are not paused, these events will be lost since they cannot be written to the database +> - Lost events mean missing state changes in your infrastructure, leading to inaccurate resource tracking and potential security/compliance gaps +> - By pausing events, they are queued and will be processed once the database is available again, ensuring no infrastructure changes are missed + +### Reboot DB Instance + +![Reboot DB Instance](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-new-reboot.png) + +> [!WARNING] +> During the database reboot, users will experience a brief service interruption lasting approximately 1 minutes or less. Please plan this maintenance window accordingly. + +### Start Events + +Now enable event processing. Refer [Enable the events](/guardrails/docs/guides/hosting-guardrails/troubleshooting/pause-events#enable-event-processing). At this stage workspace will resume it's normal operation. + +## Step 2: Provision New Database Instance + +Deploy a new TED instance to create a new database that will serve as the target for replication from the original(source) database. This allows us to perform the upgrade with minimal downtime by replicating data from the old database to the new one. + +Navigate to AWS Service Catalog console and deploy a new TED. Follow the instruction provided in [Install Turbot Guardrails Enterprise Database (TED)](/guardrails/docs/guides/hosting-guardrails/installation/install-ted) product. + + + + +For example, if your original source database is named `turbot-einstein`, name the new one as `turbot-einstein-green` (using the `-green` suffix). This suffix helps identify the new instance during the upgrade process. Set the Version parameter to 1.45.0 or higher. + +![Append Name and Version](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-naming-version.png) + + +### In case of **DB engine upgrade** + +If performing a database version upgrade e.g. migrating to PostgreSQL v16.x, use the `DB Engine Version` and `Read Replica DB Engine Version` parameters under the `Database - Advanced - Engine` section. Set the appropriate `DB Engine Parameter Group Family` and the `Hive RDS Parameter Group` under the `Database - Advanced - Parameters` section. + +> [!IMPORTANT] +> This guide assumes you are not using read replicas. For environments with read replicas enabled, additional steps may be required. Please contact Turbot Support for assistance. + +![Set Db Engine Upgrade](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-db-engine-upgrade.png) + + +### In case of **Storage Optimization** + +Set the allocated storage to match the current disk usage using the `Allocated Storage in GB` parameter (e.g., if 210 GB out of 500 GB is used, set it to 210 GB) and define the `Maximum Allocated Storage limit in GB` to a suitable value, both located under the `Database - Advanced - Storage` section; use the `FreeStorageSpace` metric to determine the size. + + +![Set Allocated Storage](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation-new.png) + +Set the encryption by configuring the `Custom Hive Key` parameter to use the original KMS key under the `Advanced - Infrastructure` section. This should be the Key ID, typically formatted as: `1111233-abcd-4444-2322-123456789012`. + +![Set Encryption](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-set-encryption.png) + +Keep all other values unchanged. + +## Step 3: Set Master Password in Source and Target Databases + +Set the master password for both the DB instances via the AWS console. + +> [!TIP] +> Setting the master password in both source and target databases is crucial for: +> - Ensuring execution the logical replication process between databases. + +Select the `source` DB instance and choose **Modify**. + +![Select Modify](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify.png) + +Provide the master password + +![Set Master Password](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password-new.png) + +Select **Modify DB Instance** and apply the changes. + +![Select Modify DB Instance](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance-new.png) + +Apply the same changes to the `target` DB. + +> [!NOTE] +> Securely store this master password till the time of migration completion. This will be automatically be rotated when `blue<>green` deployment is performed in Step [Step 17: Update Original TED Stack > Execute `blue<>green`](#execute-greenblue-deployment) + +## Step 4: Create Bastion Host + +Create a Bastion using the [CloudFormation Template](https://github.com/turbot/guardrails-samples/tree/main/enterprise_installation/turbot_bastion_host). Follow steps provided in [Turbot Bastion Host](https://github.com/turbot/guardrails-samples/tree/main/enterprise_installation/turbot_bastion_host#turbot-bastion-host). + +> [!NOTE] +> Set the bastion host image to `/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-6.1-x86_64`. +> Set RootVolumeSize to a bit larger than the original DB size (e.g., if 300 GB is used, set RootVolumeSize to 350 GB). + +### Connect to Bastion Host + +Connect to the newly created Bastion Host + +![Connect Bastion Host](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/ec2-connect-bastion-host.png) + +## Step 5: Install PostgreSQL Client + +To install or update the PostgreSQL client on the bastion host, you have two options based on your PostgreSQL version: + +### PostgreSQL 15 +Use the package manager to install PostgreSQL 15: + +```shell +sudo dnf install postgresql15.x86_64 postgresql15-server -y +``` +### PostgreSQL 16 + + +Use [PostgreSQL 16 installation steps](https://aws.amazon.com/blogs/database/synopsis-of-several-compelling-features-in-postgresql-16): + +```shell +sudo yum install -y gcc readline-devel libicu-devel zlib-devel openssl-devel +sudo wget https://ftp.postgresql.org/pub/source/v16.3/postgresql-16.3.tar.gz +sudo tar -xvzf postgresql-16.3.tar.gz +cd postgresql-16.3 +sudo ./configure --bindir=/usr/bin --with-openssl +sudo make -C src/bin install +sudo make -C src/include install +sudo make -C src/interfaces install +``` + +## Step 6: Create Temporary Folder for Migration + +> [!TIP] +> Create a temporary folder to store migration files and database dumps. This folder will serve as a workspace for: +> - Storing database backup files created by `pg_dump`. +> - Staging data during replication between source and target databases. +> - Maintaining intermediate files generated during migration. +> - Keeping migration artifacts organized and separate from system files. + +Execute the commands in the home directory to create the `tmp_migrations` folder and assign the required permissions. + +```shell +sudo mkdir tmp_migrations +sudo chmod 777 tmp_migrations +cd tmp_migrations +``` + +## Step 7: Set Environment Variables + +Set the source and target DB endpoints, available under the **Connectivity & Security** tab of the RDS DB instance, and export the pg password configured in [Step 6](#step-6-set-master-password). + +```shell +export SOURCE= +export TARGET= +export PGPASSWORD= +``` +![DB Instance Endpoint](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-endpoint.png) + +## Step 8: Create Publisher and Replication Slot in Source DB Instance + +While continuing with the **first bastion host session**, execute the commands to create a publication and replication slot. Make a note of the output value for future use. + + +```shell +psql --host=$SOURCE --username=master --dbname=turbot +CREATE PUBLICATION pub_blue FOR ALL TABLES; +SELECT * FROM pg_create_logical_replication_slot('rs_blue', 'pgoutput'); +``` +> [!WARNING] +> After creating replication slots, upgrading existing workspaces or creating new ones `will not be possible` until the process is complete. Additionally, no DDL changes can be performed during this time. + + +## Step 9: Create Source DB PG Dump + +> [!TIP] +> A PG dump is required for following reasons: +> - *Initial Data Copy*: It provides a consistent snapshot of the source database that can be used to initialize the target database before starting logical replication +> - *Data Consistency*: The dump ensures all tables and data are copied in a transactionally consistent state, preventing data inconsistencies during migration +> - *Replication Starting Point*: The dump establishes a known good starting point from which logical replication can begin catching up with any changes that occurred during and after the dump +> - *Backup Safety*: The dump serves as a backup in case issues arise during the migration process +> - *Performance*: Using a dump for the initial data copy is typically faster than relying solely on logical replication to copy the entire database + +### Establish a New Bastion Host Session + +Set the transaction isolation level, and export a snapshot using these commands. This will be separate session than you are using in above step. + +```shell +psql --host=$SOURCE --username=master --dbname=turbot +BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ; +SELECT pg_current_wal_lsn(), pg_export_snapshot(); +``` + +>[!IMPORTANT] +> This does not create an actual snapshot but captures the current state of the database and assigns it an ID. + +Keep this session open, as the snapshot ID will be used in the next step. + +#### Create Snapshot + +```shell +turbot=*> SELECT pg_current_wal_lsn(), pg_export_snapshot(); + pg_current_wal_lsn | pg_export_snapshot +--------------------+--------------------- + AC96/49F46070 | 00000062-000182C4-1 +(1 row) +``` +### Start PG Dump + +In a new session, initiate the `pg_dump` process using the `snapshot ID` obtained from the previous step: + +```shell +nohup time pg_dump -h $SOURCE -U master --snapshot="00000062-000182C4-1" -F c -b -v -f data.dump turbot > dump.log 2>&1 & +``` +### Monitor + +Check the `dump.log` file to confirm the process has begun. Look for log entries indicating `table contents are being dumped`. + +> [!TIP] +> Wait for the table content dump to be started. + +```shell +cat dump.log +``` + +Once `pg_dump` is running, return to the [**first bastion host session**](#establish-a-new-bastion-host-session) which has the transaction isolation set and rollback the transaction isolation settings. + +```sql +ROLLBACK; +``` + +>[!IMPORTANT] +> This process may take several hours, depending on the size of the database dump. +> The purpose of using the `nohup` command is to ensure that the pg_dump process continues to run even if the session is terminated. + +Check the process is running or not during the monitoring. + +```shell +ps aux | grep pg_dump +``` + +Once complete, check for errors in the dump file. + +```shell +cat dump.log | grep error +cat dump.log +``` + +If `no error` is visible in the `dump.log`, move to next step. + +## Step 10: Restore Dump in Target Database + +### Start PG Restore + +Restore the database in the target DB instance i.e. `turbot-einstein-green` here. + +```shell +nohup time pg_restore -h $TARGET -U master --verbose --no-publications --no-subscriptions --clean --if-exists -d turbot data.dump > restore.log 2>&1 + +``` +>[!IMPORTANT] +> The restore process may take several hours. Periodically run `ps aux` to check if the `pg_restore` process is still active. + +### Monitor + +```shell +ps aux | grep pg_restore +``` + +Check for any errors in the restore process. + +```bash +cat restore.log | grep error +``` +If `no error` except few trigger, deadlock related errors is visible in the restore.log, move to next step. + +**Example of Probable Error** + +``` +pg_restore: error: could not execute query: ERROR: deadlock detected - 1 +pg_restore: error: could not execute query: ERROR: operator does not exist: public.ltree = public.ltree - 264 +``` + + +## Step 11: Create Subscription in the Target DB Instance + +> [!TIP] +> Creating a subscription in the target database is required to: +> - Establish a logical replication connection between source and target databases +> - Enable continuous data synchronization after the initial data restore +> - Ensure any changes made to the source database during migration are replicated to the target +> - Minimize downtime by keeping both databases in sync until the final cutover + +Create a subscription in the `target` database. + +```shell +psql --host=$TARGET --username=master --dbname=turbot +CREATE SUBSCRIPTION sub_blue CONNECTION 'host= port=5432 password= user=master dbname=turbot' PUBLICATION pub_blue WITH ( + copy_data = false, + create_slot = false, + enabled = false, + synchronous_commit = false, + connect = true, + slot_name = 'rs_blue' + ); +``` + +```shell +SELECT * FROM pg_replication_origin; +``` +Sample output as below and save the value to be used in the next step. + +```shell +turbot=> SELECT * FROM pg_replication_origin; + roident | roname +---------+------------ + 1 | pg_1846277 +(1 row) +``` +Execute below to advance the replication point + +```shell +SELECT pg_replication_origin_advance('output_from_step_above','lsn_of_starting_point'); +ALTER SUBSCRIPTION sub_blue ENABLE; + +e.g. +SELECT pg_replication_origin_advance('pg_1846277','AC96/49F46070'); +ALTER SUBSCRIPTION sub_blue ENABLE; +``` +The value `AC96/49F46070` is derived from this [step](#create-snapshot). + +### Monitor Progress + +Execute the following command in the `source` database to monitor the replication progress. Proceed to the next steps once the `lsn_distance` reaches **0**, by executing following command. Wait for `lsn_distance` to reach **0** at-least once. + +```shell +psql --host=$SOURCE --username=master --dbname=turbot +SELECT slot_name, confirmed_flush_lsn as flushed, pg_current_wal_lsn(), (pg_current_wal_lsn() - confirmed_flush_lsn) AS lsn_distance FROM pg_catalog.pg_replication_slots WHERE slot_type = 'logical'; +``` + +## Step 12: Add Triggers + +> [!TIP] +> Triggers are required for following reasons: +> - During database migration, triggers are not automatically copied from the source to target database +> - These triggers are essential for maintaining data integrity and relationships, particularly for path-based hierarchies in Turbot Guardrails +> - The triggers handle automatic updates of path columns when parent-child relationships change +> - Without these triggers, hierarchical data structures (like resource types, control categories, etc.) would not maintain proper relationships +> - They must be created before the database becomes active to ensure data consistency from the first operation + +Execute the commands on the target database to set local search path and create Triggers. Replace the `$WORKSPACE_SCHEMA` with the actual schema name. + +Workspace schemas can be retrieved by executing the `\dn` command. + +```shell +psql --host=$TARGET --username=master --dbname=turbot +``` + +```shell +set local search_path to $turbot_schema, public; +create trigger control_category_path_au after update on $turbot_schema.control_categories for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('controls', 'control_category_id', 'control_category_path'); +create trigger control_resource_category_path_au after update on $turbot_schema.resource_categories for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('controls', 'resource_category_id', 'resource_category_path'); +create trigger control_resource_types_path_au after update on $turbot_schema.resource_types for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('controls', 'resource_type_id', 'resource_type_path'); +create trigger control_types_path_au after update on $turbot_schema.control_types for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('controls', 'control_type_id', 'control_type_path'); +create trigger policy_category_path_au after update on $turbot_schema.control_categories for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('policy_values', 'control_category_id', 'control_category_path'); +create trigger policy_resource_category_path_au after update on $turbot_schema.resource_categories for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('policy_values', 'resource_category_id', 'resource_category_path'); +create trigger policy_resource_types_path_au after update on $turbot_schema.resource_types for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('policy_values', 'resource_type_id', 'resource_type_path'); +create trigger policy_types_path_au after update on $turbot_schema.policy_types for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('policy_values', 'policy_type_id', 'policy_type_path'); +create trigger resource_resource_category_path_au after update on $turbot_schema.resource_categories for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('resources', 'resource_category_id', 'resource_category_path'); +create trigger resource_resource_type_path_au after update on $turbot_schema.resource_types for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.types_path_au('resources', 'resource_type_id', 'resource_type_path'); +create trigger resource_types_500_rt_path_update_au after update on $turbot_schema.resource_types for each row when (old.path is distinct from new.path) execute procedure $turbot_schema.update_types_path(); +``` +> [!NOTE] +> The above script must be executed for no of workspaces separately if they are part of the same database. +> +> Any additional new trigger will be updated here in future state. + +## Step 13: Test Data + +Run the following queries to compare the count of functions, triggers, indexes, and constraints between the source and target databases: + +### Trigger + +```sql +SELECT count(trigger_name), trigger_schema FROM information_schema.triggers group by trigger_schema; +``` + +### Indexes + +```sql +SELECT n.nspname AS schema_name, COUNT(i.indexname) AS index_count FROM pg_catalog.pg_indexes i JOIN pg_catalog.pg_namespace n ON i.schemaname = n.nspname WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') GROUP BY n.nspname ORDER BY index_count DESC; +``` + +### Functions + +```sql +SELECT n.nspname AS schema_name, p.proname AS function_name FROM pg_catalog.pg_proc p LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace WHERE n.nspname IN ('pg_catalog'); +``` + +### Constraints + +```sql +SELECT n.nspname AS schema_name, COUNT(c.conname) AS constraint_count FROM pg_catalog.pg_constraint c JOIN pg_catalog.pg_namespace n ON c.connamespace = n.oid WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') GROUP BY n.nspname ORDER BY constraint_count DESC; +``` +### Trigger Count by Status + +```sql +SELECT count(tgname), tgenabled FROM pg_trigger GROUP by tgenabled; +``` + +## Step 14: Rename DB Instances + +### Pause Events + +Similar to [Step 1 > Pause Events](#pause-events) + +### Modify DB Instance + +After proper validation of data consistencies, it's time to interchange the DB names as below `in AWS RDS console`. + +Rename the original (source) instance by appending -blue to its name e.g. from `turbot-einstein` to `turbot-einstein-blue`. + +![Rename Original Instance](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-original-instance-append-blue.png) + +Rename the new target instance by removing the -green suffix e.g. from `turbot-einstein-green` to `turbot-einstein`. + +![Rename New Instance](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-new-instance-remove-green.png) + +At this stage the workspace is now pointing to the new target DB with the earlier used RDS DB instance endpoint. + +### Start Events + +Now enable event processing. Refer [Enable the events](/guardrails/docs/guides/hosting-guardrails/troubleshooting/pause-events#enable-event-processing). + +## Step 15: Update Original TED Stack + +It's important to update the original TED stack e.g. (`ted-einstein`) with parameter values of new TED stack (`ted-einstein-green`) we created in [Step 2 DB Upgrade](#in-case-of-db-engine-upgrade). + +> [!IMPORTANT] +> Do not change `Custom Hive Key` parameter + +Refer the parameters used in [Step 2: Provision New Database Instance](#in-case-of-db-engine-upgrade) + +![Set Db Engine Upgrade](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-db-engine-upgrade.png) + +### Execute `green<>blue` Deployment + +This will reset the master password along with other applicable parameters. + +![Blue Green Deployment Trigger](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-blue-green-deployment.png) + +## Step 16: Run Smoke Tests + +Run smoke tests to Test both the restored and new database instances to confirm the upgrade. + +- Validate the Count of Controls + - Pre + - Post +- Validate the Count of Resources + - Pre + - Post + +- Validate the Count of Active Controls + - Pre + - Post +- Ensure all controls are running as expected. +- Confirm events are functioning properly. +- Verify grants are working correctly. +- Ensure stacks are functioning as intended. + +## Step 17: Clean Up + + + + +Delete the new TED stack i.e. `turbot-einstein-green` along with its associated resources, including the S3 bucket, log groups, and AWS Backup. Clean up replication slots and subscriptions. + + + +## Step 18: Disable and Delete Subscriptions + +Disable and delete subscription and replication slots. + + +```sql +select * from pg_subscription; +alter subscription sub_blue disable; +alter subscription sub_blue set (slot_name=none); +drop subscription sub_blue; +``` + +## Troubleshooting + +| Issue | Description | Guide | +| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | +| Permission Issues | If the current logged-in user lacks permission to modify, update, or create resources in the stack, or if IAM roles or SCPs have changed, preventing built-in roles from accessing needed configuration settings. | [Troubleshoot Permission Issues](/guardrails/docs/enterprise/FAQ/admin-permissions#aws-permissions-for-turbot-guardrails-administrators) | +| Further Assistance | If you continue to encounter issues, please open a ticket with us and attach the relevant information to assist you more efficiently. | [Open Support Ticket](https://support.turbot.com) | diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-db-engine-upgrade.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-db-engine-upgrade.png new file mode 100644 index 00000000..596449fd Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-db-engine-upgrade.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-endpoint.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-endpoint.png new file mode 100644 index 00000000..64ea860b Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-endpoint.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-new-reboot.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-new-reboot.png new file mode 100644 index 00000000..31c950d0 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-new-reboot.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-parameter-group-edit.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-parameter-group-edit.png new file mode 100644 index 00000000..21e176c4 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-parameter-group-edit.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-new-instance-remove-green.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-new-instance-remove-green.png new file mode 100644 index 00000000..47ec7eab Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-new-instance-remove-green.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-original-instance-append-blue.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-original-instance-append-blue.png new file mode 100644 index 00000000..b34f58ea Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-rename-original-instance-append-blue.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-dbinstance-parameter-group.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-dbinstance-parameter-group.png new file mode 100644 index 00000000..efa8ac8b Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-dbinstance-parameter-group.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance-new.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance-new.png new file mode 100644 index 00000000..0538db8c Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance-new.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance.png new file mode 100644 index 00000000..5fe844c2 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify-dbinstance.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify.png new file mode 100644 index 00000000..382c0259 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-modify.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-source-database.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-source-database.png new file mode 100644 index 00000000..be529997 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-select-source-database.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-set-logical-replication-group.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-set-logical-replication-group.png new file mode 100644 index 00000000..6b14a105 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-set-logical-replication-group.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password-new.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password-new.png new file mode 100644 index 00000000..444bcfc2 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password-new.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password.png new file mode 100644 index 00000000..0fa57219 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/rds-update-master-password.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-blue-green-deployment.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-blue-green-deployment.png new file mode 100644 index 00000000..7ffe1ef7 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-blue-green-deployment.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-launch-product-ted.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-launch-product-ted.png new file mode 100644 index 00000000..3c4a8506 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-launch-product-ted.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-naming-version.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-naming-version.png new file mode 100644 index 00000000..dd82c5ac Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-naming-version.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-rename-blue.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-rename-blue.png new file mode 100644 index 00000000..26dbc665 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-rename-blue.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-set-encryption.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-set-encryption.png new file mode 100644 index 00000000..74bdc71a Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-set-encryption.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation-new.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation-new.png new file mode 100644 index 00000000..fb096ced Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation-new.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation.png b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation.png new file mode 100644 index 00000000..e582f9c3 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/database-upgrade/service-catalog-storage-allocation.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/restore/index.md b/docs/guides/hosting-guardrails/disaster-recovery/hive-restore/index.md similarity index 100% rename from docs/guides/hosting-guardrails/disaster-recovery/restore/index.md rename to docs/guides/hosting-guardrails/disaster-recovery/hive-restore/index.md diff --git a/docs/guides/hosting-guardrails/disaster-recovery/index.md b/docs/guides/hosting-guardrails/disaster-recovery/index.md index 02f20d97..0fa5efee 100644 --- a/docs/guides/hosting-guardrails/disaster-recovery/index.md +++ b/docs/guides/hosting-guardrails/disaster-recovery/index.md @@ -55,9 +55,14 @@ This section provides detailed step-by-step instructions on how to use DR featur | Guide | Description | - | - -| [Hive Restore](guides/hosting-guardrails/disaster-recovery/restore) | Guides to restore a Guardrails database from RDS snapshot. -| [DR Testing](guides/hosting-guardrails/disaster-recovery/dr-testing) | Guides to restore a destroyed workspace. -| [Database Upgrade and Storage Optimization](guides/hosting-guardrails/disaster-recovery/database-upgrade-storage-optimization) | Guides to resize and/or upgrade a database engine version with minimal downtime. +| [Architecture Options](guides/hosting-guardrails/disaster-recovery/architecture-options) | Architecture Options. +| [Hive Restore](guides/hosting-guardrails/disaster-recovery/hive-restore) | Guides to restore a Guardrails database from RDS snapshot. +| [Database Upgrade](guides/hosting-guardrails/disaster-recovery/database-upgrade) | Guides to upgrade a Guardrails database in scenarios of DB engine or optimize storage. +| [Workspace Restore](guides/hosting-guardrails/disaster-recovery/restore-workspace) | Guides to restore a destroyed workspace. +| [Multi-Region Deployment](guides/hosting-guardrails/disaster-recovery/multi-region-deployment) | Guides to set up a multi-region deployment of Turbot Guardrails using Tier 3 architecture. +| [Multi-Region Failover](guides/hosting-guardrails/disaster-recovery/multi-region-failover) | Guides to set up Disaster Recovery (DR) failover for Turbot Guardrails Multi-Region deployment. + + ## Additional Assistance diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name-old.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name-old.png new file mode 100644 index 00000000..a9899b38 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name-old.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name.png new file mode 100644 index 00000000..33adc686 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/configure-api-mappings.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/configure-api-mappings.png new file mode 100644 index 00000000..1bc37029 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/configure-api-mappings.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication-details.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication-details.png new file mode 100644 index 00000000..336a0395 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication-details.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication.png new file mode 100644 index 00000000..d79b904a Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/index-original.md b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/index-original.md new file mode 100644 index 00000000..9112ebe5 --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/index-original.md @@ -0,0 +1,191 @@ +--- +title: Multi-Region Deployment +sidebar_label: Multi-Region Deployment +--- + +# Multi-Region Deployment + +## 1. Introduction + +### 1.1 Purpose + +This document outlines the setup plan for deploying the **Turbot Guardrails** application using the **Tier 3** architecture. The objective is to ensure high availability, minimize downtime, and reduce data loss in the event of a disaster by utilizing a multi-region and multi-availability zone (AZ) deployment strategy. + +### 1.2 Scope + +This setup applies to all production workloads deployed under the **Tier 3** architecture, guaranteeing high availability and fast recovery. + +### 1.3 Target Audience + +This guide is intended for **Guardrails Administrators** with experience in AWS cloud infrastructure management and Guardrails deployment. Familiarity with database recovery and restoration processes is beneficial. + +## 2. Disaster Recovery Objectives + +| Objective | Definition | +| ------------------------------ | -------------------------------------------------------- | +| Recovery Time Objective (RTO) | 2 Hours | +| Recovery Point Objective (RPO) | 2 Hour | +| Availability | 99.9% | +| Use Case | Production deployments requiring rapid disaster recovery | + +## 3. Tier 3 Deployment Architecture + +### 3.1 Overview + +The **Tier 3** architecture enhances resilience by deploying a **standby environment in a secondary AWS region**. The primary and standby environments adhere to the following principles: + +- Installation of **TEF, TED, and TE** will follow the steps outlined in the [main installation guide](https://turbot.com/guardrails/docs/guides/hosting-guardrails/installation). +- Below is a list of differences or key considerations for installations where multi-region disaster recovery (DR) is required. + + + +### 3.2 Architecture Diagram + +![Tier 3 Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/tier-3.png) + +## 4. Prerequisites + +### 4.1 Glossary + +- **Primary Region**: The main region where Turbot Guardrails is installed or will be installed. This region acts as the active environment. +- **Disaster Recovery (DR) Region**: The secondary region where the workspace will be failed over in case of a disaster. + +### 4.2 Assumptions + +This guide assumes the following setup for deploying Turbot Guardrails: + +- A **predefined VPC** (not created by Turbot Guardrails). +- **DNS records** are not managed by Turbot Guardrails. +- **IAM roles** are not provisioned by Turbot Guardrails. +- **API Gateway with an internal load balancer** is used. + +### 4.3 Key Considerations + +#### VPC Configuration + +A predefined VPC with subnets mirroring the primary region must be set up in the DR region. + +#### SSL Certificate + +- Ensure the certificate is valid and available in both primary and DR regions. +- If the certificate includes a wildcard domain (e.g., `*.cloudportal.company.com`), no additional changes are required. +- Otherwise, the certificate should be configured to trust the following domains for API Gateway: + - `gateway.cloudportal.company.com` (Primary region) + - `gateway-dr.cloudportal.company.com` (DR region) + +#### Workspace Configuration + +- A **single additional workspace** will be installed in the DR region. +- The domain for the DR workspace will follow the pattern: `{workspace_name}-dr.cloudportal.company.com`. + +#### Product Version Requirements + +Both regions must run the following minimum versions: + +- **TEF:** 1.66.0 +- **TED:** 1.45.0 +- **TE:** 5.49.0 +- **Turbot Resource Name Prefix** should be identical in both regions. Defaults to `turbot`. + +### 4.4 Differences Between Primary and DR Regions + +| Configuration | Primary Region | DR Region | +|--------------|----------------|------------| +| **TEF Configuration** | • SSL certificate must cover required domains | • SSL certificate must cover required domains | +| | • "API Gateway prefix" parameter set to `gateway` | • "API Gateway prefix" parameter set to `gateway-dr` | +| | • "Guardrails multi-region KMS Key Type" set to `Primary` | • "Guardrails multi-region KMS Key Type" set to KMS key ARN from primary region (alias: `turbot_guardrails`, prefixed with `mrk-`) | +| | | • Manual creation of custom domain names (`gateway.cloudportal.company.com`) for API Gateway | +| **TED Configuration** | • Database name must be identical in both regions | • Database name must be identical in both regions | +| **RDS Configuration** | • Manual configuration of [cross-region RDS DB snapshots](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReplicateBackups.html) with appropriate retention policies | - | + +> [!WARNING] +> When setting up TEF in the DR region, ensure a smooth deployment to avoid rollback issues. If a replica key is created and a rollback is required, the replica key cannot be deleted immediately and will be subject to a 7-day retention period unless removed with AWS Support assistance. **You can create only one replica of each primary key in each AWS Region.** + +> If necessary, complete the TEF setup in the DR region by setting the Guardrails multi-region KMS Key Type (under Advanced - Deployment) to Primary. Once the setup is successfully completed, update the parameter to Replica and delete the multi-region key created in the DR region. + +### 4.5 Workspace Deployment in DR Region + +- Create a **test workspace** in the DR region. +- Install the same set of **mods** as in the primary region to ensure consistency. + +#### Context + +Creating a test workspace in the DR region is essential because manually installing mods during an actual disaster recovery scenario can be time-consuming and might lead to delays exceeding your Recovery Time Objective (RTO) and Recovery Point Objective (RPO). By preparing a sandbox workspace in advance in the DR region, you can install mods proactively using the same automation methods (such as pipelines, Terraform scripts, or AutoMod updates) and schedules employed for your primary workspace. This ensures that your DR workspace remains continuously up-to-date and can quickly and reliably take over workloads if your primary workspace experiences downtime. + +## 5. Implementation Steps + +### 5.1 Setting Up Cross-Region Database Backup + +- Navigate to the AWS RDS Service in the Primary region. +- Click on "Automated backups". +- Under the "Current Region" tab, select the Turbot Guardrails database (e.g., `turbot-newton`). +- Select the Guardrails database, click on the "Actions" dropdown button, and choose "Manage cross-Region replication". +- A "Manage cross-Region replication" window will open. +- Check the "Enable replication in another AWS Region" option. + +![Enable cross-Region replication](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication.png) + +- Fill in the necessary details in the form: + - Destination Region: Select the "DR region". + - Replicated backup retention period: Choose the appropriate retention period in days. + - AWS KMS Key: Select the encryption key used for the Turbot database in the DR region. Typically, this follows the format "turbot_databasename" (e.g., `turbot_newton`). + - Validate the KMS Key ID: Navigate to the KMS service in the DR region to confirm the correct key. + +![Manage cross-Region replication](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/manage-crossregion-replication.png) + +- Click **Save** to complete the setup. +- Navigate to the **"Replicated"** tab and verify that the database is listed under **"Replicated backups"**. + +### 5.2 Configuring Workspaces in the Primary Region + +- Make sure to set the following policies on the Guardrails workspace: + +- `Turbot > Workspace > Gateway Domain Name`: Fully qualified domain name of the publicly accessible gateway to the workspace - for example, `gateway.turbot.acme.com`. Set to the domain name only, do not include protocol or path information. + +- `Turbot > Workspace > Domain Name`: Fully qualified domain name of the workspace - for example, `console.turbot.acme.com`. Set to the domain name only, do not include protocol or path information. + +### 5.3 Configuring API Gateway Custom Domain Name in the DR Region + +To ensure seamless failover in the DR region, you need to configure the "API Gateway Custom Domain Name". + +- Open the AWS API Gateway service in the "DR region". +- Verify that the custom domain `gateway-dr.cloudportal.company.com` is already present. +- Click on "Add domain name". +- Enter the same domain name as in the primary region: `gateway.cloudportal.company.com`. +- Configure the following settings: + +![Add domain name](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name.png) + +- Type: Public +- API endpoint type: Regional +- Minimum TLS version: TLS 1.2 +- ACM Certificate: Select the ACM Certificate created for Turbot Guardrails. This certificate should be configured to trust both `gateway.cloudportal.company.com` and `gateway-dr.cloudportal.company.com`. +- Click "Add domain name" to finalize the setup. +- Once created, navigate to the "Custom domain name" settings and open the "API mappings" tab. +- Click on "Configure API mappings", then select "Add new mapping". + +![Configure API mappings](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/configure-api-mappings.png) + +- Set the following values: + - API: Select `turbot-api`. + - Stage: Choose `turbot`. + - Path (optional): Leave blank. +- Click **Save** to apply the changes. + +### 5.4 Configuring DNS records + +Ensure that the following DNS records are correctly configured to route traffic appropriately: + +- API Gateway DNS Record: + + The domain `gateway.cloudportal.company.com` should have an **A** record pointing to the API Gateway endpoint in the primary region. The API Gateway endpoint typically follows the format: `abcdefghij.execute-api.us-east-1.amazonaws.com` + +- Workspace Console DNS Record: + + The domain `console.cloudportal.company.com` should have a **CNAME** record pointing to the internal load balancer DNS name in the primary region. This internal load balancer DNS name generally follows the format: `internal-turbot-5-49-0-lb-1234567890.us-east-1.elb.amazonaws.com` + +## Additional Assistance + +Turbot Support is happy to consult with Enterprise customers to help determine a strategy to manage these scenarios. Contact us at [help@turbot.com](mailto:help@turbot.com). diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/index.md b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/index.md new file mode 100644 index 00000000..51e2b350 --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/index.md @@ -0,0 +1,190 @@ +--- +title: Multi-Region Deployment +sidebar_label: Multi-Region Deployment +--- + +# Multi-Region Deployment with Guardrails + +In this guide, you will: + +- Set up a multi-region deployment of Turbot Guardrails using Tier 3 architecture. +- Configure disaster recovery (DR) processes to ensure high availability and rapid recovery. + +This guide outlines the deployment of **Turbot Guardrails** using a **Tier 3 architecture**. It aims to ensure high availability, minimize downtime, and reduce data loss in disaster scenarios through a multi-region, multi-AZ deployment strategy. + +> [!NOTE] +>This deployment approach applies to all production workloads deployed under the `Tier 3` architecture, ensuring rapid recovery and high availability. + +## Target Audience +**Guardrails Administrators** experienced with AWS cloud infrastructure, Guardrails deployment, and database recovery. + +## Disaster Recovery Objectives + +| Objective | Definition | +| ------------------------------ | -------------------------------------------------------- | +| Recovery Time Objective (RTO) | 2 hours | +| Recovery Point Objective (RPO) | 2 hours | +| Availability | 99.9% | +| Use Case | Rapid recovery for production workloads | + +## Tier 3 Deployment Architecture + +The **Tier 3** architecture enhances resilience by deploying a standby environment in a secondary AWS region. The primary and standby environments follow these guidelines: + +- **TEF, TED, and TE installation**: Follow the [main installation guide](/guardrails/docs/guides/hosting-guardrails/installation). +- Differences and considerations specific to multi-region disaster recovery are outlined below. + +### Architecture + +![Tier 3 Architecture](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/tier-3.png) + +### Prerequisites + +- `Primary Region`: Active deployment region for Turbot Guardrails. +- `DR Region`: Secondary region for disaster recovery. + +### Assumptions + +The deployment approach outlined in this guide is based on the following assumptions: + +1. The *VPC is pre-configured* and not created as part of the Turbot Guardrails installation +2. *DNS record* management is handled externally, not by Turbot Guardrails +3. *IAM roles* are not provisioned by Turbot Guardrails. +4. The *API Gateway* is configured with an internal load balancer architecture + +## Key Considerations + +When implementing a multi-region deployment for disaster recovery, several critical aspects need careful consideration. The following sections outline the key technical requirements and setup guidelines. + +### VPC Configuration + +Ensure VPCs and subnets mirror the primary region setup in the DR region. + +### SSL Certificate + +1. Ensure the certificate is valid and available in both primary and DR regions. +2. Wildcard domain certificates are preferred (e.g. `*.cloudportal.company.com`). +3. If not available, certificates must explicitly trust both primary region (`gateway.cloudportal.company.com`) and DR (`gateway-dr.cloudportal.company.com`) domains. + +### Workspace Configuration + +Deploy an additional workspace in the DR region using the domain pattern: `{workspace_name}-dr.cloudportal.company.com`. Refer [Create Workspace](/guardrails/docs/guides/hosting-guardrails/installation/workspace-manager) + +### Product Version Requirements + +Both regions require these **`minimum`** versions: + +1. **TEF:** 1.66.0 +2. **TED:** 1.45.0 +3. **TE:** 5.49.0 +> [!NOTE] +> *Turbot Resource Name Prefix:* Should be identical in both regions. Defaults to `turbot`. + +## Key Differences Between Primary and DR Regions + +The primary and DR regions share identical configuration settings except for a few key differences that need to be configured specifically for each region: + +| Configuration | Attributes | Primary Region | DR Region | +|---------------------|-------------------------------------|-----------------------------------------------------------|----------------------------------------------------------------| +| **TEF Configuration** | `SSL Certificate` | Covers required domains e.g. `gateway.cloudportal.company.com` | Covers required domains e.g. `gateway-dr.cloudportal.company.com ` | +| | `API Gateway Prefix`(default `gateway`) under the `Network - API Gateway` section | Set to `gateway` | Set to `gateway-dr` | +| | `Multi-region KMS Key Type` under `Advanced - Deployment` section | Set to `Primary` | Set to KMS key ARN from primary region (`alias: turbot_guardrails`, prefixed with `mrk-`) | +| | `API Gateway Custom Domain ` | Created automatically | Create manually (`gateway.cloudportal.company.com`) | +| **TED Configuration** | Database Name | Identical in both regions | Identical in both regions | +| **RDS Configuration** | Cross-region DB Snapshots | Manually configured [Cross-region RDS DB snapshots](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReplicateBackups.html) | Uses snapshots replicated from primary region | + + +> [!WARNING] +> When setting up TEF in the DR region, ensure a smooth deployment to avoid rollback issues. If a replica key is created and a rollback is required, the replica key cannot be deleted immediately and will be subject to a 7-day retention period unless removed with AWS Support assistance. **You can create only one replica of each primary key in each AWS Region.** + +> If necessary, complete the TEF setup in the DR region by setting the Guardrails multi-region KMS Key Type (under Advanced - Deployment) to Primary. Once the setup is successfully completed, update the parameter to Replica and delete the multi-region key created in the DR region. + +## Workspace Deployment in DR Region + + 1. Create a **test workspace** in the DR region. Refer [Create Workspace](/guardrails/docs/guides/hosting-guardrails/installation/workspace-manager). + 2. Install the same mods as the primary region workspace. + +> [!NOTE] +> Creating a test workspace in the DR region is essential. Manually installing mods during an actual disaster recovery scenario can be time-consuming and may exceed your RTO/RPO targets. By maintaining a sandbox workspace with proactive mod installation (via pipelines, Terraform scripts, or AutoMod updates), you ensure the DR workspace stays current and can quickly take over if the primary workspace fails. + +## Implementation Steps + +### Step1: Setup Cross-Region Database Backup + +1. Open AWS RDS console in the primary region. +2. Select **Automated backups**. +3. Choose Guardrails database (e.g. turbot-babbage). +4. Select **Manage cross-Region replication** from **Actions** dropdown. +5. Enable cross-region replication, select DR region, set retention, and select the KMS key. +6. Save and verify replication under the **Replicated** tab. + +![Enable cross-Region replication](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication.png) + +7. Destination Region: Select the 'DR region'. +8. Replicated backup retention period: Choose the appropriate retention period in days. +9. AWS KMS Key: Select the encryption key used for the Turbot database in the DR region. Typically, this follows the format `turbot_databasename` (e.g. turbot-babbage). +10. Validate the KMS Key ID: Navigate to the KMS service in the DR region to confirm the correct key. + +![Enable cross-Region replication](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/enable-crossregion-replication-details.png) + +Select **Save** and navigate to the `Replicated` tab and verify that the database is listed under `Replicated backups`. + +### Step 2: Configuring Workspaces in the Primary Region + +Set policies as below + +1. `Turbot > Workspace > Gateway Domain Name`: e.g., `gateway.turbot.acme.com`. +2. `Turbot > Workspace > Domain Name`: e.g., `console.turbot.acme.com`. + +> [!IMPORTANT] +> Set the domain name only, do not include protocol or path information. + +### Step 3: Configuring API Gateway Custom Domain Name in the DR Region + +To ensure seamless failover in the DR region, you need to configure the `API Gateway Custom Domain Name`. + +1. Open the AWS API Gateway service in the `DR region`. +2. Verify that the custom domain `gateway-dr.cloudportal.company.com` is already present. +3. Select on **Add domain name**. +4. Enter the same `Domain name` as in the primary region: `gateway.cloudportal.company.com`. +5. Type as `Public`. +6. `API endpoint type` as Regional. +7. `Minimum TLS version` as TLS 1.2. +8. In `ACM Certificate`, select the ACM Certificate created for Turbot Guardrails. This certificate should be configured to trust both `gateway.cloudportal.company.com` and `gateway-dr.cloudportal.company.com`. + +Select **Add domain name** to finalize the setup. + +![Add domain name](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/add-domain-name.png) + +9. Once created, navigate to the `Custom domain name` settings and open the `API mappings` tab. +10. Click on **Configure API mappings**, then select **Add new mapping**. + +![Configure API mappings](/images/docs/guardrails/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/configure-api-mappings.png) + +11. Configure API mappings for `turbot-api` and in `Stage` choose `turbot`. +12. Apply changes by selecting **Save** + +### Step 4: Configure DNS Records + +1. **API Gateway DNS Record**: The domain `gateway.cloudportal.company.com` should have an `A record` pointing to the API Gateway endpoint in the primary region. The API Gateway endpoint typically follows the format: `abcdefghij.execute-api.us-east-1.amazonaws.com`. +2. **Workspace Console DNS Record**: The domain `console.cloudportal.company.com` should have a CNAME record pointing to the internal load balancer DNS name in the primary region. This internal load balancer DNS name generally follows the format: `internal-turbot-5-49-0-lb-1234567890.us-east-1.elb.amazonaws.com`. + +### Step 5: Review + +- [ ] Ensure cross-region backup is operational. +- [ ] Verify DNS configurations and API mappings. +- [ ] Test workspace e.g. `console.turbot.acme.com` access is successful in the DR region. + +## Next Steps + +Learn more about: + +- [Turbot Guardrails Hosting Architecture](/guardrails/docs/guides/hosting-guardrails/architecture). +- [DR Architecture Options](/guardrails/docs/guides/hosting-guardrails/disaster-recovery/architecture-options). +- [Multi-Region Failover](/guardrails/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover) + +## Assistance + + + +If you continue to encounter issues, please open a ticket with us and attach the relevant information to assist you more efficiently. [Open Support Ticket](https://support.turbot.com) \ No newline at end of file diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/manage-crossregion-replication.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/manage-crossregion-replication.png new file mode 100644 index 00000000..18246b08 Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/manage-crossregion-replication.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/tier-3.png b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/tier-3.png new file mode 100644 index 00000000..cf0327ca Binary files /dev/null and b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment/tier-3.png differ diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover/index-original.md b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover/index-original.md new file mode 100644 index 00000000..67467c9d --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover/index-original.md @@ -0,0 +1,98 @@ +--- +title: "Multi-Region Failover" +template: Documentation +nav: + title: "Multi-Region Failover" + order: 12 +--- + +# Multi-Region Failover + +## 1. Introduction + +This document outlines the steps required to execute a Disaster Recovery (DR) failover for Turbot Guardrails Multi-Region deployment, ensuring minimal downtime and data loss. It covers the process to switch operations from the primary region to the DR region in the event of a failure. + +## 2. Failover Scenarios + +The DR failover may be triggered under the following conditions: + +- Scheduled compliance testing, as mandated by industry standards (ISO 27001, NIST 800-34, SOC 2, HIPAA, PCI-DSS) or internal governance policies. Organizations typically conduct DR simulations every 6 months or annually to validate readiness. +- Complete failure of the primary region. +- Significant degradation in performance affecting operations. +- Security incidents requiring immediate isolation of the primary region. + +## 3. Prerequisites for Failover + +Before initiating failover, ensure the following: + +- All three provisioned products, TEF, TED, and TE, should be of the same version in both the primary and DR regions. +- The DR region infrastructure is fully set up as per the [Multi-Region Deployment Guide](guides/hosting-guardrails/disaster-recovery/multi-region-deployment). +- The cross-region RDS backups are active and up-to-date. +- API Gateway and Load Balancer configurations in the DR region are in place. +- Mods on the test workspace in the DR region should ideally match the versions in the primary region. If there are discrepancies, they can be updated after the DR process is completed. +- DNS records can be updated to redirect traffic to the DR region. + +## 4. Failover Execution Steps + +### 4.1 Database Failover + +In the DR region, + +1. Navigate to AWS RDS. +2. Select the existing Turbot Guardrails database instance. +3. Rename the database by appending -temp to its name (e.g., if the database name is turbot-newton, rename it to turbot-newton-temp). +4. Apply the changes immediately and wait for the rename operation to complete. +5. Once the rename is complete, navigate to the “Automated backups” section. +6. Under the "Replicated" tab, locate and select the Turbot Guardrails database. +7. Click on the "Actions" dropdown menu and choose "Restore to point in time". +8. A Restore to Point in Time window will open. +9. Select an appropriate point in time to restore from. In most cases, choosing "Latest restorable time" is recommended. +10. Ensure the restored database settings match those of the primary region as closely as possible. +11. Set the DB Instance Identifier to match the primary region’s database identifier (e.g., turbot-newton). +12. Select the correct VPC, subnet, and security group configurations. +13. Leave the "Initial database name" field blank. +14. Use the same parameter group as assigned to the turbot-newton-temp database. +15. Click "Restore to point in time" to initiate the restoration process. +16. Wait for the database instance to reach the available state. +17. Once the instance is available, navigate to AWS Service Catalog. +18. Toggle the "Parameter Deployment Trigger" in TEF, TED, and TE from Blue <-> Green, ensuring all services transition properly to the DR setup. + +### 4.2 API Gateway and Load Balancer Updates + +1. Navigate to **AWS API Gateway** in the DR region. +2. Ensure the **custom domain name** (`gateway.cloudportal.company.com`) is correctly mapped (in API mappings tab) to the API Gateway in DR. +3. Update DNS records: + - **API Gateway:** Point `gateway.cloudportal.company.com` to the DR region's API Gateway endpoint. + - **Console Access:** Update `console.cloudportal.company.com` to point to the internal load balancer in the DR region. + +### 4.3 Application Validation + +1. Confirm that **Turbot Guardrails** services are accessible via the **DR region endpoints**. +2. Perform a **test login** to the Turbot Guardrails console. +3. Validate that database queries and API requests are **functioning correctly**. +4. Check logs for any **errors or inconsistencies**. + +## Step 5: Validation + +This step validates the DR process. + +- [ ] Local User Authentication: Log in using a local user account to confirm that existing credentials remain valid and can successfully decrypt secrets. +- [ ] SAML Authentication: Log in using a SAML-based user account to verify that authentication works as expected. +- [ ] Permission Assignment: Grant new permissions and confirm that the system correctly recognizes and applies them. +- [ ] Resource & Control Verification: Ensure that the resource count and control status align with pre-disaster values. +- [ ] CMDB Control Execution: Run a CMDB control to validate that API calls to the cloud provider are functioning properly. +- [ ] Event Handlers Validation: Create a new S3 bucket and check if it appears in the Guardrails UI, confirming that real-time event processing and API Gateway are operational. +- [ ] Stack Validation: Execute an Event Handler stack or another relevant stack to ensure that the Factory and Containers operate correctly. +- [ ] Turbot Resource Test: Create and delete a test Turbot Folder to verify system stability and proper resource lifecycle management. + +## 5. Failback to Primary Region + +Once the primary region is restored, follow these steps: + +1. **Sync any new data** from the DR region back to the primary database. +2. **Update DNS records** to point back to the primary region's API Gateway and Load Balancer. +3. **Validate application functionality** in the primary region before resuming normal operations. + +## Additional Assistance + +Turbot Support is happy to consult with Enterprise customers to help determine a strategy to manage these scenarios. Contact us at [help@turbot.com](mailto:help@turbot.com). diff --git a/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover/index.md b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover/index.md new file mode 100644 index 00000000..3acf8228 --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/multi-region-failover/index.md @@ -0,0 +1,112 @@ +--- +title: Multi-Region Failover +sidebar_label: Multi-Region Failover +--- + +# Multi-Region Failover with Guardrails + +In this guide, you will: +- Execute a disaster recovery (DR) failover for Turbot Guardrails Multi-Region deployment. +- Validate the failover process and ensure system functionality. +- Learn how to failback to the primary region when appropriate. + +This guide provides detailed steps for executing a disaster recovery failover in a multi-region Turbot Guardrails deployment, ensuring minimal downtime and data loss during region transitions. + +> [!NOTE] +> Regular DR testing is crucial for maintaining compliance with industry standards and ensuring operational readiness. + +## Target Audience +**Guardrails Administrators** experienced with AWS cloud infrastructure, database management, and Guardrails operations. + +## Failover Scenarios + +| Scenario | Description | +|----------|-------------| +| Compliance Testing | Scheduled DR testing as required by ISO 27001, NIST 800-34, SOC 2, HIPAA, PCI-DSS. | +| Region Failure | Complete failure of the primary region requiring immediate failover. | +| Performance Issues | Significant degradation affecting operations. | +| Security Incidents | Situations requiring isolation of the primary region. | + +## Prerequisites + +Before initiating failover, ensure: + +- TEF, TED, and TE versions match in both primary and DR regions. +- DR region infrastructure is configured per the [Multi-Region Deployment Guide](/guardrails/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment). +- Cross-region RDS backups are current and active. +- API Gateway and Load Balancer configurations are ready in DR region. +- Test workspace mods in DR region match primary region versions. +- DNS records can be updated to redirect traffic to the DR region. + +Refer following implementation steps: + +## Step 1: Setup Database Failover + +1. In the DR region AWS RDS console: + - Select the existing Guardrails database instance. + - Rename by appending `-temp` (e.g., `turbot-newton` to `turbot-newton-temp`). + - Wait for rename completion. + +2. Continue in RDS console to restore from backup: + - Navigate to `Automated backups` > `Replicated` tab. + - Find Guardrails database. + - Choose `Restore to point in time` from **Actions** dropdown. + - Select `Latest restorable time` from `Restore to Point in Time window`. + - Set DB Instance Identifier *to match* primary region e.g. turbot-newton. + - Configure VPC, subnet, and security group configurations. + - Leave `Initial database name` field blank. + - Use the same DB parameter group as assigned to the `turbot-newton-temp` database. + - Initiate restoration process by selecting **Restore to point in time**. + - Wait for the database instance to reach the available state. + - Once the DB is in `available` state proceed to next step. + +3. Update Enterprise Stacks: + - Access AWS Service Catalog + - Toggle Parameter Deployment Trigger in TEF, TED, and TE. Refer more info at [Guardrails Stack Updates](/guardrails/docs/guides/hosting-guardrails/updating-stacks#guardrails-stack-updates). This ensures all services transition properly to the DR setup. + +## Step 2: Update API Gateway and Load Balancer + +Configure API Gateway in DR region: + + 1. Navigate to AWS API Gateway in the DR region. + 2. Verify custom domain mapping i.e. ensure the custom domain name (`gateway.cloudportal.company.com`) is correctly mapped (in API mappings tab) to the API Gateway in DR. + 3. Update DNS records: + - **API Gateway:** Point `gateway.cloudportal.company.com` to the `DR region's API Gateway endpoint`. + - **Console Access:** Update `console.cloudportal.company.com` to point to the internal load balancer in the DR region. + +## Step 3: Validate DR Region Endpoint Access + +1. Confirm that *Turbot Guardrails* services are accessible via the *DR region endpoints*. +2. Perform a *test login* to the Turbot Guardrails console. +3. Validate that database queries and API requests are *functioning correctly*. +4. Check logs for any *errors or inconsistencies*. + +## Step 4: Review + +- [ ] Local User Authentication: Log in using a local user account to confirm that existing credentials remain valid and can successfully decrypt secrets. +- [ ] SAML Authentication: Log in using a SAML-based user account to verify that authentication works as expected. +- [ ] Permission Assignment: Grant new permissions and confirm that the system correctly recognizes and applies them. +- [ ] Resource & Control Verification: Ensure that the resource count and control status align with pre-disaster values. +- [ ] CMDB Control Execution: Run a CMDB control to validate that API calls to the cloud provider are functioning properly. +- [ ] Event Handlers Validation: Create a new S3 bucket and check if it appears in the Guardrails UI, confirming that real-time event processing and API Gateway are operational. +- [ ] Stack Validation: Execute an Event Handler stack or another relevant stack to ensure that the Factory and Containers operate correctly. +- [ ] Turbot Resource Test: Create and delete a test Turbot Folder to verify system stability and proper resource lifecycle management. + +## Fallback to Primary Region + +Once the primary region is restored, follow these steps: + +1. *Sync any new data* from the DR region back to the primary database. +2. *Update DNS records* to point back to the primary region's API Gateway and Load Balancer. +3. *Validate application functionality* in the primary region before resuming normal operations. + +## Next Steps + +Learn more about: +- [Turbot Guardrails Hosting Architecture](/guardrails/docs/guides/hosting-guardrails/architecture) +- [DR Architecture Options](/guardrails/docs/guides/hosting-guardrails/disaster-recovery/architecture-options) +- [Multi-Region Deployment](/guardrails/docs/guides/hosting-guardrails/disaster-recovery/multi-region-deployment) + +## Assistance + +If you encounter issues, please open a ticket with us and attach the relevant information to assist you more efficiently. [Open Support Ticket](https://support.turbot.com) diff --git a/docs/guides/hosting-guardrails/disaster-recovery/dr-testing/index.md b/docs/guides/hosting-guardrails/disaster-recovery/restore-workspace/dr-testing-original.md similarity index 100% rename from docs/guides/hosting-guardrails/disaster-recovery/dr-testing/index.md rename to docs/guides/hosting-guardrails/disaster-recovery/restore-workspace/dr-testing-original.md diff --git a/docs/guides/hosting-guardrails/disaster-recovery/restore-workspace/index.md b/docs/guides/hosting-guardrails/disaster-recovery/restore-workspace/index.md new file mode 100644 index 00000000..638cfad0 --- /dev/null +++ b/docs/guides/hosting-guardrails/disaster-recovery/restore-workspace/index.md @@ -0,0 +1,197 @@ +--- +title: Workspace Restore +sidebar_label: Workspace Restore +--- + +# Restoring a Workspace + +In this guide, you will: + +- Test backup and restore procedures for Turbot Guardrails workspaces `within the single region`. +- Monitor and troubleshoot the disaster recovery process. + +An essential part of maintaining Turbot Guardrails is testing disaster recovery. This document covers the process for restoring a destroyed workspace. Restoration should be tested at least once a year, ideally twice. The goal is to have *Guardrails Admins* familiar with the restoration process and the tools involved. + +Testing backup and restore procedures is critical for: + +- Validating backup integrity and restore processes +- Meeting compliance and audit requirements +- Training administrators on recovery procedures +- Measuring recovery time objectives (RTO) + +> [!NOTE] +> Workspace restoration is just *one* of several disaster recovery scenarios. Evaluate other scenarios as part of your organization's comprehensive disaster recovery strategy. + +## Prerequisites + +- Administrator access to AWS Console. +- Familiarity with Guardrails installation. +- Understanding of database backup/restore. +- Access to required AWS services such as RDS, CloudFormation, ECS and Route 53. + +## Process Summary + +- *Build a New Workspace* – Set up a fresh workspace for testing, install required mods, and `take an RDS snapshot`. +- *Simulate Disaster* – `Destroy the workspace` by deleting its CloudFormation stack. +- *Restore the Workspace* – Recover data from the latest backup, apply migrations, and restart the workspace. +- *Validate Restoration* – Log in and verify the workspace is functional. + +>[!IMPORTANT] +> +> Only test with non-production workspaces +> +> Document all parameters and configurations +> +> Time the restore process to measure RTO +> +> Test regularly (recommended twice per year) +> +> Follow security best practices + + +## Step 1: Build a New Workspace + +In this phase, create a workspace and install baseline mods. Then, [import an AWS account](/guardrails/docs/guides/aws/import-aws-account) with **Event Pollers**. + +> [!NOTE] +> Same process applies to **Azure** and **GCP**. +> +> This process assumes that Route53 is used for DNS. Customers with manually configured DNS will need to keep track of their configuration. + +### Steps: + +1. **Select TE Version**: + - Choose a dedicated TE version for testing + - Note: ECS container flush during restore may cause brief outages for workspaces using this TE version + - If multiple workspaces use this TE version, [pause event processing](enterprise/FAQ/pause-events) + +2. **Access AWS Master Account**: + - Navigate to the alpha region of your AWS Master account + +3. **Create Test Workspace**: + - Follow the [workspace creation guide](/guardrails/docs/guides/hosting-guardrails/installation/workspace-manager#create-a-workspace) + - `Save all CloudFormation parameters` used (needed for restoration) + - Record credentials from CloudFormation Stack outputs + - Note the Turbot ID of workspace Turbot Root (`tmod:@turbot/turbot#/`) + +4. **Install Required AWS Mods**: + - `aws` + - `aws-iam` + - `aws-kms` + - `aws-s3` + +5. **Configure Workspace**: + - Create "AWS" folder under Turbot Root + - Import an AWS account into the folder + - Verify no controls/policies are in `tbd` state + +6. **Document Initial State**: + - Take screenshots of workspace dashboard + - Record key metrics: + - Number of resources + - Active controls count + - Other relevant statistics + - Save for post-restore validation + +7. **Create Backup**: + - Wait for automated "Restore to point in time" backup + - Or take a manual RDS backup + + +## Step 2: Drop the Workspace + +> [!WARNING] +> Do not delete a production workspace CloudFormation Stack. +> +> Do not delete original database. + +1. Delete the **Workspace CloudFormation stack** created earlier. +2. If necessary, **force delete** the workspace. +3. Verify that the **workspace URL is no longer accessible**. + +## Step 3: Restore the Workspace + +In this step, we will recreate a new workspace which initializes an empty database schema. The goal is to restore this empty schema with the data from our restored DB, effectively bringing back the workspace to its previous state. This process ensures we maintain the database structure while recovering all workspace configurations, resources, and control states from the backup. + +### Steps: + +1. **Start RTO Measurement**: + - Begin timing the restore process + - This helps determine your Recovery Time Objective (RTO) + +2. **Recreate Workspace**: + - Use original Workspace [CloudFormation template](/guardrails/docs/guides/hosting-guardrails/installation/workspace-manager#step-2-download-cloudformation-template) + - Apply identical parameter values from original workspace + - Deploy the new workspace stack + +3. **Restore Database**: + - Navigate to AWS RDS console + - Choose either: + - Restore from snapshot, or + - Use "Restore to point in time" feature + - Ensure restored DB configurations match original: + - Instance class + - Storage type/size + - Network settings + - Security groups + +4. **Configure Temporary Database**: + - Wait for restored DB to become available + - Record the new database endpoint + - Verify connectivity + +5. **Deploy Bastion Host**: + - Launch a Turbot Bastion Host instance. Follow setup guide [Turbot Bastion Host Setup](https://github.com/turbot/guardrails-samples/tree/main/enterprise_installation/turbot_bastion_host) + - Ensure network access to both databases + +6. **Execute Migration**: + - Run [migration script](https://github.com/turbot/guardrails-samples/tree/main/guardrails_utilities/turbot_schema_migration) to copy DB schema: + - From (Source): The restored database + - To (Target): New existing database + +```shell +nohup ./migration.sh & + +example: nohup ./migration.sh panda turbot-panda.abcxyzabcxyz.us-east-1.rds.amazonaws.com turbot-babbage.abcxyzabcxyz.us-east-1.rds.amazonaws.com & +``` + +7. Wait for the `pg_dump` and `pg_restore` process in `migration.sh` to complete. +8. **Flush ECS Containers**: + - Navigate to the AWS **ECS console** → **Cluster** open the **Tasks** tab + - Locate the **TE version-related tasks** and **stop them**. + +## Step 4: Clear Redis Cache + +To clear the workspace from Redis, log into the **bastion host** and execute: + +```shell +export REDISHOST=master.turbot-babbage-cache-cluster.abcxyz.use1.cache.amazonaws.com +redis-cli -h $REDISHOST --tls -p 6379 -a KEYS "*" | xargs redis-cli -h $REDISHOST --tls -p 6379 -a DEL + +example: redis-cli -h $REDISHOST --tls -p 6379 -a mysecurepassword KEYS "panda*" | xargs redis-cli -h $REDISHOST --tls -p 6379 -a mysecurepassword DEL +``` + +## Step 5: Review + +This step validates the restoration process. + +- [ ] **Login Validation** to ensure the **previous credentials** still work. +- [ ] **Resource & Control Check**: Verify the **number of resources** and **controls** match pre-disaster stats. +- [ ] **Test New Resource Import**: Create a new **S3 bucket** and verify it appears in **Guardrails UI**. +- [ ] **Verify Control Execution**: Run a **control scan** to confirm that all controls are in **OK** or **Skipped** state. + +## Next Steps + +Explore the following resources to expand your understanding of Guardrails disaster recovery and workspace management: + +- [Turbot Guardrails Hosting Architecture](/guardrails/docs/guides/hosting-guardrails/architecture) +- [Disaster Recovery Architecture Options](/guardrails/docs/guides/hosting-guardrails/disaster-recovery/architecture-options) + + +## Troubleshooting + +| **Issue** | **Description** | **Guide** | +|----------------------------------|----------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------| +| **Workspace Not Accessible** | If the workspace does not restore correctly, ensure that **RDS endpoints are correct** in the migration script. | | +| **Redis Cache Not Cleared** | If controls fail to execute, verify that **Redis cache clearing** was performed correctly. | See **Step 4: Clear Redis Cache** in this guide. | +| **Further Assistance** | If the issue persists, open a support ticket and provide **logs & screenshots** for faster resolution. | [Open Support Ticket](https://support.turbot.com) | diff --git a/docs/guides/using-guardrails/console/index.md b/docs/guides/using-guardrails/console/index.md index 659005db..4dc19800 100644 --- a/docs/guides/using-guardrails/console/index.md +++ b/docs/guides/using-guardrails/console/index.md @@ -23,7 +23,7 @@ The **Header** is consistent throughout and consists of: - **Controls**: Quickly and easily find relevant controls, as well as getting an environment wide overview of controls in various states. - **Permissions**: Assign and revoke permissions. - [Directories](guides/directories) can be viewed, modified, and created by + [Directories](/guardrails/docs/guides/configuring-guardrails/directories/local#guardrails-local-directories) can be viewed, modified, and created by clicking the Directories card. - **Reports**: Get curated information, such as CIS controls by account, and easily export the results to CSV. @@ -91,7 +91,7 @@ GraphQL, or a scripting language. The Policies tab dashboard provides visibility into the policy settings, policy types and policy packs. The text field can be used to manipulated using -[filters](reference/filters/policies) to return specific information. The +[filters](/guardrails/docs/reference/filter) to return specific information. The `Policy Packs` and `Policy Settings` cards can be used to easily navigate to those pages. diff --git a/docs/guides/using-guardrails/iam/administrators/index.md b/docs/guides/using-guardrails/iam/administrators/index.md index a98923d6..16fb1082 100644 --- a/docs/guides/using-guardrails/iam/administrators/index.md +++ b/docs/guides/using-guardrails/iam/administrators/index.md @@ -74,7 +74,7 @@ Guardrails allows administrators to set a custom minimum password length for loc ## Setting Guardrails API Keys to expire -By default, Guardrails API keys do not expire. Passwords for users in the Guardrails Local directory expire by default at 365 days. Usually, the API keys for the break glass described above are method of last resort to get back into a workspace in the event of an emergency. The [Password Reset](https://github.com/turbot/guardrails-samples/tree/main/queries/password_reset) queries in the Guardrails Samples Repo require API keys. These instructions describe a method for expiring all Guardrails API keys except the break glass user(s). A benefit of this approach is that it makes it easy to apply to one directory but not others. +By default, Guardrails API keys do not expire. Passwords for users in the Guardrails Local directory expire by default at 365 days. Usually, the API keys for the break glass described above are method of last resort to get back into a workspace in the event of an emergency. The [Password Reset](https://github.com/turbot/guardrails-samples/blob/main/queries/iam/reset_local_directory_user_password.graphql) queries in the Guardrails Samples Repo require API keys. These instructions describe a method for expiring all Guardrails API keys except the break glass user(s). A benefit of this approach is that it makes it easy to apply to one directory but not others. 1. Use the "Aging Turbot Access Keys" report to get an idea of which keys this policy will deactivate. API keys in this report show all keys over 90 days of age, regardless of "Active" or "Inactive" status. 2. In the Terraform below, adjust the regex to match the break glass user(s). Make additional changes to the calc policy as required. diff --git a/docs/guides/using-guardrails/iam/advanced/index.md b/docs/guides/using-guardrails/iam/advanced/index.md index 6598fb04..a62dc715 100644 --- a/docs/guides/using-guardrails/iam/advanced/index.md +++ b/docs/guides/using-guardrails/iam/advanced/index.md @@ -16,7 +16,7 @@ directed to [help@turbot.com](mailto:help@turbot.com). - [Authentication](concepts/iam/authentication) - [Identity](concepts/iam/identity) - [Permissions](concepts/iam/permissions) -- [Directories in Guardrails](guides/directories) +- [Directories in Guardrails](/guardrails/docs/guides/configuring-guardrails/directories/local#guardrails-local-directories) - [Guide to IAM Management](guides/iam) Read everything under `IAM`. - [Guardrails Admin Best Practices](guides/iam/administrators) diff --git a/docs/guides/using-guardrails/iam/permission-assignment/index.md b/docs/guides/using-guardrails/iam/permission-assignment/index.md index 5d8b3e4b..d535b85b 100644 --- a/docs/guides/using-guardrails/iam/permission-assignment/index.md +++ b/docs/guides/using-guardrails/iam/permission-assignment/index.md @@ -15,7 +15,7 @@ all profiles require users to log in to initiate Guardrails profile creation. Without a profile for the specific user, PERMISSIONS CANNOT BE ASSIGNED. For more information regarding directory creation, head on over to our -[directories guide](guides/directories). +[directories guide](/guardrails/docs/guides/configuring-guardrails/directories/local#guardrails-local-directories). For a general Guardrails IAM overview, check out the [IAM concepts page](concepts/iam). @@ -58,7 +58,7 @@ We can see that the Demo User is in the **Turbot Local** directory with ## SAML and Google Directories -1. After [directory setup](guides/directories/), users will be able to log into +1. After [directory setup](/guardrails/docs/guides/configuring-guardrails/directories/local#guardrails-local-directories), users will be able to log into the Guardrails console. Users MUST sign in prior to initial permission assignment. Logging in with a user for the first time creates the associated profile in Guardrails. diff --git a/docs/guides/using-guardrails/notifications/index.md b/docs/guides/using-guardrails/notifications/index.md index e4ff1556..f6a8fff5 100644 --- a/docs/guides/using-guardrails/notifications/index.md +++ b/docs/guides/using-guardrails/notifications/index.md @@ -18,14 +18,14 @@ Guardrails currently supports the following delivery channels for notifications: 1. **Email notifications** are sent from `guardrails@system.turbot.com` for SaaS customers. Enterprise customers running their own Guardrails environment can configure custom smtp hosts and `sent from` email address. 2. **Slack notifications** are sent via standard webhooks. For documentation on configuring webhooks for slack see: `https://api.slack.com/messaging/webhooks` 3. **Microsoft Teams notifications** are also sent via webhooks. For Teams documentation see: https://learn.microsoft.com/en-us/microsoftteams/platform/webhooks-and-connectors/how-to/add-incoming-webhook?tabs=dotnet -4. **Event Streams** can be created and consumed using the [Guardrails Firehose](guides/firehose) feature. +4. **Event Streams** can be created and consumed using the [Guardrails Firehose](/guardrails/docs/guides/configuring-guardrails/firehose) feature. ## Notification Triggers ### Control Triggers -Control notifications can be triggered any time a [Guardrails' Control]() runs. Typically these control notifications will be filtered based on your rules to ensure notifications are only sent when important changes occur. The most common example would be when a control state changes from `OK` state to `Alarm` state. +Control notifications can be triggered any time by a [Guardrails' Control](/guardrails/docs/concepts/controls) runs. Typically these control notifications will be filtered based on your rules to ensure notifications are only sent when important changes occur. The most common example would be when a control state changes from `OK` state to `Alarm` state. ### Action Triggers diff --git a/docs/guides/using-guardrails/stacks/deploy/index.md b/docs/guides/using-guardrails/stacks/deploy/index.md index 3509e4cf..46721904 100644 --- a/docs/guides/using-guardrails/stacks/deploy/index.md +++ b/docs/guides/using-guardrails/stacks/deploy/index.md @@ -16,7 +16,7 @@ In this example, we will use the example source in the `Deploy AWS IAM Stack` po ## Prerequisites - Guardrails: [TE](https://turbot.com/guardrails/docs/guides/hosting-guardrails/updating-stacks/update-workspace) 5.47+, with [aws-iam](https://hub.guardrails.turbot.com/mods/aws/mods/aws-iam) mod 5.39+ -- Tools: [git](git-scm.com), [Terraform](https://developer.hashicorp.com/terraform) or [OpenTofu](https://opentofu.org/), [Guardrails CLI credentials](https://turbot.com/guardrails/docs/reference/cli/installation#set-up-your-turbot-guardrails-credentials) configured +- Tools: [git](https://git-scm.com/), [Terraform](https://developer.hashicorp.com/terraform) or [OpenTofu](https://opentofu.org/), [Guardrails CLI credentials](https://turbot.com/guardrails/docs/reference/cli/installation#set-up-your-turbot-guardrails-credentials) configured - [One or more AWS accounts imported](/guardrails/docs/guides/aws/import-aws-account) diff --git a/docs/guides/using-guardrails/stacks/destroy/index.md b/docs/guides/using-guardrails/stacks/destroy/index.md index 7fcd7dfe..db41c81b 100644 --- a/docs/guides/using-guardrails/stacks/destroy/index.md +++ b/docs/guides/using-guardrails/stacks/destroy/index.md @@ -11,7 +11,7 @@ In this guide, you will configure stack policies to preview deletion and then de ## Prerequisites - Guardrails: [TE](https://turbot.com/guardrails/docs/guides/hosting-guardrails/updating-stacks/update-workspace) 5.47+, with [aws-iam](https://hub.guardrails.turbot.com/mods/aws/mods/aws-iam) mod 5.39+ -- Tools: [git](git-scm.com), [Terraform](https://developer.hashicorp.com/terraform) or [OpenTofu](https://opentofu.org/), [Guardrails CLI credentials](https://turbot.com/guardrails/docs/reference/cli/installation#set-up-your-turbot-guardrails-credentials) configured +- Tools: [git](https://git-scm.com/), [Terraform](https://developer.hashicorp.com/terraform) or [OpenTofu](https://opentofu.org/), [Guardrails CLI credentials](https://turbot.com/guardrails/docs/reference/cli/installation#set-up-your-turbot-guardrails-credentials) configured - [One or more AWS accounts imported](/guardrails/docs/guides/aws/import-aws-account) - Install and attach the [Deploy AWS IAM Stack](https://hub.guardrails.turbot.com/policy-packs/aws_iam_deploy_aws_iam_stack) policy pack, per the [Running Stacks guide](/guardrails/docs/guides/using-guardrails/stacks/deploy) diff --git a/docs/guides/using-guardrails/stacks/import-stack-resource/index.md b/docs/guides/using-guardrails/stacks/import-stack-resource/index.md new file mode 100644 index 00000000..81c8c1d4 --- /dev/null +++ b/docs/guides/using-guardrails/stacks/import-stack-resource/index.md @@ -0,0 +1,128 @@ +--- +title: Importing Stack Resources +sidebar_label: Import Stack Resources +--- + +# Importing Stack Resources in Guardrails + +In this guide, you will: + +- Learn how to **import existing AWS resources** into a Guardrails stack. +- Modify the **stack modifier policy** to include import statements. +- Apply the import configuration at the **folder level** for structured deployment. + +Guardrails allows you to bring existing AWS resources under stack management using **import statements**. This enables Guardrails to track and enforce configuration policies on the imported resources. + +## Prerequisites + +- **Turbot/Owner** or **Turbot/Admin** permissions at the required resource level. +- Familiarity with **Terraform/OpenTofu** and Guardrails stack controls. +- Access to the Guardrails console. +- A **configured Terraform provider** for AWS. + +--- + +## Step 1: Locate the Existing Resource + +Before importing, identify the **AWS S3 bucket** that you want to manage using Guardrails. + +1. **Log in to AWS Console**. +2. Navigate to **Amazon S3** and list the existing buckets. +3. Note down the **S3 bucket name** and **AWS Account ID**. + +Example AWS CLI command: +```bash +aws s3 ls +``` +Expected output: +```plaintext +2025-01-01 12:30:00 example-s3-bucket +``` + +--- + +## Step 2: Retrieve Import Script from Guardrails + +Guardrails provides an **import script** for existing resources. To generate it: + +1. **Log in to the Guardrails console**. +2. Navigate to **Resources** and locate the S3 bucket. +3. Open the **Developer tab** and find the generated **import script**. +4. Copy the import block. + +Example import block for an S3 bucket: +```hcl +import { + id = "aws_s3_bucket.example-s3-bucket" +} +``` + +--- + +## Step 3: Modify the Stack Import Policy + +To import the S3 bucket, update the **AWS > S3 > Bucket > Stack [Native] > Modifier** policy. + +1. Go to **Policies** in the Guardrails console. +2. Search for **AWS > S3 > Bucket > Stack [Native] > Modifier**. +3. Click **New Policy Setting**. +4. Apply the following **Terraform import block** in the policy at the **folder level**. + +Example Terraform configuration: +```hcl +resource "aws_s3_bucket" "example" { + bucket = "example-s3-bucket" +} + +import { + id = "aws_s3_bucket.example" +} +``` +5. Click **Save** to apply the policy. + +--- + +## Step 4: Deploy the Stack in Guardrails + +Once the modifier policy is updated, execute the **stack deployment**. + +1. Navigate to **Stacks** in Guardrails. +2. Locate the **AWS S3 Bucket Stack**. +3. Click **Deploy Stack**. +4. Confirm the import in the **Terraform plan output**. + +Example Terraform CLI command: +```bash +terraform apply +``` +Expected output: +```plaintext +aws_s3_bucket.example: Importing... +aws_s3_bucket.example: Import successful +``` + +--- + +## Step 5: Review + +- [ ] Verify the imported S3 bucket appears in **Guardrails Console > Resources**. +- [ ] Navigate to **Stacks** and ensure the imported bucket is **tracked**. +- [ ] Check the **Policies tab** to confirm the **import statement is applied**. +- [ ] Run a **stack plan** to confirm successful import. + +--- + +## Troubleshooting + +| Issue | Description | Guide | +|--------|------------|------| +| **Resource Not Found** | Import failed due to an incorrect bucket name. | Verify the bucket name in AWS Console. | +| **Permission Denied** | Guardrails lacks the required permissions. | Ensure IAM roles are correctly assigned. | +| **Import Fails in Terraform** | The resource is already managed. | Remove the resource from Terraform state before re-importing. | + +--- + +## Next Steps + +- [Deploy a Stack](https://turbot.com/guardrails/docs/guides/using-guardrails/stacks/deploy) +- [Destroy a Stack](https://turbot.com/guardrails/docs/guides/using-guardrails/stacks/destroy) diff --git a/docs/sidebar.json b/docs/sidebar.json index f6202187..49ef6102 100644 --- a/docs/sidebar.json +++ b/docs/sidebar.json @@ -225,6 +225,7 @@ "id": "configuring-guardrails", "link": "guides/configuring-guardrails", "items": [ + "guides/configuring-guardrails/activity-retention", "guides/configuring-guardrails/install-mod", "guides/configuring-guardrails/update-mod", "guides/configuring-guardrails/uninstall-mod", @@ -329,7 +330,8 @@ "link": "guides/using-guardrails/stacks", "items": [ "guides/using-guardrails/stacks/deploy", - "guides/using-guardrails/stacks/destroy" + "guides/using-guardrails/stacks/destroy", + "guides/using-guardrails/stacks/import-stack-resource" ] }, { @@ -460,8 +462,12 @@ "id": "disaster-recovery", "link": "guides/hosting-guardrails/disaster-recovery", "items": [ - "guides/hosting-guardrails/disaster-recovery/restore", - "guides/hosting-guardrails/disaster-recovery/dr-testing" + "guides/hosting-guardrails/disaster-recovery/architecture-options", + "guides/hosting-guardrails/disaster-recovery/hive-restore", + "guides/hosting-guardrails/disaster-recovery/database-upgrade", + "guides/hosting-guardrails/disaster-recovery/restore-workspace", + "guides/hosting-guardrails/disaster-recovery/multi-region-deployment", + "guides/hosting-guardrails/disaster-recovery/multi-region-failover" ] } ]