Skip to content

Terraform module for deploying GraphDB HA cluster in Microsoft Azure

License

Notifications You must be signed in to change notification settings

Ontotext-AD/terraform-azure-graphdb

Repository files navigation

GraphDB Azure Terraform Module

CI GitHub Release

This repository contains a set of Terraform modules for deploying Ontotext GraphDB HA cluster on Microsoft Azure.

Table of Contents

About GraphDB

GraphDB logo

Ontotext GraphDB is a highly efficient, scalable and robust graph database with RDF and SPARQL support. With excellent enterprise features, integration with external search applications, compatibility with industry standards, and both community and commercial support, GraphDB is the preferred database choice of both small independent developers and big enterprises.

GraphDB is available on the Azure Marketplace in several listings depending on your needs.

Features

The module provides the building blocks of configuring, deploying and provisioning a highly available cluster of GraphDB across multiple availability zones using a VM scale set. Key features of the module include:

  • Azure VM scale set across multiple Availability Zones
  • Azure Application Gateway for load balancing and TLS termination
  • Azure Private Link with private Application Gateway
  • Azure NAT gateway for outbound connections
  • Automated backups in Azure Blob Storage
  • Azure Private DNS for internal GraphDB cluster communication
  • Azure Key Vault for storing sensitive configurations
  • Optional Azure Bastion deployment
  • User assigned identities for RBAC authorization with the least privilege principle
  • and more

Modules Overview

Modules Purpose Features
Vault Module Creates a Key Vault for storing TLS certificates and secrets - Enables purge protection for Key Vault.
- Sets soft delete retention days for Key Vault.
Backup Module Sets up a Storage Account for storing GraphDB backups. - Configures storage account tier and replication type.
- Defines retention policies for storage blobs and containers.
AppConfig Module Establishes an App Configuration store for managing GraphDB configurations. - Enables purge protection for App Configuration.
- Sets soft delete retention days for App Configuration.
TLS Module Manages TLS certificate secrets in Key Vault and their related identities. - Creates TLS certificate secrets in Key Vault.
- Configures identity related to the TLS certificate.
Application Gateway Module Sets up a public IP address and Application Gateway for forwarding internet traffic to GraphDB proxies/instances. - Configures TLS certificate for the gateway.
- Enables private access and private link service.
- Defines global buffer settings.
Bastion Module Deploys an Azure Bastion host for secure remote connections. - Configures the bastion host within the specified virtual network.
Monitoring Module Configures Azure monitoring for the deployed resources. - Sets up Application Insights for the GraphDB scale set.
- Sets up web test availability monitoring.
- Defines retention policies for monitoring data.
GraphDB Module Deploys a VM scale set for GraphDB and its cluster proxies. - Configures networking settings.
- Sets up GraphDB configurations and licenses.
- Defines backup storage, VM image, and managed disk settings.

Versioning

The Terraform module follows the Semantic Versioning 2.0.0 rules and has a release lifecycle separate from the GraphDB versions. The next table shows the version compatability between GraphDB and the Terraform module.

GraphDB Terraform GraphDB
Version 1.x.x Version 10.6.x
Version 1.2.x Version 10.7.x
Version 1.3.x Version 10.7.x
Version 1.4.x Version 10.8.x
Version 1.5.x Version 10.8.x

You can track the particular version updates of GraphDB in the changelog or the release notes.

Prerequisites

You then need to authenticate in your subscription with Azure CLI, see Authenticating using the Azure CLI for more details.

Additional steps include:

  • Enable VM Encryption At Host
  • Register AppConfiguration with az provider register --namespace "Microsoft.AppConfiguration"
  • Register AllowApplicationGatewayPrivateLink with az feature register --name AllowApplicationGatewayPrivateLink --namespace Microsoft.Network if you are planning on using Private Link

The Terraform module deploys a VM scale set based on a VM image published in the Azure Marketplace. This requires you to accept the terms which can be accomplished with Azure CLI:

az vm image terms accept --offer graphdb-ee --plan graphdb-byol --publisher ontotextad1692361256062

Inputs

Name Description Type Default Required
resource_group_name The name of the existing resource group to use. If not provided, a new resource group will be created. string null no
virtual_network_name The name of the existing vnet to use. If not provided, a new virtual network will be created. string null no
resource_name_prefix Resource name prefix used for tagging and naming Azure resources string n/a yes
location Azure geographical location where resources will be deployed string n/a yes
zones Availability zones to use for resource deployment and HA list(number) [ 1, 2, 3 ] no
tags Common resource tags. map(string) {} no
lock_resources Enables a delete lock on the resource group to prevent accidental deletions. bool true no
vmss_dns_servers List of DNS servers for the VMSS list(string) [] no
graphdb_external_address_fqdn External FQDN address for the deployment string null no
virtual_network_address_space Virtual network address space CIDRs. list(string) [ "10.0.0.0/16" ] no
gateway_subnet_address_prefixes Subnet address prefixes CIDRs where the application gateway will reside. list(string) [ "10.0.1.0/24" ] no
graphdb_subnet_address_prefixes Subnet address prefixes CIDRs where GraphDB VMs will reside. list(string) [ "10.0.2.0/24" ] no
gateway_private_link_subnet_address_prefixes Subnet address prefixes where the Application Gateway Private Link will reside, if enabled list(string) [ "10.0.5.0/24" ] no
management_cidr_blocks CIDR blocks allowed to perform management operations such as connecting to Bastion or Key Vault. list(string) n/a yes
inbound_allowed_address_prefix Source address prefix allowed for connecting to the application gateway string "Internet" no
inbound_allowed_address_prefixes Source address prefixes allowed for connecting to the application gateway. Overrides inbound_allowed_address_prefix list(string) [] no
outbound_allowed_address_prefix Destination address prefix allowed for outbound traffic from GraphDB string "Internet" no
outbound_allowed_address_prefixes Destination address prefixes allowed for outbound traffic from GraphDB. Overrides outbound_allowed_address_prefix list(string) [] no
gateway_global_request_buffering_enabled Whether Application Gateway's Request buffer is enabled. bool true no
gateway_global_response_buffering_enabled Whether Application Gateway's Response buffer is enabled. bool true no
gateway_enable_private_access Enable or disable private access to the application gateway bool false no
disable_agw Disables the creation of Application Gateway by the Terraform module. bool false no
gateway_enable_private_link_service Set to true to enable Private Link service, false to disable it. bool false no
gateway_private_link_service_network_policies_enabled Enable or disable private link service network policies string false no
gateway_backend_port Backend port for the Application Gateway rules number 7201 no
gateway_probe_interval Interval in seconds between the health probe checks number 10 no
gateway_probe_timeout Timeout in seconds for the health probe checks number 1 no
gateway_probe_threshold Number of consecutive health checks to consider the probe passing or failing number 2 no
context_path The context path for the Application Gateway. string "" no
tls_certificate_path Path to a TLS certificate that will be imported in Azure Key Vault and used in the Application Gateway TLS listener for GraphDB. Either tls_certificate_path or tls_certificate_id must be provided. string null no
tls_certificate_password TLS certificate password for password-protected certificates. string null no
tls_certificate_id Resource identifier for a TLS certificate secret from a Key Vault. Overrides tls_certificate_path. Either tls_certificate_id or tls_certificate_path must be provided. string null no
tls_certificate_identity_id Identifier of a managed identity giving access to the TLS certificate specified with tls_certificate_id string null no
key_vault_enable_purge_protection Prevents purging the key vault and its contents by soft deleting it. It will be deleted once the soft delete retention has passed. bool true no
key_vault_soft_delete_retention_days Retention period in days during which soft deleted secrets are kept number 30 no
app_config_enable_purge_protection Prevents purging the App Configuration and its keys by soft deleting it. It will be deleted once the soft delete retention has passed. bool true no
app_config_soft_delete_retention_days Retention period in days during which soft deleted keys are kept number 7 no
admin_security_principle_id UUID of a user or service principle that will become data owner or administrator for specific resources that need permissions to insert data during Terraform apply, i.e. KeyVault and AppConfig. If left unspecified, the current user will be used. string null no
graphdb_version GraphDB version from the marketplace offer string "10.8.2" no
graphdb_sku GraphDB SKU from the marketplace offer string "graphdb-byol" no
graphdb_image_id GraphDB image ID to use for the scale set VM instances in place of the default marketplace offer string null no
graphdb_license_path Local path to a file, containing a GraphDB Enterprise license. string n/a yes
graphdb_cluster_token Secret token used to secure the internal GraphDB cluster communication. Will generate one if left undeclared. string null no
graphdb_password Secret token used to access GraphDB cluster. string null no
graphdb_properties_path Path to a local file containing GraphDB properties (graphdb.properties) that would be appended to the default in the VM. string null no
graphdb_java_options GraphDB options to pass to GraphDB with GRAPHDB_JAVA_OPTS environment variable. string null no
node_count Number of GraphDB nodes to deploy in ASG number 3 no
instance_type Azure instance type string n/a yes
ssh_key Public key for accessing the GraphDB instances string null no
user_supplied_scripts Array of additional shell scripts to execute sequentially after the templated user data shell scripts. list(string) [] no
storage_account_tier Specify the performance and redundancy characteristics of the Azure Storage Account that you are creating string "Standard" no
storage_account_replication_type Specify the data redundancy strategy for your Azure Storage Account string "ZRS" no
storage_blobs_max_days_since_creation Specifies the retention period in days since creation before deleting storage blobs number 31 no
storage_account_retention_hot_to_cool Specifies the retention period in days between moving data from hot to cool tier storage number 3 no
storage_container_soft_delete_retention_policy Number of days for retaining the storage container from actual deletion number 31 no
storage_blob_soft_delete_retention_policy Number of days for retaining storage blobs from actual deletion number 31 no
backup_schedule Cron expression for the backup job. string "0 0 * * *" no
deploy_bastion Deploy bastion module bool false no
bastion_subnet_address_prefixes Bastion subnet address prefixes list(string) [ "10.0.3.0/26" ] no
deploy_monitoring Deploy monitoring module bool true no
disk_size_gb Size of the managed data disk which will be created number 500 no
disk_iops_read_write Data disk IOPS number 7500 no
disk_mbps_read_write Data disk throughput number 250 no
disk_storage_account_type Storage account type for the data disks string "PremiumV2_LRS" no
disk_network_access_policy Network accesss policy for the managed disks string "DenyAll" no
disk_public_network_access Public network access enabled for the managed disks bool false no
la_workspace_retention_in_days The workspace data retention in days. Possible values are either 7 (Free Tier only) or range between 30 and 730. number 30 no
la_workspace_sku Specifies the SKU of the Log Analytics Workspace. Possible values are Free, PerNode, Premium, Standard, Standalone, Unlimited, CapacityReservation, and PerGB2018 (new SKU as of 2018-04-03). Defaults to PerGB2018. string "PerGB2018" no
appi_retention_in_days Specifies the retention period in days. number 30 no
appi_daily_data_cap_in_gb Specifies the Application Insights component daily data volume cap in GB. number 1 no
appi_daily_data_cap_notifications_disabled Specifies if a notification email will be send when the daily data volume cap is met. bool false no
appi_disable_ip_masking By default the real client IP is masked as 0.0.0.0 in the logs. Use this argument to disable masking and log the real client IP bool true no
appi_web_test_availability_enabled Should the availability web test be enabled bool true no
web_test_ssl_check_enabled Should the SSL check be enabled? bool false no
web_test_geo_locations A list of geo locations the test will be executed from list(string) [ "us-va-ash-azr", "us-il-ch1-azr", "emea-gb-db3-azr", "emea-nl-ams-azr", "apac-hk-hkn-azr" ] no
monitor_reader_principal_id Principal(Object) ID of a user/group which would receive notifications from alerts. string null no
notification_recipients_email_list List of emails which will be notified via e-mail and/or push notifications list(string) [] no

Usage

To use the GraphDB module, create a new Terraform project or add to an existing one the following module block:

module "graphdb" {
  source  = "Ontotext-AD/graphdb/azure"
  version = "~> 1.0"

  resource_name_prefix = "graphdb"
  location             = "East US"
  zones                = [1, 2, 3]
  tags                 = {
    Environment : "dev"
  }

  instance_type            = "Standard_E8as_v5"
  graphdb_license_path     = "path-to-graphdb-license"
  ssh_key                  = "your-public-key"
  management_cidr_blocks   = ["your-ip-address"]
  tls_certificate_path     = "path-to-your-tls-certificate"

  # OPTIONAL: Required only if the password for the certificate is set
  tls_certificate_password = "password-for-your-tls-certificate"
}

Initialize the module and its required providers with:

terraform init

Before deploying, make sure to inspect the plan output with:

terraform plan

After a careful review of the output plan, deploy with:

terraform apply

Once deployed, you should be able to access the environment at the generated FQDN that has been outputted at the end.

Examples

GraphDB Secrets

Instead of generating a random administrator password, you can provide one with:

graphdb_password = "s3cr37P@$w0rD"

It's the same with the shared GraphDB cluster secret, to override the randomly generated password, use:

graphdb_cluster_secret = "V6'vj|G]fpQ1_^9_,AE(r}Ct9yKuF&"

GraphDB Configurations

The GraphDB instances can be customized either by providing a custom graphdb.properties file that could contain any of the supported GraphDB configurations properties:

graphdb_properties_path = "<path_to_custom_graphdb_properties_file>"

Or by setting the GDB_JAVA_OPTS environment variable with graphdb_java_options. For example, if you want to print the command line flags, use:

graphdb_java_options = "-XX:+PrintCommandLineFlags"

Bastion

To enable the deployment of Azure Bastion, you simply need to enable the following flag:

deploy_bastion = true

Private Gateway with Private Link

To enable the Private Link service on a private Application Gateway, you need to enable the following flags:

gateway_enable_private_access       = true
gateway_enable_private_link_service = true

See Configure Azure Application Gateway Private Link for further information on configuring and using Application Gateway Private Link.

Providing a TLS certificate

There are two options for setting up the Application Gateway with a TLS certificate.

  1. Provide local certificate file in PFX format with:
    tls_certificate_path     = "path-to-your-tls-certificate"
    
    # OPTIONAL: Required only if the password for the certificate is set
    tls_certificate_password = "tls-certificate-password"
    Note: This will create a dedicated Key Vault for storing the certificate.
  2. Or provide a reference to an existing TLS certificate with:
    tls_certificate_id          = "key-vault-certificate-secret-id"
    tls_certificate_identity_id = "managed-identity-id"
    Note: One of the two options must be used as tls is required!

Purge Protection

Resources that support purge protection and soft delete have them enabled by default. You can override the default configurations with the following variables:

# Make sure the resource group delete lock is enabled for production
lock_resources = true

# Configure Key Vault purge protection in case of local TLS certificate usage
key_vault_enable_purge_protection    = true
key_vault_soft_delete_retention_days = 7 # From 7 to 90 days

app_config_enable_purge_protection    = true
app_config_soft_delete_retention_days = 7 # From 1 to 7 days

storage_container_soft_delete_retention_policy = 7 # From 1 to 365 days
storage_blob_soft_delete_retention_policy      = 7 # From 1 to 365 days

Managed Disks

Depending on the amount of data, expected statements or other factors, you might want to reconfigure the default options used for provisioning managed disks for persistent storage.

disk_size_gb         = 1250
disk_iops_read_write = 16000
disk_mbps_read_write = 1000

Monitoring

Resources related to the monitoring (Application Insights) are deployed by default, you can change this with

deploy_monitoring = false

Custom GraphDB VM Image

You can provide the VMSS with a custom VM image by specifying graphdb_image_id, for example:

graphdb_image_id = "/subscriptions/<subscription_id>/resourceGroups/<resource_group_name>/providers/Microsoft.Compute/galleries/<gallery_name>/images/<image_definition_name>/versions/<image_version>"

Deploying in Existing Resource Group and Virtual Network

To deploy in already existing Resource Group and Virtual Network you just need to specify their names, for example:

resource_group_name  = "existing_rg"
virtual_network_name = "existing_vnet"

Deploying GraphDB with External Application Gateway and Custom Context Path

You can deploy GraphDB without creating a new Application Gateway, allowing you to use your existing one. Additionally, you can configure a custom context path for your application. To do this, follow these steps:

Prerequisites:

  • Resource Group: A resource group should already be created.
  • Virtual Network: A Virtual Network (VNet) should be set up and ready.
  • Application Gateway: Ensure your Application Gateway is deployed and fully operational.

Example Configuration:

context_path                  = "/graphdb"
disable_agw                   = true
virtual_network_name          = "your-VNet"
resource_group_name           = "your-resource-group"
graphdb_external_address_fqdn = "your-fqdn-or-ip"

Notes:

  • Setting disable_agw to true disable the creation of Application Gateway from the Terraform Module.
  • You need provide graphdb_external_adress_fqdn when using disable_agw.
  • The context_path variable sets the custom context path for your application.

Post-Deployment Actions: After applying the Terraform code, you must perform the following steps:

1. Configure the Application Gateway:

  • Path-Based Routing Rule: Set up a path-based routing rule on your Application Gateway to listen to the same context path. For example, if context_path = "/graphdb", the path-based rule should be /graphdb/*. You can use your external Application Gateway without the context path.

2. Add VMs or VMSS to Backend Pool:

  • Manually add your Virtual Machine Scale Sets (VMSS) to the Application Gateway’s backend pool as targets.

3. Upgrade VMSS Instances:

  • After assigning the VMSS to the backend pool and verifying that the Application Gateway can access the VMSS, upgrade your VMSS instances to the latest model or version. This is essential for the Application Gateway to identify them as valid targets within the backend pool. 4. Network Security Group (NSG) Configuration:
  • Configure NSG rules to allow traffic between the Application Gateway and the VMSS, ensuring the necessary access is in place.

Local Development

Instead of using the module as dependency, you can create a local variables file named terraform.tfvars and provide configuration overrides there. Then simply follow the same steps as in the Usage section.

Single Node Deployment

This Terraform module has the ability to deploy a single instance of GraphDB. To deploy a single instance you just need to set node_count to 1, everything else happens automatically.

Migrating from a Single Node Deployment to Cluster Deployment

Here is the procedure for migrating your single node deployment to cluster e.g., from one node to 3 nodes

  1. Create a backup of your data.
  2. Change the node_count to 3 or more, depending on the cluster size you desire.
  3. Run terraform import 'module.graphdb.azurerm_managed_disk.managed_disks[\"<DISK_NAME>\"]' /subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP_NAME>/providers/Microsoft.Compute/disks/<DISK_NAME>
    • IMPORTANT! Resource names are case-sensitive and mismatch will lead to resource recreation and data loss.
    • The CLI syntax differs depending on the OS please refer to the documentation.
  4. Validate the import is successful by checking the terraform.tfstate file, should contain azurerm_managed_disk resource with the name of the disk you've imported.
  5. Run terraform plan and review the plan carefully if everything seems fine run terraform apply

Release History

All notable changes between version are tracked and documented at CHANGELOG.md.

Contributing

Check out the contributors guide CONTRIBUTING.md.

License

This code is released under the Apache 2.0 License. See LICENSE for more details.