Azure Data Services Infrastructure

This folder stores all the ARM templates and PowerShell scripts used to deploy all the Azure infrastructure for the Azure Data Services.

[[TOC]]

Folders

arm - Contains Azure Resource Manager (ARM) templates.
docs - Contains documentation and diagrams for the Azure Data Services infrastructure.
dsc - Contains the Desired State Configuration (DSC) PowerShell script which downloads and installs the Azure Data Factory Self-Hosted Integration Runtime.
runbooks - Contains the PowerShell scripts used by the Azure Automation Account.
scripts - Contains the PowerShell scripts used as part of the DevOps pipeline.
scripts/schema - Config schema components, referenced by /infrastructure/scripts/config.schema.json

Solution Architecture

This diagram was generated using draw.io.

Architectural principles

The following architectural principles were employed in the development of this solution:

Single Git repository - with decoupled components so we can make wholescale changes at once, but still work in isolation on different components i.e. using a monorepo approach
Consistent - naming convention of resources is consistent, tags for resources is consistent, parameters we pass into scripts and ARM templates are consistent, file structures are consistent; once you grasp the basics everything should be easily reasoned about
Configurable - we know different organisations have different requirements and we want to be able to apply configuration of the platform ideally without changing code; there is a JSON config file that drives the deployment of a single environment instance
Reusabile - we take a modular approach that layers reusable components together to achieve the end result allowing for some or all of the platform to be easily reused for other solutions/platforms/scenarios in the future
Secure - we take security seriously and apply all possible best practices from a security perspective like security checklist review, least privilege, defensive coding, secure strings, key vaults, managed identities, service endpoints, locking down firewalls, etc.
Highly automated creation and update - we automate everything practical and avoid manual tweaks wherever possible - we want to go from an empty subscription to a working platform with one script run ideally; we also want all scripts to be idempotent so they can be re-run safely at any time to upgrade an environment to the latest configuration

Modularity model

We use the following model to structure this solution:

Platform - the end-to-end infrastructure solution, orchestrated by azure-pipelines.yml/deployment-steps.yml (in Azure DevOps); this is specific to Azure Data Services
- Modules - groups of related functionality that get deployed together - these are represented by the various Scripts/Deploy-*.ps1 files, appear as a separate task in Azure DevOps pipelines and can be run independently locally (speeding up debugging) and are portable in of themselves; these are reasonably standalone and can be picked up and repurposed with little effort, but will have some Azure Data Services specific concepts in them
  - Cells - individual units of functionality e.g. a Data Factory, it's diagnostics and it's Integration Runtime - these are represented by the various Azure Resource Manager templates in arm/*.json; these are completely standalone and can be immediately reused
Solutions - the components/content that gets deployed into the platform e.g. web apps, data factory pipelines, Databricks notebooks, SQL schemas, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Azure Data Services Infrastructure

Folders

Solution Architecture

Architectural principles

Modularity model

Files

README.md

Latest commit

History

README.md

File metadata and controls

Azure Data Services Infrastructure

Folders

Solution Architecture

Architectural principles

Modularity model