From a3ebfc7f1fa1fadefbc6bfe9ab2c0bd8868a2b97 Mon Sep 17 00:00:00 2001 From: Gary <26419401+Gary-H9@users.noreply.github.com> Date: Wed, 3 Apr 2024 13:48:35 +0100 Subject: [PATCH] :memo: Add Ingestion Service Documentation (#402) --- source/documentation/tools/index.md | 3 ++ source/documentation/tools/ingestion/index.md | 51 +++++++++++++++++++ source/tools/ingestion/index.html.md.erb | 11 ++++ 3 files changed, 65 insertions(+) create mode 100644 source/documentation/tools/ingestion/index.md create mode 100644 source/tools/ingestion/index.html.md.erb diff --git a/source/documentation/tools/index.md b/source/documentation/tools/index.md index 81872fb7..dab098ab 100644 --- a/source/documentation/tools/index.md +++ b/source/documentation/tools/index.md @@ -42,6 +42,9 @@ Online hosting platform for git. Git is a distributed version control system tha Moves data from microservices into the Analytical Platform's [curated databases](../data/curated-databases) in a standardised way. +### [Ingestion](ingestion) +An SFTP based service that allows users to ingest data into their Analytical Platform data warehouse. + ## Python packages The Data Engineering team maintain Python packages that help with data manipulation. The following are the packages we consider the most useful for doing so: diff --git a/source/documentation/tools/ingestion/index.md b/source/documentation/tools/ingestion/index.md new file mode 100644 index 00000000..930df901 --- /dev/null +++ b/source/documentation/tools/ingestion/index.md @@ -0,0 +1,51 @@ +# Ingestion + +> Ingestion on the Analytical Platform is currently a beta feature. + +## Service Requirements + +### Information to be provided to Analytical Platform + +To use the Ingestion feature, data owners must provide the following information to the team via the approved process: + +- Supplier's name +- Supplier's email +- Supplier's IP address(es) +- Supplier's SSH public key +- Target location on Analytical Platform (e.g. `s3://${TARGET_BUCKET}/${OPTIONAL_PREFIX}`) + +This information will then be merged into the requisite repository. Examples of this information can be found [here](https://github.com/ministryofjustice/modernisation-platform-environments/blob/main/terraform/environments/analytical-platform-ingestion/transfer-user.tf). + +### User Action Required + +The user's S3 bucket must have the correct permisssions to allow the final `transfer` Lambda function to copy files to it. + +For a given S3 bucket `` include the following statement + +```json +{ + "Version": "2012-10-17", + "Statement": [ + ... + { + "Sid": "AllowAnalyticalPlatformIngestionService", + "Effect": "Allow", + "Principal": { + "AWS": "arn:aws:iam:::role/transfer" + }, + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + "s3:PutObjectTagging" + ], + "Resource": [ + "arn:aws:s3:::", + "arn:aws:s3:::/*" + ] + } + ] +} +``` + +The `ingestion-account-ID` should be `471112983409` when connections are being made by the `transfer` lambda function in `analytical-platform-ingestion-production` and `730335344807` when connections are being made from `analytical-platform-ingestion-development`. \ No newline at end of file diff --git a/source/tools/ingestion/index.html.md.erb b/source/tools/ingestion/index.html.md.erb new file mode 100644 index 00000000..dbe52389 --- /dev/null +++ b/source/tools/ingestion/index.html.md.erb @@ -0,0 +1,11 @@ +--- +title: Ingestion +weight: 51 +last_reviewed_on: 2024-04-03 +review_in: 1 year +show_expiry: true +owner_slack: "#analytical-platform-support" +owner_slack_workspace: "mojdt" +--- + +<%= partial 'documentation/tools/ingestion/index' %> \ No newline at end of file