Skip to content

Commit

Permalink
📝 Add Ingestion Service Documentation (#402)
Browse files Browse the repository at this point in the history
  • Loading branch information
Gary-H9 authored Apr 3, 2024
1 parent 7c2d0e8 commit a3ebfc7
Show file tree
Hide file tree
Showing 3 changed files with 65 additions and 0 deletions.
3 changes: 3 additions & 0 deletions source/documentation/tools/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ Online hosting platform for git. Git is a distributed version control system tha

Moves data from microservices into the Analytical Platform's [curated databases](../data/curated-databases) in a standardised way.

### [Ingestion](ingestion)
An SFTP based service that allows users to ingest data into their Analytical Platform data warehouse.

## Python packages

The Data Engineering team maintain Python packages that help with data manipulation. The following are the packages we consider the most useful for doing so:
Expand Down
51 changes: 51 additions & 0 deletions source/documentation/tools/ingestion/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Ingestion

> Ingestion on the Analytical Platform is currently a beta feature.
## Service Requirements

### Information to be provided to Analytical Platform

To use the Ingestion feature, data owners must provide the following information to the team via the approved process:

- Supplier's name
- Supplier's email
- Supplier's IP address(es)
- Supplier's SSH public key
- Target location on Analytical Platform (e.g. `s3://${TARGET_BUCKET}/${OPTIONAL_PREFIX}`)

This information will then be merged into the requisite repository. Examples of this information can be found [here](https://github.com/ministryofjustice/modernisation-platform-environments/blob/main/terraform/environments/analytical-platform-ingestion/transfer-user.tf).

### User Action Required

The user's S3 bucket must have the correct permisssions to allow the final `transfer` Lambda function to copy files to it.

For a given S3 bucket `<supplier-bucket-name>` include the following statement

```json
{
"Version": "2012-10-17",
"Statement": [
...
{
"Sid": "AllowAnalyticalPlatformIngestionService",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<ingestion-account-ID>:role/transfer"
},
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:PutObjectTagging"
],
"Resource": [
"arn:aws:s3:::<supplier-bucket-name>",
"arn:aws:s3:::<supplier-bucket-name>/*"
]
}
]
}
```

The `ingestion-account-ID` should be `471112983409` when connections are being made by the `transfer` lambda function in `analytical-platform-ingestion-production` and `730335344807` when connections are being made from `analytical-platform-ingestion-development`.
11 changes: 11 additions & 0 deletions source/tools/ingestion/index.html.md.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: Ingestion
weight: 51
last_reviewed_on: 2024-04-03
review_in: 1 year
show_expiry: true
owner_slack: "#analytical-platform-support"
owner_slack_workspace: "mojdt"
---

<%= partial 'documentation/tools/ingestion/index' %>

0 comments on commit a3ebfc7

Please sign in to comment.