transformations-cli
also provides a GitHub Action, which can be used to deploy transformations.
- Create transformation manifests and place them into a folder structure in your repository.
Important notes:
-
When a scheduled transformation is represented in a manifest without schedule provided, deploy will delete the existing schedule.
-
When an existing notification is not provided along with the transformation to be updated, notification will be deleted.
-
Values specified as
${VALUE}
are treated as environment variables whileVALUE
is directly used as the actual value. -
Old jetfire-cli style manifests can be used by adding
legacy: true
inside the old manifest. -
The manifest directory is scanned recursively for
*.yml
and*.yaml
files, so you can organize your transformations into separate subdirectories.
# Required
externalId: "test-cli-transform-oidc"
# Required
name: "test-cli-transform-oidc"
# Required
# Valid values are: "assets", "timeseries", "asset_hierarchy", events", "datapoints",
# "string_datapoints", "sequences", "files", "labels", "relationships",
# "raw", "data_sets", "sequence_rows", "nodes", "edges"
destination:
type: "assets"
# destination: "assets"
# When writing to RAW tables, use the following syntax:
# destination:
# type: raw
# database: some_database
# table: some_table
# When writing to sequence rows, use the following syntax:
# destination:
# type: sequence_rows
# externalId: some_sequence
# when writing to nodes in your data model, use the following syntax:
# NOTE: view is optional, not needed for writing nodes without a view
# NOTE: instanceSpace is optional. If not set, it is a mandatory property(column) in the data
# destination:
# type: nodes
# instanceSpace: InstanceSpace
# view:
# space: TypeSpace
# externalId: TypeExternalId
# version: version
# when writing to edges ( aka connection definition) in your data model, use the following syntax:
# NOTE: instanceSpace is optional. If not set, it is a mandatory property(column) in the data
# destination:
# type: edges
# instanceSpace: InstanceSpace
# edgeType:
# space: TypeSpace
# externalId: TypeExternalId
# when writing to edges with view in your data model, use the following syntax:
# NOTE: instanceSpace is optional. If not set, it is a mandatory property(column) in the data
# destination:
# type: edges
# instanceSpace: InstanceSpace
# view:
# space: TypeSpace
# externalId: TypeExternalId
# version: version
# Optional, default: true
shared: true
# Optional, default: upsert
# Valid values are:
# upsert: Create new items, or update existing items if their id or externalId
# already exists.
# create: Create new items. The transformation will fail if there are id or
# externalId conflicts.
# update: Update existing items. The transformation will fail if an id or
# externalId does not exist.
# delete: Delete items by internal id.
action: "upsert"
# Required
query: "select 'My Assets Transformation' as name, 'asset1' as externalId"
# Or the path to a file containing the SQL query for this transformation.
# query:
# file: query.sql
# Optional, default: null
# If null, the transformation will not be scheduled.
schedule: "* * * * *"
# Or you can pause the schedules.
# schedule:
# interval: "* * * * *"
# isPaused: true
# Optional, default: true
ignoreNullFields: false
# Optional, default: null
# List of email adresses to send emails to on transformation errors
notifications:
- [email protected]
- [email protected]
# Optional, default: null
# Skipping this field or providing null clears
# the data set ID on updating the transformation
dataSetId: 1
# Or you can provide data set external ID instead,
# Optional, default: null
# Skipping this field or providing null clears
# the data set ID on updating the transformation
dataSetExternalId: test-dataset
# Optional: You can tag your transformations with max 5 tags.
tags:
- mytag1
- mytag2
# The client credentials to be used in the transformation
authentication:
clientId: ${CLIENT_ID}
clientSecret: ${CLIENT_SECRET}
tokenUrl: ${TOKEN_URL}
scopes:
- ${SCOPES}
cdfProjectName: ${CDF_PROJECT_NAME}
# audience: ""
# If you need to specify read/write credentials separately
# authentication:
# read:
# clientId: ${CLIENT_ID}
# clientSecret: ${CLIENT_SECRET}
# tokenUrl: ${TOKEN_URL}
# scopes:
# - ${SCOPES}
# cdfProjectName: ${CDF_PROJECT_NAME}
# # audience: ""
# write:
# clientId: ${CLIENT_ID}
# clientSecret: ${CLIENT_SECRET}
# tokenUrl: ${TOKEN_URL}
# scopes:
# - ${SCOPES}
# cdfProjectName: ${CDF_PROJECT_NAME}
# # audience: ""
externalId: "test-cli-transform"
name: "test-cli-transform"
destination:
type: "assets"
shared: true
action: "upsert"
query: "select 'My Assets Transformation' as name, 'asset1' as externalId"
# query:
# file: query.sql
schedule: "* * * * *"
ignoreNullFields: false
notifications:
- [email protected]
- [email protected]
# Optional, default: null
# Skipping this field or providing null clears
# the data set ID on updating the transformation
dataSetId: 1
# Or you can provide data set external ID instead,
# Optional, default: null
# Skipping this field or providing null clears
# the data set ID on updating the transformation
dataSetExternalId: test-dataset
# Optional: You can tag your transformations with max 5 tags.
tags:
- mytag1
- mytag2
authentication:
apiKey: ${API_KEY}
# # If you need to specity read/write authentication separately
# authentication:
# read:
# apiKey: ${API_KEY}
# write:
# apiKey: ${API_KEY}
- To deploy a set of transformations in a GitHub workflow, add a step which references the Action in your job.
Alternatively when using OIDC, the Action needs the client details instead of api-key
:
- name: Deploy transformations
uses: cognitedata/transformations-cli@main
env:
# Credentials to be used when running your transformations,
# as referenced in your manifests:
COGNITE_CLIENT_ID: my-cognite-client-id
COGNITE_CLIENT_SECRET: ${{ secrets.cognite_client_secret }}
with:
# Credentials used for deployment
path: transformations # Transformation manifest folder, relative to GITHUB_WORKSPACE
client-id: my-jetfire-client-id
client-secret: ${{ secrets.jetfire_client_secret] }}
token-url: https://login.microsoftonline.com/<my-azure-tenant-id>/oauth2/v2.0/token
cdf-project-name: my-project-name
# If you need to provide multiple scopes, the format: "scope1 scope2 scope3"
scopes: https://<my-cluster>.cognitedata.com/.default
# audience: "" # Optional
This GitHub Action takes the following inputs:
Name | Description |
---|---|
path |
(Required) The path to a directory containing transformation manifests. This is relative to $GITHUB_WORKSPACE , which will be the root of the repository when using actions/checkout with default settings. |
client-id |
(Required) The CLIENT ID used for authenticating with transformations. Equivalent to setting the TRANSFORMATIONS_CLIENT_ID environment variable. |
client-secret |
(Required) The CLIENT SECRET used for authenticating with transformations. Equivalent to setting the TRANSFORMATIONS_CLIENT_SECRET environment variable. |
token-url |
(Required) The TOKEN URL used for requesting token to authenticate with transformations. Equivalent to setting the TRANSFORMATIONS_TOKEN_URL environment variable. |
cdf-project-name |
(Required) Equivalent to setting the TRANSFORMATIONS_PROJECT environment variable. |
scopes |
(Optional) The SCOPES used for authenticating with transformations. Equivalent to setting the TRANSFORMATIONS_SCOPES environment variable. Space separated if multiple needed. |
audience |
(Optional) The AUDIENCE used for authenticating with transformations. Equivalent to setting the TRANSFORMATIONS_AUDIENCE environment variable. |
cluster |
(Optional) The name of the cluster where Transformations is hosted. Equivalent to setting the TRANSFORMATIONS_CLUSTER environment variable. |
Additionally, you must specify environment variables for any credentials or environment variables referenced in transformation manifests.
- name: Deploy transformations
uses: cognitedata/transformations-cli@main
with:
path: transformations # Transformation manifest folder, relative to GITHUB_WORKSPACE
api-key: ${{ secrets.TRANSFORMATIONS_API_KEY }}
# If not using the main europe-west1-1 cluster:
# cluster: greenfield
# cdf-project-name: my-project # to suppress python SDK warning (not required for API keys).
env:
# API key to be used when running your transformations,
# As referenced in your transformation manifests
SOME_API_KEY: ${{ secrets.SOME_API_KEY }}
This GitHub Action takes the following inputs:
Name | Description |
---|---|
path |
(Required) The path to a directory containing transformation manifests. This is relative to $GITHUB_WORKSPACE , which will be the root of the repository when using actions/checkout with default settings. |
api-key |
(Required) The API key used for authenticating with transformations. Equivalent to setting the TRANSFORMATIONS_API_KEY environment variable. |
cluster |
(Optional) The name of the cluster where Transformations is hosted. Equivalent to setting the TRANSFORMATIONS_CLUSTER environment variable. |
Additionally, you must specify environment variables for any API keys or environment variables referenced in transformation manifests.