Skip to content

Commit

Permalink
Merge pull request gooddata#422 from jaceksan/working
Browse files Browse the repository at this point in the history
TRIVIAL: gooddata-dbt - update documentation

Reviewed-by: Jan Kadlec
             https://github.com/hkad98
  • Loading branch information
gdgate authored Nov 13, 2023
2 parents 2d5d4a9 + 28c4c11 commit 3e19a22
Show file tree
Hide file tree
Showing 3 changed files with 87 additions and 32 deletions.
11 changes: 11 additions & 0 deletions gooddata-dbt/.env.custom.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/env bash
# (C) 2023 GoodData Corporation

export DB_PASS="<db_password>"

# dbt cloud
export DBT_ACCOUNT_ID=123456
export DBT_TOKEN="<dbt cloud token>"

# Gitlab token for commenting merge requests
export GITLAB_TOKEN="<Gitlab token>"
58 changes: 58 additions & 0 deletions gooddata-dbt/.env.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#!/bin/env bash
# (C) 2023 GoodData Corporation

if [ -f "./.env.custom.dev" ]; then
source ./.env.custom.dev
fi

export DBT_PROFILES_DIR="profile"
export ELT_ENVIRONMENT="production"
export DBT_TARGET="snowflake"

export DB_USER="db_user"
export DB_NAME="DB_NAME"
export OUTPUT_SCHEMA="output_stage"

# Snowflake specific
export DB_ACCOUNT="snowflake_account"
export DB_WAREHOUSE="MY_WAREHOUSE"
export GOODDATA_UPPER_CASE="--gooddata-upper-case" # Snowflake names are upper case

# GoodData
# We use profiles file(~/.gooddata/profiles.yaml) to store GoodData endpoints and their credentials
# Example:
# dev:
# host: "https://company-dev.cloud.gooddata.com"
# token: "<dev_token>"
# prod1:
# host: "https://company-prod1.cloud.gooddata.com"
# token: "<prod1_token>"
# prod2:
# host: "https://company-prod2.cloud.gooddata.com"
# token: "<prod2_token>"
export GOODDATA_PROFILES="prod1 prod2"
export GOODDATA_ENVIRONMENT_ID="production"

# Gitlab (for testing sending messages to merge requests
export CI_MERGE_REQUEST_PROJECT_ID=123456
export CI_MERGE_REQUEST_IID=1

# dbt cloud - test running a dbt cloud job
export DBT_ALLOWED_DEGRADATION=20
export DBT_JOB_ID=123456
export DBT_PROJECT_ID=123456
export DBT_ENVIRONMENT_ID=123456

# dbt env var needed for dbt Cloud
# First delete all existing DBT variables. Uncomment to cleanup your sessions ;-)
#for var in $(env | grep -E '^DBT_' | cut -d= -f1); do
# unset "$var"
#done
for var in $(env | grep -E '^DB_|_SCHEMA' | grep -vE '^DBT_' | cut -d= -f1); do
# Add "DBT_" prefix to variables without DBT_ prefix
new_var="DBT_${var}"
# Get the value of the original variable
value="${!var}"
# Set the new variable with the modified value
export "$new_var=$value"
done
50 changes: 18 additions & 32 deletions gooddata-dbt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,11 @@ GoodData plugin for dbt. Reads dbt models and profiles, generates GoodData seman

## Install

Currently directly from git only:
```shell
pip install git+https://github.com/gooddata/gooddata-python-sdk.git#egg=gooddata-dbt&subdirectory=gooddata-dbt
# Or add the following line to requirements.txt
-e git+https://github.com/gooddata/gooddata-python-sdk.git#egg=gooddata-dbt&subdirectory=gooddata-dbt
```

## Local development
```shell
# Creates virtualenv, installs dependencies
make dev
# Installs the package itself
make install
pip install gooddata-dbt
# Or add the corresponding line to requirements.txt
# Or install specific version
pip install gooddata-dbt==1.0.0
```

## Configuration, parametrization
Expand All @@ -25,36 +17,30 @@ Check [gooddata_example.yml](gooddata_example.yml) file for more details.
Parametrization of each execution can be done using environment variables / tool arguments.
Use main --help and --help for each use case to learn more.

Example setup of environment variables for local environment (running GoodData Community Edition locally):
```shell
export POSTGRES_HOST="localhost"
export POSTGRES_PORT=5432
export POSTGRES_USER="demouser"
export POSTGRES_PASS=demopass
export POSTGRES_DBNAME=demo
export INPUT_SCHEMA="input_stage"
export OUTPUT_SCHEMA="output_stage"

export DBT_PROFILE_DIR="profile"
export DBT_PROFILE="default"
export ELT_ENVIRONMENT="dev_local"
Alternatively, you can configure everything with environment variables.
You can directly set env variables in a shell session, or store them to .env file(s).
We provide the following example:
- [.env.dev](.env.dev)
- [.env.custom.dev](.env.custom.dev) is loaded from the above file and contains sensitive variables.
Add `.env.custom.*` to .gitignore!

export GOODDATA_HOST="http://localhost:3000"
export GOODDATA_ENVIRONMENT_ID="development"
unset GOODDATA_UPPER_CASE
export GOODDATA_TOKEN="YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz"
Then load .env files:
```bash
source .env.local
```

## Use cases
```shell
gooddata-dbt --help
```
The plugin provides the following use cases:
- deploy_models
- provision_workspaces
- Provisions workspaces to GoodData based on gooddata.yaml file
- register_data_sources
- Registers data source in GoodData for each relevant dbt profile
- deploy_ldm
- Reads dbt models and profiles
- Scans data source (connection props from dbt profiles) through GoodData to get column data types (optional in dbt)
- Registers data source in GoodData
- Generates and stores PDM (Physical model) from dbt models and the result of the scan
- Generates GoodData LDM(Logical Data Model) from dbt models. Can utilize custom gooddata-specific metadata, more below
- upload_notification
- Invalidates caches for data source
Expand Down

0 comments on commit 3e19a22

Please sign in to comment.