Skip to content

Datateer/dbt_faker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dbt faker

Generate fake/test/demo/sample data directly from dbt. dbt_faker is a python model generator for generating data within a dbt project using the Python Faker project.

Welcome to the dbt faker project!

Install

Include in packages.yml:

packages:
  - git: "https://github.com/dbt-labs/dbt_faker.git"
    revision: main 

Requirements

  • dbt version >= 1.3

How to use it

1. Add a source override macro in your project

Create the file macro/dbt_faker_source_override.sql that looks like this:

{% macro source(source_name, table_name) %}
{{ return(dbt_faker.dbt_faker_source(source_name, table_name)) }}
{% endmacro %}

Activate the faker_enabled variable in your project.yml

vars: 
  faker_enabled: true

2. Declare your sources.yml

including columns and faker_providers, and add the meta config faker_enabled:true.

version: 2

sources:
  - name: tpch
    meta:
      faker_enabled: true
  - name: fake_tpch
    tables:
      - name: orders
        meta:
          faker_enabled: true
          faker_rows: 250
        columns:
          - name: o_orderkey
            meta:
             faker_provider: pyint

          - name: o_order_date
            meta:
              faker_provider: date

3. Generate your python model

Execute the command dbt run-operation generate_faker_model

4. Copy the output of your terminal and create a python model

Create a file (e.g. dbt_faker.py) with the code generated from step #2

5. Execute your newly created python model

For example dbt run -m dbt_faker.py. This will create a table called fake__source_table for each source you have defined as fake-able

6. Use your fake data!

Run the models depending on the fake sources and be amazed

Providers

dbt_faker relies on Faker's robust data providers. In order to use them, simply include the name of the provider in the faker_provider meta tag. A full list of providers is here. Some examples you can use:

If a fake_provider has not been defined for a column, dbt faker will generate a string by default.

FAQ

generate_faker_model is skipping my sources

You should check that your sources have: 
    - Columns defined in the sources.yml
    - the meta field faker_enabled: true either at the source name level or source table name level
    - the meta field faker_enabled:false not defined at the source table level 

dbt run -m dbt_faker.py gives me a warning that the selector haven't found the model

You may not be running dbt 1.3, needed to be able to execute dbt python models

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published