Decrecation notice: The author of Tapdance (AJ/Aaron Steers) has joined forces with Meltano to build the end-to-end DataOps platform of the future. Tapdance will remain free to use but I'm not actively developing it. See my blog post for more info on why I made this decision, and find me on Slack if you have questions about what we're building and how you can transition your data platform to Meltano.
Tapdance is an orchestration layer for the open source Singer tap platform.
A bakers' dozen reasons to dance.
- Tapdance focuses on ease-of-use and is partially inspired by the ease-of-use of Pipelinewise.
- Tapdance aims to support all taps with no required plugin changes.
- Tapdance is dockerized at the core - we require that docker is installed, but we wrap it so you never have to run it directly.
- Tapdance has built-in IAC (Infrastructure-As-Code) using Terraform and near-zero infrastructure costs from always-on resources.
- Tapdance is opinionated in regards to ELT best practices.
- Tapdance supports DevOps best practices out of box, specifically: CI/CD and IAC.
- Tapdance is platform-agnostic and runs on Windows, Mac, and Linux alike using a docker-first approach.
- Tapdance plugins are curated - when multiple forks exist for a given plugin, we will curate the best we find and use those as the default. (See the latest list here.)
- Tapdance uses a data-lake-first approach - while it may be possible to load directly into an RDMS, we prioritize approaches where data lands first in the data lake before ingestion into a SQL DW.
- Tapdance is rules-based - instead of pointing and clicking a hundred times, simply tell tapdance what type of data you want (or what type of data you don't want).
- Tapdance knows your source schema is not static and it adapts automatically in response to upstream schema changes.
- Tapdance provides stream-isolation - data can be extracted (and retried) one-table-at-a-time, even if this is not a feature of the plugin.
- Tapdance automatically takes care of state - state files are managed automatically for you so that incremental data loads come for free with no additional effort.
-
Install Chocolatey from an Administrative Command Prompt
@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command " [System.Net.ServicePointManager]::SecurityProtocol = 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"
-
Install Python3 (if not already installed)
choco install -y python3
-
Install Docker
choco install -y docker-desktop
-
Install Tapdance
pip3 install tapdance
-
Install Python (only if not installed)
brew install [email protected]
-
Install Docker
brew install docker
-
Install Tapdance
pip3 install tapdance
Tapdance looks for configuration information in 3 places:
Once configuration is completed, run tapdance by executing the following two commands:
tapdance plan {tap-id}
- Runs discovery on the tap and creates a plan file.- The plan file shows which tables are columns will be included based upon your specified extraction rules.
tapdance sync {tap-id} {target-id}
- Syncs all data from tap to target, following extraction rules as documented in the previous step's plan file.
For more help, including a explanation of all optional parameters, run:
tapdance plan --help
ortapdance sync --help
.
For step-by-step instructions, see the Tapdance tutorial.