Skip to content

Latest commit

 

History

History
165 lines (106 loc) · 11 KB

README.md

File metadata and controls

165 lines (106 loc) · 11 KB

Connector Development

Airbyte supports two types of connectors: Sources and Destinations. A connector takes the form of a Docker image which follows the Airbyte specification.

To build a new connector in Java or Python, we provide templates so you don't need to start everything from scratch.

Note: you are not required to maintain the connectors you create. The goal is that the Airbyte core team and the community help maintain the connector.

Python Connector-Development Kit (CDK)

You can build a connector very quickly in Python with the Airbyte CDK, which generates 75% of the code required for you.

C#/.NET Connector-Development Kit (CDK)

You can build a connector very quickly in C# .NET with the Airbyte Dotnet CDK, which generates 75% of the code required for you.

TS/JS Connector-Development Kit (Faros AI Airbyte CDK)

You can build a connector in TypeScript/JavaScript with the Faros AI CDK, which generates and boostraps most of the code required for HTTP Airbyte sources.

The Airbyte specification

Before building a new connector, review Airbyte's data protocol specification.

Adding a new connector

Requirements

To add a new connector you need to:

  1. Implement & Package your connector in an Airbyte Protocol compliant Docker image
  2. Add integration tests for your connector. At a minimum, all connectors must pass Airbyte's standard test suite, but you can also add your own tests.
  3. Document how to build & test your connector
  4. Publish the Docker image containing the connector

Each requirement has a subsection below.

1. Implement & package the connector

If you are building a connector in any of the following languages/frameworks, then you're in luck! We provide autogenerated templates to get you started quickly:

Sources

  • Python Source Connector
  • Singer-based Python Source Connector. Singer.io is an open source framework with a large community and many available connectors (known as taps & targets). To build an Airbyte connector from a Singer tap, wrap the tap in a thin Python package to make it Airbyte Protocol-compatible. See the Github Connector for an example of an Airbyte Connector implemented on top of a Singer tap.
  • Generic Connector: This template provides a basic starting point for any language.

Destinations

  • Java Destination Connector
  • Python Destination Connector

Creating a connector from a template

Run the interactive generator:

cd airbyte-integrations/connector-templates/generator
./generate.sh

and choose the relevant template by using the arrow keys. This will generate a new connector in the airbyte-integrations/connectors/<your-connector> directory.

Search the generated directory for "TODO"s and follow them to implement your connector. For more detailed walkthroughs and instructions, follow the relevant tutorial:

As you implement your connector, make sure to review the Best Practices for Connector Development guide. Following best practices is not a requirement for merging your contribution to Airbyte, but it certainly doesn't hurt ;)

2. Integration tests

At a minimum, your connector must implement the acceptance tests described in Testing Connectors

Note: Acceptance tests are not yet available for Python destination connectors. Coming soon!

3. Document building & testing your connector

If you're writing in Python or Java, skip this section -- it is provided automatically.

If you're writing in another language, please document the commands needed to:

  1. Build your connector docker image (usually this is just docker build . but let us know if there are necessary flags, gotchas, etc..)
  2. Run any unit or integration tests in a Docker image.

Your integration and unit tests must be runnable entirely within a Docker image. This is important to guarantee consistent build environments.

When you submit a PR to Airbyte with your connector, the reviewer will use the commands you provide to integrate your connector into Airbyte's build system as follows:

  1. :airbyte-integrations:connectors:source-<name>:build should run unit tests and build the integration's Docker image
  2. :airbyte-integrations:connectors:source-<name>:integrationTest should run integration tests including Airbyte's Standard test suite.

4. Publish the connector

Typically this will be handled as part of code review by an Airbyter. There is a section below on what steps are needed for publishing a connector and will mostly be used by Airbyte employees publishing the connector.

Updating an existing connector

The steps for updating an existing connector are the same as for building a new connector minus the need to use the autogenerator to create a new connector. Therefore the steps are:

  1. Iterate on the connector to make the needed changes
  2. Run tests
  3. Add any needed docs updates
  4. Create a PR to get the connector published

Publishing a connector

Once you've finished iterating on the changes to a connector as specified in its README.md, follow these instructions to ship the new version of the connector with Airbyte out of the box.

  1. Bump the version in the Dockerfile of the connector (LABEL io.airbyte.version=X.X.X).

  2. Submit a PR containing the changes you made.

  3. One of Airbyte maintainers will review the change and publish the new version of the connector to Docker hub. Triggering tests and publishing connectors can be done by leaving a comment on the PR with the following format (the PR must be from the Airbyte repo, not a fork):

    # to run integration tests for the connector
    # Example: /test connector=connectors/source-hubspot
    /test connector=(connectors|bases)/<connector_name> 
    
    # to run integration tests, publish the connector, and use the updated connector version in our config/metadata files
    # Example: /publish connector=connectors/source-hubspot
    /publish connector=(connectors|bases)/<connector_name>
    
  4. OPTIONAL: Necessary if this is a new connector, or the automated connector version bump fails

    • Update/Add the connector definition in the Airbyte connector index to use the new version:

      • airbyte-config/init/src/main/resources/seed/source_definitions.yaml if it is a source
      • airbyte-config/init/src/main/resources/seed/destination_definitions.yaml if it is a destination.
    • Then run the command ./gradlew :airbyte-config:init:processResources to generate the seed spec yaml files, and commit the changes to the PR. See this readme for more information.

  5. The new version of the connector is now available for everyone who uses it. Thank you!

The /publish command

Publishing a connector can be done using the /publish command as outlined in the above section. The command runs a github workflow, and has the following configurable parameters:

  • connector - Required. This tells the workflow which connector to publish. e.g. connector=connectors/source-amazon-ads. This can also be a comma-separated list of many connectors, e.g. connector=connectors/source-s3,connectors/destination-postgres,connectors/source-facebook-marketing. See the parallel flag below if publishing multiple connectors.
  • repo - Defaults to the main airbyte repo. Set this when building connectors from forked repos. e.g. repo=userfork/airbyte
  • gitref - Defaults to the branch of the PR where the /publish command is run as a comment. If running manually, set this to your branch where you made changes e.g. gitref=george/s3-update
  • run-tests - Defaults to true. Should always run the tests as part of the publish flow so that if tests fail, the connector is not published.
  • comment-id - This is automatically filled if you run /publish from a comment and enables the workflow to write back success/fail logs to the git comment.
  • auto-bump-version - Defaults to true, automates the post-publish process of bumping the connector's version in the yaml seed definitions and generating spec.
  • parallel - Defaults to false. If set to true, a pool of runner agents will be spun up to allow publishing multiple connectors in parallel. Only switch this to true if publishing multiple connectors at once to avoid wasting $$$.

Using credentials in CI

In order to run integration tests in CI, you'll often need to inject credentials into CI. There are a few steps for doing this:

  1. Place the credentials into Google Secret Manager(GSM): Airbyte uses a project 'Google Secret Manager' service as the source of truth for all CI secrets. Place the credentials exactly as they should be used by the connector into a GSM secret here i.e.: it should basically be a copy paste of the config.json passed into a connector via the --config flag. We use the following naming pattern: SECRET_<capital source OR destination name>_CREDS e.g: SECRET_SOURCE-S3_CREDS or SECRET_DESTINATION-SNOWFLAKE_CREDS.
  2. Add the GSM secret's labels:
    • connector (required) -- unique connector's name or set of connectors' names with '_' as delimiter i.e.: connector=source-s3, connector=destination-snowflake
    • filename (optional) -- custom target secret file. Unfortunately Google doesn't use '.' into labels' values and so Airbyte CI scripts will add '.json' to the end automatically. By default secrets will be saved to ./secrets/config.json i.e: filename=config_auth => secrets/config_auth.json
  3. Save a necessary JSON value Example.
  4. That should be it.

Access CI secrets on GSM

Access to GSM storage is limited to Airbyte employees. To give an employee permissions to the project:

  1. Go to the permissions' page
  2. Add a new principal to dataline-integration-testing:
  • input their login email
  • select the role Development_CI_Secrets
  1. Save