data-river

Replicates data from mysql to any datastore of your choice (relies on maxwell)

Maven Dependency

Use the following maven dependency:

<dependency>
    <groupId>com.github.vivekkothari</groupId>
    <artifactId>elastic-search-persister</artifactId>
    <version>1.2.1</version>
</dependency>

Imagine you have 2 tables in MySql, Table1 and Table2. You would have to configure maxwell. Once maxwell is properly configured, lets say you want to persist changes in the above tables in Elastic-Search. We would create 2 rivers, table1_river and table2_river. Then for each of these riverType, provide implementation of IFilter, IEnricher and IBackFiller and build a TransformerFactory

Incoming kafka message goes through following 3 steps:

Filtering: IFilter governs whether the incoming message should be processed or not. 2. Enrichment: IEnricher provides a way to enrich the incoming message. (think of joining the row with some other row) 3. Persistence: IPersister persists the message in your desired data store.

There is also and admin task on the admin port of your dropwizard application which can be used to backfill the data. Example

http://localhost:8080/admin/backfill?startDate=2016-01-01T00:00:00&endDate=2016-01-10T00:00:00&riverType=river1

Add following code in the run method of you dropwizard application. Add com.github.vivekkothari.persister.ESRiverConfiguration in your Configuration class Configure appropriately (like Elasticsearch hosts, bulk index configs etc.) then call following method.

configuration.getEsRiverConfiguration().build(environment);

The IPersister can be extended and you can add a new data store to which the messages can be stored.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

data-river

Maven Dependency

Files

README.md

Latest commit

History

README.md

File metadata and controls

data-river

Maven Dependency