gobblin

Make all scripts executable (GoogleCloudDataproc#434 )

Jan 24, 2019

c214ef5 · Jan 24, 2019

This branch is 475 commits behind GoogleCloudDataproc/initialization-actions:master.

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md	Improve consistency of init action READMEs (GoogleCloudDataproc#262 )	May 17, 2018
gobblin.sh	gobblin.sh	Make all scripts executable (GoogleCloudDataproc#434 )	Jan 24, 2019

README.md

Apache Gobblin Initialization Action

This initialization action installs version 0.12.0 RC2 of Apache Gobblin on all nodes within Google Cloud Dataproc cluster.

The distribution is hosted in Dataproc-team owned Google Cloud Storage bucket gobblin-dist.

Using this initialization action

You can use this initialization action to create a new Dataproc cluster with Gobblin installed by:

Use the gcloud command to create a new cluster with this initialization action. The following command will create a new cluster named <CLUSTER_NAME>.
```
gcloud dataproc clusters create <CLUSTER_NAME> \
    --initialization-actions gs://dataproc-initialization-actions/gobblin/gobblin.sh
```

Submit jobs

gcloud dataproc jobs submit hadoop --cluster=<CLUSTER_NAME> \
    --class org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher \
    --properties mapreduce.job.user.classpath.first=true -- \
    -sysconfig /usr/local/lib/gobblin/conf/gobblin-mapreduce.properties \
    -jobconfig gs://<PATH_TO_JOB_CONFIG>

Alternatively, you can submit jobs through Gobblin launcher scripts located in /usr/local/lib/gobblin/bin. By default, Gobblin is only configured for mapreduce mode.

To learn about how to use Gobblin read the documentation for the Getting Started guide.

Important notes

For Gobblin to work with Dataproc Job API, any additional client libraries (for example: Kafka, MySql) would have to be symlinked into /usr/lib/hadoop/lib directory on each node.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

gobblin

gobblin

README.md

Apache Gobblin Initialization Action

Using this initialization action

Important notes

Files

gobblin

Directory actions

More options

Directory actions

More options

Latest commit

History

gobblin

Folders and files

parent directory

README.md

Apache Gobblin Initialization Action

Using this initialization action

Important notes