Skip to content
This repository has been archived by the owner on Dec 19, 2024. It is now read-only.

To better understand Pangeo Forge, we want to explore "vanilla Beam"

Notifications You must be signed in to change notification settings

QGreenland-Net/apache-beam-exploration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

apache-beam-exploration

This repository has been archived.

Repo for exploring the use of Apache Beam as the orchestrator for OGDC recipes.

Repo currently focuses on following along with the beam "getting started" materials: https://beam.apache.org/get-started/

System check

To start, run the built-in copy of the word-count example with the following command, just to make sure that Apache Beam is correctly installed.

python -m apache_beam.examples.wordcount_minimal \
  --input data/words.txt \
  --output data/wordcounts_official_example.txt

This outputs a file wordcounts_official_example.txt-00000-of-00001. Why doesn't it match the requested output file name?

Our own implementaiton of the example

python -m wordcount_example \
  --input data/words.txt \
  --output data/wordcounts_our_example.txt

The output file looks the same as the output file from the above example. There is significantly less log output, however. Why is that?

Seal tag data spike

python -m seal_csv_to_gpkg

Useful resources

About

To better understand Pangeo Forge, we want to explore "vanilla Beam"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages