etl-effects-of-covid-19-on-trade-at-15-december-2021-provisional

file spark-lamdo used to build and config cluster apche spark with one master and two worker

the version of spark : 3.2.1

file include two file apps and data

Job.py file in /apps is written based on pyspark for the purpose of ETL (extract, transform and load data) into Postgresql.

file csv and jar in /data

#To run:

      docker build -t cluster-apache-spark:3.2.1 .


      docker compose up -d

#To submit the app connect to one of the workers or the master and execute:

      /opt/spark/bin/spark-submit --master spark://spark-master:7077 \
      --jars /opt/spark-data/postgresql-42.2.22.jar \
      --driver-memory 1G \
      --executor-memory 1G \
      /opt/spark-apps/Job.py

#submit

#run

#succes

Check the completion time of spark-cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
spark-lamdo		spark-lamdo
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

etl-effects-of-covid-19-on-trade-at-15-december-2021-provisional

About

Releases

Packages

Languages

lx0612/etl-effects-of-covid-19-on-trade-at-15-december-2021-provisional

Folders and files

Latest commit

History

Repository files navigation

etl-effects-of-covid-19-on-trade-at-15-december-2021-provisional

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages