Skip to content

Docker containers to build an Hadoop infrastructure and experiment feedback control loops atop of it.

License

Notifications You must be signed in to change notification settings

gobo7793/hadoop-benchmark

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Hadoop-Benchmark is an open-source research acceleration platform for rapid prototyping and evaluation of self-adaptive behaviors in Hadoop clusters. The main objectives are to allow researchers to

rapidly prototype, i.e., to experiment with self-adaptation in Hadoop clusters without the need to cope with low-level system infrastructure details,

reproduction, i.e., to share complete experiments for others to reproduce them independently, and

repetition, i.e., to experiment with and to compare their work, re-doing the same experiments on the same system using the same evaluation methods.

It uses docker and docker-machine to easily create a multi-node cluster (on a single laptop or in a cloud including Grid5000) and provision Hadoop. It contains a number of acknowledged benchmarks and one self-adaptive scenario.

The following is the high-level overview of the created cluster and deployed services: architecture

Requirements

  • docker >= 1.12
  • docker-machine >= 0.8
  • (optional) R >= 3.3.2 with tidyverse and Hmisc for data analysis

Usage

./cluster.sh                                                                              ✓
Usage ./cluster.sh [OPTIONS] COMMAND

Options:

  -f, --force   Use '-f' in docker commands where applicable
  -n, --noop    Only shows which commands would be executed wihout actually executing them
  -q, --quiet   Do not print which commands are executed

Commands:

  Cluster:
    create-cluster
    start-cluster
    stop-cluster
    restart-cluster
    destroy-cluster
    status-cluster

  Hadoop:
    start-hadoop
    stop-hadoop
    restart-hadoop
    destroy-hadoop

  Misc:
    console                   Enter a bash console in a container connected to the cluster
    run-controller CMD        Run a command CMD in the controller container
    hdfs CMD                  Run the HDFS CMD command
    hdfs-download SRC         Download a file from HDFS SRC to current directory

  Info:
    shell-init      Shows information how to initialize current shell to connect to the cluster
                    Useful to execute like: 'eval $(./cluster.sh shell-init)'
    connect-info    Shows information how to connect to the cluster

Documentation

About

Docker containers to build an Hadoop infrastructure and experiment feedback control loops atop of it.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Shell 94.2%
  • R 4.1%
  • XSLT 1.7%