Skip to content

Commit

Permalink
initial code drop
Browse files Browse the repository at this point in the history
  • Loading branch information
chu11 committed Oct 3, 2013
1 parent 2c9bc05 commit 18142a7
Show file tree
Hide file tree
Showing 28 changed files with 3,307 additions and 23 deletions.
47 changes: 24 additions & 23 deletions LICENSE → COPYING
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991

Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

Preamble
Preamble

The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
Expand All @@ -15,7 +15,7 @@ software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
the GNU Library General Public License instead.) You can apply it to
your programs, too.

When we speak of free software, we are referring to freedom, not
Expand Down Expand Up @@ -55,8 +55,8 @@ patent must be licensed for everyone's free use or not licensed at all.

The precise terms and conditions for copying, distribution and
modification follow.

GNU GENERAL PUBLIC LICENSE
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

0. This License applies to any program or other work which contains
Expand Down Expand Up @@ -110,7 +110,7 @@ above, provided that you also meet all of these conditions:
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)

These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
Expand Down Expand Up @@ -168,7 +168,7 @@ access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.

4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
Expand Down Expand Up @@ -225,7 +225,7 @@ impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.

8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
Expand Down Expand Up @@ -255,7 +255,7 @@ make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.

NO WARRANTY
NO WARRANTY

11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
Expand All @@ -277,9 +277,9 @@ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

END OF TERMS AND CONDITIONS

How to Apply These Terms to Your New Programs
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs

If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
Expand All @@ -290,8 +290,8 @@ to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

Magpie contains a number of scripts for running Hadoop jobs in HPC environments using Slurm and running jobs on top of Lustre
Copyright (C) 2013 chu11
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
Expand All @@ -303,16 +303,17 @@ the "copyright" line and a pointer to where the full notice is found.
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA


Also add information on how to contact you by electronic and paper mail.

If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:

Gnomovision version 69, Copyright (C) year name of author
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
Expand All @@ -329,11 +330,11 @@ necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.

{signature of Ty Coon}, 1 April 1989
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice

This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.
24 changes: 24 additions & 0 deletions DISCLAIMER
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
This work was produced at the Lawrence Livermore National Laboratory
(LLNL) under Contract No. DE-AC52-07NA27344 (Contract 44) between
the U.S. Department of Energy (DOE) and Lawrence Livermore National
Security, LLC (LLNS) for the operation of LLNL.

This work was prepared as an account of work sponsored by an agency of
the United States Government. Neither the United States Government nor
Lawrence Livermore National Security, LLC nor any of their employees,
makes any warranty, express or implied, or assumes any liability or
responsibility for the accuracy, completeness, or usefulness of any
information, apparatus, product, or process disclosed, or represents
that its use would not infringe privately-owned rights.

Reference herein to any specific commercial products, process, or
services by trade name, trademark, manufacturer or otherwise does
not necessarily constitute or imply its endorsement, recommendation,
or favoring by the United States Government or Lawrence Livermore
National Security, LLC. The views and opinions of authors expressed
herein do not necessarily state or reflect those of the Untied States
Government or Lawrence Livermore National Security, LLC, and shall
not be used for advertising or product endorsement purposes.

The precise terms and conditions for copying, distribution, and
modification are specified in the file "COPYING".
230 changes: 230 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
Running Hadoop on Clusters w/ Slurm & Lustre

Albert Chu
Updated October 3rd, 2013
[email protected]

What is this project
--------------------

This project contains a number of scripts for running Hadoop jobs in
HPC environments using Slurm and running jobs on top of Lustre.

This project allows you to:

- Run Hadoop interactively or via scripts.
- Run Mapreduce 1.0 or 2.0 jobs (i.e. Hadoop 1.0 or 2.0)
- Run against HDFS, HDFS over Lustre, or Lustre raw
- Take advantage of SSDs for local caching if available
- Make decent optimizations of Hadoop for your hardware

Credit
------

First, credit must be given to Kevin Regimbal @ PNNL. Initial
experiments were done using heavily modified versions of scripts Kevin
developed for running Hadoop w/ Slurm & Lustre. A number of the
ideas from Kevin's scripts are still in these scripts.

Basic Idea
----------

The basic idea behind these scripts are to:

1) Allocate nodes on a cluster using slurm

2) Scripts will setup configuration files so the Slurm/MPI rank 0 node
is the "master". All compute nodes will have configuration files
created that point to the node designated as the jobtracker/yarn
server.

3) Launch Hadoop daemons on all nodes. The Slurm/MPI rank 0 node will
run the JobTracker/NameNode (or Yarn Server in Hadoop 2.0). All
remaining nodes will run DataNodes/Tasktrackers (or NodeManager in
Hadoop 2.0).

Now you have a mini Hadoop cluster to do whatever you want.

Basics of HDFS over Lustre
--------------------------

Instead of using local disk, designate a lustre directory to "emulate"
local disk for each compute node. For example, lets say you have 4
compute nodes. If we create the following paths in Lustre,

/lustre/myusername/node-0
/lustre/myusername/node-1
/lustre/myusername/node-2
/lustre/myusername/node-3

We can give each of these paths to one of the compute nodes, which
they can treat like a local disk.

Q: Does that mean I have to constantly rebuild HDFS everytime I start
a job?

A: No, using Slurm/MPI ranks, "disk-paths" can be consistently
assigned to nodes so that all your HDFS files from a previous run
can exist on a later run.

Q: But I'll have to consistently use the same number of cluster nodes?

A: Generally speaking yes. If you decide to change the number of
nodes you run on, you may need to rebalance HDFS blocks or fix
HDFS. Imagine you had a traditional Hadoop cluster and you were
increasing the number of nodes in your cluster or decreasing the
number of nodes in your cluster. How would you have to handle it?

Increasing the number of nodes in your job is generally "more ok"
than decreasing it. HDFS should be able to find your data and
rebalance it. Be careful if you try to scale down the number of
nodes you use w/o handling it first. Within HDFS respects, you may
have "lost data".

Basic Instructions
------------------

1) Download your favorite version of Hadoop off of Apache and install
it into a location where it's accessible on all cluster nodes.
Usually this is on a NFS home directory. Adjust HADOOP_VERSION and
HADOOP_BUILD_HOME appropriately for the install.

2) Open up sbatch.hadoop and setup Slurm essentials for your job.
Here are the essentials for the setup:

#SBATCH --nodes : Set how many nodes you want in your job

SBATCH_TIMELIMIT : Set the time for this job to run

#SBATCH --partition : Set the job partition

HADOOP_SCRIPTS_HOME - Set where your scripts are.

3) Now setup the essentials for Hadoop. Here are the essentials:

HADOOP_MODE : The first time you'll probably want to run w/
'terasort' mode just to try things out. Later you may want to run
w/ 'script' or 'interactive' mode, as that is the more likely way
to run.

HADOOP_FILESYSTEM_MODE : most will likely you'll want
"hdfsoverlustre".

HADOOP_HDFSOVERLUSTRE_PATH : For hdfs over lustre, need to set this

HADOOP_SETUP_TYPE : Are you running Mapreduce version 1 or 2

HADOOP_VERSION : Make sure your build matches HADOOP_SETUP_TYPE
(i.e. don't say you want MapReduce 1 and point to Hadoop 2.0 build)

HADOOP_BUILD_HOME : Where your hadoop code is. Typically in an NFS mount.

HADOOP_LOCAL_DIR : A small place for conf files and log files local
to each node. Typically /tmp directory.

JAVA_HOME : B/c you need to ...

4) If you are happy with the configuration files provided by this
project, you can use them. If not, change them. If you copy them
to a new directory, adjust HADOOP_CONF_FILES in sbatch.hadoop as
needed.

5) Run "sbatch -k ./sbatch.hadoop" ... and any other options you see
fit.

6) Look at your slurm output file to see your output. There will also
be some notes/instructions/tips in the slurm output file for
viewing the status of your job, how to interact, etc..

Advanced
--------

There are many advanced options and other scripting options. Please
see sbatch.hadoop for details.

The scripts make decent guesstimates on what would be best, but it
always depends on your job and your hardware. Many options in
sbatch.hadoop are available to help you adjust your performance.

Exported Environment Variables
------------------------------

The following environment variables are exported by the sbatch
hadoop-run script and may be useful.

HADOOP_CLUSTER_NODERANK : the rank of the node you are on. It's often
convenient to do something like

if [ $HADOOP_CLUSTER_NODERANK == 0 ]
then
....
fi

To only do something on one node of your allocation.

HADOOP_CONF_DIR : the directory that configuration files local to the
node are stored.

HADOOP_LOG_DIR : the directory log files are stored

Patching Hadoop
---------------

Generally speaking, no modifications to Hadoop are necessary, however
tweaks may be necessary depending on your environment. In some
environments, passwordless ssh is disabled, therefore requiring a
modification to Hadoop to allow you to use non-ssh mechanisms for
launching daemons.

I have submitted a patch for adjusting this at this JIRA:

https://issues.apache.org/jira/browse/HADOOP-9109

For those who use mrsh (https://github.com/chaos/mrsh), applying one
of the appropriate patches in the 'patches' directory will allow you
to specify mrsh for launching remote daemons instead of ssh using the
HADOOP_REMOTE_CMD environment variable.

Special Note on Hadoop 1.0
--------------------------

See this Jira:

https://issues.apache.org/jira/browse/HDFS-1835

Hadoop 1.0 appears to have more trouble on diskless systems, as
diskless systems have less entropy in them. So you may wish to apply
the patch in the above jira if things don't seem to be working. I
noticed the following alot on my cluster:

2013-09-19 10:45:37,620 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(apex114.llnl.gov:50010, storageID=, infoPort=50075, ipcPort=50020)

Notice the storageID is blank, that's b/c the random number
calculation failed. Subsequently daemons aren't started,
etc. etc. and badness overall.

If you have root privileges, starting up the rngd daemon is another
way to solve this problem without resorting to patching.

Dependency
----------

This project includes a script called 'hadoop-expand-nodes' which is
used for hostrange expansion within the scripts. It is a hack pieced
together from other scripts.

The preferred mechanism is to use the hostlist command in the
lua-hostlist project. You can find lua-hostlist here : < FILL IN >

The main hadoop-run script will use 'hadoop-expand-nodes' if it cannot
find 'hostlist' in its path.

Contributions
-------------

Feel free to send me patches for new environment variables, new
adjustments, new optimization possibilities, alternate defaults that
you feel are better, etc.

Any patches you submit to me for fixes will be appreciated. I am by
no means a bash expert ... in fact I'm quite bad at it.
Loading

0 comments on commit 18142a7

Please sign in to comment.