initial code drop

LLNL · Oct 3, 2013 · 18142a7 · 18142a7
1 parent 2c9bc05
commit 18142a7
Show file tree

Hide file tree

Showing 28 changed files with 3,307 additions and 23 deletions.
diff --git a/LICENSE → COPYING b/LICENSE → COPYING
@@ -1,12 +1,12 @@
-GNU GENERAL PUBLIC LICENSE
-                       Version 2, June 1991
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
 
- Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
- 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+     51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.
 
-                            Preamble
+			    Preamble
 
   The licenses for most software are designed to take away your
 freedom to share and change it.  By contrast, the GNU General Public
@@ -15,7 +15,7 @@ software--to make sure the software is free for all its users.  This
 General Public License applies to most of the Free Software
 Foundation's software and to any other program whose authors commit to
 using it.  (Some other Free Software Foundation software is covered by
-the GNU Lesser General Public License instead.)  You can apply it to
+the GNU Library General Public License instead.)  You can apply it to
 your programs, too.
 
   When we speak of free software, we are referring to freedom, not
@@ -55,8 +55,8 @@ patent must be licensed for everyone's free use or not licensed at all.
 
   The precise terms and conditions for copying, distribution and
 modification follow.
-
-                    GNU GENERAL PUBLIC LICENSE
+
+		    GNU GENERAL PUBLIC LICENSE
    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 
   0. This License applies to any program or other work which contains
@@ -110,7 +110,7 @@ above, provided that you also meet all of these conditions:
     License.  (Exception: if the Program itself is interactive but
     does not normally print such an announcement, your work based on
     the Program is not required to print an announcement.)
-
+
 These requirements apply to the modified work as a whole.  If
 identifiable sections of that work are not derived from the Program,
 and can be reasonably considered independent and separate works in
@@ -168,7 +168,7 @@ access to copy from a designated place, then offering equivalent
 access to copy the source code from the same place counts as
 distribution of the source code, even though third parties are not
 compelled to copy the source along with the object code.
-
+
   4. You may not copy, modify, sublicense, or distribute the Program
 except as expressly provided under this License.  Any attempt
 otherwise to copy, modify, sublicense or distribute the Program is
@@ -225,7 +225,7 @@ impose that choice.
 
 This section is intended to make thoroughly clear what is believed to
 be a consequence of the rest of this License.
-
+
   8. If the distribution and/or use of the Program is restricted in
 certain countries either by patents or by copyrighted interfaces, the
 original copyright holder who places the Program under this License
@@ -255,7 +255,7 @@ make exceptions for this.  Our decision will be guided by the two goals
 of preserving the free status of all derivatives of our free software and
 of promoting the sharing and reuse of software generally.
 
-                            NO WARRANTY
+			    NO WARRANTY
 
   11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
@@ -277,9 +277,9 @@ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
 PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGES.
 
-                     END OF TERMS AND CONDITIONS
-
-            How to Apply These Terms to Your New Programs
+		     END OF TERMS AND CONDITIONS
+
+	    How to Apply These Terms to Your New Programs
 
   If you develop a new program, and you want it to be of the greatest
 possible use to the public, the best way to achieve this is to make it
@@ -290,8 +290,8 @@ to attach them to the start of each source file to most effectively
 convey the exclusion of warranty; and each file should have at least
 the "copyright" line and a pointer to where the full notice is found.
 
-    Magpie contains a number of scripts for running Hadoop jobs in HPC environments using Slurm and running jobs on top of Lustre
-    Copyright (C) 2013  chu11
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
 
     This program is free software; you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -303,16 +303,17 @@ the "copyright" line and a pointer to where the full notice is found.
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     GNU General Public License for more details.
 
-    You should have received a copy of the GNU General Public License along
-    with this program; if not, write to the Free Software Foundation, Inc.,
-    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+    You should have received a copy of the GNU General Public License
+    along with this program; if not, write to the Free Software
+    Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+
 
 Also add information on how to contact you by electronic and paper mail.
 
 If the program is interactive, make it output a short notice like this
 when it starts in an interactive mode:
 
-    Gnomovision version 69, Copyright (C) year name of author
+    Gnomovision version 69, Copyright (C) year  name of author
     Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
     This is free software, and you are welcome to redistribute it
     under certain conditions; type `show c' for details.
@@ -329,11 +330,11 @@ necessary.  Here is a sample; alter the names:
   Yoyodyne, Inc., hereby disclaims all copyright interest in the program
   `Gnomovision' (which makes passes at compilers) written by James Hacker.
 
-  {signature of Ty Coon}, 1 April 1989
+  <signature of Ty Coon>, 1 April 1989
   Ty Coon, President of Vice
 
 This General Public License does not permit incorporating your program into
 proprietary programs.  If your program is a subroutine library, you may
 consider it more useful to permit linking proprietary applications with the
-library.  If this is what you want to do, use the GNU Lesser General
+library.  If this is what you want to do, use the GNU Library General
 Public License instead of this License.
diff --git a/DISCLAIMER b/DISCLAIMER
@@ -0,0 +1,24 @@
+This work was produced at the Lawrence Livermore National Laboratory
+(LLNL) under Contract No. DE-AC52-07NA27344 (Contract 44) between
+the U.S. Department of Energy (DOE) and Lawrence Livermore National
+Security, LLC (LLNS) for the operation of LLNL.
+
+This work was prepared as an account of work sponsored by an agency of
+the United States Government.  Neither the United States Government nor
+Lawrence Livermore National Security, LLC nor any of their employees,
+makes any warranty, express or implied, or assumes any liability or
+responsibility for the accuracy, completeness, or usefulness of any
+information, apparatus, product, or process disclosed, or represents
+that its use would not infringe privately-owned rights.
+
+Reference herein to any specific commercial products, process, or
+services by trade name, trademark, manufacturer or otherwise does
+not necessarily constitute or imply its endorsement, recommendation,
+or favoring by the United States Government or Lawrence Livermore
+National Security, LLC.  The views and opinions of authors expressed
+herein do not necessarily state or reflect those of the Untied States
+Government or Lawrence Livermore National Security, LLC, and shall
+not be used for advertising or product endorsement purposes.
+
+The precise terms and conditions for copying, distribution, and
+modification are specified in the file "COPYING".
diff --git a/README b/README
@@ -0,0 +1,230 @@
+Running Hadoop on Clusters w/ Slurm & Lustre
+
+Albert Chu
+Updated October 3rd, 2013
+[email protected]
+
+What is this project
+--------------------
+
+This project contains a number of scripts for running Hadoop jobs in
+HPC environments using Slurm and running jobs on top of Lustre.
+
+This project allows you to:
+
+- Run Hadoop interactively or via scripts.
+- Run Mapreduce 1.0 or 2.0 jobs (i.e. Hadoop 1.0 or 2.0)
+- Run against HDFS, HDFS over Lustre, or Lustre raw
+- Take advantage of SSDs for local caching if available
+- Make decent optimizations of Hadoop for your hardware
+
+Credit
+------
+
+First, credit must be given to Kevin Regimbal @ PNNL.  Initial
+experiments were done using heavily modified versions of scripts Kevin
+developed for running Hadoop w/ Slurm & Lustre.  A number of the
+ideas from Kevin's scripts are still in these scripts.
+
+Basic Idea
+----------
+
+The basic idea behind these scripts are to:
+
+1) Allocate nodes on a cluster using slurm
+
+2) Scripts will setup configuration files so the Slurm/MPI rank 0 node
+   is the "master".  All compute nodes will have configuration files
+   created that point to the node designated as the jobtracker/yarn
+   server.
+
+3) Launch Hadoop daemons on all nodes.  The Slurm/MPI rank 0 node will
+   run the JobTracker/NameNode (or Yarn Server in Hadoop 2.0).  All
+   remaining nodes will run DataNodes/Tasktrackers (or NodeManager in
+   Hadoop 2.0).
+
+Now you have a mini Hadoop cluster to do whatever you want.
+
+Basics of HDFS over Lustre
+--------------------------
+
+Instead of using local disk, designate a lustre directory to "emulate"
+local disk for each compute node.  For example, lets say you have 4
+compute nodes.  If we create the following paths in Lustre,
+
+/lustre/myusername/node-0
+/lustre/myusername/node-1
+/lustre/myusername/node-2
+/lustre/myusername/node-3
+
+We can give each of these paths to one of the compute nodes, which
+they can treat like a local disk.
+
+Q: Does that mean I have to constantly rebuild HDFS everytime I start
+  a job?
+
+A: No, using Slurm/MPI ranks, "disk-paths" can be consistently
+   assigned to nodes so that all your HDFS files from a previous run
+   can exist on a later run.
+
+Q: But I'll have to consistently use the same number of cluster nodes?
+
+A: Generally speaking yes.  If you decide to change the number of
+   nodes you run on, you may need to rebalance HDFS blocks or fix
+   HDFS.  Imagine you had a traditional Hadoop cluster and you were
+   increasing the number of nodes in your cluster or decreasing the
+   number of nodes in your cluster.  How would you have to handle it?
+
+   Increasing the number of nodes in your job is generally "more ok"
+   than decreasing it.  HDFS should be able to find your data and
+   rebalance it.  Be careful if you try to scale down the number of
+   nodes you use w/o handling it first.  Within HDFS respects, you may
+   have "lost data".
+
+Basic Instructions
+------------------
+
+1) Download your favorite version of Hadoop off of Apache and install
+   it into a location where it's accessible on all cluster nodes.
+   Usually this is on a NFS home directory.  Adjust HADOOP_VERSION and
+   HADOOP_BUILD_HOME appropriately for the install.
+
+2) Open up sbatch.hadoop and setup Slurm essentials for your job.
+   Here are the essentials for the setup:
+
+   #SBATCH --nodes : Set how many nodes you want in your job
+
+   SBATCH_TIMELIMIT : Set the time for this job to run
+
+   #SBATCH --partition : Set the job partition
+
+   HADOOP_SCRIPTS_HOME - Set where your scripts are.
+
+3) Now setup the essentials for Hadoop.  Here are the essentials:
+
+   HADOOP_MODE : The first time you'll probably want to run w/
+   'terasort' mode just to try things out.  Later you may want to run
+   w/ 'script' or 'interactive' mode, as that is the more likely way
+   to run.
+
+   HADOOP_FILESYSTEM_MODE : most will likely you'll want
+   "hdfsoverlustre".
+
+   HADOOP_HDFSOVERLUSTRE_PATH : For hdfs over lustre, need to set this
+
+   HADOOP_SETUP_TYPE : Are you running Mapreduce version 1 or 2
+
+   HADOOP_VERSION : Make sure your build matches HADOOP_SETUP_TYPE
+   (i.e. don't say you want MapReduce 1 and point to Hadoop 2.0 build)
+
+   HADOOP_BUILD_HOME : Where your hadoop code is.  Typically in an NFS mount.
+
+   HADOOP_LOCAL_DIR : A small place for conf files and log files local
+   to each node.  Typically /tmp directory.
+
+   JAVA_HOME : B/c you need to ...
+
+4) If you are happy with the configuration files provided by this
+   project, you can use them.  If not, change them.  If you copy them
+   to a new directory, adjust HADOOP_CONF_FILES in sbatch.hadoop as
+   needed.
+
+5) Run "sbatch -k ./sbatch.hadoop" ... and any other options you see
+   fit.
+
+6) Look at your slurm output file to see your output.  There will also
+   be some notes/instructions/tips in the slurm output file for
+   viewing the status of your job, how to interact, etc..
+
+Advanced
+--------
+
+There are many advanced options and other scripting options.  Please
+see sbatch.hadoop for details.
+
+The scripts make decent guesstimates on what would be best, but it
+always depends on your job and your hardware.  Many options in
+sbatch.hadoop are available to help you adjust your performance.
+
+Exported Environment Variables
+------------------------------
+
+The following environment variables are exported by the sbatch
+hadoop-run script and may be useful.
+
+HADOOP_CLUSTER_NODERANK : the rank of the node you are on.  It's often
+                          convenient to do something like
+
+if [ $HADOOP_CLUSTER_NODERANK == 0 ]
+then 
+   ....
+fi
+
+To only do something on one node of your allocation.
+
+HADOOP_CONF_DIR : the directory that configuration files local to the
+                  node are stored.
+
+HADOOP_LOG_DIR : the directory log files are stored
+
+Patching Hadoop
+---------------
+
+Generally speaking, no modifications to Hadoop are necessary, however
+tweaks may be necessary depending on your environment.  In some
+environments, passwordless ssh is disabled, therefore requiring a
+modification to Hadoop to allow you to use non-ssh mechanisms for
+launching daemons.
+
+I have submitted a patch for adjusting this at this JIRA:
+
+https://issues.apache.org/jira/browse/HADOOP-9109
+
+For those who use mrsh (https://github.com/chaos/mrsh), applying one
+of the appropriate patches in the 'patches' directory will allow you
+to specify mrsh for launching remote daemons instead of ssh using the
+HADOOP_REMOTE_CMD environment variable.
+
+Special Note on Hadoop 1.0
+--------------------------
+
+See this Jira:
+
+https://issues.apache.org/jira/browse/HDFS-1835
+
+Hadoop 1.0 appears to have more trouble on diskless systems, as
+diskless systems have less entropy in them.  So you may wish to apply
+the patch in the above jira if things don't seem to be working.  I
+noticed the following alot on my cluster:
+
+2013-09-19 10:45:37,620 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(apex114.llnl.gov:50010, storageID=, infoPort=50075, ipcPort=50020)
+
+Notice the storageID is blank, that's b/c the random number
+calculation failed.  Subsequently daemons aren't started,
+etc. etc. and badness overall.
+
+If you have root privileges, starting up the rngd daemon is another
+way to solve this problem without resorting to patching.
+
+Dependency
+----------
+
+This project includes a script called 'hadoop-expand-nodes' which is
+used for hostrange expansion within the scripts.  It is a hack pieced
+together from other scripts.
+
+The preferred mechanism is to use the hostlist command in the
+lua-hostlist project.  You can find lua-hostlist here : < FILL IN >
+
+The main hadoop-run script will use 'hadoop-expand-nodes' if it cannot
+find 'hostlist' in its path.
+
+Contributions
+-------------
+
+Feel free to send me patches for new environment variables, new
+adjustments, new optimization possibilities, alternate defaults that
+you feel are better, etc.
+
+Any patches you submit to me for fixes will be appreciated.  I am by
+no means a bash expert ... in fact I'm quite bad at it.