diff --git a/LICENSE b/COPYING similarity index 94% rename from LICENSE rename to COPYING index 92cef9ca1..623b6258a 100644 --- a/LICENSE +++ b/COPYING @@ -1,12 +1,12 @@ -GNU GENERAL PUBLIC LICENSE - Version 2, June 1991 + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 - Copyright (C) 1989, 1991 Free Software Foundation, Inc., - 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. - Preamble + Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public @@ -15,7 +15,7 @@ software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by -the GNU Lesser General Public License instead.) You can apply it to +the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not @@ -55,8 +55,8 @@ patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. - - GNU GENERAL PUBLIC LICENSE + + GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains @@ -110,7 +110,7 @@ above, provided that you also meet all of these conditions: License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) - + These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in @@ -168,7 +168,7 @@ access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. - + 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is @@ -225,7 +225,7 @@ impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. - + 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License @@ -255,7 +255,7 @@ make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. - NO WARRANTY + NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN @@ -277,9 +277,9 @@ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. - END OF TERMS AND CONDITIONS - - How to Apply These Terms to Your New Programs + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it @@ -290,8 +290,8 @@ to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. - Magpie contains a number of scripts for running Hadoop jobs in HPC environments using Slurm and running jobs on top of Lustre - Copyright (C) 2013 chu11 + + Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -303,16 +303,17 @@ the "copyright" line and a pointer to where the full notice is found. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. - You should have received a copy of the GNU General Public License along - with this program; if not, write to the Free Software Foundation, Inc., - 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: - Gnomovision version 69, Copyright (C) year name of author + Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. @@ -329,11 +330,11 @@ necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. - {signature of Ty Coon}, 1 April 1989 + , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the -library. If this is what you want to do, use the GNU Lesser General +library. If this is what you want to do, use the GNU Library General Public License instead of this License. diff --git a/DISCLAIMER b/DISCLAIMER new file mode 100644 index 000000000..1bb04be7e --- /dev/null +++ b/DISCLAIMER @@ -0,0 +1,24 @@ +This work was produced at the Lawrence Livermore National Laboratory +(LLNL) under Contract No. DE-AC52-07NA27344 (Contract 44) between +the U.S. Department of Energy (DOE) and Lawrence Livermore National +Security, LLC (LLNS) for the operation of LLNL. + +This work was prepared as an account of work sponsored by an agency of +the United States Government. Neither the United States Government nor +Lawrence Livermore National Security, LLC nor any of their employees, +makes any warranty, express or implied, or assumes any liability or +responsibility for the accuracy, completeness, or usefulness of any +information, apparatus, product, or process disclosed, or represents +that its use would not infringe privately-owned rights. + +Reference herein to any specific commercial products, process, or +services by trade name, trademark, manufacturer or otherwise does +not necessarily constitute or imply its endorsement, recommendation, +or favoring by the United States Government or Lawrence Livermore +National Security, LLC. The views and opinions of authors expressed +herein do not necessarily state or reflect those of the Untied States +Government or Lawrence Livermore National Security, LLC, and shall +not be used for advertising or product endorsement purposes. + +The precise terms and conditions for copying, distribution, and +modification are specified in the file "COPYING". diff --git a/README b/README new file mode 100644 index 000000000..4c4235624 --- /dev/null +++ b/README @@ -0,0 +1,230 @@ +Running Hadoop on Clusters w/ Slurm & Lustre + +Albert Chu +Updated October 3rd, 2013 +chu11@llnl.gov + +What is this project +-------------------- + +This project contains a number of scripts for running Hadoop jobs in +HPC environments using Slurm and running jobs on top of Lustre. + +This project allows you to: + +- Run Hadoop interactively or via scripts. +- Run Mapreduce 1.0 or 2.0 jobs (i.e. Hadoop 1.0 or 2.0) +- Run against HDFS, HDFS over Lustre, or Lustre raw +- Take advantage of SSDs for local caching if available +- Make decent optimizations of Hadoop for your hardware + +Credit +------ + +First, credit must be given to Kevin Regimbal @ PNNL. Initial +experiments were done using heavily modified versions of scripts Kevin +developed for running Hadoop w/ Slurm & Lustre. A number of the +ideas from Kevin's scripts are still in these scripts. + +Basic Idea +---------- + +The basic idea behind these scripts are to: + +1) Allocate nodes on a cluster using slurm + +2) Scripts will setup configuration files so the Slurm/MPI rank 0 node + is the "master". All compute nodes will have configuration files + created that point to the node designated as the jobtracker/yarn + server. + +3) Launch Hadoop daemons on all nodes. The Slurm/MPI rank 0 node will + run the JobTracker/NameNode (or Yarn Server in Hadoop 2.0). All + remaining nodes will run DataNodes/Tasktrackers (or NodeManager in + Hadoop 2.0). + +Now you have a mini Hadoop cluster to do whatever you want. + +Basics of HDFS over Lustre +-------------------------- + +Instead of using local disk, designate a lustre directory to "emulate" +local disk for each compute node. For example, lets say you have 4 +compute nodes. If we create the following paths in Lustre, + +/lustre/myusername/node-0 +/lustre/myusername/node-1 +/lustre/myusername/node-2 +/lustre/myusername/node-3 + +We can give each of these paths to one of the compute nodes, which +they can treat like a local disk. + +Q: Does that mean I have to constantly rebuild HDFS everytime I start + a job? + +A: No, using Slurm/MPI ranks, "disk-paths" can be consistently + assigned to nodes so that all your HDFS files from a previous run + can exist on a later run. + +Q: But I'll have to consistently use the same number of cluster nodes? + +A: Generally speaking yes. If you decide to change the number of + nodes you run on, you may need to rebalance HDFS blocks or fix + HDFS. Imagine you had a traditional Hadoop cluster and you were + increasing the number of nodes in your cluster or decreasing the + number of nodes in your cluster. How would you have to handle it? + + Increasing the number of nodes in your job is generally "more ok" + than decreasing it. HDFS should be able to find your data and + rebalance it. Be careful if you try to scale down the number of + nodes you use w/o handling it first. Within HDFS respects, you may + have "lost data". + +Basic Instructions +------------------ + +1) Download your favorite version of Hadoop off of Apache and install + it into a location where it's accessible on all cluster nodes. + Usually this is on a NFS home directory. Adjust HADOOP_VERSION and + HADOOP_BUILD_HOME appropriately for the install. + +2) Open up sbatch.hadoop and setup Slurm essentials for your job. + Here are the essentials for the setup: + + #SBATCH --nodes : Set how many nodes you want in your job + + SBATCH_TIMELIMIT : Set the time for this job to run + + #SBATCH --partition : Set the job partition + + HADOOP_SCRIPTS_HOME - Set where your scripts are. + +3) Now setup the essentials for Hadoop. Here are the essentials: + + HADOOP_MODE : The first time you'll probably want to run w/ + 'terasort' mode just to try things out. Later you may want to run + w/ 'script' or 'interactive' mode, as that is the more likely way + to run. + + HADOOP_FILESYSTEM_MODE : most will likely you'll want + "hdfsoverlustre". + + HADOOP_HDFSOVERLUSTRE_PATH : For hdfs over lustre, need to set this + + HADOOP_SETUP_TYPE : Are you running Mapreduce version 1 or 2 + + HADOOP_VERSION : Make sure your build matches HADOOP_SETUP_TYPE + (i.e. don't say you want MapReduce 1 and point to Hadoop 2.0 build) + + HADOOP_BUILD_HOME : Where your hadoop code is. Typically in an NFS mount. + + HADOOP_LOCAL_DIR : A small place for conf files and log files local + to each node. Typically /tmp directory. + + JAVA_HOME : B/c you need to ... + +4) If you are happy with the configuration files provided by this + project, you can use them. If not, change them. If you copy them + to a new directory, adjust HADOOP_CONF_FILES in sbatch.hadoop as + needed. + +5) Run "sbatch -k ./sbatch.hadoop" ... and any other options you see + fit. + +6) Look at your slurm output file to see your output. There will also + be some notes/instructions/tips in the slurm output file for + viewing the status of your job, how to interact, etc.. + +Advanced +-------- + +There are many advanced options and other scripting options. Please +see sbatch.hadoop for details. + +The scripts make decent guesstimates on what would be best, but it +always depends on your job and your hardware. Many options in +sbatch.hadoop are available to help you adjust your performance. + +Exported Environment Variables +------------------------------ + +The following environment variables are exported by the sbatch +hadoop-run script and may be useful. + +HADOOP_CLUSTER_NODERANK : the rank of the node you are on. It's often + convenient to do something like + +if [ $HADOOP_CLUSTER_NODERANK == 0 ] +then + .... +fi + +To only do something on one node of your allocation. + +HADOOP_CONF_DIR : the directory that configuration files local to the + node are stored. + +HADOOP_LOG_DIR : the directory log files are stored + +Patching Hadoop +--------------- + +Generally speaking, no modifications to Hadoop are necessary, however +tweaks may be necessary depending on your environment. In some +environments, passwordless ssh is disabled, therefore requiring a +modification to Hadoop to allow you to use non-ssh mechanisms for +launching daemons. + +I have submitted a patch for adjusting this at this JIRA: + +https://issues.apache.org/jira/browse/HADOOP-9109 + +For those who use mrsh (https://github.com/chaos/mrsh), applying one +of the appropriate patches in the 'patches' directory will allow you +to specify mrsh for launching remote daemons instead of ssh using the +HADOOP_REMOTE_CMD environment variable. + +Special Note on Hadoop 1.0 +-------------------------- + +See this Jira: + +https://issues.apache.org/jira/browse/HDFS-1835 + +Hadoop 1.0 appears to have more trouble on diskless systems, as +diskless systems have less entropy in them. So you may wish to apply +the patch in the above jira if things don't seem to be working. I +noticed the following alot on my cluster: + +2013-09-19 10:45:37,620 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(apex114.llnl.gov:50010, storageID=, infoPort=50075, ipcPort=50020) + +Notice the storageID is blank, that's b/c the random number +calculation failed. Subsequently daemons aren't started, +etc. etc. and badness overall. + +If you have root privileges, starting up the rngd daemon is another +way to solve this problem without resorting to patching. + +Dependency +---------- + +This project includes a script called 'hadoop-expand-nodes' which is +used for hostrange expansion within the scripts. It is a hack pieced +together from other scripts. + +The preferred mechanism is to use the hostlist command in the +lua-hostlist project. You can find lua-hostlist here : < FILL IN > + +The main hadoop-run script will use 'hadoop-expand-nodes' if it cannot +find 'hostlist' in its path. + +Contributions +------------- + +Feel free to send me patches for new environment variables, new +adjustments, new optimization possibilities, alternate defaults that +you feel are better, etc. + +Any patches you submit to me for fixes will be appreciated. I am by +no means a bash expert ... in fact I'm quite bad at it. diff --git a/README.md b/README.md index b6b71ac29..4897b992d 100644 --- a/README.md +++ b/README.md @@ -2,3 +2,24 @@ magpie ====== Magpie contains a number of scripts for running Hadoop jobs in HPC environments using Slurm and running jobs on top of Lustre + +Basic Idea +========== + +The basic idea behind these scripts are to: + +1) Allocate nodes on a cluster using slurm + +2) Scripts will setup configuration files so the Slurm/MPI rank 0 node + is the "master". All compute nodes will have configuration files + created that point to the node designated as the jobtracker/yarn + server. + +3) Launch Hadoop daemons on all nodes. The Slurm/MPI rank 0 node will + run the JobTracker/NameNode (or Yarn Server in Hadoop 2.0). All + remaining nodes will run DataNodes/Tasktrackers (or NodeManager in + Hadoop 2.0). + +Now you have a mini Hadoop cluster to do whatever you want. + +Additional details can be found in the project README file \ No newline at end of file diff --git a/TODO b/TODO new file mode 100644 index 000000000..88387b503 --- /dev/null +++ b/TODO @@ -0,0 +1,15 @@ +Major feature support +--------------------- +Spark +Hbase +UDA + +Minor feature support +--------------------- +o multiple lustre directories for dfs.data.dir + - allowing "parallel" writes to lustre from 1 node + +o document rebalancing if you increase/decrease nodes + - handle include/exclude files for balancing + - or just have script to do it for you + diff --git a/conf/core-site-1.0.xml b/conf/core-site-1.0.xml new file mode 100644 index 000000000..fb29d5745 --- /dev/null +++ b/conf/core-site-1.0.xml @@ -0,0 +1,46 @@ + + + + + + + + + + hadoop.tmp.dir + HADOOPTMPDIR + A base for other temporary directories. + + + + fs.default.name + FSDEFAULT + The name of the default file system. A URI whose + scheme and authority determine the FileSystem implementation. The + uri's scheme determines the config property (fs.SCHEME.impl) naming + the FileSystem implementation class. The uri's authority is used to + determine the host, port, etc. for a filesystem. + + + + io.file.buffer.size + IOBUFFERSIZE + The size of buffer for use in sequence files. + The size of this buffer should probably be a multiple of hardware + page size (4096 on Intel x86), and it determines how much data is + buffered during read and write operations. + + + diff --git a/conf/core-site-2.0.xml b/conf/core-site-2.0.xml new file mode 100644 index 000000000..104cd815c --- /dev/null +++ b/conf/core-site-2.0.xml @@ -0,0 +1,55 @@ + + + + + + + + + + hadoop.tmp.dir + HADOOPTMPDIR + A base for other temporary directories. + + + + fs.defaultFS + FSDEFAULT + The name of the default file system. A URI whose + scheme and authority determine the FileSystem implementation. The + uri's scheme determines the config property (fs.SCHEME.impl) naming + the FileSystem implementation class. The uri's authority is used to + determine the host, port, etc. for a filesystem. + + + + io.file.buffer.size + IOBUFFERSIZE + The size of buffer for use in sequence files. + The size of this buffer should probably be a multiple of hardware + page size (4096 on Intel x86), and it determines how much data is + buffered during read and write operations. + + + + file.stream-buffer-size + IOBUFFERSIZE + The size of buffer to stream files. + The size of this buffer should probably be a multiple of hardware + page size (4096 on Intel x86), and it determines how much data is + buffered during read and write operations. + + + diff --git a/conf/hadoop-env-1.0.sh b/conf/hadoop-env-1.0.sh new file mode 100644 index 000000000..b3fc06147 --- /dev/null +++ b/conf/hadoop-env-1.0.sh @@ -0,0 +1,57 @@ +# Set Hadoop-specific environment variables here. + +# The only required environment variable is JAVA_HOME. All others are +# optional. When running a distributed configuration it is best to +# set JAVA_HOME in this file, so that it is correctly defined on +# remote nodes. + +# The java implementation to use. Required. +export JAVA_HOME=HADOOP_JAVA_HOME + +# Extra Java CLASSPATH elements. Optional. +# export HADOOP_CLASSPATH= + +# The maximum amount of heap to use, in MB. Default is 1000. +export HADOOP_HEAPSIZE=HADOOP_DAEMON_HEAP_MAX + +# Extra Java runtime options. Empty by default. +# export HADOOP_OPTS=-server + +# Command specific options appended to HADOOP_OPTS when specified +export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS" +export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS" +export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS" +export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS" +export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS" +# export HADOOP_TASKTRACKER_OPTS= +# The following applies to multiple commands (fs, dfs, fsck, distcp etc) +# export HADOOP_CLIENT_OPTS + +# Extra ssh options. Empty by default. +# export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" + +# Where log files are stored. $HADOOP_HOME/logs by default. +# export HADOOP_LOG_DIR=${HADOOP_HOME}/logs + +# File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default. +# export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves + +# host:path where hadoop code should be rsync'd from. Unset by default. +# export HADOOP_MASTER=master:/home/$USER/src/hadoop + +# Seconds to sleep between slave commands. Unset by default. This +# can be useful in large clusters, where, e.g., slave rsyncs can +# otherwise arrive faster than the master can service them. +# export HADOOP_SLAVE_SLEEP=0.1 + +# The directory where pid files are stored. /tmp by default. +# NOTE: this should be set to a directory that can only be written to by +# the users that are going to run the hadoop daemons. Otherwise there is +# the potential for a symlink attack. +# export HADOOP_PID_DIR=/var/hadoop/pids + +# A string representing this instance of hadoop. $USER by default. +# export HADOOP_IDENT_STRING=$USER + +# The scheduling priority for daemon processes. See 'man nice'. +# export HADOOP_NICENESS=10 diff --git a/conf/hadoop-env-2.0.sh b/conf/hadoop-env-2.0.sh new file mode 100755 index 000000000..e2aac2cef --- /dev/null +++ b/conf/hadoop-env-2.0.sh @@ -0,0 +1,77 @@ +# Copyright 2011 The Apache Software Foundation +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Set Hadoop-specific environment variables here. + +# The only required environment variable is JAVA_HOME. All others are +# optional. When running a distributed configuration it is best to +# set JAVA_HOME in this file, so that it is correctly defined on +# remote nodes. + +# The java implementation to use. +export JAVA_HOME=HADOOP_JAVA_HOME + +# The jsvc implementation to use. Jsvc is required to run secure datanodes. +#export JSVC_HOME=${JSVC_HOME} + +export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"} + +# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler. +for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do + if [ "$HADOOP_CLASSPATH" ]; then + export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f + else + export HADOOP_CLASSPATH=$f + fi +done + +# The maximum amount of heap to use, in MB. Default is 1000. +export HADOOP_HEAPSIZE=HADOOP_DAEMON_HEAP_MAX +#export HADOOP_NAMENODE_INIT_HEAPSIZE="" + +# Extra Java runtime options. Empty by default. +export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true" + +# Command specific options appended to HADOOP_OPTS when specified +export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS" +export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS" + +export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS" + +# The following applies to multiple commands (fs, dfs, fsck, distcp etc) +export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS" +#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS" + +# On secure datanodes, user to run the datanode as after dropping privileges +export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER} + +# Where log files are stored. $HADOOP_HOME/logs by default. +#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER + +# Where log files are stored in the secure data environment. +export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER} + +# The directory where pid files are stored. /tmp by default. +# NOTE: this should be set to a directory that can only be written to by +# the user that will run the hadoop daemons. Otherwise there is the +# potential for a symlink attack. +export HADOOP_PID_DIR=${HADOOP_PID_DIR} +export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR} + +# A string representing this instance of hadoop. $USER by default. +export HADOOP_IDENT_STRING=$USER diff --git a/conf/hdfs-site-1.0.xml b/conf/hdfs-site-1.0.xml new file mode 100644 index 000000000..7ddd71d11 --- /dev/null +++ b/conf/hdfs-site-1.0.xml @@ -0,0 +1,80 @@ + + + + + + + + + + dfs.secondary.http.address + HADOOP_MASTER_HOST:50090 + + The secondary namenode http server address and port. + If the port is 0 then the server will start on a free port. + + + + + dfs.name.dir + ${hadoop.tmp.dir}/dfs/name + Determines where on the local filesystem the DFS name node + should store the name table(fsimage). If this is a comma-delimited list + of directories then the name table is replicated in all of the + directories, for redundancy. + + + + dfs.data.dir + ${hadoop.tmp.dir}/dfs/data + Determines where on the local filesystem an DFS data node + should store its blocks. If this is a comma-delimited + list of directories, then data will be stored in all named + directories, typically on different devices. + Directories that do not exist are ignored. + + + + + dfs.replication + HDFSREPLICATION + Default block replication. + The actual number of replications can be specified when the file is created. + The default is used if replication is not specified in create time. + + + + + dfs.block.size + HDFSBLOCKSIZE + The default block size for new files. + + + + + dfs.namenode.handler.count + HDFSNAMENODEHANDLERCLOUNT + The number of server threads for the namenode. + + + + + dfs.datanode.handler.count + HDFSDATANODEHANDLERCLOUNT + The number of server threads for the datanode. + + + + diff --git a/conf/hdfs-site-2.0.xml b/conf/hdfs-site-2.0.xml new file mode 100644 index 000000000..273209745 --- /dev/null +++ b/conf/hdfs-site-2.0.xml @@ -0,0 +1,95 @@ + + + + + + + + + + dfs.namenode.secondary.http-address + HADOOP_MASTER_HOST:50090 + + The secondary namenode http server address and port. + If the port is 0 then the server will start on a free port. + + + + + dfs.namenode.name.dir + file://${hadoop.tmp.dir}/dfs/name + Determines where on the local filesystem the DFS name node + should store the name table(fsimage). If this is a comma-delimited list + of directories then the name table is replicated in all of the + directories, for redundancy. + + + + dfs.datanode.data.dir + file://${hadoop.tmp.dir}/dfs/data + Determines where on the local filesystem an DFS data node + should store its blocks. If this is a comma-delimited + list of directories, then data will be stored in all named + directories, typically on different devices. + Directories that do not exist are ignored. + + + + + dfs.replication + HDFSREPLICATION + Default block replication. + The actual number of replications can be specified when the file is created. + The default is used if replication is not specified in create time. + + + + + dfs.blocksize + HDFSBLOCKSIZE + + The default block size for new files, in bytes. + You can use the following suffix (case insensitive): + k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.), + Or provide complete size in bytes (such as 134217728 for 128 MB). + + + + + + dfs.namenode.handler.count + HDFSNAMENODEHANDLERCLOUNT + The number of server threads for the namenode. + + + + + dfs.datanode.handler.count + HDFSDATANODEHANDLERCLOUNT + The number of server threads for the datanode. + + + + + dfs.stream-buffer-size + IOBUFFERSIZE + The size of buffer to stream files. + The size of this buffer should probably be a multiple of hardware + page size (4096 on Intel x86), and it determines how much data is + buffered during read and write operations. + + + + diff --git a/conf/log4j.properties b/conf/log4j.properties new file mode 100644 index 000000000..7fd5596fc --- /dev/null +++ b/conf/log4j.properties @@ -0,0 +1,224 @@ +# Copyright 2011 The Apache Software Foundation +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Define some default values that can be overridden by system properties +hadoop.root.logger=INFO, stdout +hadoop.log.dir=. +hadoop.log.file=hadoop.log + +# Define the root logger to the system property "hadoop.root.logger". +log4j.rootLogger=${hadoop.root.logger}, EventCounter + +# Logging Threshold +log4j.threshold=ALL + +# Null Appender +log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender + +# +# Rolling File Appender - cap space usage at 5gb. +# +hadoop.log.maxfilesize=256MB +hadoop.log.maxbackupindex=20 +log4j.appender.RFA=org.apache.log4j.RollingFileAppender +log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} + +log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} +log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} + +log4j.appender.RFA.layout=org.apache.log4j.PatternLayout + +# Pattern format: Date LogLevel LoggerName LogMessage +log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +# Debugging Pattern format +#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n + + +# +# Daily Rolling File Appender +# + +log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender +log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} + +# Rollver at midnight +log4j.appender.DRFA.DatePattern=.yyyy-MM-dd + +# 30-day backup +#log4j.appender.DRFA.MaxBackupIndex=30 +log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout + +# Pattern format: Date LogLevel LoggerName LogMessage +log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +# Debugging Pattern format +#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n + + +# +# console +# Add "console" to rootlogger above if you want to use this +# + +log4j.appender.console=org.apache.log4j.ConsoleAppender +log4j.appender.console.target=System.err +log4j.appender.console.layout=org.apache.log4j.PatternLayout +log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n + +# +# TaskLog Appender +# + +#Default values +hadoop.tasklog.taskid=null +hadoop.tasklog.iscleanup=false +hadoop.tasklog.noKeepSplits=4 +hadoop.tasklog.totalLogFileSize=100 +hadoop.tasklog.purgeLogSplits=true +hadoop.tasklog.logsRetainHours=12 + +log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender +log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} +log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} +log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} + +log4j.appender.TLA.layout=org.apache.log4j.PatternLayout +log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n + +# +# HDFS block state change log from block manager +# +# Uncomment the following to suppress normal block state change +# messages from BlockManager in NameNode. +#log4j.logger.BlockStateChange=WARN + +# +#Security appender +# +hadoop.security.logger=INFO,NullAppender +hadoop.security.log.maxfilesize=256MB +hadoop.security.log.maxbackupindex=20 +log4j.category.SecurityLogger=${hadoop.security.logger} +hadoop.security.log.file=SecurityAuth-${user.name}.audit +log4j.appender.RFAS=org.apache.log4j.RollingFileAppender +log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} +log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout +log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} +log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} + +# +# Daily Rolling Security appender +# +log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender +log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} +log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout +log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd + +# +# hdfs audit logging +# +hdfs.audit.logger=INFO,NullAppender +hdfs.audit.log.maxfilesize=256MB +hdfs.audit.log.maxbackupindex=20 +log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} +log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false +log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender +log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log +log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout +log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n +log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} +log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} + +# +# mapred audit logging +# +mapred.audit.logger=INFO,NullAppender +mapred.audit.log.maxfilesize=256MB +mapred.audit.log.maxbackupindex=20 +log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} +log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false +log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender +log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log +log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout +log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n +log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} +log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} + +# Custom Logging levels + +#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG +#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG +#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG + +# Jets3t library +log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR + +# +# Event Counter Appender +# Sends counts of logging messages at different severity levels to Hadoop Metrics. +# +log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter + +# +# Job Summary Appender +# +# Use following logger to send summary to separate file defined by +# hadoop.mapreduce.jobsummary.log.file : +# hadoop.mapreduce.jobsummary.logger=INFO,JSA +# +hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} +hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log +hadoop.mapreduce.jobsummary.log.maxfilesize=256MB +hadoop.mapreduce.jobsummary.log.maxbackupindex=20 +log4j.appender.JSA=org.apache.log4j.RollingFileAppender +log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} +log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} +log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} +log4j.appender.JSA.layout=org.apache.log4j.PatternLayout +log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n +log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} +log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false + +# +# Yarn ResourceManager Application Summary Log +# +# Set the ResourceManager summary log filename +yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log +# Set the ResourceManager summary log level and appender +yarn.server.resourcemanager.appsummary.logger=${hadoop.root.logger} +#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY + +# To enable AppSummaryLogging for the RM, +# set yarn.server.resourcemanager.appsummary.logger to +# ,RMSUMMARY in hadoop-env.sh + +# Appender for ResourceManager Application Summary Log +# Requires the following properties to be set +# - hadoop.log.dir (Hadoop Log directory) +# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename) +# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender) + +log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} +log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false +log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender +log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} +log4j.appender.RMSUMMARY.MaxFileSize=256MB +log4j.appender.RMSUMMARY.MaxBackupIndex=20 +log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout +log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n diff --git a/conf/mapred-site-1.0.xml b/conf/mapred-site-1.0.xml new file mode 100644 index 000000000..3daad6bf9 --- /dev/null +++ b/conf/mapred-site-1.0.xml @@ -0,0 +1,178 @@ + + + + + + + + + io.sort.factor + IOSORTFACTOR + The number of streams to merge at once while sorting + files. This determines the number of open file handles. + + + + io.sort.mb + IOSORTMB + The total amount of buffer memory to use while sorting + files, in megabytes. By default, gives each merge stream 1MB, which + should minimize seeks. + + + + mapred.job.tracker + HADOOP_MASTER_HOST:54311 + The host and port that the MapReduce job tracker runs + at. If "local", then jobs are run in-process as a single map + and reduce task. + + + + + + mapred.job.tracker.handler.count + JOBTRACKERHANDLERCOUNT + + The number of server threads for the JobTracker. This should be roughly + 4% of the number of tasktracker nodes. + + + + + mapred.local.dir + LOCALSTOREDIR/mapred/local + The local directory where MapReduce stores intermediate + data files. May be a comma-separated list of + directories on different devices in order to spread disk i/o. + Directories that do not exist are ignored. + + + + + mapred.system.dir + ${hadoop.tmp.dir}/mapred/system + The directory where MapReduce stores control files. + + + + + mapreduce.jobtracker.staging.root.dir + ${hadoop.tmp.dir}/mapred/staging + The root of the staging area for users' job files + In practice, this should be the directory where users' home + directories are located (usually /user) + + + + + mapred.temp.dir + ${hadoop.tmp.dir}/mapred/temp + A shared directory for temporary files. + + + + + + mapred.reduce.parallel.copies + MRPRALLELCOPIES + The default number of parallel transfers run by reduce + during the copy(shuffle) phase. + + + + + mapred.child.java.opts + -XmxALLCHILDHEAPSIZEm + Java opts for the task tracker child processes. + The following symbol, if present, will be interpolated: @taskid@ is replaced + by current TaskID. Any other occurrences of '@' will go unchanged. + For example, to enable verbose gc logging to a file named for the taskid in + /tmp and to set the heap maximum to be a gigabyte, pass a 'value' of: + -Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc + + The configuration variable mapred.child.ulimit can be used to control the + maximum virtual memory of the child processes. + + + + + mapred.reduce.slowstart.completed.maps + MRSLOWSTART + Fraction of the number of maps in the job which should be + complete before reduces are scheduled for the job. + + + + + mapred.map.tasks + DEFAULTMAPTASKS + The default number of map tasks per job. + Ignored when mapred.job.tracker is "local". + + + + + mapred.reduce.tasks + DEFAULTREDUCETASKS + The default number of reduce tasks per job. Typically set to 99% + of the cluster's reduce capacity, so that if a node fails the reduces can + still be executed in a single wave. + Ignored when mapred.job.tracker is "local". + + + + + mapred.tasktracker.map.tasks.maximum + MAXMAPTASKS + The maximum number of map tasks that will be run + simultaneously by a task tracker. + + + + + mapred.tasktracker.reduce.tasks.maximum + MAXREDUCETASKS + The maximum number of reduce tasks that will be run + simultaneously by a task tracker. + + + + + mapreduce.task.tmp.dir + ${fs.default.name}/mapred/temp + To set the value of tmp directory for map and reduce tasks. + If the value is an absolute path, it is directly assigned. Otherwise, it is + prepended with task's working directory. The java tasks are executed with + option -Djava.io.tmpdir='the absolute path of the tmp dir'. Pipes and + streaming are set with environment variable, + TMPDIR='the absolute path of the tmp dir' + + + + + + tasktracker.http.threads + 80 + The number of worker threads that for the http server. This is + used for map output fetching + + + + + mapred.output.compress + HADOOPCOMPRESSION + Should the job outputs be compressed? + + + + + mapred.compress.map.output + HADOOPCOMPRESSION + Should the outputs of the maps be compressed before being + sent across the network. Uses SequenceFile compression. + + + + + diff --git a/conf/mapred-site-2.0.xml b/conf/mapred-site-2.0.xml new file mode 100644 index 000000000..ffbfb7b23 --- /dev/null +++ b/conf/mapred-site-2.0.xml @@ -0,0 +1,200 @@ + + + + + + + + + mapreduce.task.io.sort.factor + IOSORTFACTOR + The number of streams to merge at once while sorting + files. This determines the number of open file handles. + + + + mapreduce.task.io.sort.mb + IOSORTMB + The total amount of buffer memory to use while sorting + files, in megabytes. By default, gives each merge stream 1MB, which + should minimize seeks. + + + + mapreduce.cluster.local.dir + LOCALSTOREDIR/mapred/local + The local directory where MapReduce stores intermediate + data files. May be a comma-separated list of + directories on different devices in order to spread disk i/o. + Directories that do not exist are ignored. + + + + + mapreduce.jobtracker.system.dir + ${hadoop.tmp.dir}/mapred/system + The directory where MapReduce stores control files. + + + + + mapreduce.jobtracker.staging.root.dir + ${hadoop.tmp.dir}/mapred/staging + The root of the staging area for users' job files + In practice, this should be the directory where users' home + directories are located (usually /user) + + + + + mapreduce.cluster.temp.dir + ${hadoop.tmp.dir}/mapred/temp + A shared directory for temporary files. + + + + + + mapreduce.reduce.shuffle.parallelcopies + MRPRALLELCOPIES + The default number of parallel transfers run by reduce + during the copy(shuffle) phase. + + + + + mapred.child.java.opts + -XmxALLCHILDHEAPSIZEm + Java opts for the task tracker child processes. + The following symbol, if present, will be interpolated: @taskid@ is replaced + by current TaskID. Any other occurrences of '@' will go unchanged. + For example, to enable verbose gc logging to a file named for the taskid in + /tmp and to set the heap maximum to be a gigabyte, pass a 'value' of: + -Xmx2560m -verbose:gc -Xloggc:/tmp/@taskid@.gc + + Usage of -Djava.library.path can cause programs to no longer function if + hadoop native libraries are used. These values should instead be set as part + of LD_LIBRARY_PATH in the map / reduce JVM env using the mapreduce.map.env and + mapreduce.reduce.env config settings. + + + + + mapred.map.child.java.opts + -XmxMAPCHILDHEAPSIZEm + + + + mapreduce.map.memory.mb + MAPCONTAINERMB + + + + mapred.reduce.child.java.opts + -XmxREDUCECHILDHEAPSIZEm + + + + mapreduce.reduce.memory.mb + REDUCECONTAINERMB + + + + mapreduce.job.maps + DEFAULTMAPTASKS + The default number of map tasks per job. + Ignored when mapreduce.jobtracker.address is "local". + + + + + mapreduce.job.reduces + DEFAULTREDUCETASKS + The default number of reduce tasks per job. Typically set to 99% + of the cluster's reduce capacity, so that if a node fails the reduces can + still be executed in a single wave. + Ignored when mapreduce.jobtracker.address is "local". + + + + + mapreduce.job.reduce.slowstart.completedmaps + MRSLOWSTART + Fraction of the number of maps in the job which should be + complete before reduces are scheduled for the job. + + + + + mapreduce.task.tmp.dir + ${fs.default.name}/mapred/temp + To set the value of tmp directory for map and reduce tasks. + If the value is an absolute path, it is directly assigned. Otherwise, it is + prepended with task's working directory. The java tasks are executed with + option -Djava.io.tmpdir='the absolute path of the tmp dir'. Pipes and + streaming are set with environment variable, + TMPDIR='the absolute path of the tmp dir' + + + + + mapreduce.framework.name + yarn + The runtime framework for executing MapReduce jobs. + Can be one of local, classic or yarn. + + + + + + mapreduce.jobtracker.handler.count + JOBTRACKERHANDLERCOUNT + + The number of server threads for the JobTracker. This should be roughly + 4% of the number of tasktracker nodes. + + + + + + mapreduce.tasktracker.http.threads + 80 + The number of worker threads that for the http server. This is + used for map output fetching + + + + + yarn.app.mapreduce.am.staging-dir + ${hadoop.tmp.dir}/yarn/ + The staging dir used while submitting jobs. + + + + + mapreduce.job.reduce.shuffle.consumer.plugin.class + org.apache.hadoop.mapreduce.task.reduce.Shuffle + + Name of the class whose instance will be used + to send shuffle requests by reducetasks of this job. + The class must be an instance of org.apache.hadoop.mapred.ShuffleConsumerPlugin. + + + + + mapreduce.output.fileoutputformat.compress + HADOOPCOMPRESSION + Should the job outputs be compressed? + + + + + mapreduce.map.output.compress + HADOOPCOMPRESSION + Should the outputs of the maps be compressed before being + sent across the network. Uses SequenceFile compression. + + + + + diff --git a/conf/yarn-env-2.0.sh b/conf/yarn-env-2.0.sh new file mode 100755 index 000000000..8c73f268f --- /dev/null +++ b/conf/yarn-env-2.0.sh @@ -0,0 +1,112 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# User for YARN daemons +export HADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn} + +# resolve links - $0 may be a softlink +export YARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}" + +# some Java parameters +# export JAVA_HOME=/home/y/libexec/jdk1.6.0/ +export JAVA_HOME=HADOOP_JAVA_HOME + +if [ "$JAVA_HOME" != "" ]; then + #echo "run java in $JAVA_HOME" + JAVA_HOME=$JAVA_HOME +fi + +if [ "$JAVA_HOME" = "" ]; then + echo "Error: JAVA_HOME is not set." + exit 1 +fi + +JAVA=$JAVA_HOME/bin/java +JAVA_HEAP_MAX=-Xmx1000m + +# For setting YARN specific HEAP sizes please use this +# Parameter and set appropriately +YARN_HEAPSIZE=HADOOP_DAEMON_HEAP_MAX + +# check envvars which might override default args +if [ "$YARN_HEAPSIZE" != "" ]; then + JAVA_HEAP_MAX="-Xmx""$YARN_HEAPSIZE""m" +fi + +# Resource Manager specific parameters + +# Specify the max Heapsize for the ResourceManager using a numerical value +# in the scale of MB. For example, to specify an jvm option of -Xmx1000m, set +# the value to 1000. +# This value will be overridden by an Xmx setting specified in either YARN_OPTS +# and/or YARN_RESOURCEMANAGER_OPTS. +# If not specified, the default value will be picked from either YARN_HEAPMAX +# or JAVA_HEAP_MAX with YARN_HEAPMAX as the preferred option of the two. +#export YARN_RESOURCEMANAGER_HEAPSIZE=1000 + +# Specify the JVM options to be used when starting the ResourceManager. +# These options will be appended to the options specified as YARN_OPTS +# and therefore may override any similar flags set in YARN_OPTS +#export YARN_RESOURCEMANAGER_OPTS= + +# Node Manager specific parameters + +# Specify the max Heapsize for the NodeManager using a numerical value +# in the scale of MB. For example, to specify an jvm option of -Xmx1000m, set +# the value to 1000. +# This value will be overridden by an Xmx setting specified in either YARN_OPTS +# and/or YARN_NODEMANAGER_OPTS. +# If not specified, the default value will be picked from either YARN_HEAPMAX +# or JAVA_HEAP_MAX with YARN_HEAPMAX as the preferred option of the two. +#export YARN_NODEMANAGER_HEAPSIZE=1000 + +# Specify the JVM options to be used when starting the NodeManager. +# These options will be appended to the options specified as YARN_OPTS +# and therefore may override any similar flags set in YARN_OPTS +#export YARN_NODEMANAGER_OPTS= + +# so that filenames w/ spaces are handled correctly in loops below +IFS= + + +# default log directory & file +if [ "$YARN_LOG_DIR" = "" ]; then + YARN_LOG_DIR="$HADOOP_YARN_HOME/logs" +fi +if [ "$YARN_LOGFILE" = "" ]; then + YARN_LOGFILE='yarn.log' +fi + +# default policy file for service-level authorization +if [ "$YARN_POLICYFILE" = "" ]; then + YARN_POLICYFILE="hadoop-policy.xml" +fi + +# restore ordinary behaviour +unset IFS + + +YARN_OPTS="$YARN_OPTS -Dhadoop.log.dir=$YARN_LOG_DIR" +YARN_OPTS="$YARN_OPTS -Dyarn.log.dir=$YARN_LOG_DIR" +YARN_OPTS="$YARN_OPTS -Dhadoop.log.file=$YARN_LOGFILE" +YARN_OPTS="$YARN_OPTS -Dyarn.log.file=$YARN_LOGFILE" +YARN_OPTS="$YARN_OPTS -Dyarn.home.dir=$YARN_COMMON_HOME" +YARN_OPTS="$YARN_OPTS -Dyarn.id.str=$YARN_IDENT_STRING" +YARN_OPTS="$YARN_OPTS -Dhadoop.root.logger=${YARN_ROOT_LOGGER:-INFO,console}" +YARN_OPTS="$YARN_OPTS -Dyarn.root.logger=${YARN_ROOT_LOGGER:-INFO,console}" +if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then + YARN_OPTS="$YARN_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH" +fi +YARN_OPTS="$YARN_OPTS -Dyarn.policy.file=$YARN_POLICYFILE" diff --git a/conf/yarn-site-2.0.xml b/conf/yarn-site-2.0.xml new file mode 100644 index 000000000..d768295d6 --- /dev/null +++ b/conf/yarn-site-2.0.xml @@ -0,0 +1,91 @@ + + + + + + + The address of the applications manager interface in the RM. + yarn.resourcemanager.address + HADOOP_MASTER_HOST:8032 + + + + The address of the scheduler interface. + yarn.resourcemanager.scheduler.address + HADOOP_MASTER_HOST:8030 + + + + The address of the RM web application. + yarn.resourcemanager.webapp.address + HADOOP_MASTER_HOST:8088 + + + + yarn.resourcemanager.resource-tracker.address + HADOOP_MASTER_HOST:8031 + + + + The address of the RM admin interface. + yarn.resourcemanager.admin.address + HADOOP_MASTER_HOST:8033 + + + + The minimum allocation for every container request at the RM, + in MBs. Memory requests lower than this won't take effect, + and the specified value will get allocated at minimum. + yarn.scheduler.minimum-allocation-mb + YARNMINCONTAINER + + + + The maximum allocation for every container request at the RM, + in MBs. Memory requests higher than this won't take effect, + and will get capped to this value. + yarn.scheduler.maximum-allocation-mb + YARNMAXCONTAINER + + + + Amount of physical memory, in MB, that can be allocated + for containers. + yarn.nodemanager.resource.memory-mb + YARNRESOURCEMEMORY + + + + yarn.nodemanager.aux-services + mapreduce.shuffle + + + + yarn.nodemanager.aux-services.mapreduce.shuffle.class + org.apache.hadoop.mapred.ShuffleHandler + + + + List of directories to store localized files in. An + application's localized file directory will be found in: + ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}. + Individual containers' work directories, called container_${contid}, will + be subdirectories of this. + + yarn.nodemanager.local-dirs + LOCALSTOREDIR/yarn-nm/ + + + + yarn.scheduler.capacity.root.queues + default + The queues at the this level (root is the root queue). + + + + + yarn.scheduler.capacity.root.default.capacity + 100 + + + diff --git a/create-dist.sh b/create-dist.sh new file mode 100755 index 000000000..93f9ac0c9 --- /dev/null +++ b/create-dist.sh @@ -0,0 +1,41 @@ +#!/bin/sh + +version="1.0" +distdir=magpie-${version} +tarball=${distdir}.tar.gz + +rm -rf ${distdir} +rm -rf ${tarball} +mkdir ${distdir} + +cp -a --parents \ + COPYING \ + DISCLAIMER \ + README \ + TODO \ + hadoop-example-environment-extra \ + hadoop-example-job \ + hadoop-example-post-job \ + hadoop-example-pre-job \ + hadoop-expand-nodes \ + hadoop-gather \ + hadoop-post-run \ + hadoop-run \ + sbatch.hadoop \ + conf/core-site-1.0.xml \ + conf/core-site-2.0.xml \ + conf/hdfs-site-1.0.xml \ + conf/hdfs-site-2.0.xml \ + conf/mapred-site-1.0.xml \ + conf/mapred-site-2.0.xml \ + conf/yarn-env-2.0.sh \ + conf/hadoop-env-1.0.sh \ + conf/hadoop-env-2.0.sh \ + conf/yarn-site-2.0.xml \ + conf/log4j.properties \ + patches/hadoop-1.2.1-9109.patch \ + patches/hadoop-2.1.0-beta-9109.patch \ + ${distdir} + +tar czf ${tarball} ${distdir} +rm -rf ${distdir} diff --git a/hadoop-example-environment-extra b/hadoop-example-environment-extra new file mode 100755 index 000000000..a7b3cd8e9 --- /dev/null +++ b/hadoop-example-environment-extra @@ -0,0 +1,5 @@ +# This is an example of extra stuff you may wish to put into your +# environment. + +ulimit -n 8192 +ulimit -u 4096 diff --git a/hadoop-example-job b/hadoop-example-job new file mode 100755 index 000000000..e1678ddd6 --- /dev/null +++ b/hadoop-example-job @@ -0,0 +1,26 @@ +#!/bin/sh + +# This script executes a teragen and terasort with Hadoop 2.0. It is +# an example of how you can setup of script to run Hadoop job. It is +# set via the HADOOP_SCRIPT_PATH environment variable and setting the +# HADOOP_MODE to 'script'. See the sbatch.hadoop file for details. +# +# Please adjust to whatever you would want to do with Hadoop + +cd ${HADOOP_BUILD_HOME} + +command="bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-$HADOOP_VERSION.jar teragen 50000000 terasort-example-job-teragen" +echo "Running $command" >&2 +$command + +command="bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-$HADOOP_VERSION.jar terasort -Dmapred.reduce.tasks=1 terasort-example-job-teragen terasort-example-job-sort" + +echo "Running $command" >&2 +$command + +command="bin/hadoop fs -rm -r terasort-example-job-teragen" +$command +command="bin/hadoop fs -rm -r terasort-example-job-sort" +$command + +exit 0 \ No newline at end of file diff --git a/hadoop-example-post-job b/hadoop-example-post-job new file mode 100755 index 000000000..0a6e51f25 --- /dev/null +++ b/hadoop-example-post-job @@ -0,0 +1,16 @@ +#!/bin/sh + +# This script is an example of something you might like to do after a +# job is run. It is set by the HADOOP_POST_JOB_RUN environment +# variable. See the sbatch.hadoop file for details. + +# Cleanup some stuff +# rm -rf /ssd/tmp1/achu/ + +# Cleaning up on networked directory, only do on one node +if [ "${HADOOP_CLUSTER_NODERANK}" == "0" ] +then + # rm -rf /p/lscratchg/achu/hadoop/hdfsoverlustre/ +fi + +exit 0 \ No newline at end of file diff --git a/hadoop-example-pre-job b/hadoop-example-pre-job new file mode 100755 index 000000000..88be864b8 --- /dev/null +++ b/hadoop-example-pre-job @@ -0,0 +1,34 @@ +#!/bin/sh + +# This script is an example of something you might like to do before a +# job is run. It is set by the HADOOP_PRE_JOB_RUN environment +# variable. See the sbatch.hadoop file for details. + +# Get some debugging info before I run + +# Run only on one node, no need to do it on all nodes +if [ "${HADOOP_CLUSTER_NODERANK}" == "0" ] +then + ulimit -a + + # Cat conf files for documentation + echo "**** mapred-site.xml ****" + cat ${HADOOP_CONF_DIR}/mapred-site.xml + + echo "**** core-site.xml ****" + cat ${HADOOP_CONF_DIR}/core-site.xml + + if [ -f "${HADOOP_CONF_DIR}/hdfs-site.xml" ] + then + echo "**** hdfs-site.xml ****" + cat ${HADOOP_CONF_DIR}/hdfs-site.xml + fi + + if [ -f "${HADOOP_CONF_DIR}/yarn-site.xml" ] + then + echo "**** yarn-site.xml ****" + cat ${HADOOP_CONF_DIR}/yarn-site.xml + fi +fi + +exit 0 \ No newline at end of file diff --git a/hadoop-expand-nodes b/hadoop-expand-nodes new file mode 100755 index 000000000..51e579f7e --- /dev/null +++ b/hadoop-expand-nodes @@ -0,0 +1,75 @@ +#!/usr/bin/perl -w + +# Most of this code is cut & pasted from Hostlist.pm in the Genders +# project. See http://sourceforge.net/projects/genders/ + +use strict; + +# expand_quadrics_range +# +# expand nodelist in quadrics form +# +sub expand_quadrics_range +{ + my ($list) = @_; + my ($pfx, $ranges, $suffix) = split(/[\[\]]/, $list, 3); + + return $list if (!defined $ranges); + + return map {"$pfx$_$suffix"} + map { s/^(\d+)-(\d+)$/"$1".."$2"/ ? eval : $_ } + split(/,|:/, $ranges); +} + +# expand() +# turn a hostname range into a list of hostnames. Try to autodetect whether +# a quadrics-style range or a normal hostname range was passed in. +# +sub expand +{ + my ($list) = @_; + + if ($list =~ /\[/ && $list !~ /[^[]*\[.+\]/) { + # Handle case of no closing bracket - just return + return ($list); + } + + # matching "[" "]" pair with stuff inside will be considered a quadrics + # range: + if ($list =~ /[^[]*\[.+\]/) { + # quadrics ranges are separated by whitespace in RMS - + # try to support that here + $list =~ s/\s+/,/g; + + # + # Replace ',' chars internal to "[]" with ':" + # + while ($list =~ s/(\[[^\]]*),([^\[]*\])/$1:$2/) {} + + return map { expand_quadrics_range($_) } split /,/, $list; + + } else { + return map { + s/(\w*?)(\d+)-(\1|)(\d+)/"$2".."$4"/ + || + s/(.+)/""/; + map {"$1$_"} eval; + } split /,/, $list; + } +} + +if (@ARGV != 1) { + my $prog = `basename $0`; + + chomp($prog); + print "Usage: $prog \n"; + print " e.g. $prog mynodes[0-200]\n"; + exit 1; +} + +my @nodes = expand($ARGV[0]); + +foreach my $node (@nodes) +{ + print "$node\n"; +} diff --git a/hadoop-gather b/hadoop-gather new file mode 100755 index 000000000..219db7381 --- /dev/null +++ b/hadoop-gather @@ -0,0 +1,39 @@ +#!/bin/sh +############################################################################# +# Copyright (C) 2013 Lawrence Livermore National Security, LLC. +# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). +# Written by Albert Chu +# LLNL-CODE-644248 +# +# This file is part of Magpie, scripts for running Hadoop on +# traditional HPC systems. For details, see . +# +# Magpie is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# Magpie is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Magpie. If not, see . +############################################################################# + +# This script is todo final log gathering for debugging + +export HADOOP_LOCAL_JOB_DIR=${HADOOP_LOCAL_DIR}/${SLURM_JOB_NAME}/${SLURM_JOB_ID} + +cd $HADOOP_LOCAL_JOB_DIR + +export NODENAME=`hostname` + +export TARGET_DIR=$HADOOP_SCRIPTS_HOME/${SLURM_JOB_NAME}/${SLURM_JOB_ID}/nodes/${NODENAME} +mkdir -p $TARGET_DIR + +cp -a conf ${TARGET_DIR} +cp -a log ${TARGET_DIR} + +exit 0 diff --git a/hadoop-post-run b/hadoop-post-run new file mode 100755 index 000000000..407569cdb --- /dev/null +++ b/hadoop-post-run @@ -0,0 +1,41 @@ +#!/bin/bash +############################################################################# +# Copyright (C) 2013 Lawrence Livermore National Security, LLC. +# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). +# Written by Albert Chu +# LLNL-CODE-644248 +# +# This file is part of Magpie, scripts for running Hadoop on +# traditional HPC systems. For details, see . +# +# Magpie is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# Magpie is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Magpie. If not, see . +############################################################################# + +# This script executes setup scripts on behalf of the user. For the +# most part, it shouldn't be editted. See sbatch.hadoop for +# configuration details. + +if [ "${HADOOP_POST_JOB_RUN}X" != "X" ] \ + && [ ! -x "${HADOOP_POST_JOB_RUN}" ] +then + echo "Script HADOOP_POST_JOB_RUN=\"${HADOOP_POST_JOB_RUN}\" is not executable" + exit 1 +fi + +if [ "${HADOOP_POST_JOB_RUN}X" != "X" ] +then + ${HADOOP_POST_JOB_RUN} +fi + +exit 0 diff --git a/hadoop-run b/hadoop-run new file mode 100755 index 000000000..1959e3103 --- /dev/null +++ b/hadoop-run @@ -0,0 +1,1015 @@ +#!/bin/bash +############################################################################# +# Copyright (C) 2013 Lawrence Livermore National Security, LLC. +# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). +# Written by Albert Chu +# LLNL-CODE-644248 +# +# This file is part of Magpie, scripts for running Hadoop on +# traditional HPC systems. For details, see . +# +# Magpie is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# Magpie is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Magpie. If not, see . +############################################################################# + + +# This script is the core processing script for running Hadoop jobs. +# For the most part, it shouldn't be editted. See sbatch.hadoop for +# configuration details. + +# +# Check a variety of input variables +# + +if [ "${HADOOP_MODE}" != "terasort" ] \ + && [ "${HADOOP_MODE}" != "script" ] \ + && [ "${HADOOP_MODE}" != "interactive" ] \ + && [ "${HADOOP_MODE}" != "setuponly" ] +then + echo "HADOOP_MODE must be set to terasort, script, interactive, setuponly" + exit 1 +fi + +if [ "${HADOOP_FILESYSTEM_MODE}" != "hdfs" ] \ + && [ "${HADOOP_FILESYSTEM_MODE}" != "hdfsoverlustre" ] \ + && [ "${HADOOP_FILESYSTEM_MODE}" != "rawnetworkfs" ] +then + echo "HADOOP_FILESYSTEM_MODE must be set to hdfs, hdfsoverlustre, or rawnetworkfs" + exit 1 +fi + +if [ "${HADOOP_SETUP_TYPE}" != "MR1" ] \ + && [ "${HADOOP_SETUP_TYPE}" != "MR2" ] +then + echo "HADOOP_SETUP_TYPE must be set to MR1 or MR2" + exit 1 +fi + +if [ "${HADOOP_BUILD_HOME}X" == "X" ] +then + echo "HADOOP_BUILD_HOME must be set" + exit 1 +fi + +if [ ! -d ${HADOOP_BUILD_HOME} ] +then + echo "${HADOOP_BUILD_HOME} does not point to a directory" + exit 1 +fi + +if [ "${HADOOP_SCRIPTS_HOME}X" == "X" ] +then + echo "HADOOP_SCRIPTS_HOME must be set" + exit 1 +fi + +if [ ! -d ${HADOOP_SCRIPTS_HOME} ] +then + echo "${HADOOP_SCRIPTS_HOME} does not point to a directory" + exit 1 +fi + +if [ "${JAVA_HOME}X" == "X" ] +then + echo "JAVA_HOME must be set" + exit 1 +fi + +if [ ! -d ${JAVA_HOME} ] +then + echo "${JAVA_HOME} does not point to a directory" + exit 1 +fi + +if [ "${HADOOP_MODE}" == "terasort" ] +then + if [ "${HADOOP_VERSION}X" == "X" ] + then + echo "HADOOP_VERSION must be set for terasort mode" + exit 1 + fi +fi + +if [ "${HADOOP_MODE}" == "interactive" ] \ + || [ "${HADOOP_MODE}" == "setuponly" ] +then + if [ "${SBATCH_TIMELIMIT}X" == "X" ] + then + echo "SBATCH_TIMELIMIT environment variable must be set for interactive or setuponly mode" + exit 1 + fi + + if [ ${SBATCH_TIMELIMIT} -lt 10 ] + then + echo "timelimit must be atleast 10 for interactive or setuponly mode" + exit 1 + fi +fi + +if [ "${HADOOP_MODE}" == "script" ] +then + if [ "${HADOOP_SCRIPT_PATH}X" == "X" ] + then + echo "Script HADOOP_SCRIPT_PATH must be specified" + exit 1 + fi + + if [ ! -x ${HADOOP_SCRIPT_PATH} ] + then + echo "Script HADOOP_SCRIPT_PATH \"$HADOOP_SCRIPT_PATH\" does not have execute permissions" + exit 1 + fi +fi + +if [ "${HADOOP_FILESYSTEM_MODE}" == "hdfs" ] \ + && [ "${HADOOP_HDFS_PATH}X" == "X" ] +then + echo "Must specify environment variable HADOOP_HDFS_PATH" + exit 1 +fi + +if [ "${HADOOP_FILESYSTEM_MODE}" == "hdfsoverlustre" ] \ + && [ "${HADOOP_HDFSOVERLUSTRE_PATH}X" == "X" ] +then + echo "Must specify environment variable HADOOP_HDFSOVERLUSTRE_PATH" + exit 1 +fi + +if [ "${HADOOP_FILESYSTEM_MODE}" == "rawnetworkfs" ] \ + && [ "${HADOOP_RAWNETWORKFS_PATH}X" == "X" ] +then + echo "Must specify environment variable HADOOP_RAWNETWORKFS_PATH" + exit 1 +fi + +if [ "${HADOOP_PRE_JOB_RUN}X" != "X" ] \ + && [ ! -x "${HADOOP_PRE_JOB_RUN}" ] +then + echo "Script HADOOP_PRE_JOB_RUN=\"${HADOOP_PRE_JOB_RUN}\" is not executable" + exit 1 +fi + +if [ "${HADOOP_TERASORT_CLEAR_CACHE}X" != "X" ] \ + && ( [ "${HADOOP_TERASORT_CLEAR_CACHE}" != "yes" ] && [ "${HADOOP_TERASORT_CLEAR_CACHE}" != "no" ]) +then + echo "HADOOP_TERASORT_CLEAR_CACHE must be set to yes or no" + exit 1 +fi + +if [ "${HADOOP_COMPRESSION}X" != "X" ] \ + && ( [ "${HADOOP_COMPRESSION}" != "yes" ] && [ "${HADOOP_COMPRESSION}" != "no" ]) +then + echo "HADOOP_COMPRESSION must be set to yes or no" + exit 1 +fi + +# +# Setup a whole bunch of directories +# + +export HADOOP_CLUSTER_NODERANK="${SLURM_NODEID:=0}" + +mkdir -p $HADOOP_LOCAL_DIR +if [ $? -ne 0 ] ; then + echo "mkdir failed making ${HADOOP_LOCAL_DIR}" + exit 1 +fi + +cd $HADOOP_LOCAL_DIR + +export HADOOP_LOCAL_JOB_DIR=${HADOOP_LOCAL_DIR}/${SLURM_JOB_NAME}/${SLURM_JOB_ID} +mkdir -p $HADOOP_LOCAL_JOB_DIR +if [ $? -ne 0 ] ; then + echo "mkdir failed making ${HADOOP_LOCAL_JOB_DIR}" + exit 1 +fi + +export HADOOP_CONF_DIR=${HADOOP_LOCAL_JOB_DIR}/conf +mkdir -p $HADOOP_CONF_DIR +if [ $? -ne 0 ] ; then + echo "mkdir failed making ${HADOOP_CONF_DIR}" + exit 1 +fi + +export HADOOP_LOG_DIR=${HADOOP_LOCAL_JOB_DIR}/log +mkdir -p $HADOOP_LOG_DIR +if [ $? -ne 0 ] ; then + echo "mkdir failed making ${HADOOP_LOG_DIR}" + exit 1 +fi + +export YARN_CONF_DIR=${HADOOP_CONF_DIR} +export YARN_LOG_DIR=${HADOOP_LOG_DIR} + +# Create master and slave files + +hash hostlist 2>/dev/null +if [ $? -eq 0 ] +then + hostrangescript="hostlist" + hostrangescriptoptions="-e -d \n" +else + if [ -x ${HADOOP_SCRIPTS_HOME}/hadoop-expand-nodes ] + then + hostrangescript="${HADOOP_SCRIPTS_HOME}/hadoop-expand-nodes" + hostrangescriptoptions="" + else + echo "Cannot find tool to expand host ranges" + exit 1 + fi +fi + +${hostrangescript} ${hostrangescriptoptions} ${SLURM_JOB_NODELIST} | head -1 > ${HADOOP_CONF_DIR}/masters +${hostrangescript} ${hostrangescriptoptions} ${SLURM_JOB_NODELIST} | tail -n+2 > ${HADOOP_CONF_DIR}/slaves + +master=`head -1 ${HADOOP_CONF_DIR}/masters` + +slave_count=`cat ${HADOOP_CONF_DIR}/slaves|wc -l` + +# +# Calculate values for various config file variables, based on +# recommendtions, rules of thumb, or based on what user input. +# + +# Recommendation from Cloudera, parallel copies sqrt(number of nodes), floor of ten +parallelcopies=$(echo "sqrt ( ${slave_count} )" | bc -l | xargs printf "%1.0f") +if [ "${parallelcopies}" -lt "10" ] +then + parallelcopies=10 +fi + +# Recommendation from Cloudera, 10% of nodes w/ floor of ten, ceiling 200 +# My experience this is low b/c of high core counts, so bump higher to 50% +namenodehandlercount=$(echo "${slave_count} * .5" | bc -l | xargs printf "%1.0f") +if [ "${namenodehandlercount}" -lt "10" ] +then + namenodehandlercount=10 +fi + +if [ "${namenodehandlercount}" -gt "200" ] +then + namenodehandlercount=200 +fi + +# General rule of thumb is half namenode handler count, so * .25 instead of * .5 +datanodehandlercount=$(echo "${slave_count} * .25" | bc -l | xargs printf "%1.0f") +if [ "${datanodehandlercount}" -lt "10" ] +then + datanodehandlercount=10 +fi + +if [ "${datanodehandlercount}" -gt "200" ] +then + datanodehandlercount=200 +fi + +# Per description, about 4% of nodes but w/ floor of 10 +jobtrackerhandlercount=$(echo "${slave_count} * .04" | bc -l | xargs printf "%1.0f") +if [ "${jobtrackerhandlercount}" -lt "10" ] +then + jobtrackerhandlercount=10 +fi + +# Optimal depends on file system +if [ "${HADOOP_FILESYSTEM_MODE}" == "hdfs" ] +then + iobuffersize=65536 +elif [ "${HADOOP_FILESYSTEM_MODE}" == "hdfsoverlustre" ] +then + # Default block size is 1M in Lustre + # XXX: If not default, can get from lctl or similar? + iobuffersize=1048576 +elif [ "${HADOOP_FILESYSTEM_MODE}" == "rawnetworkfs" ] +then + # Assuming Lustre, so copy above 1M + iobuffersize=1048576 +fi + +javahometmp=`echo "${JAVA_HOME}" | sed "s/\\//\\\\\\\\\//g"` + +memtotal=`cat /proc/meminfo | grep MemTotal | awk '{print $2}'` +memtotalgig=$(echo "(${memtotal} / 1048576)" | bc -l | xargs printf "%1.0f") +proccounttmp=`cat /proc/cpuinfo | grep processor | wc -l` +# subtract one, leave 1 processor for daemons +proccount=`expr ${proccounttmp} - 1` + +if [ "${HADOOP_MAX_TASKS_PER_NODE}X" != "X" ] +then + maxtaskspernode=${HADOOP_MAX_TASKS_PER_NODE} +else + maxtaskspernode=${proccount} +fi + +if [ "${YARN_RESOURCE_MEMORY}X" != "X" ] +then + yarnresourcememory=${YARN_RESOURCE_MEMORY} +else + # 80% of system memory seems like a good estimate + tmp1=$(echo "${memtotalgig} * .8" | bc -l | xargs printf "%1.0f") + yarnresourcememory=$(echo "${tmp1} * 1024" | bc -l | xargs printf "%1.0f") +fi + +if [ "${HADOOP_CHILD_HEAPSIZE}X" != "X" ] +then + allchildheapsize=${HADOOP_CHILD_HEAPSIZE} +else +# achu: This is my stupid bash way to round to the nearest 256M. I'm +# sure there is something smarter I just don't know. Sorry, not a +# bash expert +# +# For MapReduce1, yarn resource memory makes no sense, but we can use +# the value for some of the calculations below. +# + tmp1=$(echo "(${yarnresourcememory} / 1024) / ${maxtaskspernode}" | bc -l | xargs printf "%1.2f") + tmp2=$(echo "(${tmp1} * 100) / 25" | bc -l | xargs printf "%1.0f") + tmp3=$(echo "${tmp2} * .25" | bc -l | xargs printf "%1.2f") + allchildheapsize=$(echo "${tmp3} * 1024" | bc -l | xargs printf "%1.0f") +fi + +if [ "${HADOOP_CHILD_MAP_HEAPSIZE}X" != "X" ] +then + mapchildheapsize=${HADOOP_CHILD_MAP_HEAPSIZE} +else + mapchildheapsize=${allchildheapsize} +fi + +if [ "${HADOOP_CHILD_REDUCE_HEAPSIZE}X" != "X" ] +then + reducechildheapsize=${HADOOP_CHILD_REDUCE_HEAPSIZE} +else + reducechildheapsize=${allchildheapsize} +fi + +# Cloudera recommends 256 for io.sort.mb. Cloudera blog suggests +# io.sort.factor * 10 ~= io.sort.mb. + +if [ "${HADOOP_IO_SORT_MB}X" != "X" ] +then + iosortmb=${HADOOP_IO_SORT_MB} +else + iosortmb=256 +fi + +if [ "${HADOOP_IO_SORT_FACTOR}X" != "X" ] +then + iosortfactor=${HADOOP_IO_SORT_FACTOR} +else + iosortfactor=`expr ${iosortmb} \/ 10` +fi + +mapcontainermb=`expr ${mapchildheapsize} + 256` +reducecontainermb=`expr ${reducechildheapsize} + 256` + +yarnmincontainer=1024 +if [ ${mapcontainermb} -lt ${yarnmincontainer} ] +then + yarnmincontainer=${mapcontainermb} +fi + +if [ ${reducecontainermb} -lt ${yarnmincontainer} ] +then + yarnmincontainer=${reducecontainermb} +fi + +yarnmaxcontainer=8192 +if [ ${mapcontainermb} -gt ${yarnmaxcontainer} ] +then + yarnmaxcontainer=${mapcontainermb} +fi + +if [ ${reducecontainermb} -gt ${yarnmaxcontainer} ] +then + yarnmaxcontainer=${reducecontainermb} +fi + +if [ "${HADOOP_MAPREDUCE_SLOWSTART}X" != "X" ] +then + mapredslowstart=${HADOOP_MAPREDUCE_SLOWSTART} +else + # Hadoop default is 0.05, which is normally ridiculously low. + mapredslowstart="0.25" +fi + +if [ "${HADOOP_DEFAULT_MAP_TASKS}X" != "X" ] +then + defaultmaptasks=${HADOOP_DEFAULT_MAP_TASKS} +else + defaultmaptasks=`expr ${maxtaskspernode} \* ${slave_count}` +fi + +if [ "${HADOOP_DEFAULT_REDUCE_TASKS}X" != "X" ] +then + defaultreducetasks=${HADOOP_DEFAULT_REDUCE_TASKS} +else + defaultreducetasks=${slave_count} +fi + +if [ "${HADOOP_MAX_MAP_TASKS}X" != "X" ] +then + maxmaptasks=${HADOOP_MAX_MAP_TASKS} +else + maxmaptasks=${maxtaskspernode} +fi + +if [ "${HADOOP_MAX_REDUCE_TASKS}X" != "X" ] +then + maxreducetasks=${HADOOP_MAX_REDUCE_TASKS} +else + maxreducetasks=${maxtaskspernode} +fi + +if [ "${HADOOP_HDFS_BLOCKSIZE}X" != "X" ] +then + hdfsblocksize=${HADOOP_HDFS_BLOCKSIZE} +else + # 64M is Hadoop default, widely considered bad choice, we'll use 128M as default + hdfsblocksize=134217728 +fi + +if [ "${HADOOP_HDFS_REPLICATION}X" != "X" ] +then + hdfsreplication=${HADOOP_HDFS_REPLICATION} +else + # replication of 1 instead of 3 in HDFS over Lustre can cause some + # extra file transfers and slow down jobs, however it's been + # observed as not a "huge factor". + if [ "${HADOOP_FILESYSTEM_MODE}" == "hdfsoverlustre" ] + then + hdfsreplication=1 + else + hdfsreplication=3 + fi +fi + +if [ "${HADOOP_DAEMON_HEAP_MAX}X" != "X" ] +then + hadoopdaemonheapmax="${HADOOP_DAEMON_HEAP_MAX}" +else + hadoopdaemonheapmax="1000" +fi + +if [ "${HADOOP_COMPRESSION}X" != "X" ] +then + if [ "${HADOOP_COMPRESSION}" == "yes" ] + then + compression=true + else + compression=false + fi +else + compression=false +fi + +openfiles=`ulimit -n` +if [ "${openfiles}" != "unlimited" ] +then + openfileshardlimit=`ulimit -H -n` + + # we estimate 1024 per 100 nodes. Obviously depends on many + # factors such as core count, but it's a reasonble and safe + # over-estimate calculated based on experience. + openfilescount=`expr ${slave_count} \/ 100 ` + if [ "${openfilescount}" == "0" ] + then + openfilescount=1 + fi + openfilescount=`expr ${openfilescount} \* 1024` + + if [ "${openfileshardlimit}" != "unlimited" ] + then + if [ ${openfilescount} -gt ${openfileshardlimit} ] + then + openfilescount=${openfileshardlimit} + fi + fi +else + openfilescount="unlimited" +fi + +userprocesses=`ulimit -u` +if [ "${userprocesses}" != "unlimited" ] +then + userprocesseshardlimit=`ulimit -H -n` + + # we estimate 1024 per 100 nodes. + userprocessescount=`expr ${slave_count} \/ 100 ` + if [ "${userprocessescount}" == "0" ] + then + userprocessescount=1 + fi + userprocessescount=`expr ${userprocessescount} \* 1024` + + if [ "${userprocesseshardlimit}" != "unlimited" ] + then + if [ ${userprocessescount} -gt ${userprocesseshardlimit} ] + then + userprocessescount=${userprocesseshardlimit} + fi + fi +else + userprocessescount="unlimited" +fi + +# +# Setup file system +# + +if [ "${HADOOP_FILESYSTEM_MODE}" == "hdfs" ] +then + # Setup node directory per node + fspath="${HADOOP_HDFS_PATH}" + hadooptmpdir=`echo "${fspath}" | sed "s/\\//\\\\\\\\\//g"` + fsdefault=`echo "hdfs://${master}:54310" | sed "s/\\//\\\\\\\\\//g"` + + # Assume if path doesn't exist must format + if [ -d "${fspath}" ] + then + format=0 + else + format=1 + mkdir -p ${fspath} + if [ $? -ne 0 ] ; then + echo "mkdir failed making ${fspath}" + exit 1 + fi + fi +elif [ "${HADOOP_FILESYSTEM_MODE}" == "hdfsoverlustre" ] +then + # Setup node directory per node + fspath="${HADOOP_HDFSOVERLUSTRE_PATH}/${SLURM_JOB_NAME}/node-${HADOOP_CLUSTER_NODERANK}" + hadooptmpdir=`echo "${fspath}" | sed "s/\\//\\\\\\\\\//g"` + fsdefault=`echo "hdfs://${master}:54310" | sed "s/\\//\\\\\\\\\//g"` + + # Assume if path doesn't exist must format + if [ -d "${fspath}" ] + then + format=0 + else + format=1 + mkdir -p ${fspath} + if [ $? -ne 0 ] ; then + echo "mkdir failed making ${fspath}" + exit 1 + fi + /usr/bin/lfs setstripe --size ${hdfsblocksize} --count 1 ${fspath} + fi +elif [ "${HADOOP_FILESYSTEM_MODE}" == "rawnetworkfs" ] +then + # Setup node directory per node, still need for "local" files + fspath="${HADOOP_RAWNETWORKFS_PATH}" + pernodepath="${fspath}/${SLURM_JOB_NAME}/node-${HADOOP_CLUSTER_NODERANK}" + hadooptmpdir=`echo "${pernodepath}" | sed "s/\\//\\\\\\\\\//g"` + fsdefault=`echo "file:///" | sed "s/\\//\\\\\\\\\//g"` + + if [ ! -d "${fspath}" ] + then + mkdir -p ${fspath} + if [ $? -ne 0 ] ; then + echo "mkdir failed making ${fspath}" + exit 1 + fi + /usr/bin/lfs setstripe --size ${hdfsblocksize} --count 1 ${fspath} + fi + + if [ ! -d "${pernodepath}" ] + then + mkdir -p ${pernodepath} + if [ $? -ne 0 ] ; then + echo "mkdir failed making ${pernodepath}" + exit 1 + fi + fi + + format=0 +else + echo "Illegal HADOOP_FILESYSTEM_MODE \"${HADOOP_FILESYSTEM_MODE}\" specified" + exit 1 +fi + +if ([ "${HADOOP_FILESYSTEM_MODE}" == "hdfsoverlustre" ] \ + || [ "${HADOOP_FILESYSTEM_MODE}" == "rawnetworkfs" ]) \ + && [ "${HADOOP_LOCALSTORE}X" != "X" ] +then + localstoredirtmp=${HADOOP_LOCALSTORE} + localstoredir=`echo "${localstoredirtmp}" | sed "s/\\//\\\\\\\\\//g"` + + if [ -d "${localstoredirtmp}" ] + then + mkdir -p ${localstoredirtmp} + if [ $? -ne 0 ] ; then + echo "mkdir failed making ${localstoredirtmp}" + exit 1 + fi + fi +else + localstoredir="\$\{hadoop.tmp.dir\}" +fi + +# +# Get config files for setup +# + +if [ "${HADOOP_CONF_FILES}X" == "X" ] +then + hadoopconffiledir=${HADOOP_SCRIPTS_HOME}/conf +else + hadoopconffiledir=${HADOOP_CONF_FILES} +fi + +if [ ${HADOOP_SETUP_TYPE} == "MR1" ] +then + coresitexml=${hadoopconffiledir}/core-site-1.0.xml + mapredsitexml=${hadoopconffiledir}/mapred-site-1.0.xml + hadoopenvsh=${hadoopconffiledir}/hadoop-env-1.0.sh + hdfssitexml=${hadoopconffiledir}/hdfs-site-1.0.xml +elif [ ${HADOOP_SETUP_TYPE} == "MR2" ] +then + coresitexml=${hadoopconffiledir}/core-site-2.0.xml + mapredsitexml=${hadoopconffiledir}/mapred-site-2.0.xml + hadoopenvsh=${hadoopconffiledir}/hadoop-env-2.0.sh + yarnsitexml=${hadoopconffiledir}/yarn-site-2.0.xml + yarnenvsh=${hadoopconffiledir}/yarn-env-2.0.sh + hdfssitexml=${hadoopconffiledir}/hdfs-site-2.0.xml +else + echo "Illegal HADOOP_SETUP_TYPE \"${HADOOP_SETUP_TYPE} \" specified" + exit 1 +fi + +# +# Setup configuration files and environment files +# + +if [ ${HADOOP_SETUP_TYPE} == "MR1" ] \ + || [ ${HADOOP_SETUP_TYPE} == "MR2" ] +then + + sed -e "s/HADOOPTMPDIR/${hadooptmpdir}/g" \ + -e "s/FSDEFAULT/${fsdefault}/g" \ + -e "s/IOBUFFERSIZE/${iobuffersize}/g" \ + $coresitexml > ${HADOOP_CONF_DIR}/core-site.xml + + sed -e "s/HADOOP_MASTER_HOST/${master}/g" \ + -e "s/MRPRALLELCOPIES/${parallelcopies}/g" \ + -e "s/JOBTRACKERHANDLERCOUNT/${jobtrackerhandlercount}/g" \ + -e "s/MRSLOWSTART/${mapredslowstart}/g" \ + -e "s/ALLCHILDHEAPSIZE/${allchildheapsize}/g" \ + -e "s/MAPCHILDHEAPSIZE/${mapchildheapsize}/g" \ + -e "s/MAPCONTAINERMB/${mapcontainermb}/g" \ + -e "s/REDUCECHILDHEAPSIZE/${reducechildheapsize}/g" \ + -e "s/REDUCECONTAINERMB/${reducecontainermb}/g" \ + -e "s/DEFAULTMAPTASKS/${defaultmaptasks}/g" \ + -e "s/DEFAULTREDUCETASKS/${defaultreducetasks}/" \ + -e "s/MAXMAPTASKS/${maxmaptasks}/g" \ + -e "s/MAXREDUCETASKS/${maxreducetasks}/g" \ + -e "s/LOCALSTOREDIR/${localstoredir}/g" \ + -e "s/IOSORTFACTOR/${iosortfactor}/g" \ + -e "s/IOSORTMB/${iosortmb}/g" \ + -e "s/HADOOPCOMPRESSION/${compression}/g" \ + $mapredsitexml > ${HADOOP_CONF_DIR}/mapred-site.xml + + sed -e "s/HADOOP_JAVA_HOME/${javahometmp}/g" \ + -e "s/HADOOP_DAEMON_HEAP_MAX/${hadoopdaemonheapmax}/g" \ + $hadoopenvsh > ${HADOOP_CONF_DIR}/hadoop-env.sh + + echo "export HADOOP_CONF_DIR=\"${HADOOP_CONF_DIR}\"" >> ${HADOOP_CONF_DIR}/hadoop-env.sh + echo "export HADOOP_LOG_DIR=\"${HADOOP_LOG_DIR}\"" >> ${HADOOP_CONF_DIR}/hadoop-env.sh + + if [ "${HADOOP_REMOTE_CMD:-ssh}" != "ssh" ] + then + echo "export HADOOP_SSH_CMD=\"${HADOOP_REMOTE_CMD}\"" >> ${HADOOP_CONF_DIR}/hadoop-env.sh + fi + + if [ -f ${HADOOP_ENVIRONMENT_EXTRA_PATH} ] + then + cat ${HADOOP_ENVIRONMENT_EXTRA_PATH} >> ${HADOOP_CONF_DIR}/hadoop-env.sh + else + echo "ulimit -n ${openfilescount}" >> ${HADOOP_CONF_DIR}/hadoop-env.sh + echo "ulimit -u ${userprocessescount}" >> ${HADOOP_CONF_DIR}/hadoop-env.sh + fi +fi + +if [ ${HADOOP_SETUP_TYPE} == "MR2" ] +then + sed -e "s/HADOOP_MASTER_HOST/${master}/g" \ + -e "s/YARNMINCONTAINER/${yarnmincontainer}/g" \ + -e "s/YARNMAXCONTAINER/${yarnmaxcontainer}/g" \ + -e "s/YARNRESOURCEMEMORY/${yarnresourcememory}/g" \ + -e "s/LOCALSTOREDIR/${localstoredir}/g" \ + $yarnsitexml > ${HADOOP_CONF_DIR}/yarn-site.xml + + sed -e "s/HADOOP_JAVA_HOME/${javahometmp}/g" \ + -e "s/HADOOP_DAEMON_HEAP_MAX/${hadoopdaemonheapmax}/g" \ + $yarnenvsh > ${HADOOP_CONF_DIR}/yarn-env.sh + + echo "export YARN_CONF_DIR=\"${HADOOP_CONF_DIR}\"" >> ${HADOOP_CONF_DIR}/yarn-env.sh + echo "export YARN_LOG_DIR=\"${HADOOP_LOG_DIR}\"" >> ${HADOOP_CONF_DIR}/yarn-env.sh + + if [ "${HADOOP_REMOTE_CMD:-ssh}" != "ssh" ] + then + echo "export YARN_SSH_CMD=\"$HADOOP_REMOTE_CMD\"" >> ${HADOOP_CONF_DIR}/yarn-env.sh + fi + + if [ -f ${HADOOP_ENVIRONMENT_EXTRA_PATH} ] + then + cat ${HADOOP_ENVIRONMENT_EXTRA_PATH} >> ${HADOOP_CONF_DIR}/yarn-env.sh + else + echo "ulimit -n ${openfilescount}" >> ${HADOOP_CONF_DIR}/yarn-env.sh + echo "ulimit -u ${userprocessescount}" >> ${HADOOP_CONF_DIR}/yarn-env.sh + fi +fi + +if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] \ + || [ "${HADOOP_SETUP_TYPE}" == "MR2" ] +then + sed -e "s/HADOOP_MASTER_HOST/${master}/g" \ + -e "s/HDFSBLOCKSIZE/${hdfsblocksize}/g" \ + -e "s/HDFSREPLICATION/${hdfsreplication}/g" \ + -e "s/HDFSNAMENODEHANDLERCLOUNT/${namenodehandlercount}/g" \ + -e "s/HDFSDATANODEHANDLERCLOUNT/${datanodehandlercount}/g" \ + -e "s/IOBUFFERSIZE/${iobuffersize}/g" \ + $hdfssitexml > ${HADOOP_CONF_DIR}/hdfs-site.xml + + cat ${hadoopconffiledir}/log4j.properties > ${HADOOP_CONF_DIR}/log4j.properties +fi + +# +# Perform format if necessary +# + +if [ "${format}" -eq "1" ] +then + # Only HADOOP_CLUSTER_NODERANK 0 will format the node + if [ "$HADOOP_CLUSTER_NODERANK" -eq "0" ] + then + cd $HADOOP_BUILD_HOME + echo 'Y' | bin/hadoop namenode -format + else + # If this is the first time running, make everyone else wait + # until the format is complete + sleep 30 + fi +fi + +# +# Launch pre-run script just before run +# + +if [ "${HADOOP_PRE_JOB_RUN}X" != "X" ] +then + ${HADOOP_PRE_JOB_RUN} +fi + +# +# Run the job +# + +if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] +then + scriptprefix="bin" + terasortexamples="hadoop-examples-$HADOOP_VERSION.jar" + rmoption="-rmr" +elif [ "${HADOOP_SETUP_TYPE}" == "MR2" ] +then + scriptprefix="sbin" + terasortexamples="share/hadoop/mapreduce/hadoop-mapreduce-examples-$HADOOP_VERSION.jar" + rmoption="-rm -r" +fi +remotecmd="${HADOOP_REMOTE_CMD:=ssh}" + +if [ "$HADOOP_CLUSTER_NODERANK" -eq "0" ] +then + cd ${HADOOP_BUILD_HOME} + + if [ ${HADOOP_MODE} != "setuponly" ] + then + echo "Starting hadoop" + if [ "$HADOOP_FILESYSTEM_MODE" == "hdfs" ] \ + || [ "$HADOOP_FILESYSTEM_MODE" == "hdfsoverlustre" ] + then + ${scriptprefix}/start-dfs.sh + fi + + if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] + then + ${scriptprefix}/start-mapred.sh + fi + + if [ "${HADOOP_SETUP_TYPE}" == "MR2" ] + then + ${scriptprefix}/start-yarn.sh + fi + + # My rough estimate for setup time is 30 seconds per 64 nodes + sleepwait=`expr ${slave_count} \/ 64 \* 30` + if [ ${sleepwait} -lt 30 ] + then + sleepwait=30 + fi + echo "Waiting ${sleepwait} seconds to allows daemons to setup" + sleep ${sleepwait} + + echo "*******************************************************" + echo "*" + echo "* You can view your job/cluster status by launching a web browser and pointing to ..." + echo "*" + if [ ${HADOOP_SETUP_TYPE} == "MR1" ] + then + echo "* Jobtracker: http://$master:50030" + elif [ ${HADOOP_SETUP_TYPE} == "MR2" ] + then + echo "* Yarn Resource Manager: http://$master:8088" + fi + echo "*" + echo "* HDFS Namenode: http://$master:50070" + echo "*" + echo "* To interact with Hadoop:" + echo "* ${remotecmd} $master" + echo "* Set HADOOP_CONF_DIR, such as ..." + echo "* export HADOOP_CONF_DIR=\"${HADOOP_CONF_DIR}\"" + echo "* or" + echo "* setenv HADOOP_CONF_DIR \"${HADOOP_CONF_DIR}\"" + echo "* cd $HADOOP_BUILD_HOME" + echo "*" + echo "* then do as you please, e.g. bin/hadoop fs ..." + echo "*" + echo "* To end your session early, kill the daemons via:" + if [ "$HADOOP_FILESYSTEM_MODE" == "hdfs" ] \ + || [ "$HADOOP_FILESYSTEM_MODE" == "hdfsoverlustre" ] + then + echo "* ${scriptprefix}/stop-dfs.sh" + fi + if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] + then + echo "* ${scriptprefix}/stop-mapred.sh" + fi + if [ "${HADOOP_SETUP_TYPE}" == "MR2" ] + then + echo "* ${scriptprefix}/stop-yarn.sh" + fi + echo "*******************************************************" + fi + + if [ "${HADOOP_MODE}" == "terasort" ] + then + echo "*******************************************************" + echo "* Running Terasort" + echo "*******************************************************" + + if [ "${HADOOP_TERASORT_SIZE}X" == "X" ] + then + terasortsize=50000000 + else + terasortsize=$HADOOP_TERASORT_SIZE + fi + + if [ "${HADOOP_FILESYSTEM_MODE}" == "rawnetworkfs" ] + then + pathprefix="${HADOOP_RAWNETWORKFS_PATH}/" + fi + + if [ "${HADOOP_TERASORT_CLEAR_CACHE}X" != "X" ] + then + if [ "${HADOOP_TERASORT_CLEAR_CACHE}" == "yes" ] + then + clearcache="-Ddfs.datanode.drop.cache.behind.reads=true -Ddfs.datanode.drop.cache.behind.writes=true" + else + clearcache="" + fi + else + clearcache="-Ddfs.datanode.drop.cache.behind.reads=true -Ddfs.datanode.drop.cache.behind.writes=true" + fi + + cd ${HADOOP_BUILD_HOME} + + command="bin/hadoop jar ${terasortexamples} teragen ${clearcache} $terasortsize ${pathprefix}terasort-teragen" + echo "Running $command" >&2 + $command + + sleep 30 + + if [ "${HADOOP_TERASORT_REDUCER_COUNT:-0}" -ne "0" ] + then + rtasks=$HADOOP_TERASORT_REDUCER_COUNT + else + rtasks=`expr $slave_count \* 2` + fi + + command="bin/hadoop jar ${terasortexamples} terasort -Dmapred.reduce.tasks=$rtasks -Ddfs.replication=1 ${clearcache} ${pathprefix}terasort-teragen ${pathprefix}terasort-sort" + + echo "Running $command" >&2 + $command + + command="bin/hadoop fs ${rmoption} ${pathprefix}terasort-teragen" + $command + command="bin/hadoop fs ${rmoption} ${pathprefix}terasort-sort" + $command + elif [ "${HADOOP_MODE}" == "script" ] + then + echo "*******************************************************" + echo "* Executing script $HADOOP_SCRIPT_PATH" + echo "*******************************************************" + ${HADOOP_SCRIPT_PATH} + elif [ "${HADOOP_MODE}" == "interactive" ] \ + || [ "${HADOOP_MODE}" == "setuponly" ] + then + echo "*******************************************************" + echo "* Entering ${HADOOP_MODE} mode" + echo "*" + + if [ "${HADOOP_MODE}" == "setuponly" ] + then + echo "* To setup, you likely want to goto:" + echo "* ${remotecmd} $master" + echo "* Set HADOOP_CONF_DIR, such as ..." + echo "* export HADOOP_CONF_DIR=\"${HADOOP_CONF_DIR}\"" + echo "* or" + echo "* setenv HADOOP_CONF_DIR \"${HADOOP_CONF_DIR}\"" + echo "* cd $HADOOP_BUILD_HOME" + echo "*" + echo "* You probably want to run run:" + echo "*" + if [ "$HADOOP_FILESYSTEM_MODE" == "hdfs" ] \ + || [ "$HADOOP_FILESYSTEM_MODE" == "hdfsoverlustre" ] + then + echo "* ${scriptprefix}/start-dfs.sh" + fi + + if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] + then + echo "* ${scriptprefix}/start-mapred.sh" + fi + + if [ "${HADOOP_SETUP_TYPE}" == "MR2" ] + then + echo "* ${scriptprefix}/start-yarn.sh" + fi + echo "*" + fi + echo "* To launch jobs:" + echo "* ${remotecmd} $master" + echo "* Set HADOOP_CONF_DIR, such as ..." + echo "* export HADOOP_CONF_DIR=\"${HADOOP_CONF_DIR}\"" + echo "* or" + echo "* setenv HADOOP_CONF_DIR \"${HADOOP_CONF_DIR}\"" + echo "* cd $HADOOP_BUILD_HOME" + echo "* bin/hadoop jar ..." + echo "*" + + if [ "${HADOOP_MODE}" == "setuponly" ] + then + echo "* To cleanup your session, kill deamons likely like:" + if [ "$HADOOP_FILESYSTEM_MODE" == "hdfs" ] \ + || [ "$HADOOP_FILESYSTEM_MODE" == "hdfsoverlustre" ] + then + echo "* ${scriptprefix}/stop-dfs.sh" + fi + if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] + then + echo "* ${scriptprefix}/stop-mapred.sh" + fi + if [ "${HADOOP_SETUP_TYPE}" == "MR2" ] + then + echo "* ${scriptprefix}/stop-yarn.sh" + fi + fi + echo "*" + echo "*******************************************************" + hadoopsleepamount=`expr ${SBATCH_TIMELIMIT} - 5` + hadoopsleepseconds=`expr ${hadoopsleepamount} \* 60` + sleep ${hadoopsleepseconds} + fi + + cd ${HADOOP_BUILD_HOME} + + if [ ${HADOOP_MODE} != "setuponly" ] + then + echo "Stopping hadoop" + if [ "$HADOOP_FILESYSTEM_MODE" == "hdfs" ] \ + || [ "$HADOOP_FILESYSTEM_MODE" == "hdfsoverlustre" ] + then + ${scriptprefix}/stop-dfs.sh + fi + + if [ "${HADOOP_SETUP_TYPE}" == "MR1" ] + then + ${scriptprefix}/stop-mapred.sh + fi + + if [ "${HADOOP_SETUP_TYPE}" == "MR2" ] + then + ${scriptprefix}/stop-yarn.sh + fi + fi +fi + +exit 0 diff --git a/patches/hadoop-1.2.1-9109.patch b/patches/hadoop-1.2.1-9109.patch new file mode 100644 index 000000000..25b35b6ec --- /dev/null +++ b/patches/hadoop-1.2.1-9109.patch @@ -0,0 +1,53 @@ +diff -pruN bin/hadoop-daemon.sh bin/hadoop-daemon.sh +--- bin/hadoop-daemon.sh 2013-07-22 15:26:38.000000000 -0700 ++++ bin/hadoop-daemon.sh 2013-09-05 23:01:30.063914000 -0700 +@@ -26,6 +26,7 @@ + # HADOOP_PID_DIR The pid files are stored. /tmp by default. + # HADOOP_IDENT_STRING A string representing this instance of hadoop. $USER by default + # HADOOP_NICENESS The scheduling priority for daemons. Defaults to 0. ++# HADOOP_SSH_CMD Specify an alternate remote shell comand + ## + + usage="Usage: hadoop-daemon.sh [--config ] [--hosts hostlistfile] (start|stop) " +@@ -107,6 +108,7 @@ export HADOOP_ROOT_LOGGER="INFO,DRFA" + log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out + pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid + HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5} ++RSH_CMD=${HADOOP_SSH_CMD:-ssh} + + # Set default scheduling priority + if [ "$HADOOP_NICENESS" = "" ]; then +@@ -128,7 +130,7 @@ case $startStop in + + if [ "$HADOOP_MASTER" != "" ]; then + echo rsync from $HADOOP_MASTER +- rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_HOME" ++ rsync -a -e $RSH_CMD --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_HOME" + fi + + hadoop_rotate_log $log +diff -pruN bin/hadoop-daemon.sh.orig bin/hadoop-daemon.sh.orig +diff -pruN bin/slaves.sh bin/slaves.sh +--- bin/slaves.sh 2013-07-22 15:26:38.000000000 -0700 ++++ bin/slaves.sh 2013-09-05 23:02:01.904041000 -0700 +@@ -24,6 +24,8 @@ + # Default is ${HADOOP_CONF_DIR}/slaves. + # HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_HOME}/conf. + # HADOOP_SLAVE_SLEEP Seconds to sleep between spawning remote commands. ++# HADOOP_SSH_CMD Specify an alternate remote shell command. ++# Defaults to ssh if not specified. + # HADOOP_SSH_OPTS Options passed to ssh when running remote commands. + ## + +@@ -58,8 +60,10 @@ if [ "$HOSTLIST" = "" ]; then + fi + fi + ++RSH_CMD=${HADOOP_SSH_CMD:-ssh} ++ + for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do +- ssh $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \ ++ $RSH_CMD $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \ + 2>&1 | sed "s/^/$slave: /" & + if [ "$HADOOP_SLAVE_SLEEP" != "" ]; then + sleep $HADOOP_SLAVE_SLEEP diff --git a/patches/hadoop-2.1.0-beta-9109.patch b/patches/hadoop-2.1.0-beta-9109.patch new file mode 100644 index 000000000..c1db56329 --- /dev/null +++ b/patches/hadoop-2.1.0-beta-9109.patch @@ -0,0 +1,81 @@ +diff -pruN sbin/hadoop-daemon.sh sbin/hadoop-daemon.sh +--- sbin/hadoop-daemon.sh 2013-08-15 13:59:05.000000000 -0700 ++++ sbin/hadoop-daemon.sh 2013-09-03 11:01:17.320536000 -0700 +@@ -26,6 +26,7 @@ + # HADOOP_PID_DIR The pid files are stored. /tmp by default. + # HADOOP_IDENT_STRING A string representing this instance of hadoop. $USER by default + # HADOOP_NICENESS The scheduling priority for daemons. Defaults to 0. ++# HADOOP_SSH_CMD Specify an alternate remote shell comand + ## + + usage="Usage: hadoop-daemon.sh [--config ] [--hosts hostlistfile] [--script script] (start|stop) " +@@ -114,6 +115,7 @@ export HDFS_AUDIT_LOGGER=${HDFS_AUDIT_LO + log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out + pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid + HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5} ++RSH_CMD=${HADOOP_SSH_CMD:-ssh} + + # Set default scheduling priority + if [ "$HADOOP_NICENESS" = "" ]; then +@@ -135,7 +137,7 @@ case $startStop in + + if [ "$HADOOP_MASTER" != "" ]; then + echo rsync from $HADOOP_MASTER +- rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_PREFIX" ++ rsync -a -e $RSH_CMD --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_PREFIX" + fi + + hadoop_rotate_log $log +diff -pruN sbin/slaves.sh sbin/slaves.sh +--- sbin/slaves.sh 2013-08-15 13:59:05.000000000 -0700 ++++ sbin/slaves.sh 2013-09-03 11:01:17.326529000 -0700 +@@ -24,6 +24,8 @@ + # Default is ${HADOOP_CONF_DIR}/slaves. + # HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_PREFIX}/conf. + # HADOOP_SLAVE_SLEEP Seconds to sleep between spawning remote commands. ++# HADOOP_SSH_CMD Specify an alternate remote shell command. ++# Defaults to ssh if not specified. + # HADOOP_SSH_OPTS Options passed to ssh when running remote commands. + ## + +@@ -55,9 +57,11 @@ else + SLAVE_NAMES=$(cat "$SLAVE_FILE" | sed 's/#.*$//;/^$/d') + fi + ++RSH_CMD=${HADOOP_SSH_CMD:-ssh} ++ + # start the daemons + for slave in $SLAVE_NAMES ; do +- ssh $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \ ++ $RSH_CMD $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \ + 2>&1 | sed "s/^/$slave: /" & + if [ "$HADOOP_SLAVE_SLEEP" != "" ]; then + sleep $HADOOP_SLAVE_SLEEP +diff -pruN sbin/yarn-daemon.sh sbin/yarn-daemon.sh +--- sbin/yarn-daemon.sh 2013-08-15 13:59:07.000000000 -0700 ++++ sbin/yarn-daemon.sh 2013-09-03 11:01:17.317540000 -0700 +@@ -26,6 +26,7 @@ + # YARN_PID_DIR The pid files are stored. /tmp by default. + # YARN_IDENT_STRING A string representing this instance of hadoop. $USER by default + # YARN_NICENESS The scheduling priority for daemons. Defaults to 0. ++# YARN_SSH_CMD Specify an alternate remote shell comand + ## + + usage="Usage: yarn-daemon.sh [--config ] [--hosts hostlistfile] (start|stop) " +@@ -94,6 +95,7 @@ export YARN_ROOT_LOGGER=${YARN_ROOT_LOGG + log=$YARN_LOG_DIR/yarn-$YARN_IDENT_STRING-$command-$HOSTNAME.out + pid=$YARN_PID_DIR/yarn-$YARN_IDENT_STRING-$command.pid + YARN_STOP_TIMEOUT=${YARN_STOP_TIMEOUT:-5} ++RSH_CMD=${YARN_SSH_CMD:-ssh} + + # Set default scheduling priority + if [ "$YARN_NICENESS" = "" ]; then +@@ -115,7 +117,7 @@ case $startStop in + + if [ "$YARN_MASTER" != "" ]; then + echo rsync from $YARN_MASTER +- rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $YARN_MASTER/ "$HADOOP_YARN_HOME" ++ rsync -a -e $RSH_CMD --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $YARN_MASTER/ "$HADOOP_YARN_HOME" + fi + + hadoop_rotate_log $log diff --git a/sbatch.hadoop b/sbatch.hadoop new file mode 100644 index 000000000..6eaf2af0b --- /dev/null +++ b/sbatch.hadoop @@ -0,0 +1,352 @@ +#!/bin/sh +############################################################################# +# Copyright (C) 2013 Lawrence Livermore National Security, LLC. +# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). +# Written by Albert Chu +# LLNL-CODE-644248 +# +# This file is part of Magpie, scripts for running Hadoop on +# traditional HPC systems. For details, see . +# +# Magpie is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# Magpie is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Magpie. If not, see . +############################################################################# + +## SLURM Customizations + +# Node count. Node count should include one node for the +# head/managemnet node. For example, if you want 8 compute nodes to +# process data, specify 9 nodes below. +#SBATCH --nodes=9 + +# Time to run the job +export SBATCH_TIMELIMIT=60 + +# Job name. This will be used in naming directories for the job. +#SBATCH --job-name=myhadoopjob + +# Partition to launch job in +#SBATCH --partition=pbatch + +## SLURM Values +# Generally speaking, don't touch these, just need to configure slurm + +#SBATCH --ntasks-per-node=1 +#SBATCH --exclusive +#SBATCH --no-kill + +############################################################################ +# Hadoop Configurations +############################################################################ + +# Set how Hadoop should run +# +# "terasort" - run terasort. Useful for making sure things are setup +# the way you like. +# +# "script" - execute a script that lists all of your Hadoop jobs. +# +# "interactive" - manually interact to submit jobs, peruse HDFS, etc. +# also useful for moving data in/out of HDFS. In this +# mode you'll login to the cluster node that is your +# 'master' node and interact with Hadoop directly +# (e.g. bin/hadoop ...) +# +# "setuponly" - Like 'interactive' but only setup conf files. useful +# if user wants to setup daemons themselves, etc. +# +export HADOOP_MODE="terasort" + +# Set how the filesystem should be setup +# +# "hdfs" - Normal straight up HDFS if you have local disk in your +# cluster. Be careful running this in a cluster environment. +# The next time you execute your job, if a different set of +# nodes are allocated to you, the HDFS data you wrote from a +# previous job may not be there. +# +# "hdfsoverlustre" - HDFS over Lustre. See README for description. +# +# "rawnetworkfs" - Use Hadoop RawLocalFileSystem (i.e. file: scheme), +# to use networked file system directly. It could be a +# Lustre mount or NFS mount. Whatever you please. +# +export HADOOP_FILESYSTEM_MODE="hdfsoverlustre" + +# Set Hadoop Setup Type +# +# Will inform scripts on how to setup config files and what daemons to +# launch/setup. The hadoop build/binaries set by HADOOP_BUILD_HOME +# needs to match up with what you set here. +# +# MR1 - MapReduce/Hadoop 1.0 w/ HDFS +# MR2 - MapReduce/Hadoop 2.0 w/ HDFS +# +export HADOOP_SETUP_TYPE="MR2" + +# Version +# +# Make sure the version for Mapreduce version 1 or 2 matches whatever +# you set in HADOOP_SETUP_TYPE +# +export HADOOP_VERSION="2.1.0-beta" + +# Path to your Hadoop build/binaries +# +# Make sure the build for Mapredeuce version 1 or 2 matches whatever +# you set in HADOOP_SETUP_TYPE. +# +# This should be accessible on all nodes in your allocation. Typically +# this is in an NFS mount. +# +export HADOOP_BUILD_HOME="/home/achu/hadoop/hadoop-${HADOOP_VERSION}" + +# Terasort size +# +# For "terasort" mode. +# +# Specify 10000000000 for terabyte, for actual benchmarking +# +# Specify something small, like 50000000 for basic sanity tests. +# Defaults to 50000000. +# +# export HADOOP_TERASORT_SIZE=50000000 + +# Terasort reducer count +# +# For "terasort" mode. +# +# If not specified, will be compute node count * 2. +# +# export HADOOP_TERASORT_REDUCER_COUNT=4 + +# Terasort cache +# +# For "real benchmarking" you should flush page cache between a +# teragen and a terasort. You can disable this for sanity runs/tests +# to make things go faster. Specify yes or no. Defaults to yes. +# +# export HADOOP_TERASORT_CLEAR_CACHE=no + +# Directory your launching scripts/files are stored +export HADOOP_SCRIPTS_HOME="/home/achu/hadoop/hadoop-launch" + +# Directory where script configuration templates are stored +# +# If not specified, assumed to be $HADOOP_SCRIPTS_HOME/conf +# +# export HADOOP_CONF_FILES="${HADOOP_SCRIPTS_HOME}/conf" + +# Specify script to execute for "script" mode +# +# See hadoop-example-job for example of what to put in the script. +# +export HADOOP_SCRIPT_PATH="${HADOOP_SCRIPTS_HOME}/hadoop-example-job" + +# Path for HDFS when using local disk +# +export HADOOP_HDFS_PATH="/ssd/tmp1/achu/hdfs/" + +# Lustre mount to do Hadoop HDFS out of +# +# Note that different versions of Hadoop may not be compatible with +# your current HDFS data. If you're going to switch around to +# different versions, perhaps set different paths for different data. +# +export HADOOP_HDFSOVERLUSTRE_PATH="/p/lscratchg/achu/hadoop/hdfsoverlustre/" + +# Path for rawnetworkfs +# +export HADOOP_RAWNETWORKFS_PATH="/p/lscratchg/achu/hadoop/rawnetworkfs/" + +# If you have a local SSD, performance may be better to store +# intermediate data on it rather than Lustre (or some other networked +# fs). If the below environment variable is specified, local +# intermediate data will be stored in the specified directory. +# Otherwise it will go to an appropriate directory in Lustre/networked +# FS. +# +# Be wary, local SSDs stores may have less space than HDDs or +# networked file systems. It can be easy to run out of space. +# +# export HADOOP_LOCALSTORE="/ssd/tmp1/achu/localstore/" + +# Path local to each cluster node, typically something in /tmp. +# This will store local conf files and log files for your job. +# +# This will not be used for storing intermediate files or +# distributedcache files. See HADOOP_LOCALSTORE above for that. +# +export HADOOP_LOCAL_DIR="/tmp/achu/hadoop" + +# HDFS Block Size +# +# Commonly 134217728, 268435456, 536870912 (i.e. 128m, 256m, 512m) +# +# If not specified, defaults to 134217728 +# +# export HADOOP_HDFS_BLOCKSIZE=134217728 + +# HDFS Replication +# +# HDFS commonly uses 3. It's not necessary if doing HDFS over Lustre, +# but higher replication can help w/ Hadoop task scheduling and job +# performance at the cost of increased disk usage. +# +# If not specified, defaults to 3 for HDFS, defaults 1 for HDFS over +# Lustre. +# +# export HADOOP_HDFS_REPLICATION=3 + +# Tasks per Node +# +# If not specified, defaults to #cores - 1 (i.e. leave 1 processor for +# system daemons). +# +# export HADOOP_MAX_TASKS_PER_NODE=8 + +# Default Map tasks for Job +# +# If not specified, defaults to HADOOP_MAX_TASKS_PER_NODE * compute +# nodes. +# +# export HADOOP_DEFAULT_MAP_TASKS=8 + +# Default Reduce tasks for Job +# +# If not specified, defaults to # compute nodes (i.e. 1 reducer per +# node) +# +# export HADOOP_DEFAULT_REDUCE_TASKS=8 + +# Max Map tasks for Task Tracker +# +# If not specified, defaults to HADOOP_MAX_TASKS_PER_NODE +# +# export HADOOP_MAX_MAP_TASKS=8 + +# Max Reduce tasks for Task Tracker +# +# If not specified, defaults to HADOOP_MAX_TASKS_PER_NODE +# +# export HADOOP_MAX_REDUCE_TASKS=8 + +# Heap size for JVM +# +# Specified in K. If not specified, a reasonable estimate will be +# calculated based on total memory available and number of CPUs on the +# system. +# +# HADOOP_CHILD_MAP_HEAPSIZE and HADOOP_CHILD_REDUCE_HEAPSIZE are for +# Yarn (i.e. HADOOP_SETUP_TYPE = MR2) +# +# If HADOOP_CHILD_MAP_HEAPSIZE and/or HADOOP_CHILD_REDUCE_HEAPSIZE are +# not specified, they are assumed to be HADOOP_CHILD_HEAPSIZE. +# +# export HADOOP_CHILD_HEAPSIZE=2048 +# export HADOOP_CHILD_MAP_HEAPSIZE=2048 +# export HADOOP_CHILD_REDUCE_HEAPSIZE=2048 + +# Mapreduce Slowstart, indicating percent of maps that should complete +# before reducers begin. +# +# If not specified, defaults to 0.25 (Hadoop default is 0.05, really low) +# +# export HADOOP_MAPREDUCE_SLOWSTART=0.25 + +# Container Memory +# +# Memory on compute nodes for containers. Typically "nice-chunk" less +# than actual memory on machine, b/c machine needs memory for its own +# needs (kernel, daemons, etc.). Specified in megs. +# +# If not specified, a reasonable estimate will be calculated based on +# total memory on the system. +# export YARN_RESOURCE_MEMORY=32768 + +# Daemon Heap Max +# +# Heap maximum for Hadoop daemons (i.e. Resource Manger, NodeManager, +# etc.), specified in megs. +# +# If not specified, defaults to 1000 (which is Hadoop default) +# +# May need to be increased if you are scaling large. +# +# export HADOOP_DAEMON_HEAP_MAX=2000 + +# Compression +# +# Should compression of outputs and intermediate data be enabled. +# Specify yes or no. +# +# Effectively, is time spend compressing data going to save you time +# on I/O. Sometimes yes, sometimes no. +# +# export HADOOP_COMPRESSION=yes + +# IO Sort Factors + MB +# +# The number of streams of files to sort while reducing and the memory +# amount to use while sorting. This is a quite advanced mechanism +# taking into account many factors. If not specified, some reasonable +# number will be assumed. +# +# export HADOOP_IO_SORT_FACTOR=10 +# export HADOOP_IO_SORT_MB=100 + +# Environment Extra +# +# Specify extra environment information that should be passed into +# Hadoop. This file will simply be appended into the hadoop-env.sh +# (and if appropriate) yarn-env.sh. +# +# By default, a reasonable estimate for max user processes and open +# file descriptors will be calculated and put into hadoop-env.sh (and +# if appropriate) yarn-env.sh. However, it's always possible they may +# need to be set differently. Everyone's cluster/situation can be +# slightly different. +# +# export HADOOP_ENVIRONMENT_EXTRA_PATH="${HADOOP_SCRIPTS_HOME}/hadoop-example-environment-extra" + +# Convenience Scripts +# +# Specify script to be executed to before / after your job. It is run +# on all nodes. +# +# Typically the pre-job is used to set something up or get debugging +# info. The post-job script is used for cleaning up something. +# +# See examples for ideas of what you can do w/ these scripts +# +# export HADOOP_PRE_JOB_RUN="${HADOOP_SCRIPTS_HOME}/hadoop-example-pre-job" +# export HADOOP_POST_JOB_RUN="${HADOOP_SCRIPTS_HOME}/hadoop-example-post-job" + +# Specify ssh-equivalent remote command if ssh is not available on +# your cluster +# export HADOOP_REMOTE_CMD="mrsh" + +# Necessary for various Hadoop execution +export JAVA_HOME="/usr/lib/jvm/jre-1.6.0-sun.x86_64/" + +############################################################################ +# Hadoop Run +############################################################################ + +srun --no-kill -W 0 $HADOOP_SCRIPTS_HOME/hadoop-run +srun --no-kill -W 0 $HADOOP_SCRIPTS_HOME/hadoop-post-run + +# Gather configuration and log files after your job is done for +# debugging. Data will be stored in $HADOOP_SCRIPTS_HOME/$SLURM_JOB_NAME/$SLURM_JOB_ID + +# srun --no-kill -W 0 $HADOOP_SCRIPTS_HOME/hadoop-gather