-
Notifications
You must be signed in to change notification settings - Fork 177
Jeroen van Dijk edited this page Jun 14, 2013
·
1 revision
- How to deploy to EMR?
- What are common errors on EMR?
- Why does my job fail when running on EMR, but not locally?
Q: What are common errors on EMR?
A: The following list shows the types of errors one can encounter:
- ClassNotFoundExceptions due to the use of dashes (“-”) in the namespaces or functions that are part of name of main classes for Hadoop. Advice: don’t use dashes.
- Classpath collisions with libraries that come with the Hadoop distribution. See Classpath precedence
Q: Why does my job fail when running on EMR, but not locally?
A: Generally speaking EMR is different from when running locally through leiningen. The steps to debug this are the following:
- Is there error in this list?
- Does your job run locally on the same version of Hadoop as EMR is using. See How to run job locally?
- Does the error occur when you re-run the job? No, then wait until you see a pattern.
- Are you using spot instances? If yes, have the instances been killed?
- Ask the mailing list
Q: How to deploy to EMR?
A: Lemur is a tool build to easily launch Hadoop jobs to EMR
Q: How do I make sure my libraries are loaded before the libraries of the Hadoop distribution?
A: Certain Hadoop versions allow to control the classpath “precedence” through configuration options.
Hadoop version(s) | Configuration option |
---|---|
0.20.203 – 0.20.205 | mapreduce.user.classpath.first=true |