Friday, November 26, 2010

Running hadoop jobs using the eclipse hadoop plugin

The hadoop distribution contains an eclipse plugin that allows you to browse the hadoop filesystem and to run MapReduce-jobs on a hadoop cluster.

As i tried to use the plugin bundled with the current stable hadoop version 0.20.2 i found out that this version of the plugin is not compatible with eclipse 3.4 and above. It allows you to browse hdfs-shares, but you cannot start hadoop-jobs. If you try to start a job nothing happens. The error log contains the following message:

Error
Fri Nov 26 10:13:43 CET 2010
Plug-in org.apache.hadoop.eclipse was unable to load class 
org.apache.hadoop.eclipse.launch.HadoopApplicationLaunchShortcut.

java.lang.NoClassDefFoundError: 
org/eclipse/jdt/internal/debug/ui/launcher/JavaApplicationLaunchShortcut

After some google searches, i found out, that this is a known issue that has already been fixed in the jira ticket MAPREDUCE-1280.

Unfortunately there is no hadoop release that contains this bugfix. Because of this I compiled the plugin from the current hadoop 0.20 svn branch. I attached the compiled plugin to the jira ticket. You can download it from here

This version of the plugin works for me on Mac OS 10.6 with hadoop 0.20.2 and eclipse 3.5.2 as well as eclipse 3.6.1.

To install the plugin, simply delete any old version of the plugin from your eclipse installation and put the jar file into the dropins-folder of your eclipse installation.

If you had an older version of the plugin installed, you need to start eclipse with the "-clean" command line switch. Help on running eclipse with command line switches can be found here.

If you have any trouble getting this to work feel free to ask for support here.