Explore Data Science: Executing Code With Mahout Without Using Maven

Recently I have started learning Mahout using Mahout In Action. Topic that I first choose was Un-Supervised Learning. Very first example that I found was about Clustering. I was able to code the example of K-Means Clustering and was able to compiled it successfully. Then I followed the standard procedure of creating Jar and Running Jar on Hadoop Cluster to submit a Map-Reduce Clustering Job For me. But it did not worked and I was continuously getting ClassDef Not Found Error.

After some googling I realized that we can install maven, can build the project using Maven and then we can execute it.

But I was looking for a way to execute the code for Mahout on Hadoop without using Maven. Following is the class that I coded:

Then I compiled the java file to create a java file as:

Above command has created my class file: SimpleKMeansClustering.class

Now as I don't wanted to use Maven to build my project, I looked at the Sean Owen Comment:

Use the "job" JAR file provided by Mahout. It packages up all the dependencies. You need to add your classes to it too.

So I went to my Mahout installation directory:

I copied the above file in red to the directory where my Mahout Code resides.

Then I have added my class file to our main jar file mahout-core-0.5-cdh3u3-job.jar Using following Command. And then simply executed main jar file by invoking my class SimpleKMeansClustering. My Code got executed properly in Map-Reduce Fashion and Generated Output As:

Explore Data Science

Thursday 12 June 2014

Executing Code With Mahout Without Using Maven

1 comment:

Thursday 12 June 2014

Executing Code With Mahout Without Using Maven

1 comment:

Subscribe Us