Recently I have started learning Mahout using Mahout In Action. Topic that I first choose was Un-Supervised Learning. Very first example that I found was about Clustering. I was able to code the example of K-Means Clustering and was able to compiled it successfully. Then I followed the standard procedure of creating Jar and Running Jar on Hadoop Cluster to submit a Map-Reduce Clustering Job For me. But it did not worked and I was continuously getting ClassDef Not Found Error.
After some googling I realized that we can install maven, can build the project using Maven and then we can execute it.
But I was looking for a way to execute the code for Mahout on Hadoop without using Maven. Following is the class that I coded:
Then I compiled the java file to create a java file as:
Above command has created my class file: SimpleKMeansClustering.class
Now as I don't wanted to use Maven to build my project, I looked at the Sean Owen Comment:
Use the "job" JAR file provided by Mahout. It packages up all the dependencies. You need to add your classes to it too.
So I went to my Mahout installation directory:
I copied the above file in red to the directory where my Mahout Code resides.
Then I have added my class file to our main jar file mahout-core-0.5-cdh3u3-job.jar Using following Command. And then simply executed main jar file by invoking my class SimpleKMeansClustering. My Code got executed properly in Map-Reduce Fashion and Generated Output As:
kırşehir
ReplyDeletekarabük
adıyaman
niğde
ordu
7BJ1