Thursday, June 14, 2012

Error while executing MapReduce WordCount program (Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable)

Quite often I see questions from people who are comparatively new to the Hadoop world or just starting their Hadoop journey that they are getting below specified error while executing the traditional WordCount program :

Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable

If you are also getting this error then you have to set your MapOutputKeyClass explicitly like this :

- If you are using the older MapReduce API then do this :
  conf.setMapOutputKeyClass(Text.class); 
  conf.setMapOutputValueClass(IntWritable.class); 

 - And if you are using the new MapReduce API then do this :
   job.setMapOutputKeyClass(Text.class);     
   job.setMapOutputValueClass(IntWritable.class);

REASON : The reason for this is that your MapReduce application might be using TextInputFormat as the InputFormat class and this class generates keys of type LongWritable and values of type Text by default. But your application might be expecting keys of type Text. That's why you get this error.

NOTE : For detailed information you can visit the official MapReduce page.

1 comment:

How to work with Avro data using Apache Spark(Spark SQL API)

We all know how cool Spark is when it comes to fast, general-purpose cluster computing. Apart from the core APIs Spark also provides a rich ...