How to run MapReduce job in Hadoop:
Step 2: File -> New -> Project -> Java
Give Project name.
Step 3: Create your MapReduce application on Eclipse.
Step 4: Remember to create different Mapper, Reducer and Driver class.
Step 5: Right click on Project name -> Export -> jar -> Next -> Give jar file name -> Finish
Step 6: Keep your data file ready with you.
Step 7: Create a directory on HDFS
hdfs dfs -mkdir /input
Step 8: Copy your data file from local drive to HDFS
hdfs dfs -put [path of data file] /input
Step 9: Run the yarn command:
yarn jar [jar-file-name] [package-name.Driver-class-name] [input path] [output path]
EG: yarn jar wcount.jar com.amir.WordCount /input /output
A series of processing will take place. If any error is there then MapReduce job will not
run successfully, and it will give error.
Step 10: After the job is successfully processed. Check the ouptut path.
hdfs dfs -ls /output
By default, two files will be created in Output directory.
1) _SUCCESS
2) part-r-00000
Here 'r' signifies output from Reducer job.
run successfully, and it will give error.
Step 10: After the job is successfully processed. Check the ouptut path.
hdfs dfs -ls /output
By default, two files will be created in Output directory.
1) _SUCCESS
2) part-r-00000
Here 'r' signifies output from Reducer job.