Getting Started with Hadoop: Advanced Operations Using MapReduce
Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level
Overview/Description
In this Skillsoft Aspire course, explore how MapReduce can be used to extract the five most expensive vehicles in a data set, then build an inverted index for the words appearing in a set of text files. Begin by defining a vehicle type that can be used to represent automobiles to be stored in a Java PriorityQueue, then configure a Mapper to use a PriorityQueue to store the five most expensive automobiles it has processed from the dataset. Learn how to use a PriorityQueue in the Reducer of the application to receive the five most expensive automobiles from each mapper and write the top five automobiles overall to the output, then execute the application to verify the results. Next, explore how you can utilize the MapReduce framework in order to generate an inverted index and configure the Reducer and Driver for the inverted index application. This leads on to running the application and examining the inverted index on HDFS (Hadoop Distributed File System). The concluding exercise involves advanced operations using MapReduce.
Expected Duration (hours)
0.8
Lesson Objectives
Getting Started with Hadoop: Advanced Operations Using MapReduce
it_dshpfddj_05_enus
Expertise Level
Intermediate