Data Flow for the Hadoop Ecosystem
Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level
Overview/Description
Hadoop is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the GFS and of the MapReduce computing paradigm. You'll explore a demonstration of the use of Sqoop and Hive with Hadoop to flow and fuse data. The demonstration includes preprocessing data, partitioning data and joining data. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.
Target Audience
Technical personnel with a background in Linux, SQL, and programming who intend to join a Hadoop Engineering team in roles such as Hadoop developer, data architect, or data engineer or roles related to technical project management, cluster operations, or data analysis
Prerequisites
None
Expected Duration (hours)
1.9
Lesson Objectives
Data Flow for the Hadoop Ecosystem
df_ahec_a10_it_enus
Expertise Level
Intermediate