Course details

Accessing Data with Spark: Data Analysis using Spark SQL

Accessing Data with Spark: Data Analysis using Spark SQL


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Continue to explore Apache Spark, the de facto big data science framework, in this Skillsoft Aspire course. You will learn how to analyze a Spark DataFrame by treating it as though it were a relational database table. Learners discover how to create a view from a Spark DataFrame and run SQL queries against it, and how to define and explore data in Windows. Key concepts in this course include different stages involved in optimizing any query or method call on the contents of a Spark DataFrame; how to create views out of a Spark DataFrame's contents and run queries against them; and how to trim and clean a DataFrame before a view is created, as a precursor to running SQL queries. Next, learn how to perform an analysis of data by running different SQL queries; how to configure a DataFrame with an explicitly defined schema; and define what a window is in the context of Spark. Finally, observe how to create and analyze categories of data in a data set by using Windows.



Expected Duration (hours)
0.9

Lesson Objectives

Accessing Data with Spark: Data Analysis using Spark SQL

  • Course Overview
  • recall the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame
  • create views out of a Spark DataFrame's contents and run queries against them
  • trim and clean a DataFrame before a view is created as a precursor to running SQL queries on it
  • perform an analysis of data by running different kinds of SQL queries, including grouping and aggregations
  • recognize how Spark DataFrames infer the schema of data loaded into them and configure a DataFrame with an explicitly defined schema
  • define what a window is in the context of Spark DataFrames and when they can be used
  • create and analyze categories of data in a dataset using Windows
  • analyze data using Spark SQL
  • Course Number:
    it_dsadskdj_03_enus

    Expertise Level
    Intermediate