In the overview, Chris will explain what will be covered in the Data Analyst track of the Skillsoft Aspire Data Science journey as well show you how to use code assets, the lab, and browse the different assets in the track.
Traditional data warehousing is transitioning to be more cloud-based and this can be a key area that must be mastered for data science. In this Skillsoft Aspire course, you will examine the organizational implications of data silos and explore how data lakes can help make data secure, discoverable, and queryable. Discover how data lakes can work with batch and streaming…
Traditional data warehousing is transitioning to be more cloud-based and this can be a key area that must be mastered for data science. In this Skillsoft Aspire course, you will discover how to configure Glue crawlers to work with different data stores on AWS. Examine how to visualize the data stored in the data lake with AWS QuickSight and how…
Traditional data warehousing is transitioning to be more cloud-based and this can be a key area that must be mastered for data science. In this Skillsoft Aspire course, you will discover how to build a data lake on the AWS cloud by storing data in S3 buckets and indexing this data using AWS Glue. Explore how to run crawlers to…
Data engineering is the area of data science that focuses on practical applications of data collection and analysis. In this Skillsoft Aspire course, you will explore distributed systems, batch vs. in-memory processing, NoSQL uses, and the various tools available for data management/big data and the ETL process.
Apache Hadoop is a collection of open-source software utilities that facilitates solving data science problems. In this Skillsoft Aspire course, you will explore the theory behind big data analysis using Hadoop and how MapReduce enables the parallel processing of large datasets distributed on a cluster of machines.
Apache Hadoop is a collection of open-source software utilities that facilitates solving data science problems. In this Skillsoft Apsire course, you will discover how to use Hadoop's MapReduce, including how to provision a Hadoop cluster on the cloud and then build a hello world application using MapReduce to calculate the word frequencies in a text document.
On the career path to Data Science, a fundamental understanding of statistics, specifically inferential statistics is required. Inferential statistics go beyond merely describing a dataset and seek to posit and prove or disprove the existence of relationships within the data. In this Skillsoft Aspire course, you will explore hypothesis testing, which finds wide applications in data science.
On the career path to Data Science, a fundamental understanding of statistics and modeling is required. The goal of all modeling is generalizing as well as possible from a sample to the population of big data as a whole. In thius Skillsoft Aspire course, you will explore the first step in this process, obtaining a representative sample from which meaningful…
HDFS is the file system used for data science which enables the parallel processing of big data in distributed cluster. When managing a data warehouse, not all users should be given free reign over all the datasets. In this Skillsoft Aspire course, you will explore how file permissions can be viewed and configured in HDFS. The NameNode UI is used…
On the career path to Data Science, a fundamental understanding and the application of statistics, specifically modeling is required. The goal of all modeling is generalizing as well as possible from a sample to the population as a whole. In this Skillsoft Aspire course, you will explore the first step in this process, obtaining a representative sample from which meaningful…
HDFS is the file system used for data science which enables the parallel processing of big data in distributed cluster. In this Skillsoft Aspire course, you will explore the Hadoop file system using the HDFS dfs shell and perform basic file and directory-level operations. Transfer files between a local file system and HDFS and explore ways to create and delete…
HDFS is the file system used for data science which enables the parallel processing of big data in distributed cluster. In this Skillsoft Aspire course, you will discover how to set up a Hadoop Cluster on the cloud and explore the bundled web apps - the YARN Cluster Manager app and the HDFS NameNode UI. Then use the hadoop fs…
Data architecture is a foundation that you need to understand in your data science journey. This is a high-level concept that typically is made up of models, policies, rules or standards that determine how and what data is collected, stored, arranged, integrated, and the put into use within an organization. In this Skillsoft Aspire course, you will explore how we…
R is a programming language that is an essential skill for data science and big data used for statistical computing and graphics. In this Skillsoft Aspire course, you will examine how to apply classification and clustering methods to data science problems using R.
R is a programming language that is an essential skill for data science and big data used for statistical computing and graphics. In this Skillsoft Aspire course, you will explore data in R using the dplyr library including working with tabular data, piping data, mutating data, summarizing data, combining datasets, and grouping data.
R is a programming language that is an essential skill for data science and big data used for statistical computing and graphics. In this Skillsoft Aspire course, you will explore the use of the common data structures used in R including working with vectors, lists, matrices, factors, and data frames.
While understanding data analysis key for data science, applying data analysis with different languages and applications is important for any data scientist. In this Skillsoft Aspire course, you will discover how to perform data analysis using Anaconda Python, R, and related analytical libraries and tools.
R is a programming language that is an essential skill for data science and big data used for statistical computing and graphics. In this Skillsoft Aspire course, you will discover how to use R to import and export tabular data in CSV, Excel, and HTML format.
HDFS is the file system used for data science which enables the parallel processing of big data in distributed cluster. In this Skillsoft Aspire course, you will explore the concepts of analyzing large datasets and explore how Hadoop and HDFS make this process very efficient.
R is a programming language that is an essential skill for data science and big data used for statistical computing and graphics. In this Skillsoft Aspire course, you will discover how to apply regression methods to data science problems using R.
Apache Spark is an open-source cluster-computing framework used for data science and it has become the defacto big data framework. In this Skillsoft Aspire course, you will explore the basics of Apache Spark, an analytics engine for working with big data that is built on top of Hadoop. Discover how it allows operations on data with both its own library…
Pandas is a Python software library used for data science and big data that is used for data manipulation and analysis. In this Skilloft Aspire course, you will explore different ways to iterate over and sort Pandas DataFrames. Discover how to handle missing data and perform grouping operations, as well as how to combine data from different DataFrames using join…
Pandas is a Python software library used for data science and big data that is used for data manipulation and analysis. In this Skillsoft Aspire course, you will discover how to work with series and tabular data, including initialization, population, and manipulation of Pandas Series and DataFrames.
NumPy is a Python library used in data science and big data that works with arrays when performing scientific computing with Python. In the Skillsoft Aspire course, you will explore advanced array operations such as image manipulation, fancy indexing, views and broadcasting.
NumPy is a Python library used in data science and big data that works with arrays when performing scientific computing with Python. In this Skillsoft Aspire course, you will explore how to initialize and load data into arrays and learn about basic array manipulation operations using NumPy including universal functions, indexing and slicing, array iteration, and array reshaping.