Big Data using Apache Spark

Learn Apache Spark and get a guide to the most popular and workable framework for building and using big data applications.


Big data is a great challenge for large and mid sized companies. How can one work effectively and efficiently? This program introduces Apache Spark, the open source cluster computing system that makes process of decision sciences and analytics fast to both write and execute. In this program, you will handle bigger and larger datasets quickly via Spark Python API’s.


TECHNOLOGY USED: Apache Spark, Data Bricks

What Will I Learn?

  • Learn how to deploy interactive, batch and streaming applications

Topics for this course

39 Lessons40h

Introduction to Hadoop, Spark and Python

An introduction to Big data
Hadoop Architecture
Mapper and Reducer
What is Apache Spark?
Apache Spark Jobs and API’s
Apache Spark2.0 Architecture
Use Conda notebooks and test spark using it

Introduction to Apache Spark

RDD’s in Apache Spark

Python Spark Libraries

Pyspark RDD’s

Data Frames

Prepare the Data for Modelling

Introducing MLlib

About the instructors

Pedagogy Trainings

4.67 ratings (6 )
23 Courses
83 students



Course Details


  • Python core
  • Programming basics
  • Hadoop Map-reduce

Target Audience

  • Big Data Engineers
  • Machine Learning Engineers