0(0)

Big Data using Apache Spark

Learn Apache Spark and get a guide to the most popular and workable framework for building and using big data applications.
35,000

Description

Big data is a great challenge for large and mid sized companies. How can one work effectively and efficiently? This program introduces Apache Spark, the open source cluster computing system that makes process of decision sciences and analytics fast to both write and execute. In this program, you will handle bigger and larger datasets quickly via Spark python API’s.

 

TECHNOLOGY USED: Apache Spark, Data Bricks

What Will I Learn?

  • Learn how to deploy interactive, batch and streaming applications

Topics for this course

39 Lessons40h

Introduction to Hadoop, Spark and Python

An introduction to Big data
Hadoop Architecture
Mapper and Reducer
What is Apache Spark?
Apache Spark Jobs and API’s
Apache Spark2.0 Architecture
Use Conda notebooks and test spark using it

Introduction to Apache Spark

RDD’s in Apache Spark

Python Spark Libraries

Pyspark RDD’s

Data Frames

Prepare the Data for Modelling

Introducing MLlib

About the instructors

Pedagogy Trainings

0 ratings (0 )
With a pure blend of Mathematics, Statistics, Data Analysis and Interpretation,we deliver training in numerous technologies ranging from basic report authoring, building large scale data integration, data quality, data visualization, data mining and the new generation distributed computing using big data.
17 Courses
2 students

Share

Course Details

  • Level: Intermediate
  • Categories: Big DataData Science
  • Total Hour: 40h
  • Total Lessons: 39
  • Total Enrolled: 0
  • Last Update: May 2, 2020

Requirements

  • Python core
  • Programming basics
  • Hadoop Map-reduce

Target Audience

  • Big data and Machine Learning engineers