Master the skills needed for Data Science in Just 8 Weeks

Course Summary: Advanced Data Analytics with Spark

Our Advanced Data Analytics with Spark cohort is a 4 week evening course.

Apache Spark is a fast and general engine for large-scale data processing. Spark was developed as an alternative to the traditional MapReduce processing paradigm. By using in memory storage, Spark can achieve up to 100X the speed of Hadoop MapReduce and is 10X faster when running on disk. Spark is preferred for iterative processing, which is being done by many machine learning algorithms.

Sparks runs on top of Hadoop, as a standalone platform or in the cloud. It is easy to use, fast and has a powerful stack of libraries including SQL and Dataframes. Our course will require that you have some experience programming in python.

Course Details: Advanced Data Analytics with Spark

Week 1 : Spark Fundamentals

C: Introduction to Spark
C: Why Spark?
C: Introduction to RDDs
C: Data sharing
C: Data Partitioning

Week 2 : Spark SQL

C: Working with the Spark Shell
C: What is Spark SQL?
C: Spark SQL vs Spark Core
C: DataFrames API

Week 3 : Spark Streaming

C: DStreams
C: Transformations: Stateless and Stateful Transformation
C: Checkpointing and Output Operations
C: Tuning and Debugging Spark

Learning Objectives: Advanced Data Analytics with Spark

Become familiar with Spark fundamentals. Learn about the different components of Spark.
Use Spark on a HDFS cluster. Gain experience working with RDDs.
Learn how to tune and debug Spark.
Tools used : Python, Spark

Next Steps:

Drop us a note, to schedule an interview, and see if this course is a good fit for you.
Enroll@bitbootcamp.com

Campus

New York City

Next Cohort

January 10^th, 2017 - February 2^nd, 2017
Tuesday and Thursday: 6:30 PM to 9:30 PM

Tuition

$2,500 USD

Financing

Financing Options available with: Pave