A Brief Tour of the Magic Behind Apache Spark

Scale
06/13/2017 - 11:50 to 12:10
Palais Atelier
short talk (20 min)
Intermediate

Session abstract: 

Apache Spark is one the most popular general purpose distributed systems in the past few years. Apache Spark has APIs in Scala, Java, Python and more recently a few different attempts to provide support for R, C#, and Julia.

This talk covers the core concepts of Apache Spark, but then switches gears into how the "magic" of Spark can have unintended consequences on your programs. You will gain a better understanding of impact of lazy evaluation, the shift away from arbitrary lambda expressions to "statements", and of course the "dreaded" shuffle (plus a few more concepts). Based on that the presenter will then shamelessly try and self-promote her own book (for one slide I promise :p) and then look at (free) resources for learning more about the magic behind Spark.

Video: 

Slide: