Why and how to leverage the power and simplicity of SQL on Apache Flink

Stream
06/12/2018 - 17:20 to 18:00
Kesselhaus
long talk (40 min)
Intermediate

Session abstract: 

SQL is the lingua franca of data processing and everybody working with data knows SQL. Apache Flink provides SQL support for querying and processing batch and streaming data. Flink’s SQL support powers large-scale production systems at Alibaba, Huawei, and Uber. Based on Flink SQL, these companies have built systems for their internal users as well as publicly offered services for paying customers. In my talk, I will discuss why you should and how you can (not being Alibaba or Uber) leverage the simplicity and power of SQL on Flink.

I will start exploring the use cases that Flink SQL was designed for and present real-world problems that it can solve. In particular, I'll explain why unified batch and stream processing is important and what it means to run SQL queries on streams of data. After discussing why and when you should use Flink SQL, I will show how to leverage its full potential. The Flink community is developing a service that integrates a query interface, (external) table catalogs, and result serving functionality for static, appending, and updating result sets. I will discuss the design and features of this query service and how it will enable exploratory batch and streaming queries, ETL pipelines, and live updating query results that serve applications, such as real-time dashboards. The talk concludes with a brief demo of a client running queries against the service.

Video: 

Slide: