Processing Data: Integrating Kafka with Apache Spark

placeholder

“Flexible and Intuitive DataFrames are a popular data structure in data analytics. In this course build Spark applications that process data streamed to Kafka topics using DataFrames.
Begin by setting up a simple Spark app that streams in messages from a Kafka topic processes and transforms them and publishes them to an output sink. Next leverage the Spark DataFrame application programming interface by performing selections projections and aggregations on data streamed in from Kafka while also exploring the use of SQL queries for those transformations. Finally you will perform windowing operations – both tumbling windows where the windows do not overlap and sliding windows where there is some overlapping of data.”