Flink Forward

One Sink to Rule Them All: Introducing the New Async Sink

Next time you want to integrate with a new destination for a demo, concept or production application, the Async Sink framework will bootstrap development, allowing you to move quickly without compromise. In Flink 1.15 we introduced the Async Sink base (FLIP-171), with the goal to encapsulate common logic and allow developers to focus on the key integration code. The new framework handles things like request batching, buffering records, applying backpressure, retry strategies, and at least once semantics. It allows you to focus on your business logic, rather than spending time integrating with your downstream consumers. During the session we will dive deep into the internals to uncover how it works, why it was designed this way, and how to use it. We will code up a new sink from scratch and demonstrate how to quickly push data to a destination. At the end of this talk you will be ready to start implementing your own Flink sink using the new Async Sink framework. ...

Build and run streaming applications with Apache Flink and Amazon Kinesis Data Analytics (FF)

Stream processing facilitates the collection, processing, and analysis of real-time data and enables the continuous generation of insights and quick reactions to emerging situations. Yet, despite these advantages compared to traditional batch-oriented analytics applications, streaming applications are much more challenging to operate. Some of these challenges include the ability to provide and maintain low end-to-end latency, to seamlessly recover from failure, and to deal with a varying amount of throughput. ...

Build a Real-time Stream Processing Pipeline with Apache Flink on AWS (FF)

The increasing number of available data sources in today’s application stacks created a demand to continuously capture and process data from various sources to quickly turn high volume streams of raw data into actionable insights. Apache Flink addresses many of the challenges faced in this domain as it’s specifically tailored to distributed computations over streams. While Flink provides all the necessary capabilities to process streaming data, provisioning and maintaining a Flink cluster still requires considerable effort and expertise. We will discuss how cloud services can remove most of the burden of running the clusters underlying your Flink jobs and explain how to build a real-time processing pipeline on top of AWS by integrating Flink with Amazon Kinesis and Amazon EMR. We will furthermore illustrate how to leverage the reliable, scalable, and elastic nature of the AWS cloud to effectively create and operate your real-time processing pipeline with little operational overhead. ...