Making it Easier to Build Connectors with Apache Flink: Introducing the Async Sink

· 302 words · 2 minute read

Apache Flink is a popular open source framework for stateful computations over data streams. It allows you to formulate queries that are continuously evaluated in near real time against an incoming stream of events. To persist derived insights from these queries in downstream systems, Apache Flink comes with a rich connector ecosystem that supports a wide range of sources and destinations. However, the existing connectors may not always be enough to support all conceivable use cases. Our customers and the community kept asking for more connectors and better integrations with various open source tools and services.

But that’s not an easy problem to solve. Creating and maintaining production-ready sinks for a new destination is a lot of work. For critical use cases, it’s undesirable to lose messages or to compromise on performance when writing into a destination. However, sinks have commonly been developed and maintained independently of each other. This further adds to the complexity and cost of adding sinks to Apache Flink, as more functionality had to be independently reimplemented and optimized for each sink.

To better support our customers and the entire Apache Flink community, we set out to make it easier and less time consuming to build and maintain sinks. We contributed the Async Sink to the Flink 1.15 release, which improved cloud interoperability and added more sink connectors and formats, among other updates. The Async Sink is an abstraction for building sinks with at-least-once semantics. Instead of reimplementing the same core functionality for every new sink that is created, the Async Sink provides common sink functionality that can be extended upon. In the remainder of this post, we’ll explain how the Async Sink works, how you can build a new sink based on the Async Sink, and discuss our plans to continue our contributions to Apache Flink.