Steffen's Blog
Field Engineer @ Materialize
At Materialize, we’ve built a data warehouse that runs on real-time data. Our customers use this real-time data to power critical business use cases, from fraud detection, to dynamic pricing, to loan underwriting.
To provide our customers with streaming data, we have first-class support for loading and unloading data via Apache Kafka, the de facto standard for transit for real-time data. Because of the sensitivity of their data, our customers require strong encryption and authentication schemes at a minimum.
There are various strategies for securely connecting to Kafka clusters between different networks or over the public internet. Many cloud providers even offer endpoints that privately route traffic between networks and are not exposed to the internet. But, depending on your network setup and how you are running Kafka, these options … might not be an option!
In this session, we’ll discuss how you can use SSH bastions or a self managed PrivateLink endpoint to establish connectivity to your Kafka clusters without exposing brokers directly to the internet.
Over time, deploying and running Kafka became easier and easier. Today you can choose amongst a large ecosystem of different managed offerings or just deploy to Kubernetes directly. But, although you have plenty of options to optimize your Kafka configuration and choose infrastructure that matches your use case and budget, it’s not always easy to tell how these choices affect overall cluster performance.
In this session, we’ll take a look at Kafka performance from an infrastructure perspective.
This post is also available on the Materialize blog.
Materialize is a distributed SQL database built on streaming internals. With it, you can use the SQL you are already familiar with to build powerful stream processing capabilities. But as with any abstraction, sometimes the underlying implementation details leak through the abstraction. Queries that look simple and innocent when you are formulating them in SQL can sometimes require more resources than expected when evaluated incrementally against a continuous stream of arriving updates.
After more than 7.5 years my time at AWS came to a close at the end of 2022. It’s been an incredible journey to learn and grow professionally.
I’m still surprised how much trust and support I’ve received over the years to focus on things I found important and impactful. Just last year the work I’ve started to improve the Apache Flink connectors system was contributed back to the open source project, not only resulting in several blog posts and a session at Flink Forward, but also getting early adoption that lead to support of new destinations that now integrate with Apache Flink.