How Materialize Unlocks Private Kafka Connectivity via PrivateLink and SSH

At Materialize, we’ve built a data warehouse that runs on real-time data. Our customers use this real-time data to power critical business use cases, from fraud detection, to dynamic pricing, to loan underwriting. To provide our customers with streaming data, we have first-class support for loading and unloading data via Apache Kafka, the de facto standard for transit for real-time data. Because of the sensitivity of their data, our customers require strong encryption and authentication schemes at a minimum....

June 10, 2024 · Steffen Hausmann

Navigating Private Network Connectivity Options for Kafka Clusters

There are various strategies for securely connecting to Kafka clusters between different networks or over the public internet. Many cloud providers even offer endpoints that privately route traffic between networks and are not exposed to the internet. But, depending on your network setup and how you are running Kafka, these options … might not be an option! In this session, we’ll discuss how you can use SSH bastions or a self managed PrivateLink endpoint to establish connectivity to your Kafka clusters without exposing brokers directly to the internet....

March 20, 2024 · Steffen Hausmann

A Beginner’s Guide to Kafka Performance in Cloud Environments

Over time, deploying and running Kafka became easier and easier. Today you can choose amongst a large ecosystem of different managed offerings or just deploy to Kubernetes directly. But, although you have plenty of options to optimize your Kafka configuration and choose infrastructure that matches your use case and budget, it’s not always easy to tell how these choices affect overall cluster performance. In this session, we’ll take a look at Kafka performance from an infrastructure perspective....

May 16, 2023 · Steffen Hausmann

Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost

Apache Kafka is well known for its performance and tunability to optimize for various use cases. But sometimes it can be challenging to find the right infrastructure configuration that meets your specific performance requirements while minimizing the infrastructure cost. This post explains how the underlying infrastructure affects Apache Kafka performance. We discuss strategies on how to size your clusters to meet your throughput, availability, and latency requirements. Along the way, we answer questions like “when does it make sense to scale up vs....

March 14, 2022 · Steffen Hausmann

Performance Testing Framework for Apache Kafka

The tool is designed to evaluate the maximum throughput of a cluster and compare the put latency of different broker, producer, and consumer configurations. To run a test, you basically specify the different parameters that should be tested and the tool will iterate through all different combinations of the parameters, producing a graph similar to the one below. https://github.com/aws-samples/performance-testing-framework-for-apache-kafka/

March 7, 2022 · Steffen Hausmann