Build a Unified Batch and Stream Processing Pipeline with Apache Beam on AWS

In this workshop, we explore an end to end example that combines batch and streaming aspects in one uniform Beam pipeline. We start to analyze incoming taxi trip events in near real time with an Apache Beam pipeline. We then show how to archive the trip data to Amazon S3 for long term storage. We subsequently explain how to read the historic data from S3 and backfill new metrics by executing the same Beam pipeline in a batch fashion....

August 26, 2020 · Steffen Hausmann

Build real-time analytics for a ride-sharing app (ANT401)

In this session, we walk through how to perform real-time analytics on ride-sharing and taxi data, and we explore how to build a reliable, scalable, and highly available streaming architecture based on managed services. You learn how to deploy, operate, and scale an Apache Flink application with Amazon Kinesis Data Analytics for Java applications. Leave this workshop knowing how to build an end-to-end streaming analytics pipeline, starting with ingesting data into a Kinesis data stream, writing and deploying a Flink application to perform basic stream transformations and aggregations, and persisting the results to Amazon Elasticsearch Service to be visualized from Kibana....

December 2, 2019 · Steffen Hausmann

Streaming Analytics Workshop

In this workshop, you will build an end-to-end streaming architecture to ingest, analyze, and visualize streaming data in near real-time. You set out to improve the operations of a taxi company in New York City. You’ll analyze the telemetry data of a taxi fleet in New York City in near-real time to optimize their fleet operations. You will not only learn how to deploy, operate, and scale an Apache Flink application with Kinesis Data Analytics for Java Applications, but also explore the basic concepts of Apache Flink and running Flink applications in a fully managed environment on AWS....

June 20, 2019 · Steffen Hausmann