In this talk I present the results of a set of experiments comparing the performance of several implementations of aggregating time-series data. There are 3 implementations: a baseline implementation not using any streaming frameworks, an implementation using Apache Flink, and an implementation using Apache Spark Streaming*. These implementations all ran against the same Kafka cluster using the same data stream, with the goal to understand the limitations of the different implementations. The limitations were measured at 3 input data rates: 100%, 6000%, and breaking-point load.
Day 2 - September, 13th
11:00 - 11:40