Apache Kafka is a powerful publish subscribe messaging service that delivers high-volume messages across ad hoc topics to subscribers with message durability for offline consumers. As with any technology, Kafka comes with certain concessions, in this case, concessions that influence consumer behavior in particular at scale. Likewise, achieving optimal throughput when writing to durable storage can be optimized by distilling data storage operations into commutative, idempotent sets of operations.
This talk will detail Urban Airship's experience using Kafka to process billions of messages per day. The talk will begin with an in-depth look at Kafka's core design concepts and how they influence nuisances of writing consumers, both positive and negative. Beyond consumers, this talk will detail how Urban Airship leverages the strengths of different storage engines (Cassandra, HBase and in-house solutions) across a common consumer infrastructure for disparate goals including near real-time message routing, analytics and system measurement.