Kafka

  • submit to reddit

Building LinkedIn's Real-Time Data Pipeline

At the core of many of LinkedIn's analytics applications is a real-time data pipeline built on top of Apache Kafka. This system handles over 10 billion...

0 replies - 1628 views - 02/08/13 by Mitch Pronschinske in Articles

Intra-Cluster Replication in Apache Kafka

Originally authored by Jun Rao.  Jun will present at this year's ApacheCon North America. Kafka is a distributed publish-subscribe messaging system. It...

0 replies - 2423 views - 02/07/13 by Mitch Pronschinske in Articles

VMware's RabbitMQ vs. LinkedIn's Kafka

Although quantitative, data-driven comparisons like benchmarks are the most fundamental comparisons we have in technology, it can often be more helpful to get...

0 replies - 4838 views - 01/16/13 by Mitch Pronschinske in Articles

Kafka Possibly Moving to Java CRC, Akka Suggested

If you haven't heard of Kafka, a super-fast distributed message queue from LinkedIn, you probably ought to look into it, and its heavily involved Apache...

0 replies - 4646 views - 12/14/12 by Mitch Pronschinske in Articles

A Big Data Quadfecta: (Cassandra + Storm + Kafka) + ElasticSearch

In my previous post, I discussed our BigData Trifecta, which includes Storm, Kafka and Cassandra. Kafka played the role of our work/data queue. ...

0 replies - 4881 views - 11/12/12 by Brian O' Neill in Articles

Takeaways from the Kafka Talk at AirBnB: the Power of Structured Data and the Myth of “Exactly Once”

Last night, I attended Jay Kreps’s talk on Apache Kafka at AirBnB. Jay is a Principal Engineer at LinkedIn and is one of the original authors of...

2 replies - 3777 views - 09/03/12 by Sadayuki Furuhashi in Articles