Big Data/Analytics Zone is brought to you in partnership with:

Alec is a Content Curator at DZone. He lives in Raleigh and spends his free time writing and programming. Alec is a DZone Zone Leader and has posted 505 posts at DZone. You can read more from them at their website. View Full User Profile

The Best of the Week: Big Data Zone

  • submit to reddit
Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Oct. 11 to Oct. 17). Here they are, in order of popularity:

1. Big Data Analytics Beyond Hadoop

This article outlines the need to look beyond Hadoop for some big data analytics. From a batch analytics perspective, Spark is ideal for iterative machine learning algorithms. From a real-time analytics perspective, Storm is preferable. From a specialized data structures perspective, GraphLab is an ideal paradigm for processing large graphs.

2. Bayesian Modeling for the Perfect Pizza

How do you check if your pizza’s done? You look at it. But what if you’re mass producing pizza? In "Multivariate Bayesian cognitive modeling for unsupervised quality control of baked pizzas," the authors propose using computers specifically trained to look at a pizza and say, “yup, perfect! Done.”

3. The Rise of Big Data

While helping a MongoDB user with a sharding issue - his chunks weren't splitting - the author learned an important lesson about big data and tactfulness.

4. Apache Releases Hadoop 2.0: MapReduce, YARN and Big Changes for Big Data

Hadoop 2.0 is here, and with it come some big changes. The most notable, as detailed by a recent article from InfoWorld, is MapReduce 2.0, which is now incorporated into a larger system called YARN (Yet Another Resource Negotiator). Take a look and see what may be in store for Big Data. 

5. Detecting Reddit Voting Rings Using This Weird Little Data Trick

The HyperLogLog counter is a probabilistic data structure that estimates the count of unique elements in a list. What if we created a HyperLogLog counter for each Reddit user, and for every upvote, updated the corresponding HyperLogLog counter with the voter's id. Given this setup, here’s how we detect the voting ring.