Big Data/Analytics Zone is brought to you in partnership with:

Whitney has posted 73 posts at DZone. You can read more from them at their website. View Full User Profile

Google I/O: Dumping MapReduce

  • submit to reddit

Almost ten years ago, Google changed the Big Data world when it released a paper on MapReduce, a model for working through immense amounts of data that gained immediate traction in the community and continued, for a decade, to be the norm. At its developer conference on Wednesday, however, Google followed a burgeoning trend in dumping MapReduce in favor of what they're calling Google Cloud Dataflow.

Forbes writes,

The company realizes that time has moved on however and there is a need for a broader tool that allows for ingestion, transformation and analysis of data in ways that cover both streaming data and more traditional batch data processing. 

The service is also a response to an exponentially growing set of data to sift through; ten years ago, social media wasn't as abundant, prolific or widely-used; wearable tech was barely a thought and traditional statistical means for digging through data were more than enough to provide valuable insights. As Wired writes,

Long ago, with a sweeping software system called MapReduce, Google set the standard for processing “big data.” A tool that ran across hundreds of servers, MapReduce is what the company used to build the enormous index of webpages that underpins its search engine. Thanks to an open source clone of MapReduce–Hadoop–the rest of the world now crunches data in similar ways. But Hölzle says that Google not longer uses MapReduce. It now uses other Flume, aka FlumeJava, for this kind of massive “batch processing.”

The new service will allow Google to build more complex data pipelines; it will also integrate with MillWheel stream processing. From Venturebeat,

It can either run a series of computing jobs, batch-style, or do constant work as data flows in. Engineers can start using the service in Google’s burgeoning public cloud. Google takes care of managing the thing.

In other Google I/O coverage, check out the 5.0 news or an update on the Internet of Things.

Published at DZone with permission of its author, Whitney Baker.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)


Adeyemi Oyekan replied on Thu, 2014/06/26 - 9:58am

10 yrs to get big bizness on big streams ir clouds dont even exist again imagine a telnet cellphone n pentium 2

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.