MapReduce

  • submit to reddit

Scala and Hadoop: Hand in Hand at Twitter

If you have read the paper published by Google’s Jeffrey Dean and Sanjay Ghemawat (MapReduce: Simplied Data Processing on Large Clusters), they revealed...

0 replies - 9020 views - 08/05/12 by Istvan Szegedi in Articles

Introducing a Simple PaaS Built on Hadoop YARN

This post describes a prototype implementation of a simple PAAS built on the Hadoop YARN framework and the key findings from the experiment. While there...

0 replies - 5090 views - 07/31/12 by Jaigak Song in Articles

Building Your First MongoDB App in Ruby: An OSCON 2012 Tutorial

This is a 3 hour tutorial I wrote for and gave at OSCON 2012. Here is the summary: This tutorial will introduce the features of MongoDB by building a...

0 replies - 5732 views - 07/29/12 by Steve Francia in Articles

On the Future of Hadoop: Hadoop 2.0, NextGen MapRecue (YARN), and more. . .

Episode #8 of the Podcast is a talk with Arun C. Murthy. We talked about Hortonworks HDP1, the first release from Hortonworks, Apache Hadoop 2.0, NextGen...

0 replies - 5279 views - 07/25/12 by Joe Stein in Articles

Use Cassandra to Run Hadoop MapReduce

So if you are looking for a good NoSQL read of HBase vs. Cassandra you can check out...

1 replies - 7795 views - 07/16/12 by Joe Stein in Articles

Getting to Know Amazon Elastic MapReduce

Amazon Elastic MapReduce is a service in the AWS portfolio that can be used for data processing and analytics on vast amounts of data. It is based on...

0 replies - 3058 views - 07/16/12 by Istvan Szegedi in Articles

Using AWS Elastic MapReduce Results with Mobile BI Analytics

So far we covered server-side/cloud components – how to process data with  MapReduce running in the cloud or on our own Hadoop cluster. This time it is...

0 replies - 3193 views - 07/16/12 by Istvan Szegedi in Articles

A Brief Introduction to Riak, from Clusters to Links to MapReduce

While researching NoSQL databases recently, I stumbled upon Riak, found myself intrigued, and decided to dive a little deeper. The quick and dirty: Riak is...

0 replies - 4527 views - 07/12/12 by Scott Leberknight in Articles

Creating .NET-based Mappers and Reducers for Hadoop with JNBridgePro

 This post was originally authored by Wayne Citrin on the JNBridge Labs page.The Apache Hadoop framework enables distributed processing of very large...

0 replies - 4903 views - 07/12/12 by Mitch Pronschinske in Articles

Hadoop Hangover: MapReduce2 or YARN?

Well, we have seen new versions and new releases in software as days roll by. No wonder!Same is the case with Hadoop! Past few months we saw new releases on...

0 replies - 3708 views - 07/05/12 by Swathi Venkatachala in Articles

Hadoop on Azure: Hive and Amazon Elastic MapReduce

Note: This post is the second half of my recent Executing an Elastic MapReduce Hive Workflow from the AWS Management Console article with a slightly modified...

0 replies - 4110 views - 06/21/12 by Roger Jennings in Articles

Getting MapReduce Working on MongoDB

MapReduce is a software framework introduced by Google in 2004 to support distributed computing on large data sets on clusters of computers. You can read...

0 replies - 4463 views - 05/10/12 by Prabath Siriwardena in Articles

MongoDB, Hadoop, and Humongous Data

Learn how to integrate MongoDB with Hadoop for large-scale distributed data processing. Using Hadoop’s MapReduce and Streaming you will learn how to do...

0 replies - 4222 views - 05/08/12 by Steve Francia in Articles

Effective Testing Strategies for MapReduce Applications

Effective Testing Strategies for MapReduce Applications In this article I demonstrate various strategies that I have used to test Hadoop...

0 replies - 5783 views - 05/02/12 by Tim Reardon in Articles

Amazon EMR Tutorial: Running a Hadoop Job Using Custom JAR

IntroductionAmazon EMR is a web service which can be used to easily and efficiently process enormous amounts of data. It uses a hosted Hadoop framework running...

0 replies - 9700 views - 04/23/12 by Muhammad Khojaye in Articles