Hadoop

  • submit to reddit

Scala and Hadoop: Hand in Hand at Twitter

If you have read the paper published by Google’s Jeffrey Dean and Sanjay Ghemawat (MapReduce: Simplied Data Processing on Large Clusters), they revealed...

0 replies - 8992 views - 08/05/12 by Istvan Szegedi in Articles

Hurdles to Your First Hadoop Cluster

Yesterday we were working on setting up our first Hadoop cluster. Though there are many online documentation on this even then we faced a few challenges...

0 replies - 3376 views - 08/02/12 by Abhishek Jain in Articles

Heroku Users Can Now Access Treasure Data Analytics Platform

We are pleased to announce that Treasure Data Heroku Add-on is now GA on Heroku! Treasure Data Hadoop Add-on lets Heroku users access our cloud-based...

0 replies - 2882 views - 08/01/12 by Sadayuki Furuhashi in Articles

Introducing a Simple PaaS Built on Hadoop YARN

This post describes a prototype implementation of a simple PAAS built on the Hadoop YARN framework and the key findings from the experiment. While there...

0 replies - 5071 views - 07/31/12 by Jaigak Song in Articles

How I Got Twitter Data Onto SQL Server

I’ve been looking at how it might be possible to bring data from Twitter into SQL Server. You might ask, Why ???? Well, why not ? It’s more an exercise...

1 replies - 4168 views - 07/25/12 by Nick Haslam in Articles

On the Future of Hadoop: Hadoop 2.0, NextGen MapRecue (YARN), and more. . .

Episode #8 of the Podcast is a talk with Arun C. Murthy. We talked about Hortonworks HDP1, the first release from Hortonworks, Apache Hadoop 2.0, NextGen...

0 replies - 5264 views - 07/25/12 by Joe Stein in Articles

Faster Datanodes With Less Wait IO in Hadoop

I have noticed often that the check Hadoop uses to calculate usage for the data nodes causes a fair amount of wait io on them driving up load. Every cycle...

0 replies - 2873 views - 07/23/12 by Joe Stein in Articles

Spring Data - Apache Hadoop

Spring for Apache Hadoop is a Spring project to support writing applications that can benefit of the integration of Spring Framework and Hadoop.  This...

0 replies - 4210 views - 07/18/12 by Istvan Szegedi in Articles

NoSQL HBase and Hadoop with Todd Lipcon from Cloudera

Episode #6 of the Podcast is a talk with Todd Lipcon from Cloudera discussing HBase. We talked about NoSQL and how it should stand for “Not Only SQL” and...

0 replies - 2241 views - 07/17/12 by Joe Stein in Articles

Book Review: Data Analysis with Open Source Tools

Before I get to the book review, I wanted to mention a basic note about book reviews. In the past I have reviewed books in a less than traditional manner,...

0 replies - 3808 views - 07/17/12 by Robert Diana in Articles

Use Cassandra to Run Hadoop MapReduce

So if you are looking for a good NoSQL read of HBase vs. Cassandra you can check out...

1 replies - 7762 views - 07/16/12 by Joe Stein in Articles

Working with HBase and Hadoop

HBase is a NoSQL database. It is based on Google’s Bigtable distributed storage system – as it is described in Google research paper; “A Bigtable is a...

0 replies - 7240 views - 07/14/12 by Istvan Szegedi in Articles

Creating .NET-based Mappers and Reducers for Hadoop with JNBridgePro

 This post was originally authored by Wayne Citrin on the JNBridge Labs page.The Apache Hadoop framework enables distributed processing of very large...

0 replies - 4893 views - 07/12/12 by Mitch Pronschinske in Articles

GigaOm: 'Hadoop's days are numbered…' Are they?

Interesting article at GigaOm: http://bit.ly/OINpfr I won’t repeat the main points - but basically it says that since Hadoop is disk/ETL/batch based it...

3 replies - 6465 views - 07/12/12 by Nikita Ivanov in Articles

Hadoop: My Experience with Cloudera and MapR

A few months back we started to endeavor on a new Hadoop cluster @ medialets We have been live with Hadoop in production since April 2010 and we are still...

0 replies - 8071 views - 07/11/12 by Joe Stein in Articles