
A little bit late this week (busy week end), and as usual, a lot of interesting posts here, and there,“What happened to...
0 replies - 2399 views - 02/16/13 by Arthur Charpentier in Articles

We love to work with latest big data technologies. Currently, our favorite open-source tools are the NoSQL database mongoDB,...
0 replies - 2756 views - 02/15/13 by Daniel Bartl in Articles

Note: This tutorial comes from guest writer Mark Grover. Enjoy.Introduction
If you have delved into Apache Hadoop and related projects, you know that...
1 replies - 5482 views - 02/15/13 by Eric Gregory in Articles

Note: This interview appears courtesy of Justin Kestelyn and Kathleen Ting.
In this installment of “Meet the Engineer”, get to know Customer Operations...
0 replies - 1135 views - 02/15/13 by Eric Gregory in Articles

This twenty minute tutorial from Dan Morrill explains a simple Pig script line by line. Concise and useful:
0 replies - 1482 views - 02/15/13 by Eric Gregory in Articles

This Friday, I will give in Montréal a crash course entitled Econometric Modeling in Finance and Insurance with the R Language....
0 replies - 1706 views - 02/14/13 by Arthur Charpentier in Articles

Where Did The Term "Big Data" Come From The NYTimes did some deep investigating into the etymological origins of the biggest buzzword in IT right now.A...
0 replies - 2741 views - 02/14/13 by Mitch Pronschinske in Articles

It is common to fit a model using training data, and then to evaluate its performance on a test data set. When the data are time series,...
0 replies - 414 views - 02/14/13 by Rob J Hyndman in Articles

These days there are lot of hype around jargons like Hadoop, HBase, Hive, Pig and BigData. I was itching to learn what are these terms...
0 replies - 619 views - 02/14/13 by Krishna Prasad in Articles

A couple of weeks ago I wrote about some feature extraction work that I’d done on the Kaggle Digit Recognizer data set and having...
0 replies - 932 views - 02/13/13 by Mark Needham in Articles

I have adapted the WordCount tutorial to Maven based development as this probably the most popular way to develop in companies. I am not going to...
0 replies - 1227 views - 02/13/13 by Armel Nene in Articles

Over at 37Signal’s Signal v. Noise, Noah Lorang recently cautioned against the recent infatuation with “creative” visualization techniques. He...
0 replies - 746 views - 02/13/13 by Sadayuki Furuhashi in Articles

I am currently building a Machine Learning system. In this blog I want to captures the elements of a machine learning system.My definition...
0 replies - 684 views - 02/13/13 by Krishna Prasad in Articles

This post is about how-to launch a CDH4 MRv1 or CDH4 Yarn cluster on EC2 instances. It's said that you can launch a cluster with the help of Whirr and in a...
0 replies - 1385 views - 02/12/13 by Swathi Venkatachala in Articles

Yesterday, I was uploading some old posts to complete the migration (I get back to my old posts, one by one, to check links of pictures, reformating R codes,...
0 replies - 802 views - 02/12/13 by Arthur Charpentier in Articles