Big Data

  • submit to reddit

Lego, Causal Modeling, R and the Web, and More Data Links

A little bit late this week (busy week end), and as usual, a lot of interesting posts here, and there,“What happened to...

0 replies - 2399 views - 02/16/13 by Arthur Charpentier in Articles

Hadoop, MongoDB, and R Community Linksheet

We love to work with latest big data technologies. Currently, our favorite open-source tools are the NoSQL database mongoDB,...

0 replies - 2756 views - 02/15/13 by Daniel Bartl in Articles

Getting Hadoop, Hive and HBase Up and Running in Less than 15 Minutes

Note: This tutorial comes from guest writer Mark Grover. Enjoy.Introduction If you have delved into Apache Hadoop and related projects, you know that...

1 replies - 5482 views - 02/15/13 by Eric Gregory in Articles

Meet the Engineer: Kathleen Ting

Note: This interview appears courtesy of Justin Kestelyn and Kathleen Ting. In this installment of “Meet the Engineer”, get to know Customer Operations...

0 replies - 1135 views - 02/15/13 by Eric Gregory in Articles

This Quick Pig Overview Brings You Up to Speed Line by Line

This twenty minute tutorial from Dan Morrill explains a simple Pig script line by line. Concise and useful:

0 replies - 1482 views - 02/15/13 by Eric Gregory in Articles

Crash Course on R for Financial and Actuarial Econometrics

This Friday, I will give in Montréal a crash course entitled Econometric Modeling in Finance and Insurance with the R Language....

0 replies - 1706 views - 02/14/13 by Arthur Charpentier in Articles

DZone Links You Don't Want To Miss (2/14/13)

Where Did The Term "Big Data" Come From The NYTimes did some deep investigating into the etymological origins of the biggest buzzword in IT right now.A...

0 replies - 2741 views - 02/14/13 by Mitch Pronschinske in Articles

Out-of-Sample One Step Forecasts

It is com­mon to fit a model using train­ing data, and then to eval­u­ate its per­for­mance on a test data set. When the data are time series,...

0 replies - 414 views - 02/14/13 by Rob J Hyndman in Articles

SpringData-Hadoop: Jumpstart Hadoop with Spring

These days there are lot of hype around jargons like Hadoop, HBase, Hive, Pig and BigData. I was itching to learn what are these terms...

0 replies - 619 views - 02/14/13 by Krishna Prasad in Articles

Feature Extraction/Selection – What I’ve Learnt So Far

A couple of weeks ago I wrote about some feature extraction work that I’d done on the Kaggle Digit Recognizer data set and having...

0 replies - 932 views - 02/13/13 by Mark Needham in Articles

Hadoop Developer - WordCount tutorial using Maven and NetBeans 7.3RC2

I have adapted the WordCount tutorial to Maven based development as this probably the most popular way to develop in companies. I am not going to...

0 replies - 1227 views - 02/13/13 by Armel Nene in Articles

The Safe Triumvirate of Visualization, In-Faux-Graphics, and All That

Over at 37Signal’s Signal v. Noise, Noah Lorang recently cautioned against the recent infatuation with “creative” visualization techniques. He...

0 replies - 746 views - 02/13/13 by Sadayuki Furuhashi in Articles

GATE, NLTK: Basic Components of a Machine Learning System

I am currently building a Machine Learning system. In this blog I want to captures the elements of a machine learning system.My definition...

0 replies - 684 views - 02/13/13 by Krishna Prasad in Articles

Hadoop Hangover: Launch a Hadoop Cluster CDH4 Using Apache Whirr

This post is about how-to launch a CDH4 MRv1 or CDH4 Yarn cluster on EC2 instances. It's said that you can launch a cluster with the help of Whirr and in a...

0 replies - 1385 views - 02/12/13 by Swathi Venkatachala in Articles

Pills, Half Pills, and Probabilities

Yesterday, I was uploading some old posts to complete the migration (I get back to my old posts, one by one, to check links of pictures, reformating R codes,...

0 replies - 802 views - 02/12/13 by Arthur Charpentier in Articles