Big Data

  • submit to reddit

Crash Course on R for Financial and Actuarial Econometrics

This Friday, I will give in Montréal a crash course entitled Econometric Modeling in Finance and Insurance with the R Language....

0 replies - 1780 views - 02/14/13 by Arthur Charpentier in Articles

DZone Links You Don't Want To Miss (2/14/13)

Where Did The Term "Big Data" Come From The NYTimes did some deep investigating into the etymological origins of the biggest buzzword in IT right now.A...

0 replies - 2845 views - 02/14/13 by Mitch Pronschinske in Articles

Out-of-Sample One Step Forecasts

It is com­mon to fit a model using train­ing data, and then to eval­u­ate its per­for­mance on a test data set. When the data are time series,...

0 replies - 458 views - 02/14/13 by Rob J Hyndman in Articles

SpringData-Hadoop: Jumpstart Hadoop with Spring

These days there are lot of hype around jargons like Hadoop, HBase, Hive, Pig and BigData. I was itching to learn what are these terms...

0 replies - 690 views - 02/14/13 by Krishna Prasad in Articles

Feature Extraction/Selection – What I’ve Learnt So Far

A couple of weeks ago I wrote about some feature extraction work that I’d done on the Kaggle Digit Recognizer data set and having...

0 replies - 981 views - 02/13/13 by Mark Needham in Articles

Hadoop Developer - WordCount tutorial using Maven and NetBeans 7.3RC2

I have adapted the WordCount tutorial to Maven based development as this probably the most popular way to develop in companies. I am not going to...

0 replies - 1362 views - 02/13/13 by Armel Nene in Articles

The Safe Triumvirate of Visualization, In-Faux-Graphics, and All That

Over at 37Signal’s Signal v. Noise, Noah Lorang recently cautioned against the recent infatuation with “creative” visualization techniques. He...

0 replies - 788 views - 02/13/13 by Sadayuki Furuhashi in Articles

GATE, NLTK: Basic Components of a Machine Learning System

I am currently building a Machine Learning system. In this blog I want to captures the elements of a machine learning system.My definition...

0 replies - 794 views - 02/13/13 by Krishna Prasad in Articles

Hadoop Hangover: Launch a Hadoop Cluster CDH4 Using Apache Whirr

This post is about how-to launch a CDH4 MRv1 or CDH4 Yarn cluster on EC2 instances. It's said that you can launch a cluster with the help of Whirr and in a...

0 replies - 1516 views - 02/12/13 by Swathi Venkatachala in Articles

Pills, Half Pills, and Probabilities

Yesterday, I was uploading some old posts to complete the migration (I get back to my old posts, one by one, to check links of pictures, reformating R codes,...

0 replies - 828 views - 02/12/13 by Arthur Charpentier in Articles

Visualization, Modeling, and Surprises

This afternoon Hadley Wickham gave a great talk on data analysis. Here’s a paraphrase of something profound he said.Visualization can surprise you,...

0 replies - 2025 views - 02/11/13 by John Cook in Articles

The Most Important Concept in IT and Other Data Links of the Week

A nice post on statistics and codes, that I read this week“Statisticians and computer scientists – if there is no code, there is no...

0 replies - 1851 views - 02/11/13 by Arthur Charpentier in Articles

LZOP Decompression - Revenge of the Useless Cat

For me LZOP is the ubiquitous compression codec with working with large text files in HDFS due to its MapReduce data locality advantages. As a result when I...

0 replies - 1368 views - 02/11/13 by Alex Holmes in Articles

Google on Datastore Query, Index, and Transactions

This Google Developers tutorial delves into querying, indexing, and transactions with App Engine's Datastore service, driven by Google Bigtable: In this...

0 replies - 1632 views - 02/11/13 by Eric Gregory in Articles

Using R — .Call(“hello”)

In an introductory post on R APIs to C code, Calling C Code ‘Hello World!’, we explored the .C() function with some ‘Hello World!’ baby...

0 replies - 1306 views - 02/10/13 by Jonathan Callahan in Articles