Big Data

  • submit to reddit

A Gentle Introduction to JBatch

As Java EE 7 rolls forward, my good friend and colleague Arun Gupta has been trying to blog about some of the most exciting new features. His blog...

0 replies - 620 views - 02/21/13 by Reza Rahman in Articles

R: Building Up a Data Frame Row by Row

Jen and I recently started working on the Kaggle Titanic problem and we thought it’d probably be useful to start with some exploratory data...

0 replies - 930 views - 02/21/13 by Mark Needham in Articles

DZone Links You Don't Want To Miss (2/21/13)

18 API Business Models: The Breakdown  A useful model of 18 different ways to monetize your APIs.  See what models Twitter, Facebook, Amazon,...

0 replies - 2907 views - 02/21/13 by Mitch Pronschinske in Articles

Text Processing (Part 1): Entity Recognition

Entity recognition is commonly used to parse unstructured text document and extract useful entity information (like location, person, brand) to construct a...

0 replies - 2902 views - 02/20/13 by Ricky Ho in Articles

British Statisticians and American Gangsters

A few months ago, I did publish a post (in French) following my reading of Leonard Mlodinow’s the Drunkard’s Walk. More precisely,...

0 replies - 948 views - 02/20/13 by Arthur Charpentier in Articles

From Simpson's Paradox to Pies

Today, I wanted to publish a post on economics, and decision theory. And probability too… Those who do follow my blog should know that I am a big...

0 replies - 1108 views - 02/20/13 by Arthur Charpentier in Articles

Algorithm of the Week: Bellman-Ford in Python Using Vectorization/Numpy

I recently wrote about an implementation of the Bellman Ford shortest path algorithm and concluded by saying that it took 27 seconds to calculate...

0 replies - 2107 views - 02/19/13 by Mark Needham in Articles

Data Science: Don't Filter Data Prematurely

Last year I wrote a post describing how I’d gone about getting data for my ThoughtWorks graph and one mistake about my approach in retrospect is...

0 replies - 1487 views - 02/19/13 by Mark Needham in Articles

This Summer's Forecasting Conferences in Seoul, Rome, and Paris

This year there are no less than three fore­cast­ing con­fer­ences planned for June and July 2013. As well as the annual Inter­na­tional...

0 replies - 398 views - 02/19/13 by Rob J Hyndman in Articles

DZone Links You Don't Want To Miss (2/19/13)

Interactive Design Trends of 2013  You need to download this slidedeck.  It's overflowing with advice that will keep you ahead of the curve. Why...

1 replies - 101391 views - 02/19/13 by Mitch Pronschinske in Articles

Sorting Rows and Columns in a Matrix (with Some Music, and Some Magic)

This morning, I was working on some paper on inequality measures, and for computational reasons, I had to sort elements in a matrix. To make it simple, I...

0 replies - 1594 views - 02/18/13 by Arthur Charpentier in Articles

How Cloud9 IDE Used Treasure Data to Re-segment Users

We just published Cloud9 IDE’s success story. As many of you know, Cloud9 IDE is a online development environment that makes coding more...

0 replies - 1007 views - 02/18/13 by Sadayuki Furuhashi in Articles

Simulating a Line-Following Robot in R

I’ve been reading up on controlling mobile robots, and built a simple robotic movement simulator in R, using the graphing libraries. The model sets up a...

0 replies - 1520 views - 02/18/13 by Gary Sieling in Articles

Statistical Consulting with Zombal

This is a guest post by Bene­dict Noel of Zom­bal. Many sta­tis­ti­cians do a lit­tle bit of con­sult­ing in addi­tion to...

0 replies - 940 views - 02/18/13 by Rob J Hyndman in Articles

DZone Links You Don't Want To Miss (2/18/13)

Steve Yegge's Predictions from 2004 Now this was a fun little blast from the past.  Steve Yegge (who I think is awesome) made some interesting...

0 replies - 2571 views - 02/18/13 by Mitch Pronschinske in Articles