Big Data

  • submit to reddit

"It's Open Source, So the Source is, You Know, Open."

Even though I mostly sit at work trying to look busy, every so often someone does stumbles into my office with a question or a problem so I’ve got to...

0 replies - 2098 views - 04/02/13 by Arnon Rotem-gal-oz in Articles

Scalding: Finding K Nearest Neighbors for Fun and Profit

Imagine you run a popular e-commerce site focused on hip and trendy clothing for women and you got a whiz bang recommendation engine that recommends products...

0 replies - 2024 views - 04/02/13 by Muhammad Ashraf in Articles

The Statistics of Easter

This morning, there was an interesting post entitled “why does Easter move around so much?” online...

0 replies - 363 views - 04/01/13 by Arthur Charpentier in Articles

Open Source in the Academy, What a Data Scientist Does, and More Data Links

A couple of very interesting posts this week (and also a more funny one),“Scholarship: Beyond the paper” (via Neuro_Skeptic‘s...

0 replies - 1846 views - 03/30/13 by Arthur Charpentier in Articles

Geek Reading for the Weekend

I have talked about human filters and my plan for digital curation. These items are the fruits of those ideas, the items I deemed worthy from...

0 replies - 2081 views - 03/29/13 by Robert Diana in Articles

Differentiation Across the Apache Hadoop Distribution Vendor Landscape

Hadoop Spring Roundup - With the Strata Conference, the Gartner BI Summit, and the Hadoop Summit all occurring in the month of March, there is a mountain...

0 replies - 441 views - 03/28/13 by Bootstrap Mark... in Articles

HOWTO: Monitor Amazon Elastic MapReduce Metrics

This tutorial from AWS details the ins and outs of monitoring EMR jobflows: 

0 replies - 1266 views - 03/27/13 by Eric Gregory in Articles

The Temporal Doppler Effect, Learnable Coding, and More Data Links

Once again, a lot of interesting posts, here and there (but, as usual, outside this blog)econometrics in academic journals “Star Wars: the Empirics Strike...

0 replies - 867 views - 03/27/13 by Arthur Charpentier in Articles

Ranking Conferences by Social Endogamy Using Graph Databases

Lately there is lot of research interest in the area of the benchmarking of Graph Databases and NOSQL, which is a great area to explore. However, it is also...

0 replies - 1140 views - 03/27/13 by Damaris Coll in Articles

Dev of the Week: Mikio Braun

Every week, we feature a new developer/blogger from the DZone community here and in our newsletter, catching up to find out what they're working on...

0 replies - 4198 views - 03/27/13 by Eric Gregory in Articles

Hadoop/R Integration I: Streaming

If you've spent any time with MapReduce frameworks in general, by now you probably know the word-count example is the MapReduce equivalent of "Hello...

0 replies - 5523 views - 03/27/13 by Wayne Adams in Articles

Autocomplete on Multivalued Fields Using Faceting

In the previous blog post about auto complete on multi-valued field we discussed how highlighting can help us get the information we are...

0 replies - 1940 views - 03/26/13 by Rafał Kuć in Articles

Integrated Data as the Perfect Zinger

There’s a classic episode of the American television show Seinfeld called “The Comeback” where a main character, George Costanza, is...

0 replies - 888 views - 03/26/13 by Christopher Taylor in Articles

Data Analysis Training via Coursera

I recently finished the coursera course "Data Analysis," which immediately followed and somewhat overlapped "Computing for Data Analysis," also from...

1 replies - 3945 views - 03/26/13 by Wayne Adams in Articles

A Year of Blogging Analyzed with R

Today it’s exactly one year ago that I published my first blogpost on branchandbound.net. During this year I’ve written 12 posts, including this one....

0 replies - 1037 views - 03/25/13 by Sander Mak in Articles