Big Data

  • submit to reddit

Averaging Languages Spoken, Dumbassification of Academia, and More Data Links

This week, as usual, there is much more to read somewhere else than on my own blog.“Average number of languages spoken in different countries”...

0 replies - 1380 views - 03/03/13 by Arthur Charpentier in Articles

ETS Models Now in EViews 8

The ETS mod­el­ling frame­work devel­oped in my 2002 IJF paper (with Koehler, Sny­der and Grose), and in my 2008...

0 replies - 1088 views - 03/03/13 by Rob J Hyndman in Articles

Covariant and Contravariant

The terms covariant and contravariant come up in many contexts. An earlier post discussed how the terms are used in programming and...

0 replies - 1529 views - 03/02/13 by John Cook in Articles

How Intel’s Hadoop Distribution Wants to be Different

Intel announced this week its Big Data strategy with its crown jewel their own Hadoop distribution. Many people will be surprised that a chipmaker wants to be...

0 replies - 2352 views - 02/28/13 by Maarten Ectors in Articles

Systems Integration in the NoSQL Era with Apache Camel

In February 2013, I was at ApacheCon NA 2013 in Portland, Oregon, USA. It was a small, but great conference. I met so many awesome Apache experts and...

0 replies - 1038 views - 02/28/13 by Kai Wähner in Articles

Running the Numbers on Papal Tenures with R

The job of Bishop of Rome – i.e. the Pope – has usually been considered a life-long commitment. There have been 266 popes since 32 A.D....

0 replies - 1577 views - 02/28/13 by Arthur Charpentier in Articles

R and Hadoop Data Analytics - RHadoop

Introduction R is a programming language and a software suite used for data analysis, statistical computing and data visualization. It is highly extensible...

0 replies - 4315 views - 02/27/13 by Istvan Szegedi in Articles

Exact Chaos

Pick a number x between 0 and 1. Then repeatedly replace x with 4x(1-x). For almost all starting values of x, the result exhibits...

0 replies - 436 views - 02/27/13 by John Cook in Articles

Super Storm Sandy and 100% Uptime

In the aftermath of Super Storm Sandy, this panel of CTOs from AppNexus, adMarketplace, Tapad, x+1 and Aerospike discussed issues and best practices in...

0 replies - 1206 views - 02/27/13 by Claire Umeda in Articles

The Impact of Real-Time Big Data on Business

In the aftermath of Super Storm Sandy, this panel of CTOs from AppNexus, adMarketplace, Tapad, x+1 and Aerospike discussed issues and best practices in...

0 replies - 1719 views - 02/26/13 by Claire Umeda in Articles

Using the Libjars Option with Hadoop

When working with MapReduce one of the challenges that is encountered early-on is determining how to make your third-part JAR’s available to the map and...

0 replies - 1826 views - 02/26/13 by Alex Holmes in Articles

Text Processing, Part 2: Oh, Inverted Index

This is the second part of my text processing series.  In this blog, we'll look into how text documents can be stored in a form that can be easily...

0 replies - 1414 views - 02/26/13 by Ricky Ho in Articles

Big Data Beyond MapReduce: Google's Big Data Papers

Mainstream Big Data is all about MapReduce, but when looking at real-time data, limitations of that approach are starting to show. In this post, I’ll review...

0 replies - 19644 views - 02/25/13 by Mikio Braun in Articles

Building Suggest-As-You-Type With Carrot2 Clustering

The first interaction that a customer has with your e-commerce web site is with the search box itself. So it is of utmost importance to make the user...

0 replies - 1178 views - 02/25/13 by John Berryman in Articles

Using R — Package Installation Problems

The post titled Installing Packages described the basics of package installation with R.  The process is wonderfully simple when everything goes...

0 replies - 1045 views - 02/25/13 by Jonathan Callahan in Articles