Big Data/Analytics Zone is brought to you in partnership with:

John Cook is an applied mathematician working in Houston, Texas. His career has been a blend of research, software development, consulting, and management. John is a DZone MVB and is not an employee of DZone and has posted 168 posts at DZone. You can read more from them at their website. View Full User Profile

Book Review: Practical Data Analysis

12.13.2013
| 2381 views |
  • submit to reddit

Many people have drawn Venn diagrams to locate machine learning and related ideas in the intellectual landscape. Drew Conway’s diagram may have been the first. It has at least been frequently referenced.

By this classification, Hector Cuesta’s new book Practical Data Anaysis is located toward the “hacking skills” corner of the diagram. No single book can cover everything, and this one emphasizes practical software knowledge more than mathematical theory or details of a particular problem domain.

The biggest strength of the book may be that it brings together in one place information on tools that are used together but whose documentation is scattered. The book is great source for sample code. The source code is available on GitHub, though it’s more understandable in the context of the book.

Much of the book uses Python and related modules and tools including:

  • NumPy
  • mlpy
  • PIL
  • twython
  • Pandas
  • NLTK
  • IPython
  • Wakari

It also uses D3.js (with JSON, CSS, HTML, …), MongoDB (with MapReduce, Mongo Shell, PyMongo, …), and miscellaneous other tools and APIs.

There’s a lot of material here in 360 pages, making it a useful reference.

Published at DZone with permission of John Cook, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)