NoSQL Zone is brought to you in partnership with:

Alec is a Content Curator at DZone. He lives in Raleigh and spends his free time writing and programming. Alec is a DZone Zone Leader and has posted 524 posts at DZone. You can read more from them at their website. View Full User Profile

MongoDB Aggregation: How to Work with 30 Years of NBA Data

  • submit to reddit

If you've been waiting for the day when MongoDB and basketball would finally intersect, I have good news for you: A recent post by Valeri Karpov at the Coding Barbarian has crunched 30 years worth of NBA data with MongoDB aggregation.

Karpov begins by exploring the structure of the NBA data - the dataset alone would be deeply exciting to any basketball fan - and then provides a general overview and sanity checks of MongoDB's aggregation framework and aggregation methods. Using these techniques, Karpov produces a few statistics:

  • Teams with the most wins in a given season
  • Correlating wins with particular stats
  • Multiple stats compared vs. win percentage

Ultimately, it's all just for fun, testing out techniques to find gems in the data. For example:

An interesting factoid: the team that recorded the fewest defensive rebounds in a win was the 1995-96 Toronto Raptors, who beat the Milwaukee Bucks 93-87 on 12/26/1995 despite recording only 14 defensive rebounds.

Regardless, though, Karpov's experiments provide a fantastic overview of MongoDB's aggregation framework, how it works, some key methods, and what it can do. Check out Karpov's full post to find some truly obscure NBA statistics.