Big Data/Analytics Zone is brought to you in partnership with:

Gary Sieling is a software developer interested in dev-ops, database technologies, and machine learning. He has a computer science degree from the Rochester Institute of Technology. He has worked on many products in the legal and regulatory industries, having worked on and supported several data warehousing applications. Gary is a DZone MVB and is not an employee of DZone and has posted 62 posts at DZone. You can read more from them at their website. View Full User Profile

Visualizing Six Million Files and Folders

07.12.2013
| 1886 views |
  • submit to reddit

Each year there are nearly 300,000 of these in Federal Federal Civil Court, 1.3-1.6 million in Federal Bankruptcy Court, but this pales in comparison to state courts, which accept just over 100 million cases each year.

Even a small extract of these takes up a fair amount of space:


This is what a court docket looks like -
level4

This includes an actual document (PDF or html), xml metadata, although this varies case by case.

Divided up into groups of a half dozen, these feel manageable:

level3

Each of these is in one of 256 folders contains group of a half-dozen like the above listing:
level2

Each of these 256 folders is contained within another group of 256 folders:
level1

And that’s just for ~500k cases. Imagine how much paper is sitting out there in the world.

Published at DZone with permission of Gary Sieling, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)