Performance Zone is brought to you in partnership with:

Cody Powell (@codypo) is the cofounder and CTO of Famigo. Famigo's main offering is a cross-platform recommendation engine for mobile content, helping families find things like the best android apps, best iPad apps, and free apps. He's a graduate of Trinity University, an ardent supporter of the Texas Rangers, and he makes a mean mojito. Cody is a DZone MVB and is not an employee of DZone and has posted 26 posts at DZone. You can read more from them at their website. View Full User Profile

Bytes Matter

  • submit to reddit

I love to profile applications, because I always learn something that surprises me.

Initial Profiler Surprise: Client Side
Case in point, I was recently profiling our Android application, the Famigo Sandbox. This app sends a lot of data back and forth with our API, as we try to determine which of the apps on your phone are safe for your kids. I always assumed that, if app performance suffered during some of the chattier features, it was probably due to slow cell reception.

The profiler told me that I was wrong; the transfer time was almost always negligible. What wasn't negligible was the amount of CPU time it took to parse the JSON coming from the API into native types. (Note that I'm measuring JSON parse time across an average app session, not just for one call.)

Like most JSON decoders, we parse everything, regardless of whether we use it or not. I took another look at our API responses and learned that our app actually didn't need half of what we were sending.

Now, we weren't doing anything too crazy on any individual API call. We consistently returned too much data everywhere, though, across many API calls. In aggregate, these bytes mattered. Once we learned this, we streamlined the data returned from our API and quickly saw our JSON parsing bottleneck go away.

Subsequent Profiler Surprise: Server Side
Here, I was profiling our website, which is essentially an app recommendation engine for families. We consistently see some calls take a long time, and I assumed it was the complexity of the queries. For example, our queries to find and sort the best iphone apps or free android apps take into account a lot of disparate data from our own reviewers, the app stores, and all of our family users.

When I profiled these calls again, I was shocked. The queries were actually well-tuned (as of the author of these queries, yes, this is shocking); the slowness was coming from the ORM (pedantic note: it's really an ODM - shakes TI85 threateningly) we use to turn our MongoDB documents into our lovely Python models.

This problem was actually very similar to the problem seen in our Android app. MongoDB documents are encoded in BSON, which is very similar to JSON, and our ORM is responsible for parsing that BSON into usable types. On almost all of these queries, we were asking our db drivers to parse the entire document when we really only needed a small subset (1/3 or 1/4) of the fields. That's hardly noticeable when you're dealing with a few documents, but it becomes quite a bottleneck with thousands of documents. Again, I realized that bytes matter.

Once I figured out the problem, the fix was easy. Instead of asking for every field on every document in the query, I simply specified the fields I wanted. When this change went live, the bottleneck disappeared and we got an easy 40% improvement in average render time.

Let Us Conclude
I don't think I need to restate this, but I will, because it's my website and we hammer points into the ground 'round these parts. The lesson is that the more data you return, the more you must process.

This is so basic that it's often easy to ignore entirely. However, once you have real users and real data, bytes matter, and they matter more and more as you scale. Use them wisely.

Published at DZone with permission of Cody Powell, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)