Performance Zone is brought to you in partnership with:

Ashod Nakashian is well into his second decade leading and developing professional software solutions. Throughout the years, Ashod has participated in projects covering a wide gamut of the software spectrum from web-based services, state-of-the-art audio/video processing and conversion solutions, multi-tier data and email migration and archival solutions to mission-critical real-time electronic trading servers. When not designing, coding or debugging, he enjoys reading, writing and photography. Ashod is a DZone MVB and is not an employee of DZone and has posted 18 posts at DZone. You can read more from them at their website. View Full User Profile

The Second Law of Optimization

06.18.2011
| 5945 views |
  • submit to reddit

By now everybody and their screensaver have heard the Optimization Mantra: Don’t Do It!

This is commonly wrapped in a three-rule package (I suspect there is a word for that.) The first two of which are copies of the mantra, and the third adds the wise word “Yet” to the one-and-only true rule and addresses it to the “expert.” I suspect originally the middle “rule” didn’t exist and it was later added for effect, and perhaps to get the total to the magic-number of three.

I can almost imagine Knuth after figuring out a single-character bug in a bunch of code, with coffee mugs and burger wraps (or whatever it was that was popular in the ’60s) scattered around the desk… eyes red-shot, sleepless and edgy, declaring “Premature optimization is the root of all evil.” (But in my mind he uses more graphic synonyms for ‘evil’.)

Knuth later attributed that bit about premature optimization to Tony Hoare (the author of QuickSort) thereby distorting my mental image of young Knuth swearing as he fixed his code, only later to be saved by Hoare himself who apparently doesn’t remember uttering or coining such words. (Somebody’s got bad memory… may be more than one.)

Fig. 4. Illustration of the constrained optimi...

Image via Wikipedia

 

Smart Aleck

Premature optimization; Most of us have been there. And that’s what makes those words very familiar. Words of wisdom, if you will. We’ve all decided to do a smart trick or two before fleshing out the algorithm and even checking if it compiles, let alone checking the result, only to be dumbfounded by the output. I can figure it out! we declare… and after half a day, we’d be damned if we rewrote that stupid function from scratch. No chance, bub.

Probably the smarter amongst us would learn from the experience of such dogged persistence and avoid trickery the next time around. While few would admit to the less-intelligent decisions they took in the past, at least some will have learned a lesson or two when the next opportunity knocked.

The aforementioned trickery doesn’t have to be optimization trickery, mind you. Some people (read: everyone) likes to be a smart-ass every so often and show off. Sure, many end up shooting themselves in the foot and making fools of themselves. But that doesn’t stop the kids from doing a crazy jump while waving to their friends, iPod on and eating crackers, just to impress someone… who typically turns around precisely when they shouldn’t. (Did I mention skateboards?)

The rules are sound. No doubt. Another rule of optimization, when the time comes, is to use profilers and never, ever, make costly assumptions. And any assumption is probably costly. That, too, is sound. These are words of wisdom, no doubt. But, taken at face-value they could cause some harm.

Let’s take it from the top. Leaving aside the smaller projects we might have started and for years tended and developed. Most projects involve multiple developers and typically span generations of developers. They are legacy projects, in the sense of having a long and rich history. No one person can tell you this history, let alone describe all parts of the code. On such a project, if performance is an issue, you shouldn’t go about shooting in the dark and spending days or even weeks on your hunches. Such an approach will not only waste time, add a lot of noise and pollute the code-base and source repository (if you commit to the trunk, which you should never do, until done and ready to merge.)

In such a case, one must use a profiler, consult with others and especially talk with module owners, veteran developers and the architects before making any changes. The change-set must be planned, designed and well managed. The larger the project, the more this stage becomes important. No funny tricks, please.

Efficient Code != Premature Optimization

Of the standard interview questions we often ask (and get asked) are those on the data-structures and algorithms (discrete math and information theory.) I typically ask candidates to compare the data-structures in terms of performance which should cover both internal details and complexity characteristics (big O). It’s also a good opportunity to see how organized their thoughts are. We use arrays, lists and maps/dictionaries quite often and not having a good grasp of their essence is a shortcoming. As a follow-up to this I typically ask how they decide which to use. This isn’t an easy question, I realize. Some things are hard to put into words, even when we have a good understanding of them in our minds. But, interviews aren’t meant to be easy.

The worst answer I ever got was “I use List, because that’s what I get.” To which I had to ask “Where?” Apparently, the candidate worked on a legacy project that used Lists almost exclusively and bizarrely she never had a need for anything else. The best answer typically gives a description of the use-case. That is, they describe the algorithm they’ll implement, and from that, they decide which container to use.

The best answer isn’t merely a more detailed or technical answer. Not just. It’s the best answer because it’s the only answer that gives you reasons. The candidate must have thought about the problem and decided on an algorithm to use, they must’ve quantified the complexity of the algorithm (big O) and they must’ve known the performance characteristics of the different containers for the operations their algorithm needs. They have thought about the problem and their solution thoroughly before choosing containers, designing classes and whatnot.

The traditional wisdom tells us to avoid premature optimization and when absolutely necessary, we should first use a profiler. But both of these can also be interpreted as follows: it’s OK to write inefficient and bloated code, and when necessary, we’ll see what the profiler comes up with.

Performance as an afterthought is very costly. Extremely so. But I don’t recommend premature optimization. There is a very thin line between well-thought and designed code that you’d expect a professional to output and the student toy-project style coding. The latter focuses on getting the problem-of-the-moment solved, without any regards to error handling or performance or indeed maintenance. We’ve all done it; Multiple similar functions; Same function reused for different purposes with too many responsibilities; Unreasonable resource consumption and horrible performance characteristics that the author is probably oblivious to. And so on.

It’s not premature optimization to use dictionary/map instead of a list or the most common container in your language of choice. Not when we have to read items most of the time. It’s not premature optimization if we use an O(n) algorithm instead of the O(n2) that isn’t much more complicated than what we’ll use (if not an O(log n) algorithm). It’s not premature optimization if we refactor a function so it wouldn’t handle multiple unrelated cases. Similarly, moving invariant data outside a loop isn’t premature optimization. Nor is caching very complex calculation results that we don’t need to redo.

Regex object construction is typically an expensive operation due to the parsing and optimizations involve. Some dynamic languages allow for runtime compilation for further optimization. If the expression string isn’t modified, creating a new instance of this object multiple times isn’t smart. In C# this would be creating a Regex object with RegexOptions.Compiled and in Java a Pattern.compile() called from matches() on a string. Making the object a static member is the smartest solution and hardly a premature optimization. And the list goes on.

As much as I’d hate to have a pretentious show-off in my team, who’d go around “optimizing” code by making wild guesses and random changes, without running a profiler or talking with their colleagues, I’d hate it even more if the team spent their time cleaning up after one another. It’s easy to write code without thinking more than a single step ahead. It’s easy to type some code, run it, add random trace logs (instead of properly debugging,) augment the code, run again, and repeat until the correct output is observed.

I don’t know about you, but to me, writing and modifying code instead of designing and working out the algorithm beforehand is simply counter productive. It’s not fun either. Similarly, debugging is much more interesting and engaging than adding random trace logs until we figure out what’s going on.

I’m not suggesting that this extreme worse-case that I’ve described is the norm (although you’d be surprised to learn just how common it is.) My point is that there is a golden mean between “premature optimization” and “garbage coding.”

The Cost of Change

When it’s time to spend valuable resources on optimization (including the cost of buying profilers,) I don’t expect us to discover that we needed a hash-table instead of an array after all. Rather, I should expect the profiler to come up with more subtle insights. Information that we couldn’t easily guess (and indeed we shouldn’t need to.) I should expect the seniors in the team to have a good grasp of the performance characteristics of project, the weak points and the limitations. Surely the profiler will give us accurate information, but unless we are in a good position to make informed and educated guesses, the profiler won’t help us much. Furthermore, understanding and analyzing the profiler’s output isn’t trivial. And if we have no clue what to change, and how our changes would affect the performance, we’ll use the profiler much like the student who prints traces of variables and repeatedly makes random changes until the output is the one expected. In short, the profiler just gives us raw data, we still have to interpret it, design a change-set and have a reasonably sound expectation of improved performance. Otherwise, profiling will be pure waste.

It’s well documented that the cost of change increases exponentially the later a project is in it’s development cycle. (See for example Code Complete.) This means a design issue caught during designing or planning will cost next to nothing to fix. However, try to fix that design defect when you’re performing system-testing, after having most modules integrated and working, and you’ll find that the change cascades over all the consequent stages and work completed.

This cost is sometimes overlooked, thanks to the Rule of Optimization. The rule highly discourages thinking about performance when one should at least give it a good thinking when the business and technical requirements are finalized (as far as design is concerned) and an initial design is complete. The architecture should answer to the performance question. And at every step of the development path developers must consider the consequences of their choices, algorithms and data-structures.

This doesn’t suggest optimization-oriented development. Rather, having a conscious grasp of the performance implications can avoid a lot of painful change down the road. As we’ve already iterated, designing and writing efficient code doesn’t necessarily mean premature optimization. It just means we’re responsible and we are balancing the cost by investing a little early and avoiding a high cost in the future. For a real-life example see Robert O’Callahan’s post linked above.

I know there is a camp that by this point is probably laughing at my naïve thinking. By the time I finish optimizing or even writing efficient and clean code, they’ll say, their product would’ve shipped and the customer would’ve bought the latest and faster hardware that will offset their disadvantage in performance. “What’s the point?” they add. While this is partially true, (and it has happened before,) given the same data, the better performing product will still finish sooner on the same hardware. In addition, now that processors have stopped scaling vertically, the better designed code for concurrent scalability (horizontal scaling) will outperform even the best algorithm. This, not to mention, data outgrows hardware any day.

Conclusion

Premature optimization is a major trap. We learn by falling, getting up, dusting off and falling again. We learn by making mistakes. The wisdom of the community tells us to avoid experimenting on our production code and postponing optimization as much as possible. Only when the code is mature, and only when necessary should we, by the aid of a profiler, decide the hot-spots and then, and only then, very carefully optimize the code.

This strategy encourages developers to come up with inefficient, thoughtless and -often- outright ugly code. All in the name of avoiding premature optimization. Furthermore, it incorrectly assumes that profiling is a magic solution to improving performance. It neglects to mention how involved profiling is. Those who had no clue as to why their code is bogged down, won’t know what to change even if the profiler screamed where the slowest statement is.

There are no excuses to writing inefficient code if the alternative is available at a small or no cost. There is no excuse to not thinking the algorithm ahead of typing. No excuse to leaving old experimental bits and pieces because we might need them later, or that we’ll cleanup later when we optimize. The cost of poor design, badly performing code is very high. It’s the least maintainable code and can have a very high cost to improve.

Let’s optimize later, but let’s write efficient code, not optimum, just efficient, from the get-go.

 

From http://blog.ashodnakashian.com/2011/06/the-second-law-of-optimization/

References
Published at DZone with permission of Ashod Nakashian, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Balázs Bessenyei replied on Sun, 2011/06/19 - 7:28am

Agreed, as long as the more "efficient" solution does not hurt readability and maintainability, it should not be considered premature optimization. Assuming there are no significant difference between the 2 solutions.

Violating readability and maintainability is acceptable, if backed by profiler data and well documented.

It is important to note that micro benchmarks should not be accepted as good reasons.
Especially in case of managed and JIT enabled languages like (Java or CLI based languages).
For standard compiled languages the results would only be valid a for a given compiler version and optimization level.

Jonathan Fisher replied on Sun, 2011/06/19 - 1:36pm

Not only should not not 'pre-optimize', but you shouldn't over-engineer. Humans are extremely poor judges of:

1) where a system is slow (don't optimize when you don't know where it's slow)
2) where a system will need to be extended (don't enable extension for unknown requirements)
3) where a system will be reused (don't enable reuse for unknown requirements)

These rules of course don't apply to projects where you're intending to design a reusable library...

Ashod Nakashian replied on Mon, 2011/06/20 - 2:29am in response to: Jonathan Fisher

Indeed. The point to remember is to know your code, requirements and algorithms. One shouldn't deliberately write poorly performing code because they haven't profiled yet. There is a reason we study discrete math and playing ignorant when we should know that we've chosen a poor algorithm, when the alternative is equally readable and easy/hard to implement isn't commendable. In all other cases, I agree, we're poor judges and we shouldn't guess without data. But the obvious cases must be pointed out at design time. Profiling helps for the subtle cases.

Mladen Girazovski replied on Wed, 2011/06/22 - 2:57am in response to: Ashod Nakashian

>> But the obvious cases must be pointed out at design time. Profiling helps for the subtle cases. What would be an obvious case in your opinion? How can you judge without profiling what parts fo the system needs to be optimizied because it isn't performing very well?

Ashod Nakashian replied on Wed, 2011/06/22 - 8:11am in response to: Mladen Girazovski

It's hard to condense the article in a few sentences (it's best to read it,) and it's hard to give concrete examples because I may corner my argument. However, I think we can agree that in many cases we know what data we're working with. If you see some code reading a full table from the database into memory, you must ask how large the table can be. If it's reasonable that it'll be "large," you shouldn't wait for it to max out in the wild, you should change the algorithm. If you have a collection of names, in which you lookup quite frequently, you should use a map/dictionary or at the very least use a sorted array (and use binary search.) If you're using a linked-list, you know you'll have bad performance. If you need to read some data from the server, reading the number of items once is certainly smarter than asking the server after every loop. I'm sure you can think of other cases too.

The point is that there are obvious cases that we shouldn't ignore or wait for the profiler. We should know how our algorithm will scale with data and if we know for a fact that the data can be "large" then we should chose algorithms that scale better. If the alternative algorithms are too complicated, then we can postpone this, as that'd be premature optimization. But most cases are low-hanging fruits. Choosing a hashtable instead of array doesn't cost us anything, and if it'll give us much better performance, then we should do it. Even if the data isn't too large, unless we have good reasons not to.

In fact, I'm more interested in that developers think and know about the limitations of their chosen algorithms, than to encourage them to improve them as early as possible. It's still very rewarding to know when an algorithm will break (low-resources) or will take excessively long time (bad big-O) even if they don't improve it any.

Mladen Girazovski replied on Wed, 2011/06/22 - 6:55am

I see your point, however, what performs better in real life cannot be purely assumed based on theoretical knowledge, you have to measure (profiler) to be sure that you have actually improved the performance.

The reason i'm asking is that i've seen it numerous times that a developer writes less readable code instead of readable because he was under the impression that he was improving performance, but since he didn't check the improvements with a profiler, lots of times the outcome was slower performing code that was also less readable.

A very popular example is the use of a StringBuilder instead of String concatenation, lots of people still think that using a StringBuilder is always faster than using String concatenation, which isn't true. While not nessecary being faster (often even slower), it certainly is less readable to use an StringBuilder instead of a +.

If you remember the 80/20 rule, which in this instance means that 80% of the ressources are used by 20% of the code (usually its more like 90/10), then most of our assumed improvements will not impact performance significantly(at least not in a positive way), so readabilty should be prioritised, and optimisations should be only done if nessecary or required. I think this is the reason why we should not give in to our "natural" obsession with performance, but rather work on the expressiveness and stability of the code.

Sure, if i know that i will have a List with 500000 elements that i want to access randomly, i will not use a LinkedList but ArrayList instead, but for 5 elements it makes no measurable difference and i wouldn't worry to much about it.

The danger with optimizations however is, that if you don't check your improvement, you cannot be sure that you've improved anything, this is why people who claim an performance improvement, are supposed to prove that the performace has acctually imroved, measurably.

In other industries nobody would ever claim top have improved something solely on a theoretical basis (like in our case the big O notation), but instead would have to present actual prove to support that claim.

Ashod Nakashian replied on Wed, 2011/06/22 - 8:36am in response to: Mladen Girazovski

Good points.

>> what performs better in real life cannot be purely assumed based on theoretical knowledge, you have to measure (profiler) to be sure [...]
Not necessarily so. You are giving the impression that performance is random. That's far from the truth. Sure, there are very subtle cases (CPU architecture for one,) and certainly the smaller the data the less it matters, but we should know our target market and expected data loads. Where there is high-load, that's where things matter. We can (and should) postpone all other cases.

>> The reason i'm asking is that i've seen it numerous times that a developer writes less readable code instead of readable because he was under the impression that he was improving performance [...]
I've tried to mention at different point in the article that the condition to choosing a better algorithm is "if it's little or no" more complicated/readable than the default. Readability is also very high on the list. Please see my article on that topic here. It's also published on dzone. But I absolutely agree with you.

>> A very popular example is the use of a StringBuilder instead of String concatenation [...]
This is micro-optimization and must be avoided in almost every case. The only exception is when you're writing reusable code (say a library) and you need to design for the typical case. Profilers are your best friend.

>> Sure, if i know that i will have a List with 500000 elements that i want to access randomly, i will not use a LinkedList but ArrayList instead [...]
Yes, this is what I'm addressing in the article. Many people consider this case premature optimization!

I'll just add that at each point we have to make decisions. They all affect correctness, readability and performance. I think it's a fallacy that there is a "default" option and from there using a profiler we optimize. I've worked with people who create very readable and highly performant code virtually all the time. Even when optimization was called for, their code stood the test of having a very reasonable performance and we decided not to change it because it was good enough and we didn't want to sacrifice readability. Yet, we had a project re-written twice to redesign for scalability that was still bogged down and had many outstanding support tickets (it was 4x slower than the original the two redesigns tried to replace.) Surely there are better ways to think, design and write code and there are poor ways. Wouldn't you agree that even the best profilers and "mature" optimizations couldn't replace the former group and save that failing project?

Mladen Girazovski replied on Wed, 2011/06/22 - 9:10am in response to: Ashod Nakashian

>> Not necessarily so. You are giving the impression that performance is random. That's far from the truth. Sure, there are very subtle cases (CPU architecture for one,) and certainly the smaller the data the less it matters, but we should know our target market and expected data loads. Where there is high-load, that's where things matter. We can (and should) postpone all other cases.

What i'm trying to say is that humans can only take guesses when it comes to performance. Rules are usually simple, but reality is complex.

Take the Java compiler and the Java runtime (JVM) for example, the compiler will transform any String + to a StringBuilder, someone who doesn't know that, would assume that its always faster to use an StringBuilder, someone who knows this will use StringBuilder only in loops. Also, the JVM (using the hotspot JIT) will try to optimize bottlenecks of your code if they are significant, there is no way to predict the performance of code under these circumstances. Btw., the JIT will work better on "simple, dumb" code than on code that the developer already tried to optimize.
I see your point when you mention obvious cases (ie. the huge List), but IMHO these cases are far less frequent than we think, on the other hand, if we encounter such an obvious case, we should not ignore it.

>> I've tried to mention at different point in the article that the condition to choosing a better algorithm is "if it's little or no" more complicated/readable than the default. Readability is also very high on the list. Please see my article on that topic here. It's also published on dzone. But I absolutely agree with you.

I remember your other article :) I think that we both agree that if the option is either readable or supposedly performant, there needs to be proof that the benefit outweighs the cost, in other words: Provide proof that your optimization is really faster.

>> This is micro-optimization and must be avoided in almost every case. The only exception is when you're writing reusable code (say a library) and you need to design for the typical case. Profilers are your best friend.

I agree.

>> Yes, this is what I'm addressing in the article. Many people consider this case premature optimization!

I thought so, but i just wanted to make sure that we are talking about the same thing.

>> I'll just add that at each point we have to make decisions. They all affect correctness, readability and performance. I think it's a fallacy that there is a "default" option and from there using a profiler we optimize. I've worked with people who create very readable and highly performant code virtually all the time. Even when optimization was called for, their code stood the test of having a very reasonable performance and we decided not to change it because it was good enough and we didn't want to sacrifice readability. Yet, we had a project re-written twice to redesign for scalability that was still bogged down and had many outstanding support tickets (it was 4x slower than the original the two redesigns tried to replace.) Surely there are better ways to think, design and write code and there are poor ways. Wouldn't you agree that even the best profilers and "mature" optimizations couldn't replace the former group and save that failing project?

Well scalability is not the same as performance, often enough they contradict each other, that might sound strange at first, but it really isn't.

When it comes to scalability, it's not just about algorythms(at least not the simpler ones), but also about honoring rules of the platform that offer scalability. Lets take a look at Java WebApps, you could save lots of roundtrips to the DB, if you just put the data into the session context once per user (assuming that the requirements state that the data doesn't change frequently and outdated data is not a big issue). Having big sessions however makes an WebApp less scalable. A profiler however isn't IMHO a proper tool for testing for scalability, profilers work best to check resources used (CPU and Mem).

But of course, no tool can replace expirience, especially the bad ones, since humans tend to remember them for longer.

Ashod Nakashian replied on Wed, 2011/06/22 - 9:56pm in response to: Mladen Girazovski

No doubt we're on the same page. It's hard to give examples of each type of situation one could run into and give advise on each case. As you said, experience can't be replaced. I'm assuming the reader is someone perhaps ideally like you. Someone who has enough experience to be able to judge the cases they might run into, then having read about someone else's experience and/or advice, they can decide for themselves. I couldn't put it better: "Rules are usually simple, but reality is complex."

Sirikant Noori replied on Sat, 2012/03/31 - 2:44pm

Reason for behind all this was, the developers words: “i didn’t want to do any premature optimization … and forgot about it”. In reality, he must have known that you can project the results, e.g. select only the column you need, and do the paging in rdbms, which happens to be very optimized for that sort of jobs. He was just being lazy, calling it “premature optimization”.

jsp

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.