Don't optimize too early, don't optimize too late

Software engineering, just like any kind of engineering, is a discipline which involves many informed, planned compromises—a surprising amount of "economizing".

The task of optimizing software—that is, minimizing its resource requirements within an effort budget allocated toward this purpose—is a classical example in which economic realities of software development were early recognized by the pioneers our our discipline. Thus we have Knuth's adage that "premature optimization is the root of all evil". Reasonably, as performance engineers we should focus our efforts on those areas that (after measurement) consume most resources, rather than waste time on following personal hunches about imagined "best practices". One might describe this process as "top-down", or strategic, optimization. First use measurement tools to identify the critical areas and the big decisions that affect performance (such as algothmic complexity), then fix the identified problems.

The top-down optimization approach works very well in most cases, but sometimes it fails. That is because some performance problems are incremental and combinatorial in nature. Rather than being located in only a few critical areas of code, a low-hanging fruit for a trained performance engineer, they turn out as pervasive issues - where performance suffers "death by a thousand cuts", not due to some grand ommissions, but due to universal cumulative neglect.

Recognition that such systemic problems do exist and that they are a lot more difficult to deal with than local bottlenecks should make us think again about the dangers of the popular, easy-going "make it work, make it right, make it fast" approach to software engineering (or should we say, software prototyping?)

In other industries serious opposite viewpoints already exist that treat quality as an emergent property of products. Historically, mastery in any craft has always involved paying relentless attention to every detail throughout the process of crafting and construction. as opposed to post hoc economizing and extinguishing of only the most critical problems. In that respect, recall-and-repair, also a form of "late improvement", is a practice more related to modern mass production. Not by coincidence, mass production works best where discerning patrons or purchasers are also replaced by relatively hapless consumers onto whom manufacturers may shift some of their production risks in exchange for a lower price (or higher profit). High-end, highly customized products seldom benefit from such conditions.

Incidentally, paying attention to detail is also something that is psychologically natural for individuals who understand and love developing software, and a source of personal pride and motivation (sometimes it also degenerates into hair-splitting; as always, you can have too much of a good thing). At heart of what I call bottom-up optimization approaches is a conviction that a flawless final product can only be a result of combining flawless components withing a flawless process, a core belief that superior quality pays off in a myriad ways, not excluding immaterial rewards from feeling of accomplishment. It is a conviction which most software engineers accept quite readily when it comes to considering functional correctness of their products. Nobody would use a buggy standard library or consider fixing its bugs as either "premature" or "evil". However, possibly due to overemphasizing and misinterpretation of Knuth's wise statements by some lesser capable minds, the same level of attention to detail is rarely spent on designing software performance (except in systems in which hardware costs cannot be shifted onto users or poor performance translates into incorrect execution; or in systems with such great competition that inadequate performance translates into lost sales or operational losses on hardware).

So what should we make of those divergent views on early and late, bottom-up and top-down optimizations? I think the key to striking the correct balance ultimately lies in understanding how value is created and destroyed within the software development process. It is a strategic mistake to focus your optimization efforts on those areas which are easy to measure (e.g. late profiling) while neglecting the difficult ones. On the other hand, it is a tactical mistake to waste effort on improvements of provably little consequence, especially while disregarding clear opportunity costs. But, as in any discipline, it takes some years of practice to reliably distinguish which kind of those mistakes you are making (of which even experts may disagree). Always optimize your bottom line—but understand first how many numbers stand above it and how they are connected.

No comments:

Post a Comment