Monday, February 28, 2005

Why building software is not like building bridges

Taken from Steve Row's blog (http://blogs.msdn.com/SteveRowe/archive/2005/02/28/381910.aspx)

I was having a conversation with a friend the other night and we came across the age-old “software should be like building buildings” argument. It goes something like this: Software should be more like other forms of engineering like bridges or buildings. Those, it is argued, are more mature engineering practices. If software engineering were more like them, programs would be more stable and projects would come in more on time. This analogy is flawed.

Before I begin, I must state that I’ve never engineered buildings or bridges before. I’m sure I’ll make some statements that are incorrect. Feel free to tell me so in the comments section.

First, making software, at least systems software, is nothing like making buildings. Engineering a bridge does not involve reinventing the wheel each time. While there may be some new usage of old principles, there isn’t a lot of research involved. The problem space is well understood and the solutions are usually already known. On the other hand, software engineering, by its very nature, is new every time. If I want two bridges, I need to engineer and build two bridges. If I want two copies of Windows XP, I only engineer and build it once. I can then make infinite perfect copies. Because of this software engineering is more R&D than traditional engineering. Research is expected to have false starts, to fail and backtrack. Research cannot be put on a strict time-line. We cannot know for certain that we’ll find the cure for cancer by March 18, 2005.

Second, the fault tolerances for buildings are higher than for software. More often than not, placing one rivet or one brick a fraction off won’t cause the building to collapse. On the other hand, a buffer overflow of even a single byte could allow for a system to be exploited. Buildings are not built flawlessly. Not even small ones. I have a friend who has a large brick fireplace inside their room rather than outside the house because the builders were wrong when they built it. In large buildings, there are often lots of small things wrong. Wall panels don’t line up perfectly and are patched over, walls are not square to each other, etc. These are acceptable problems. Software is expected to be perfect. In software, small errors are magnified. It only takes one null pointer to crash a program or a small memory leak to bring a system to its knees. In building skyscrapers, small errors are painted over.

Third, software engineering is incredibly complex—even compared to building bridges and skyscrapers. The Linux kernel alone has 5.7 million lines of code. Windows 98 had 18 million lines of code. Windows XP reportedly has 40 million lines of code. By contrast, the Chrysler building has 391,881 rivets and 3.8 million bricks.

Finally, it is a myth that bridge and building engineering projects come in on time. One has to look no further than New Jersey’s Big Dig project to see that. Software development often takes longer and costs more than expected. This is not a desirable situation and we, as software engineers, should do what we can to improve our track record. The point is that we are not unique in this failing.

It is incorrect to compare software development to bridge building. Bridge building is not as perfect as software engineers like to think it is and software development is not as simple as we might want it to be. This isn’t to excuse the failings of software projects. We can and must explore new approaches like unit tests, code reviews, threat models, and scrum (to name a few). It is to say that we shouldn’t ever expect predictability from what is essentially an R&D process. Software development is always doing that which has not been done before. As such, it probably will never reliably be delivered on time, on budget, and defect free. We must improve where we can but hold the bar at a realistic level so we know when we've succeeded.

No comments: