Catalyst Conference 2008

Blog powered by TypePad


Software Development Lifecycle

May 16, 2008

Is Microsoft’s SDL Working?

Blogger: Pete Lindstrom

Microsoft’s Security Development Lifecycle (SDL) is the main product of its Trustworthy Computing Initiative, launched from the now-famous Bill Gates memo in 2002. Six years into the initiative, Microsoft surely must be reaping the benefits of, for example, the well-publicized security training every developer went through.

So, how do we determine whether the SDL is working? Microsoft suggests that this is a simple exercise – simply compare the number of public vulnerabilities disclosed for products prior to SDL with similar products developed after SDL. The most recent case was comparing Windows XP SP2 to Vista vulnerabilities in the first year. The count is down and Microsoft provides a quick and easy example of the logical fallacy “post hoc ergo propter hoc” which in this case means “public perception is ripe for deception.”

The biggest problem with Microsoft’s assertion is simply that there are too many variables that are uncontrolled and could just as easily be making the difference. There are too many unknowns related to effort of independent researchers and focus on a specific Microsoft platform. At the very least, Microsoft has done an admirable job in making people feel more secure. (I happen to believe the SDL is working as well, but that belief is a matter of conjecture without strong evidence).

If Microsoft wants to use public vulnerability counts as the ultimate arbiter, it needs to create an environment where independent researchers are encouraged to find bugs. Creating a controlled bounty program for a limited time period would increase incentives and at least provide circumstantial evidence of SDL effectiveness. Interestingly, if the number of vuln counts was higher, it still wouldn’t mean SDL is ineffective,  but the framing of the conversation would be entirely different.

The plot thickens when Microsoft makes claims that spending more time and leveraging external resources are a part of SDL. Whether they are or not, there is a big difference between making programmers more secure developers and simply spending more money on a problem. You don’t really need SDL if the latter is more beneficial.

But if public vulnerability counts are not the answer, what should Microsoft be doing to demonstrate the effectiveness of its SDL? Well, it is much easier to determine causality by controlling for all other variables, and conducting a test of two groups – one with SDL training and one without. Comparing vulnerability creation rates per unit output (either developer-hours or lines of code, for example) would go a long way to answering the effectiveness question.

At this stage, it might be difficult to find a group of developers in-house that aren’t SDL trained, and Microsoft is fully vested in the program such that it wouldn’t allow an untrained developer on a real project, so a new experiment may need to be set up using some arbitrary project created solely for the experiment. Alternatively, Microsoft could measure the differences in development skills after an acquisition and during the transition to SDL-trained developers. Or a final option is to conduct a private benchmarking exercise where the effectiveness is compared among multiple groups.

At this stage, it may be even harder to figure out the effectiveness of an SDL-trained QA group. Presumably, QA training will help the group find more bugs earlier, but if the developers are getting better, then the rate of finding vulnerabilities will go down. There are techniques associated with defect density that could be leveraged to determine this effectiveness level as well.

Creating fewer bugs and finding more bugs early, I believe, are the real expectations of SDL, and finding those numbers would provide much stronger evidence for or against its effectiveness. Not only that, but this information would better frame discussions around ultimate effectiveness of software development: Microsoft is likely to have spent more money than anyone else on its SDL efforts, so the benchmarks provided by the company would serve as an upper limit for expectations.