Source code redux

Building a static source code analyzer is a daunting task. I note that the latest edition of CSO has an ad for ouncelabs – I guess I should state that I don’t work for Ounce, don’t know anyone who does, and have never (never-ever-ever) used their stuff. And, having said that, I really want to see how they deal with variable state in their app. Give me a shout if you have any first-hand knowledge ;)

My problem can be best summarized with a simple example. I recently did a code audit of a banks web app which was handling incoming numeric data. Data came in as a verified Decimal, was converted to a string, and much later the string got converted to an integer without any exception handling. Easy to spot the flaw, right? Well, not for the static analyzer, as the conversions were spread across multiple files, multiple includes, multiple classes, etc. etc. The static code analyzer has to be smart enough to know variables, scope, conversions, mathematical operations, etc. etc. My source code analyzer didn’t flag on the true nature of the bug. Instead, my tool told me where all the data conversions were taking place without exception handling. I had to manually trace each of these variables back to it’s beginning and all the way through it’s handling, modifications, etc. to the point where it was de-referenced and used in business logic. Yes, I could have just generated an alert based on the fact that the conversion took place without any exception handling. However, this will generate false positives on programs where the data comes in as an integer, is converted to a string, and then later back to an integer. The source code analysis tool which has the smarts to automate all of that manual ‘tracing’ will be a valuable tool. I’d buy it. I’d be interested in hearing if such a tool exists.

Lastly, apologies for leaving CodeScout off my list of source code tools. It has a few built-in checks (like SWAAT) which can be extended fairly easily. However, the nicest features is a fully-compliant regex parser which you can run over your entire source tree. It is very fast and you can use it to very quickly identify flaws.