Monday, December 20, 2010

Automated Vulnerability Research



A paper just hit the Web this month describing a methodology for automated analysis of source/binary code for vulnerabilities and automatically generating a correlating, working exploit. The paper was done by Thanassis Avgerinos at Carnegie Mellon as part of his graduate research. You can view a video and the paper at the site they set up for this area.

It's an interesting paper that unfortunately received mostly negative comments from the "hacking" community (ie, DailyDave, Sean/Sotirov/others on Twitter, etc.). Mostly I believe that's because that community has been trained to point out failures in general (applications, operating systems, enterprises, irrelevant academic research, etc.) The outsider/attacker mindset can make it difficult to accept ideas not from your trusted circle of friends. In fact, the individual most outspoken against the research (Sean Heelan, a researcher from Ireland now with Immunity wrote his 2009 master's thesis on a similar, but less ambitious topic: "Automatic Generation of Control Flow Hijacking Exploits for Software Vulnerabilities".

The new paper is an excellent read and incorporates preconditioned symbolic execution with an end-to-end system to demonstrate it is possible to take source and binary pairs and automatically develop a working exploit (in bounded, fairly simple) cases. They properly acknowledge the prior work by Brumley (using patch-based approaches) and Heelan and point out the remaining work (which is significant!) And as the hacking community has shown, it becomes significantly more difficult to scale this and automate for more complex scenarios.

As we move to an era where Artificial Intelligence is able to automatically translate language, discover laws of physics, learn to read, it doesn't seem far fetched to automatically  fuzz software and develop exploit code. Oh, wait... we can do that now. The real challenge for this research is not solving the simple case, but showing how it could tackle more complex problems and avoiding attempts to minimize their difficulty. (Which the author did originally, causing many to ignore or overlook his work.)

Humans will presumably stay ahead of AI for the near future, but we've already lost at chess. It seems to me the only  reason computers can't find bugs in software better than the best human is we haven't programmed them to do so yet.