An Introduction To Model Checking

An Introduction to Model Checking: The development of model checking methods is one of the towering achievements of the late 20th century in terms of debugging concurrent systems. This development came about in the face of the pressing need faced by the computing community in the late 1970’s for effective program debugging methods. The correctness of programs is the central problem in computing, because there are often very designs that are correct, and vastly more that are incorrect. In some sense, defining what ‘correct’ means is half the problem - proving correctness being the other half of the problem.
The quest for high performance and flexibility in terms of usage (e.g., in mobile computing applications) require systems (software or hardware) to be designed using multiple computers, processes, threads, and/or function units, thus quickly making system behavior highly concurrent and non-intuitive. With the rapid progress in computing, especially with the availability of inexpensive microprocessors, the computer user community found itself – in the late 1970’s – in a position where it had plenty of inexpensive hardware but close to no practical debugging methods for concurrent systems! We will now examine the chain of events that led to the introduction of model checking in this setting.

The enterprise of sequential program verification pioneered, among others, by Floyd, Hoare, and Dijkstra was soon followed by the quest for parallel program correctness, pioneered, among others, by Owicki and Gries, and Lamport. The difficulty of these methods when applied to real-world programs led to alternate proposals, such as relying on social processes. Unfortunately, the arbitrariness hidden in such proposals makes it difficult to situate computation engineering in the same plane as other engineering disciplines where design verification against rigorous specifications is the rule rather than the exception. Contrast the following guarantees: where the former is likely to be offered by a “If I press the eject button, I am guaranteed to be safely ejected from a burning airplane in less than 5 seconds.” versus “If I am lucky to be in a plane that was debugged by an expert reader of a program who happened to spot a bug, then I might get ejected in a reasonable amount of time.”
In one thread of work that was evolving in the late 1970’s, some scientists, notably Pnueli [97], had the vision of focusing on concurrency. In a nutshell, by focusing on control and not data, it becomes possible to model a system in terms of finite-state machines, and then employ decision procedures to check for its reactive properties. Even after such simplifications, system control tends to be highly non-intuitive, and hence simply not amenable to any reasonable social processes. Automated analysis of finite-state models can, on the other hand, automatically hunt bugs and report them back. Pnueli’s vision lead to Manna, Pnueli, and many others developing temporal logic proof systems.
We must admit that in this historical “sampler” that we are presenting, it is entirely possible that we have overlooked some key references, despite our best efforts to prevent any such omissions. One such unfortunately omitted citation from many recent works on model checking pertains to Carl Adam Petri’s seminal work. Petri not only proposed many basic ideas in concurrency and work-flow as early as 1963, but also saw the importance of focusing on control flow and synchronization. Similarly, Hoare and Milner pioneered much of the understanding of concurrency in terms of process algebras. However, none of the early works had emphasized algorithmic approaches similar to model checking, which is our focus in this section.
The breakthrough towards algorithmic methods for reasoning about concurrent systems (as opposed to the initial proof theoretic methods) was introduced in the work of Clarke and Emerson, Queille and Sifakis, and Clarke, Emerson, and Sistla. This line of work also received multiple fundamental contributions, notably from Vardi and Wolper who introduced an automata theoretic approach to automatic program verification, and a team of researchers at AT&T Bell Laboratories, notably by Holzmann, Peled, Yannakakis, and Kurshan, who developed various ways to build finite-state machine models and formally analyze them. Known as model checking, these methods relied on (i) creating a finite state model of the concurrent system being verified, and (ii) showing that this model possesses desired temporal properties (expressed in temporal logic). Graph traversal algorithms were employed in lieu of deductive methods, thus turning the whole exercise of verification largely into one of building system models as graphs, and performing traversals on these graphs without encountering state explosion.
State explosion—having to deal with an exponential number of states—is an unfortunate reality of model checking methods because finite-state models of concurrent systems tend to interleave in an exponential number of ways with respect to the number of components in the system. Effective methods to combat state explosion became the hot topic of research – but meanwhile model checking methods were being applied to a number of real systems, with success, finding deepseated bugs in them! In, Bryant published many seminal results pertaining to binary decision diagrams (BDD), and following his popularization of BDDs in the area of hardware verification, McMillan wrote his very influential dissertation on symbolic model checking. This is one line of work that truly made model checking feasible for certain “well structured,” very large state spaces, found in hardware modeling. The industry now employs BDDs in symbolic trajectory evaluation methods.
Model checking has truly caught on in the area of hardware verification, and promises to make inroads into software verification—the area of “software model checking” being very actively researched at the time of writing this very sentence. In particular, Boolean satisfiability (SAT) methods are being widely researched. In modern reasoning systems, SAT and BDD methods are being used in conjunction with first-order reasoning systems, for example in tools such as BLAST. In addition, higherorder logic based reasoning systems also employ BDD, SAT, and even model checking methods as automated proof assistants within them. As examples of concrete outcomes, we can mention two success stories:

Model checking in the design of modern microprocessors: All modern microprocessors are designed to be able to communicate with other microprocessors through shared memory. Unfortunately since only one processor can be writing at any memory location at a given time, and since “shared memory” exists in the form of multiple levels of caches, with further levels of caches being far slower to access than nearly levels of caches, extremely complex protocols are employed to allow processors to share memory. Even one bug in one of these protocols can render a microprocessor useless, requiring a redesign that can cost several 100s of millions of dollars. No modern microprocessor is sold today without its cache coherence protocol being debugged through model checking.
Model checking in the design of device drivers: Drivers for computer input/output devices such as Floppy Disk Controllers, USB Drivers, and Blue-tooth Drivers are extremely complex. Traditional debugging is unable to weed out hidden bugs unless massive amounts of debugging time are expended. Latent bugs can crash computers and/or become security holes. Projects such as the Microsoft Research SLAM project have technology transitioned model checking into the real world by making the Static Driver Verifier part of the Windows Driver Foundation. With this, and other similar developments, device-driver writers now have the opportunity to model-check their protocols and find deep-seated bugs that have often escaped, and/or have taken huge amounts of time to locate using traditional debugging cycles.
Has the enterprise of model checking succeeded? What about social processes? We offer two quotes:

Model checking has recently rescued the reputation of formal methods
Don’t rely on social processes for verification.

SKEDSOFT