Tim Mattson
The ideal programming language would map directly onto the way algorithm designers and programmers think. To achieve this ideal, language designers need to understand what goes on inside a programmer's head. In other words, language designers need a cognitive model for programming.
The literature on cognitive models of programming is both sparse and confusing. It is sparse because experiments leading to cognitive models are difficult to carry out. It is confusing because the research community has not converged on a consensus model.
Fortunately, there is a related problem that has been studied in more detail. This is the problem of how a programmer comprehends a program as he or she reads it. We can use this literature combined with the few cognitive models in the literature to paint a picture of a programmer's reasoning as they produce software.
This note presents a cognitive model of programming. Before launching into the model, I define my terms by talking about what a program is. This is followed by the model itself and then some closing comments about what I hope to do with this model.
A problem defines a situation to be resolved. The resolution of the problem is its solution. The definition of the problem in sufficient detail to drive a solution is called a specification.
The programmer takes a problem and finds a solution. When this solution is coded to execute on a computer, we call it a program.
Every problem can be viewed in terms of two parts: state and transitions.
The state is defined in terms of a space of primitive elements or the domain. The programmer works in two domains. The problem domain is the space within which the problem is defined. It's the domain exposed to the intended user of the program. The programming domain is the space within which the program is constructed.
For example, in a banking system, the primitive elements of the problem domain are account balances, deposits, and withdrawals, while the primitive elements of the programming domain are linked lists and transactions.
Transitions represent changes from one state to another. The transitions themselves are defined in terms of sequences of operations. When the operations are expressed in terms of a programming notation, they are referred to as statements.
So at last, we can define a program. A program is a solution to a problem designed to execute on a computer. It consists of a collection of statements that define the program's state and all possible transitions on that state.
Programmers progress on three fronts as they work on a programming problem:
Problem solving does not pass sequentially from specification to model to code. Rather, programmers work on all three fronts together, addressing at any given time the issues that gives them the "most bang for the buck". In the literature, this strategy is called opportunistic refinement.
For example, programmers start with the problem specification. They produce a high level model for the problem and use information from the model to better refine the specification. At some point, portions of the problem are modeled well enough that it is easier to represent them as code. These would be coded and possibly provide information that would lead to further refinements in the problem-model and specification.
The point is that programmers simultaneously work on all three fronts, shifting to whichever problem element allows them to be most productive at any given time. Before looking at opportunistic refinement in more detail, let's consider the three fronts of our cognitive model in more detail.
The problem specification is a definition of the problem to be solved. The specification is given in the original problem domain. Ideally, the specification contains all the information required to drive the problem solution. In some cases, the definition is a formal problem. In other cases, it is an informal description of the problem and a list of possible input and output conditions.
The specification is almost never complete when the programmer starts work on the problem. Even an experienced analyst finds it difficult to anticipate everything that needs to go into the problem specification. Hence, the specification is a moving target and except for the simplest of problems, it evolves as progress is made towards a solution.
The heart of our cognitive model is a progression of abstractions constructed by the programmer as he or she turns the specification into a program. Initially, the abstractions are high level models with the same structure and behavior as the original problem. The elements of these models are directly related to items in the problem domain.
The models are abstract and informal. As the programmer refines the design, the models move to lower levels of abstraction. Eventually, the contents of the models are dominated by elements from the programming language as opposed to the original problem domain. As more details are added, the models address lower-level issues and become abstractions of the computer upon which the program will execute.
The process of understanding and writing a programs becomes one of constructing a hierarchy of models and understanding how they fit together. In most cases, the models do not exist outside the programmer's mind. While informal, they are detailed enough to support mental simulations. These simulations play a very important role in reasoning about a program. In particular, programmers use models to propose and test hypothesis about the solution to the problem [Brooks83]. These hypothesis are tested by simulation in the model space. These simulations expose opportunities to further refine or correct the models.
At some point, solutions denoted in terms of abstract models are refined to a point where it makes sense to represent them as code. This transition from abstract model to code is perhaps the most critical one for us to understand. Two conclusions will guide our understanding of this transition from model to code.
First, a number of studies have shown ([Robertson90] [Petre88]) that the programming language has little impact on how experts solve their problems. In other words, prior to writing code in a formal programming language, the programmer represents the code using personal informal notations. These notations are constructed from many sources including programming languages, logic, mathematics, etc. and adapted to the problem at hand.
The issue of notation, however, dodges the more fundamental question. How does a programmer move from model to code? The literature suggests that this transition occurs in terms of groups or chunks of operations. These chunks are called "plans". Plans are code fragments that solve a particular problem. Plans have been used as the foundation for knowledge representation in a program for many years[Wiedenbeck93]. The idea is most closely associated with Soloway and his colleagues ([Solloway82], [Ehrlich84], [Soloway84]). One of the best studies producing empirical evidence for plans is [Rist86], in which plans were shown to exist by analyzing how novices and experts comprehended programs.
According to the plan theory of programming, experienced programmers hold large catalogs of plans in their minds. These programmers refine models until they exist at such low levels that the plans are apparent. At that point, the transition to code takes place in terms of these plans. When a completely new problem is encountered, a new plan is built up statement by statement starting with the key or focal statement.
Defining the basic components of a cognitive model isn't enough. To make a useful cognitive model, we need to understand the strategies employed by programmers as they write programs.
Programmers employ many strategies in the course of constructing a program. The strategies vary depending on how familiar the problem is to the programmer. As reported in [Rist86] when a problem is familiar to the programmer, he or she proceeds directly from the initial state to the solution. When the problem is unfamiliar, programmers use some form of goal chaining. The most common approach is to start from the final goal and use a goal back chaining to reduce the problem to sub-goals. This strategy is used recursively until a series of sub-goals are produced that can be directly mapped onto plans.
Goal chaining is a very general technique. It is effective later in the programming process, but the programmer needs to do considerable refinement of the specification and problem-models before goal chaining can be used productively.
So how does a program start the process of successive refinements? In most cases the initial refinements utilize one or both of the following approaches:
These techniques define handles for the programmers to use as they refine the problem-models and close in on a solution.
In this note, we have defined a basic cognitive model for programming. Our model views programming as taking place on three parallel fronts: the problem specification, problem-models, and code. The three fronts are refined opportunistically using a number of different strategies. The transition from abstract models to code is carried out in terms of chunks of code called plans.
We can explain a great deal about program construction and comprehension using this cognitive model. Our model is weak, however, when it comes to the strategies employed by programmers. Programmers employ many different strategies in the course of writing software. Our model shows where the strategies enter into the process, i.e., strategies are used to refine the models and to isolate and combine plans. Our model fails, however, to provide abstractions that can help us understand the diversity of strategies employed by programmers.
A reasonable question to ask at this point is "so what"? What can we do with this cognitive model? Our cognitive model can be used to help us more effectively compare different programming environments. It can be used to help us design programming environments that more closely match the way programmers think. Using this cognitive model, we have produced a short list of rules the programming environment designer can use. While this cognitive model is interesting as a way to understand the science behind programming, perhaps the most pragmatic benefit of this model are these rules.