Tim Mattson
In this note, I summarize the literature on the psychology of programming. The audience for this note is programming environment developers as opposed to psychologists. To make this material most accessible to its target audience, the core of this note is a series of rules to be used when designing a new programming environment. These rules help tool developers produce systems that more closely match the way programmers think about programming.
A programming environment must fit in with the abstract models programmers use as they reason about their problems. This mapping must be clear, clean, and readily apparent to the programmer.
Expert programmers reason about the behavior of their programs as they design and write code. They do this in terms of mental models with the same functional nature and structure as the problem they are trying to solve.
These models are multi-leveled and include abstract models in the problem domain, working models of the programming language, and low-level models of the computer system. These models are rich enough to support mental simulations on an abstract computational model.
A programming environment needs to fit in with this hierarchy of models. This means that the models must be apparent in the environment and the interfaces between them must be easy to understand.
[Guindom90] showed how designers use simulations of their problems within abstract models as part of the design process. This rule follows from [Brooks83], in which program comprehension is described in terms of a series of models which start at the highest level, problem domain, and work their way downward to machine models. The key is the models themselves but also the understanding of how the models map onto each other. This approach is discussed and related to plans in [Davis93].
The programmer - not the programming environment - must control which level of abstraction is used at any point in the software development cycle.
Writing programs requires reasoning at several different levels. For example, when making decisions about the input to a problem it useful to think of the problem in terms related to the problem's space. On the other hand, when choosing between various algorithms or designing the data structures in a program, it's useful to drop down to a low-level model of the computer itself.
These multiple concerns require distinct models appropriate to the design task at hand. Hence, we say that the programmer uses a hierarchy of models. Each one describes the problem under consideration, but the language used and the details addressed vary greatly between the model levels.
As software is designed and coded, issues come up that are best addressed at different levels of abstraction. In other words, the layers in the model hierarchy are not addressed sequentially. The programmer may want to work at the highest-level model within the problem domain, then drop to the lowest level to choose the representation for the key data structures, and then jump back up to the top level for the next step in the design.
Some tools force the programmer to work through the abstraction layers sequentially. This doesn't match the way programmers think. They need to be able to choose when they work in terms of a high level abstraction and when they drop down to a low-level approach that directly utilizes the computational model.
This topic is discussed in the review [Petre90].
The design in terms of the high level models must be apparent in the program source code. In other words, the abstract design must be well grounded in the program text with explicit mappings between the highest specification levels and the program's source code.
An inevitable part of programming is debugging and refining the source code. This means that a program must be read as well as written. In fact, a program is read many more times than it is written so, if anything, the ability to read and comprehend a program is more important than how easy it is to write in the first place.
To understand a program, the reader must work backwards from the source code to construct the abstractions behind the software design. This process is much easier if the abstractions are clearly visible in the program text.
This rule is based on the five-level characterization of expert programmer knowledge discussed in [Wiedenbeck93]. It is also consistent with the models of comprehension proposed in [Brooks83].
A programming environment must support software development in which design and coding progress together (i.e., opportunistic refinement).
Programmers spend very little time in pure design activities. It is wishful thinking by software managers that a program is fully designed before coding begins. Programmers switch between design and coding throughout a project. They come up with some ideas, test them with rapid prototypes, and amend their design as appropriate.
We call this approach "Opportunistic refinement". The programmer confronted with a problem switches between design, coding, and testing activities depending on the best opportunities for making progress.
This is discussed in the cognitive model paper.
A programming environment must not hide the machine from the programmer.
Programmers know their software will execute on a real system even when they are working on high-level design issues. A programming environment must not interfere with this view of the system. This is particularly the case where efficiency is concerned.
This topic is discussed in the review [Petre90]. This rule is essentially a corollary of Rule 2.
Programmers work in terms of chunks of code. A good programming environment will make it easy to code in chunks at increasingly higher levels of abstraction.
Programmers design code in chunks or plans. The more experienced the programmer, the more plans are used as opposed to statement-by-statement construction of a program. As the programmer becomes even more experienced, these plans take on a larger role and apply to the high level design of a program. A good programming environment needs to recognize this approach to programming and make the high level design in plans explicit.
Plans are usually discussed in terms of program comprehension. For a discussion of plans in program constructions, see [Rist86].