In order to produce programs that are readable, reliable, and can be easily maintained or modified, one must use modular software design. This means that, instead of having a large collection of statements strung together in one partition of in-line code, we segment or divide the statements into logical groups called modules. Each module performs one or two tasks, then passes control to another module. By breaking up the code into "bite-sized chunks", so to speak, we are able to better control the flow of data and control. This is especially true in large software systems.
This section is organized as follows:
In Section 3.1, we discuss the basic justification and concepts associated
with software modularity, and show how in-line code can be subdivided to
support modular structure. Sections 3.2 and 3.3 contain a discussion of
the PASCAL PROGRAM
and PROCEDURE
statements, and
how they support software modularity. In Section 3.4, we show how constructs
called variables can be used to pass information to and from procedures.
(This transfer of information is similar in concept to passing food in and
out of a cafeteria window.)
Definition. Spaghetti code is the term used for a computer program that is not well structured and tends to have highly tangled flows of data and control.
Example. Most assembly language code and machine code are good examples of spaghetti code. The following sample of machine code is illustrative:
110101010010001000111001001 010101001000100001011101001 000111001101110001101101010 001111010010010101011001010 001010101111110100101010001Clearly, there is very little discernable structure in this type of code.
Definition. In programming languages, the semantic gap is the difference between the language you use to program the hardware (machine code) and the language you would like to use to program the computer as a system. We call the latter, more abstract language a high-level language or HLL.
Observation. Throughout the history of computing, there have been at least hundreds of attempts to make computer programming languages something like English -- easy to read and implicitly easy to understand. PASCAL is the result of one such effort. The co-creator of PASCAL, Nicholas Wirth, wanted to have an HLL that was easy to learn, read, and write. So, he designed PASCAL around the following concepts:
Clearly written software is often an elusive goal, because technical
programmers tend to prefer cryptic variable names (e.g.,
PR2CD$
instead of clear notation such as
PRICE
). Furthermore,
there are many programmers who do not have good writing skills, and
definitely don't enjoy writing documentation. Thus, to be a good
programmer, must concentrate
on improving the quality of your software not only through careful
design and programming, but also through careful documentation.
Modular code is easy to produce from a design, but often hard to produce from spaghetti code. We discuss this process below, where we show general examples of code modularization. Modern software development tools facilitate the generation of modular code, and often check syntax of programming statements, with some variable type checking possible. Thus, there exists a variety of evolving techniques for software design in modular form.
Concisely written code is important to ensuring proper program function. For example, if your code is so tangled that you can't determine what it does, how easy will it be for others to understand your work? It is also important not to create excessively complicated procedures, which are difficult to debug and maintain, and thus tend to be unreliable.
Software reliability follows from rigorous software design, checking one's work, and carefully debugging and testing the software you write in an incremental fashion. By incremental development, we mean the construction of a software system and testing of that software on a piece-by-piece basis. For example, after you write the lowest-level routines, you should test them all thoroughly before you write the functions or procedures that call those routines.
Ease of maintenance follows directly from clarity and concision. For example, if code can be clearly understood, then you or others would have no trouble understanding and modifying its functionality. Additionally, concise code is easier to maintain because (a) there is less code to examine and modify, (b) there is less probability of making mistakes in modifying the code, and (c) debugging is easier due to limited scope of functionality.
Now, let's examine some techniques for the conversion of spaghetti code to modular form.
Spaghetti code is usually written piecewise, in the sense that a programmer might write Lines 1-50 on Monday, Lines 51-107 on Tuesday, and so forth. Because humans tend to group their thoughts, there is usually some locality to the spaghetti code. This means that dataflow and the flow of control (i.e., statement or expression execution) tend to cluster in space (i.e., in the code and in memory) and in time.
There exists a class of computer programs (usually, as part of a suite of software development tools), called performance analysis software, dependency checking software, etc. Such programs accept as input the spaghetti code and produce a calling tree that illustrates control and data dependencies among modules in the software system.
For example, consider the spaghetti code shown in Figure 3.1a, where the tangled lines denote flow of control. By partitioning the code of Figure 3a into various modules, each of which execute two or three operations or tasks, it is possible to produce the modular organization depicted in Figure 3b.
Figure 3.1. General concept of modularizing spaghetti code: (a)
schematic illustration of control transfers (arcs) in spaghetti code;
(b) procedure segmentation according to locaity of control transfer.
A main program that calls three procedures is produced.
PROCEDURE P1(< args >): < procedure definition > END-PROC PROCEDURE P2(< args >): < procedure definition > END-PROC PROCEDURE P3(< args >): < procedure definition > END-PROC PROGRAM Main: < declarations and/or executable code > P1(< args >) #execute P1 < executable code > P2(< args >) #execute P2 < executable code > P3(< args >) #execute P3 < executable code > END-PROGwhich portrays the modularity shown in Figure 3.1b.
There are other methods that can facilitate conversion of spaghetti code to modularized code, which include:
MAIN > (P1 P2 P3)
. It
is not necessarily easy to determine the calling hierarchy
from spaghetti code, but this information can sometimes be
available from software called an execution profiler.
Such programs keep a record of control flow and can be used
to determine branching and jump behavior (transfers of
control).
The PASCAL language provides three methods for encapsulating code in procedures. First, the FUNCTION statement specifies a function that accepts values from its argument list and returns a value or result through the function name. Second, the PROCEDURE statement specifies a procedure that accepts values from its argument list and returns one or more values through its argument list. Third, the PROGRAM statement allows the programmer to specify high-level source code that calls predefined procedures to implement a structured software system. We define these statements as follows:
PROGRAM
specification statement:
Purpose: The Program statement specifies the name of a main program (i.e., the top-level procedure).
Syntax: PROGRAM program-name ( input-file
, output-file ) ;
where
program-name
denotes the name of the
program
input-file
denotes the name of the
file from which the program reads input
output-file
denotes the filename
to which the program writes output.
Example:
PROGRAM Prog1 (myfile.dat, myfile.rpt);
PROGRAM Prog1;
Notes: The input and output file names and their associated parentheses are optional, and may or may not work with various operating systems (e.g., DOS, UNIX, etc.)
PASCAL programs have three parts:
DO THIS NOT THIS ----------------------- ---------------------------- MAIN-PROGRAM MAIN-PROGRAM Proc #1 specification Proc #1 specification | <proc-1 code> Proc #2 specification | Proc #2 specification <proc-2 code> | <proc-2 code> <proc-1 code> | Proc #3 specification Proc #3 specification <proc-3 code> <proc-3 code> <main-program code> <main-program code> END. END.The preceding pseudocode becomes difficult to interpret visually (and, therefore, difficult to maintain) when Procedure #2 is defined within Procedure #1. Although this is valid from the perspective of PASCAL syntax it is not good programming style, because it decreases readability and, therefore, increases code maintenance cost.
FUNCTION
specification statement:
Purpose: The Function statement specifies the name of a procedure that inputs values through its argument list and can be thought of as returning a result through its name.
Syntax: FUNCTION function-name ( argument-1
,..., argument-N ) ;
where
function-name
denotes the name of the
function
argument-i
denotes the name of the
i-th argument of the function.
Example:
FUNCTION sine(x);
FUNCTION Distance(x,y);
Notes: Do not try to pass output values through the argument list of a function. This can cause problems in some PASCAL implementations.
PROCEDURE
specification statement:
Purpose: The Procedure statement specifies the name of a procedure that can input and output values through its argument list.
Syntax: PROCEDURE proc-name ( argument-1
,..., argument-N ) ;
where
proc-name
denotes the name of the
procedure
argument-i
denotes the name of the
i-th argument of the procedure.
Example:
PROCEDURE sine(x,output);
PROCEDURE Distance(x,y,output);
PROCEDURE Smile;
Notes: In the preceding examples, the variable output was used for passing results to the calling procedure. In the procedure Smile, there are no input or output variables declared in the argument list, but this procedure (and others) may use global variables, which are discussed in Section 3.4.
BEGIN...END
block specification statement:
Purpose: The BEGIN...END statement delimits a block of compound statements.
Syntax: BEGIN <statements> END
where
statements
denotes more than one Pascal
statement.
Example:
BEGIN WRITELN('Hello, world'); WRITELN('Second statement'); WRITELN('Last statement'); END;
Notes: In the preceding example, each statement ends with a semicolon (;). Since the PASCAL design philosophy views each statement as a clause, the semicolon punctuation convention (adopted from English) is employed.
General Comments: Indentation is used to highlight and clarify program structure. For example, each new level of statements should be indented two or three spaces to the right. When a block of statements is closed (e.g., with an END statement), then the indent shifts two or three spaces to the left. Each statement begins on a new line, except for multiple short assignment statements that initialize values in a program.
In the following section, we consider several examples of PASCAL procedural code.
PROGRAM TestFun; {Program specification} VAR x: integer; {Declare variable x as integer} FUNCTION Xcubed(x); {Function specification} BEGIN {Function begins here} Xcubed := x * x * x; {Function definition} END; {Function ends here} BEGIN {Program begins here} x := 4; {Assign value to x} WRITELN('x3=', Xcubed(x)); {Print value of x^3} END. {Program ends here}In the preceding PASCAL code, note that the VAR statement specifies a variable of a given datatype. In this case, the variable x is specified as an integer. Additionally, the
WRITELN
statement outputs the legend x3= to the screen, followed by
the value returned by the function call Xcubed(x). If we
preferred not to put the function call in WRITELN's argument list,
we could rewrite the preceding code as:
PROGRAM TestFun; {Program specification} VAR x,y: integer; {Declare variables x,y as integer} FUNCTION Xcubed(x); {Function specification} BEGIN {Function begins here} Xcubed := x * x * x; {Function definition} END; {Function ends here} BEGIN {Program begins here} x := 4 {Assign value to x} y := Xcubed(x); {Assign function output to y} WRITELN('x3=', y); {Print value of y} END. {Program ends here}Let us replicate the functionality of the preceding code by using the PROCEDURE construct and passing the output through a procedural argument instead of a FUNCTION name, as follows:
PROGRAM TestFun; {Program specification} VAR x,y: integer; {Declare variables x,y as integer} PROCEDURE Xcubed(x,y); {Function specification} BEGIN {Function begins here} y := x * x * x; {Function definition - y gets x^3} END; {Function ends here} BEGIN {Program begins here} x := 4 {Assign value to x} Xcubed(x,y); {Procedure call} WRITELN('x3=', y); {Print value of y = x^3} END. {Program ends here}In the preceding programs, the variables x and y have global scope. That is, their definition as integers held throughout the main program and called procedures (the function was also a called procedure). In the following section, we shall see that there is a way to define x and y that makes procedures and functions reusable. This also facilitates efficiency and reliability in software development.
In PASCAL, valid datatypes that we will consider in this class are:
In PASCAL, a variable name is any string of valid PASCAL characters. We recommend that you use the characters {A-Z,a-z,0-9,_} for your variable names. The following example is illustrative:
VALID NAMES INVALID NAMES ------------------- ------------------ Cost, Price $amount, @price score score+exam-gradeIn each case of invalid names, reserved symbols or characters that have multiple meanings are used in the name string. This is bad practice that can lead to compiler errors (i.e., your program won't compile), or can lead to confusion when debugging or modifying programs that contain such names.
In PASCAL, variables are typed using the VAR
statement,
which is described as follows:
VAR
specification statement:
Purpose: The VARiable statement specifies the name and datatype of procedure or program variables.
Syntax: VAR varname-1,...,varname-N :
datatype ) ;
where
varname-i
denotes the name of the
i-th variable in the list
datatype
denotes a valid PASCAL
datatype
Example:
VAR x,y,z : integer;
VAR sum,prod : real;
VAR name,ssn : string;
Notes: It is good programming style to specify only one datatype in each VAR statement. It is also good style not to continue VAR statements on multiple lines. This makes the program easier to read.
We next consider the issue of scope of variables. This issue is discussed in detail in Chapter 6 of Koffman, the textbook for this class, from which we condense the following discussion.
Each nested proceudre has its own declaration section and executable code. The latter is also called the procedure body. This is also true for main programs. Figure 3.2 illustrates procedure nesting in a program called Nest. Each rectangle represents a defined procedure, also called a program block.
A program block contains a program module's formal parameter list, which is defined by the VAR statement, as well as the declaration section and procedure body. In Figure 3.2, we illustrate the following procedure nesting hierarchy:
(Nest > (Outer > Inner, Too))The statements in each procedure operate only on local variables. This is good programming practice, and facilitates modularity. If we were to use global variables, which are declared once at the beginning of the main program and then hold through all procedures, this would be bad software engineering practice, because:
Figure 3.2. Scope of variables in a PASCAL program (after Koffmann, 1992).
Example. A variable declared as type T in some procedure P is available within P and all its subordinate procedures as a variable of type T.
Example. Suppose we have the procedural definition hierarchy
Main > (P1 > (P1a,P1b), P2))
. That is, P1 and P2
are defined within Main and P1a and P1b are defined within P1.
If a variable v is declared within Main as a string but
within P1 as real, then v has the type real in
P1, P1a, and P1b. However, v retains the type string
in Main and P2.
This concludes our overview of software modularity and PASCAL.
We next discuss PASCAL selection structures.