Data Structures, Algorithms, & Applications in Java
Amortized Complexity
Copyright 1999 Sartaj Sahni

What is Amortized Complexity?
Examples
Maintenance Contract
The McWidget Company
Subset Generation
Simulated Pointers
Array Stacks and Queues
Union-Find
Rearranging Railroad Cars
Exercises
Further Reading

What is Amortized Complexity?
The complexity of a method or operation, as defined in Chapter 2 of the text, is the actual complexity of the method/operation. The actual complexity of an operation is determined by the step count for that operation, and the actual complexity of a sequence of operations is determined by the step count for that sequence. The actual complexity of a sequence of operations may be determined by adding together the step counts for the individual operations in the sequence. Typically, determining the step count for each operation in the sequence is quite difficult, and instead, we obtain an upper bound on the step count for the sequence by adding together the worst-case step count for each operation.

Example Consider the method insert of Program 2.10. This method inserts an element into a sorted array, and its step count ranges from a low of 4 to a high of 2n+4, where n is the number of elements already in the array. Suppose we perform 5 insert operations beginning with n = 0. Further, suppose, that the actual step counts for these insert operations are 4, 4, 6, 10, and 8, respectively. The actual step count for the sequence of insert operations is 4 + 4 + 6 + 10 + 8 = 32. If we did not know the actual step count for the individual operations, we could obtain an upper bound on the actual step count for the operation sequence using one of the following two approaches.

Since the worst-case step count for an insert operation is 2n+4, sum_{0 <= i <= 4}(2i+4) = 4 + 6 + 8 + 10 + 12 = 40 is an upper bound on the step count for the sequence of 5 inserts.
The maximum number of elements already in the array at the time an insert operation begins is 4. Therefore, the worst-case step count of an insert operation is 2*4+4 = 12. Therefore, 5*12 = 60 is an upper bound on the step count for the sequence of 5 inserts.

In the preceding example, the upper bound obtained by the first approach is closer to the actual step count for the operation sequence. We say that the count obtained by the first approach is a tighter (i.e., closer to the real count) upper bound than that obtained by the second approach.

When determining the complexity of a sequence of operations, we can, at times, obtain tighter bounds using amortized complexity rather than worst-case complexity. Unlike the actual and worst-case complexities of an operation which are closely related to the step count for that operation, the amortized complexity of an operation is an accounting artifact that often bears no direct relationship to the actual complexity of that operation. The amortized complexity of an operation could be anything. The only requirement is that the sum of the amortized complexities of all operations in the sequence be greater than or equal to the sum of the actual complexities. That is

(1)       sum_{1 <= i <= n}amortized(i) >=
sum_{1 < = i <= n}actual(i)

where amortized(i) and actual(i), respectively, denote the amortized and actual complexities of the ith operation in a sequence of n operations. Because of this requirement on the sum of the amortized complexities of the operations in any sequence of operations, we may use the sum of the amortized complexities as an upper bound on the complexity of any sequence of operations.

You may view the amortized cost of an operation as being the amount you charge the operation rather than the amount the operation costs. You can charge an operation any amount you wish so long as the amount charged to all operations in the sequence is at least equal to the actual cost of the operation sequence.

Relative to the actual and amortized costs of each operation in a sequence of n operations, we define a potential function P(i) as below
(2) P(i) = amortized(i) - actual(i) + P(i-1)

That is, the ith operation causes the potential function to change by the difference between the amortized and actual costs of that operation. If we sum Equation (2) for 1 <= i <= n, we get
sum _{1 <= i <= n}P(i) = sum _{1 <= i <= n}(amortized(i) - actual(i) + P(i-1))
or
sum _{1 <= i <= n}(P(i) - P(i-1)) = sum _{1 <= i <= n}(amortized(i) - actual(i))
or
P(n) - P(0) = sum _{1 <= i <= n}(amortized(i) - actual(i))

From Equation (1), it follows that
(3) P(n) - P(0) >= 0

Under the assumption that P(0) = 0, the potential P(i) is the amount by which the first i operations have been overcharged (i.e., they have been charged more than their actual cost).

Generally, when we analyze the complexity of a sequence of n operations, n can be any nonnegative integer. Therefore, Equation (3) must hold for all nonegative integers.

The preceding discussion leads us to the following three methods to arrive at amortized costs for operations:

Aggregate Method
In the aggregate method, we determine an upper bound UpperBoundOnSumOfActualCosts(n) for the sum of the actual costs of the n operations. The amortized cost of each operation is set equal to UpperBoundOnSumOfActualCosts(n)/n. You may verify that this assignment of amortized costs satisfies Equation (1) and is, therefore, valid.
Accounting Method
In this method, we assign amortized costs to the operations (probably by guessing what assignment will work), compute the P(i)s using Equation (2), and show that P(n)-P(0) >= 0.
Potential Method
Here, we start with a potential function (probably obtained using good guess work) that satisfies Equation (3), and compute the amortized complexities using Equation (2).

Although, at this time, you may feel most comfortable with the aggregate method. this method is often the hardest to use because it is often quite difficult to obtain a bound on the aggregate actual cost that is smaller than the bound obtained by using the worst-case cost of each operation in the sequence. The accounting method is intuitive (we simply verify that the sum of the amortized costs is at least equal to the sum of the actual costs) and often results in tight bounds on the complexity of a sequence of operations. The potential method is often the hardest to use (because of the difficulty of determining the proper potential function to use), but for some applications remains the only way to obtain tight complexity bounds.

Examples

1. Maintenance Contract
Problem Definition
In January, you buy a new car from a dealer who offers you the following maintenance contract: $50 each month other than March, June, September and December (this covers an oil change and general inspection), $100 every March, June, and September (this covers an oil change, a minor tune-up, and a general inspection), and $200 every December (this covers an oil change, a major tune-up, and a general inspection). We are to obtain an upper bound on the cost of this maintenance contract as a function of the number of months.

Worst-Case Method
Using the ideas developed in Chapter 2, we can bound the contract cost for the first n months by taking the product of n and the maximum cost incurred in any month (i.e., $200). This would be analagous to the traditional way to estimate the complexity--take the product of the number of operations and the worst-case complexity of an operation. Using this approach, we get $200n as an upper bound on the contract cost. The upper bound is correct because the actual cost for n months does not exceed $200n.

Aggregate Method
To use the aggregate method for amortized complexity, we first determine an upper bound on the sum of the costs for the first n months. As tight a bound as is possible is desired. The sum of the actual monthly costs of the contract for the first n months is


200*floor(n/12) + 100*(floor(n/3) - floor(n/12)) + 50*(n - floor(n/3))

= 100*floor(n/12) + 50*floor(n/3) + 50*n

<= 100*n/12 + 50*n/3 + 50*n

= 50n(1/6 + 1/3 + 1)

= 50n(3/2)

= 75n

The amortized cost for each month is set to $75. The table below shows the actual costs, the amortized costs, and the potential function value (assuming P(0) = 0) for the first 16 months of the contract.

month          |  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16
actual cost    | 50  50 100  50  50 100  50  50 100  50  50 200  50  50 100  50
amortized cost | 75  75  75  75  75  75  75  75  75  75  75  75  75  75  75  75
P()            | 25  50  25  50  75  50  75 100  75 100 125   0  25  50  25  50

Notice that some months are charged more than their actual costs and others are charged less than their actual cost. The cumulative difference between what the operations are charged and their actual costs is given by the potential function. The potential function satisfies Equation (3) for all values of n. When we use the amortized cost of $75 per month, we get $75n as an upper bound on the contract cost for n months. This bound is tighter than the bound of $200n obtained using the worst-case monthly cost.

Accounting Method
When we use the accounting method, we must first assign an amortized cost for each month and then show that this assignment satisifes Equation (3). We have the option to assign a different amortized cost to each month. In our maintenance contract example, we know the actual cost by month and could use this actual cost as the amortized cost. It is, however, easier to work with an equal cost assignment for each month. Later, we shall see examples of operation sequences that consist of two or more types of operations (for example, when dealing with lists of elements, the operation sequence may be made up of search, insert, and remove operations). When dealing with such sequences we often assign a deifferent amortized cost to operations of different types (however, operations of the same type have the same amortized cost).

To get the best upper bound on the sum of the actual costs, we must set the amortized monthly cost to be the smallest number for which Equation (3) is satisifed for all n. From the above table, we see that using any cost less than $75 will result in P(n) - P(0) < 0 for some values of n. Therefore, the smallest assignable amortized cost consistent with Equation (3) is $75.

Generally, when the accounting method is used, we have not computed the aggregate cost. Therefore, we would not know that $75 is the least assignable amortized cost. So we start by assigning an amortized cost (obtained by making an educated guess) to each of the different operation types and then proceed to show that this assignment of amortized costs satisfies Equation (3). Once we have shown this, we can obtain an upper bound on the cost of any operation sequence by computing


sum_{1 <= i <= k} f(i) * amortized(i)

where k is the number of different operation types and f(i) is the frequency of operation type i (i.e., the number of times operations of this type occur in the operation sequence).

For our maintenance contract example, we might try an amortized cost of $70. When we use this amortized cost, we discover that Equation (3) is not satisifed for n = 12 (for example) and so $70 is an invalid amortized cost assignment. We might next try $80. By constructing a table such as the one above, we will observe that Equation (3) is satisfied for all months in the first 12 month cycle, and then conclude that the equation is satisifed for all n. Now, we can use $80n as an upper bound on the contract cost for n months.

Potential Method
We first define a potential function for the analysis. The only guideline you have in defining this function is that the potential function represents the cumulative difference between the amortized and actual costs. So, if you have an amortized cost in mind, you may be able to use this knowledge to develop a potential function that satsifies Equation (3), and then use the potential function and the actual operation costs (or an upper bound on these actual costs) to verify the amortized costs.

If we are extremely experienced, we might start with the potential function


P(0) = 0


P(n) = 0 for n mod 12 = 0


P(n) = 25 for n mod 12 = 1 or 3


P(n) = 50 for n mod 12 = 2, 4 or 6


P(n) = 75 for n mod 12 = 5, 7 or 9


P(n) = 100 for n mod 12 = 8, or 10


P(n) = 125 for n mod 12 = 11

Without the aid of the table constructed for the aggregate method, it would take quite some ingenuity to come up with this potential function. Having formulated a potential function and verified that this potential function satisfies Equation (3) for all n, we proceed to use Equation (2) to determine the amortized costs.

From Equation (2), we obtain
amortized(i) = actual(i) + P(i) - P(i-1)

Therefore, amortized(1) = actual(1) + P(1) - P(0) = 50 + 25 - 0 = 75, amortized(2) = actual(2) + P(2) - P(1) = 50 + 50 - 25 = 75, amortized(3) = actual(3) + P(3) - P(2) = 100 + 25 - 50 = 75, and so on. Therefore, the amortized cost for each month is $75. So, the actual cost for n months is at most $75n.

2. The McWidget Company
Problem Definition
The famous McWidget company manufactures widgets. At its headquarters, the company has a large display that shows how many widgets have been manufactured so far. Each time a widget is manufactured, a maintenance person updates this display. The cost for this update is $c + dm, where c is a fixed trip charge, d is a charge per display digit that is to be changed, and m is the number of digits that are to be changed. For example, when the display is changed from 1399 to 1400, the cost to the company is $c + 3d because 3 digits must be changed. The McWidget company wishes to amortize the cost of maintaining the display over the widgets that are manufactured, charging the same amount to each widget. More precisely, we are looking for an amount $e = amortized(i) that should leavied against each widget so that the sum of these charges equals or exceeds the actual cost of maintaining/updating the display ($e*n >= actual total cost incurred for first n widgets for all n >= 1). To keep the overall selling price of a widget low, we wish to find as small an e as possible. Clearly, e > c + d because each time a widget is made, at least one digit (the least significant one) has to be changed.

Worst-Case Method
This method does not work well in this application because there is no finite worst-case cost for a single display update. As more and more widgets are manufactured, the number of digits that need to be changed increases. For example, when the 1000th widget is made, 4 digits are to be changed incurring a cost of c + 4d, and when the 1,000,000th widget is made, 7 digits are to be changed incurring a cost of c + 7d. If we use the worst-case method, the amortized cost to each widget becomes infinity.

Aggregate Method
Let n be the number of widgets made so far. As noted earlier, the least significant digit of the display has been changed n times. The digit in the tens place changes once for every ten widgets made, that in the hundreds place changes once for evey hundred widgets made, that in the thousands place changes once for evey thousand widgets made, and so on. Thefore, the aggregate number of digits that have changed is bounded by n(1 + 1/10 + 1/100 + 1/1000 + ...) = (1.11111...)n. So, the amortized cost of updating the display is c + d(1.11111...)n/n < c + 1.12d. If the McWidget company adds $c + 1.12d to the selling price of each widget, it will collect enough money to pay for the cost of maintaining the display. Each widget is charged the cost of changing 1.12 digits regardless of the number of digits that are actually changed. The table given below shows the actual cost, as measured by the number of digits that change, of maintaining the display, the amortized cost (i.e., 1.12 digits per widget), and the potential function. The potential function gives the difference between the sum of the amortized costs and the sum of the actual costs. Notice how the potential function builds up so that when it comes time to pay for changing two digits, the previous potential function value plus the current amortized cost exceeds 2. From our derivation of the amortized cost, it follows that the potential function is always nonnegative.

widget        |    1    2    3    4    5    6    7    8    9   10   11   12   13   14 
actual cost   |    1    1    1    1    1    1    1    1    1    2    1    1    1    1
amortized cost| 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12
P()           | 0.12 0.24 0.36 0.48 0.60 0.72 0.84 0.96 1.08 0.20 0.32 0.44 0.56 0.68
=============================================================================================
widget        |   15   16   17   18   19   20   21   22   23   24   25   26   27   28 
actual cost   |    1    1    1    1    1    2    1    1    1    1    1    1    1    1
amortized cost| 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12 1.12
P()           | 0.80 0.92 1.04 1.16 1.28 0.40 0.52 0.64 0.76 0.88 1.00 1.12 1.24 1.36

Accounting Method
We begin by assigning an amortized cost to the individual operations, and then we show that these assigned costs satsify Equation (3). Having already done an amortized analysis using the aggregate method, we see that Equation (3) is satisfied when we assign an amortized cost of $c + 1.12d to each display change. Typically, however, the use of the accounting method is not preceded by an application of the aggregate method and we start by guessing an amortized cost and then showing that this guess satisfies Equation (3).

Suppose we assign a guessed amortized cost of $c + 2d for each display change.

P(n) - P(0) = sum _{1 <= i <= n}(amortized(i) - actual(i))

= (c + 2d)n - sum_{1 <= i <= n}actual(i)

= (c + 2d)n - (c + (1 + 1/10 + 1/100 + ...)d)n

>= (c + 2d)n - (c + 1.12d)n

>= 0

This analysis also shows us that we can reduce the amortized cost of a widget to $c + 1.12d.

An alternative proof method that is useful in some analyses involves distributing the excess charge P(i) - P(0) over various accounting entities, and using these stored excess charges (called credits) to establish P(i+1) - P(0) >= 0. For our McWidget example, we use the display digits as the accounting entities. Initially, each digit is 0 and each digit has a credit of 0 dollars. Suppose we have guessed an amortized cost of $c + (1.111...)d. When the first widget is manufactured, $c + d of the amortized cost is used to pay for the update of the display and the remaining $(0.111...)d of the amortized cost is retained as a credit by the least significant digit of the display. Similarly, when the second through ninth widgets are manufactured, $c + d of the amortized cost is used to pay for the update of the display and the remaining $(0.111...)d of the amortized cost is retained as a credit by the least significant digit of the display. Following the manufacture of the ninth widget, the least significant digit of the display has a credit of $(0.999...)d and the remaining digits have no credit. When the tenth widget is manufactured, $c + d of the amortized cost are used to pay for the trip charge and the cost of changing the least significant digit. The least significant digit now has a credit of $(1.111...)d. Of this credit, $d are used to pay for the change of the next least significant digit (i.e., the digit in the tens place), and the remaining $(0.111...)d are transferred to the ten's digit as a credit. Continuing in this way, we see that when the display shows 99, the credit on the ten's digit is $(0.999...)d and that on the one's digit (i.e., the least significant digit) is also $(0.999...)d. When the 100th widget is manufactured, $c + d of the amortized cost are used to pay for the trip charge and the cost of changing the least significant digit, and the credit on the least significant digit becomes $(1.111...)d. Of this credit, $d are used to pay for the change of the tens digit from 9 to 0, the remaining $(0.111...)d credit on the one's digit is transferred to the ten's digit. The credit on the ten's digit now becomes $(1.111...)d. Of this credit, $d are used to pay for the change of the hundred's digit from 0 to 1, the remaining $(0.111...)d credit on the ten's digit is transferred to the hundred's digit.

The above accounting scheme ensures that the credit on each digit of the display always equals $(0.111...)dv, where v is the value of the digit (e.g., when the display is 206 the credit on the one's digit is $(0.666...)d, the credit on the tens's digit is $0, and that on the hundred's digit is $(0.222...)d.

From the preceding discussion, it follows that P(n) - P(0) equals the sum of the digit credits and this sum is always nonnegative. Therefore, Equation (3) holds for all n.

Potential Method
We first postulate a potential function that satisfies Equation (3), and then use this function to obtain the amortized costs. From the alternative proof used above for the accounting method, we can see that we should use the potential function
P(n) = (0.111...)d sum_iv_i

where v_i is the value of the ith digit of the display. For example, when the display shows 206 (at this time n = 206), the potential function value is (0.888...)d. This potential function satisifies Equation (3).

Let q be the number of 9s at the right end of j (i.e., when

j
= 12903999

, q = 3). When the display changes from j to j+1, the potential change is (0.111...)d(1-9q) and the actual cost of updating the display is $c + (q+1)d. From Equation (2), it follows that the amortized cost for the display change is

actual cost + potential change

= c + (q+1)d + (0.111...)d(1-9q)

= c + (1.111...)d

3. Subset Generation
Problem Definition
The subsets of a set of n elements are defined by the 2ⁿ vectors x[1:n], where each x[i] is either 0 or 1. x[i] = 1 iff the ith element of the set is a member of the subset. The subsets of a set of three elements are given by the eight vectors 000, 001, 010, 011, 100, 101, 110, 111, for example. A recursive method to generate all subsets was developed as the solution to Exercise 1.29. We shall now develop an enumerator for the subsets. For convenience, we shall actually enumerate only the nonempty subsets (i.e., all subsets other than 000...000). Further, we shall deviate from the syntax of Java enumerators and provide only two methods--a constructor and a method nextSubset, which returns null when there is no next subset and returns the next subset otherwise. The code is given below.

public class SubsetEnumerator
{
   // data member
   int n;      // set size
   int [] x;   // subset array
   
   /** constructor */
   public SubsetEnumerator(int theN)
   {
      n = theN;
      x = new int [n + 1];
   }

   /** @return next subset
     * @return null iff there is no next subset */
   public int [] nextSubset()
   {
      // generate next subset
      int i = n;
      while (i > 0 && x[i] == 1)
      {
         x[i] = 0;
         i--;
      }

      if (i == 0)
         return null;
      else
      {
         x[i] = 1;
         return x;
      }
   }

   /** test program */
   public static void main(String [] args)
   {
      int n = 4;
      int [] x;
      SubsetEnumerator sv = new SubsetEnumerator(n);

      // output all subsets other than the null subset
      while (true)
      {
         x = sv.nextSubset();
         if (x == null) break;
         for (int i = 1; i <= n; i++)
            System.out.print(x[i] + " ");
         System.out.println();
      }
   }
}

We wish to determine how much time it takes to generate the first m, 1 <= m < 2ⁿ subsets.

Worst-Case Method
The complexity of nextSubset is Theta(c), where c is the number of xs that change. Since all n of the xs could change in a single invocation of nextSubset, the worst-case complexity of nextSubset is Theta(n). Using the worst-case method, the time required to generate the first m subsets is O(mn).

Aggregate Method
The complexity of nextSubset equals the number of x[i]s that change. When nextSubset is invoked m times, x[n] changes m times; x[n - 1] changes floor(m/2) times; x[n - 2] changes floor(m/4) times; x[n - 3] changes floor(m/8) times; and so on. Therefore the sum of the actual costs of the first m invocations is


sum_{0 <= i <= floor(log₂m)}floor(m/2ⁱ)
< 2m

Therefore, the complexity of generating the first m subsets is actually O(m), a tighter bound than obtained using the worst-case method.

The amortized complexity of nextSubset is (sum of actual costs)/m < 2m/m = O(1).

Accounting Method
We first guess the amortized complexity of nextSubset, and then show that this amortized complexity satisfies Equation (3). Suppose we guess that the amortized complexity is 2. To verify this guess, we must show that
P(m) - P(0) >= 0

for all m.

We shall use the alternative proof method used in the McWidget example. In this method, we distribute the excess charge P(i) - P(0) over various accounting entities, and use these stored excess charges to establish P(i+1) - P(0) >= 0. We use the x[j]s as the accounting entities. Initially, each x[j] is 0 and has a credit of 0. When the first subset is generated, 1 unit of the amortized cost is used to pay for the single x[j] that changes and the remaining 1 unit of the amortized cost is retained as a credit by x[n], which is the x[j] that has changed to 1. When the second subset is generated, the credit on x[n] is used to pay for changing x[n] to 0 in the while loop, 1 unit of the amortized cost is used to pay for changing x[n-1] to 1, and the remaining 1 unit of the amortized cost is retained as a credit by x[n-1], which is the x[j] that has changed to 1. When the third subset is generated, 1 unit of the amortized cost is used to pay for changing x[n] to 1, and the remaining 1 unit of the amortized cost is retained as a credit by x[n], which is the x[j] that has changed to 1. When the fourth subset is generated, the credit on x[n] is used to pay for changing x[n] to 0 in the while loop, the credit on x[n-1] is used to pay for changing x[n-1] to 0 in the while loop, 1 unit of the amortized cost is used to pay for changing x[n-2] to 1, and the remaining 1 unit of the amortized cost is retained as a credit by x[n-2], which is the x[j] that has changed to 1. Continuing in this way, we see that each x[j] that is 1 has a credit of 1 unit on it. This credit is used to pay the actual cost of changing this x[j] from 1 to 0 in the while loop. One unit of the amortized cost of nextSubset is used to pay for the actual cost of changing an x[j] to 1 in the else clause, and the remaining one unit of the amortized cost is retained as a credit by this x[j].

The above accounting scheme ensures that the credit on each x[j] that is 1 is exactly 1, and the credit on each x[j] that is 0 is 0.

From the preceding discussion, it follows that P(m) - P(0) equals the number of x[j]s that are 1. Since this number is always nonnegative, Equation (3) holds for all m.

Having established that the amortized complexity of nextSubset is 2 = O(1), we conclude that the complexity of generating the first m subsets equals m * amortized complexity = O(m).

Potential Method
We first postulate a potential function that satisfies Equation (3), and then use this function to obtain the amortized costs. Let P(j) be the potential just after the jth subset is generated. From the proof used above for the accounting method, we can see that we should define
P(j) = number of x[i]s in the jth subset that are equal to 1.

By definition, the 0th subset has all x[i] equal to 0. Since P(0) = 0 and P(j) >= 0 for all j, this potential function P satisfies Equation (3).

Consider any subset x[1:n]. Let q be the number of 1s at the right end of x[] (i.e.,

x[n],
x[n-1], ..., x[n-q+1]

, are all 1s). Assume that there is a next subset. When the next subset is generated, the potential change is 1-q because q 1s are replaced by 0 in the while loop and a 0 is replaced by a 1 in the else clause. The actual cost of generating the next subset is q+1. From Equation (2), it follows that, when there is a next subset, the amortized cost for nextSubset is

actual cost + potential change

= q + 1 + 1 - q

= 2

When there is no next subset, the potential change is -q and the actual cost of nextSubset is q. From Equation (2), it follows that, when there is no next subset, the amortized cost for nextSubset is

actual cost + potential change

= q - q

= 0

Therefore, we can use 2 as the amortized complexity of nextSubset. Consequently, the actual cost of generating the first m subsets is O(m).

4. Simulated Pointers
Problem Definition
Section 7.3 of the text discusses how we can simulate Java pointers using integers. The discussion of Section 7.3 resulted in the development of the class SimulatedSpace1 which has a constructor and the methods allocateNode and deallocateNode. We wish to determine the time it takes to perform any sequence of a allocateNode and d deallocateNode operations.

The code for the class SimulatedSpace1 For the analysis, we assume that the cost of changeSize1D is node.length. is given below.

public class SimulatedSpace1
{
   // data members
   private int firstNode;
   SimulatedNode [] node;  // package visible

   // constructor
   public SimulatedSpace1(int numberOfNodes)
   {
      node = new SimulatedNode [numberOfNodes];

      // create nodes and link into a chain
      for (int i = 0; i < numberOfNodes - 1; i++)
         node[i] = new SimulatedNode(i + 1);
      // last node of array and chain
      node[numberOfNodes - 1] = new SimulatedNode(-1);
      // firstNode has the default initial value 0
   }
   
   public int allocateNode(Object element, int next)
   {// Allocate a free node and set its fields.
      if (firstNode == -1)
      {   // double number of nodes
          node = (SimulatedNode []) ChangeArraySize.
                 changeSize1D(node, 2 * node.length);
          // create and link new nodes
          for (int i = node.length / 2;
               i < node.length - 1; i++)
             node[i] = new SimulatedNode(i + 1);
          node[node.length] = new SimulatedNode(-1);
          firstNode = node.length / 2;
      }

      int i = firstNode;         // allocate first node
      firstNode = node[i].next;  // firstNode points to
                                 // next free node
      node[i].element = element; 
      node[i].next = next;
      return i;
   }
   
   public void deallocateNode(int i)
   {// Free node i.
      // make i first node on free space list
      node[i].next = firstNode;
      // remove element reference so that space can be garbage collected
      node[i].element = null;
      firstNode = i;
   }
}

Worst-Case Method
Since the worst-case complexity of allocateNode is Theta(a) (because of the array doubling done by the invocation of changeSize1d) and that of deallocateNode is Theta(1), the complexity of a sequence of a allocateNode and d deallocateNode operations is O(a²+d).

Aggregate Method
The total time spent on a allocateNode operations is O(a + time spent in array doubling). The time spent in array doubling is O(sum_{0 <= i <= k}2ⁱ), where k = ceil(log₂a). Therefore, the time spent in array doubling is O(a), and the aggregate time for a allocateNode operations is O(a). The amortized complexity of allocateNode is therefore O(a/a) = O(1).

The aggregate time for d deallocateNode operations is O(d), and the amortized complexity of deallocateNode is O(d/d) = O(1).

Consequently, the actual complexity of a sequence of a allocateNode and d deallocateNode operations is O(a+d). This is a tighter bound than obtained using the worst-case method.

Accounting Method
For the deallocate method, we assign an amortized complexity that equals its worst-case complexity, that is O(1). For the allocate method, we assign an amortized complexity of 3, which is less than its worst-case complexity. To verify the correctness of this assignment, we must show that
P(a) - P(0) >= 0

for all a.

We shall use the alternative proof method introduced in the McWidget example. We use the node[i]s as the accounting entities. Initially, each node[i] has a credit of 0. When a node, node[i], is allocated, we use 1 unit of the amortized cost to pay for the actual cost of the node allocation (excluding any array doubling that may occur), and the remaining 2 units of the amortized cost are retained as a credit by node[i]. When array doubling occurs for the first time, all nodes have been allocated and so have a credit of 2 units on them. The total available credits are 2*node.lengthBeforeDoubling. These credits are used to pay for the array doubling (the cost of doubling the array size is node.lengthBeforeDoubling). When any subsequent array doubling occurs, the nodes node[j], node.lengthAfterDoubling/2 <= j < node.lengthAfterDoubling, have a credit of at least 2 units on them (other nodes may also have a nonzero credit). Therefore, the total credits on the nodes is at least node.lengthAfterDoubling. These credits are adequate to pay for the subsequent array doubling.

From the preceding discussion, it follows that P(a) - P(0) is always nonnegative. Therefore, Equation (3) holds for all a.

Having established that the amortized complexity of both allocateNode and deallocateNode is O(1), we conclude that the actual complexity of a sequence of a allocateNode and d deallocateNode operations is O(a+d).

Potential Method
We first postulate a potential function that satisfies Equation (3), and then use this function to obtain the amortized complexity of allocateNode. Let P(i) be the potential following the ith allocate node operation. Since we assign deallocateNode an amortized complexity that equals its actual complexity, the potential is not affected by deallocate operations. The potential function that we shall use is


P(i) = 2 * (number of allocate node operations since the most recent array
doubling)

We see that P(0) = 0 and P(i) >= 0 for all i. So, this potential function satisfies Equation (3).

If the ith allocate node operation does not require array doubling, the actual cost of this operation is 1 unit. Also, P(i) - P(i-1) = 2. Therefore, the amortized cost of the operation is

actual cost + potential change

= 1 + 2

= 3

If the ith allocate node operation requires array doubling, the actual cost of this operation is 1 + node.length units. Following the node allocation, the potential P(i) is 2. Further, since at least node.length/2 allocate node operations (excluding the current one) must have taken place since the last time the array size was doubled, P(i-1) >= node.length. Therefore, the amortized cost of the operation is

actual cost + potential change

<= (1 + node.length) + (2 - node.length)

= 3

Consequently, we may assign each allocate node operation an amortized cost of 3.

Since the amortized complexity of both allocateNode and deallocateNode is O(1), we conclude that the actual complexity of a sequence of a allocateNode and d deallocateNode operations is O(a+d).

5. Array Stacks and Queues
The classes ArrayStack and ArrayQueue were developed in Chapters 9 and 10, respectively. For both classes, because it may be necessary to double the array size, the actual complexity of the insert operation (ArrayStack.push and ArrayQueue.put) is O(s), where s is the current size of the stack or queue. The actual complexity of the remove operation (ArrayStack.pop and ArrayQueue.remove) is O(1). Therefore, using the worst-case method, the complexity of a sequence of i insert and r remove operations is O(i²+r). Using developments identical to those used for the case of simulated pointers, we can show that the amortized complexity of the insert operation is O(1). This enables us to obtain the tighter bound of O(i + r) on the complexity of any sequence of i insert and r remove operations performed on an initially empty stack or queue. Using these amortized complexities, we can establish the time complexities of the stack and queue applications considered in Chapters 9 and 10 without appealing to Theorem 5.1.

6. Union-Find
Problem Definition
Consider the class UnionFindSecondSolution developed in Section 7.7.4 of the text. Suppose that u < n union operations and f find operations are performed. The complexity of an individual union is O(u) and that of an individual find is O(1). In the text, we showed that the aggregate complexity of u union operations is O(u log u), and then concluded that the actual complexity of a sequence of u < n union operations and f find operations is O(u log u + f). Had we used the worst-case method, we would have arrived at the complexity O(u²+f). Since the aggregate complexity of u union operations is O(u log u), the amortized complexity of the method union is O(log u). The amortized complexity of the method find is the same as its actual complexity, that is O(1). Let us see how we can arrive at the amortized complexity of union using the accounting and potential function methods.

Accounting Method
To the method find, we assign an amortized complexity that equals its worst-case complexity, that is O(1). To the method union, we first assign an amortized complexity of L = ceil(log₂ (u+1)), which is less than its worst-case complexity. To verify the correctness of this assignment, we must show that
P(u) - P(0) >= 0

for all u, where P(i) denotes the potential following the ith union operation. Since the actual and amortized complexities of the method find are the same, the find operations do not affect the potential.

It is easy to verify that, at all times, each set contains exactly one element that has never moved (see the text for the definition of moved). Our credit assignment scheme will have the following properties (except, possibly, following the last union operation):

The element in each set that has never been moved has 0 credits.
The credits on an element that has moved at least once is at least L - (number of times this element has moved).

For example, when the number u of union operations is 6, L = 3. Suppose that the initial sets are {1}, {2}, ..., {10}. Initially, no element has moved, each set has exactly one element that has never moved, and each element has 0 credits. If the first union unites the sets {1} and {2} by regarding the first set as the smaller set, then following the union we have the set {1,2}. The credits on the element 1 will be at least L-1 = 2 (note that the element 1 has been moved once) and the credits on the element 2, which is the single element in the set {1,2} that has never moved, will be zero.

Suppose that a union operation unites the sets A and B, where A is the smaller set. Let the resulting set be C. The set C contains exactly one element that has never moved. This is the single element of B that has never moved. This element has 0 credits. 1 unit of the amortized cost of L units associated with this union operation is used to pay for moving the never moved element a of A. The remaining L-1 units of the amortized cost are assigned as credits to the element a. These credits are adequate to pay for the at most L-1 future moves that element a may make. The remaining portion of the actual cost of uniting sets A and B is paid for by reducing the number of credits on each of the elements A-{a} by 1. Note that each of these elements must have a credit that exceeds 0, because the first time an element is moved it is assigned L-1 credits which are enough to pay for all future moves of the element.

It follows that, after the u union operations, no element has a negative credit. Therefore, P(u) - P(0) >= 0, and

L =
floor(log₂(u+1)) = O(log u)

is the amortized complexity of a union operation. Hence, the complexity of any sequence of u < n union operations and f find operations is O(u log u + f).

Potential Method
Let P(i) = sum [(s_j-1) (L-log₂ s_j)] be the potential following the ith union operation. Here L = log₂ (u+1) and s_j is the size of set j. Initially, the size of each set is 1 and so, P(0) = 0. Since no set can have a size more than u+1, P(i) >= 0. Therefore, the potential function satisfies Equation (3).

When two sets of size a and b, a <= b are united, the change in the potential function is
(a+b-1)(L-log(a+b)) - (a-1)(L-log a) - (b-1)(L-log b)
= L + (a-1)log(a/(a+b)) + (b-1)log(b/(a+b)) - log (a+b)
< L + (a-1)log(1/2) + (b-1)log(1) - log (2)
= L - a + 1 + 0 - 1
= L - a
Since the actual complexity of the union operation is a, its amortized complexity is

actual cost + potential change

< a + (L - a)

= L

= O(log u)

Consequently, the complexity of any sequence of u < n union operations and f find operations is O(u log u + f).

7. Rearranging Railroad Cars
Problem Definition
Section 9.5.3 of the text developed a solution to the railroad cars rearranging problem in which the holding tracks operate as stacks. We shall analyze the complexity of the method RailroadWithStacks.railroad (Program 9.10). For convenience, a simplified version of this method is given below.

/** rearrange railroad cars beginning with the initial order
 * inputOrder[1:theNumberOfCars]
 * @return true if successful, false if impossible. */
public static boolean railroad(int [] inputOrder,
                      int theNumberOfCars, int theNumberOfTracks)
{
   // code of complexity O(numberOfTracks) comes here

   // rearrange cars
   for (int i = 1; i <= numberOfCars; i++)
      if (inputOrder[i] == nextCarToOutput)
      {// send car inputOrder[i] straight out
          System.out.println("Move car " + inputOrder[i] + " from input "
                             + "track to output track");
          nextCarToOutput++;
 
          // output from holding tracks
          while (smallestCar == nextCarToOutput)
          {
             outputFromHoldingTrack();
             nextCarToOutput++;
          }
      }
      else
         // put car inputOrder[i] in a holding track
         if (!putInHoldingTrack(inputOrder[i]))
            return false;

   return true;
}

The method outputFromHoldingTrack removes and outputs a railroad car from a holding track, and the method putInHoldingTrack puts a car into a holding track using the rule given in Section 9.5.3. This latter method returns the value false iff there is no holding track that satisfies the stated rule. For purposes of our analysis, we assume that the actual complexity of both outputFromHoldingTrack and putInHoldingTrack is O(numberOfTracks). In reality, this assumption is valid only for the method outputFromHoldingTrack. The actual complexity of putInHoldingTrack is O(numberOfCars+numberOfTracks) because of the cost of array doubling when adding an element to a stack. The amortized complexity of putInHoldingTrack is, however, O(numberOfTracks).

The complexity of the code outside of the for loop is O(numberOfTracks), and our analysis here will focus on the complexity of the for loop.

Worst-Case Method
In examining the if statement, we see that the worst-case complexity of the else clause is Theta(numberOfTracks). The then clause consists of the statements executed when the if conditional (inputOrder[i] == nextCarToOutput) is true. The worst-case complexity of the then clause is Theta(numberOfCars * numberOfTracks) because the holding tracks may have Theta(numberOfCars) on them and we may have have to output all of these. Therefore, the worst-case complexity of the if statement is Theta(numberOfCars * numberOfTracks).

Using the worst-case method, we arrive at O(numberOfCars² * numberOfTracks) as the complexity of the for loop (and of the method railroad). Although this analysis is correct, we have not obtained a tight bound on the complexity. A tighter bound is obtained using amortized complexity methods.

Aggregate Method
Since p < numberOfCars cars are put into the holding tracks in the else clause, exactly p cars are removed from the holding tracks in the then clause. Therefore, the aggregate number of iterations of the while loop is p, and the aggregate complexity of the while loop is O(p * numberOfTracks). Therefore, the aggregate complexity of the for loop (and also of the method railroad) is O(numberOfCars * numberOfTracks). Using the aggregate method, we have arrived at a tighter bound on the complexity than obtained using the worst-case method.
Since the aggregate complexity of the for loop is O(numberOfCars * numberOfTracks), and the number of times the for loop is entered is numberOfCars, the amortized complexity of each iteration of the for loop is

O(numberOfCars * numberOfTracks / numberOfCars) =
O(numberOfTracks)

.

Accounting Method
To simplify the analysis, we use numbeOfTracks (rather than O(numberOfTracks)) as the complexity of the methods putInHoldingTrack and outputFromHoldingTrack. We assign the then clause an amortized complexity of 1, and the else clause is assigned an amortized complexity equal to 2 * numberOfTracks + 2. To verify the correctness of this complexity assignment, we must show that the sum of the amortized complexities is greater than or equal to the sum of the actual complexities (the sum is taken over all executions of the then and else clauses). This is equivalent to showing that Equation (3) holds.

When the else clause is entered, numberOfTracks units of its amortized complexity are used to pay for the execution of the method putInHoldingTrack, 1 unit is used to pay for testing the value returned by putInHoldingTrack and exceuting the return false statement (if executed). The remaining numberOfTracks + 1 units of the amortized complexity are held as a credit by the railroad car that is put into a holding track.

When the then clause is entered, the single unit of amortized cost assigned to the then clause is used to pay for the statements that print a line and increment nextCarToOutput, and for one test of the while loop conditional. If the while loop is entered, then the numberOfTracks + 1 credits on each car output from a holding track are used in the following way: numberOfTracks units are used to pay for the execution of the method outputFromHoldingTrack, and the remaining 1 unit is used to pay for the statement that increments nextCarToOutput as well as for one test of the while loop conditional. Therefore, the amortized costs are adequate to pay for all of the actual costs.

Since the amortized cost of the then clause is 1 and that of the else clause is 2 * numberOfTracks + 2, and since the number of times each clause is executed is O(numberOfCars), the actual complexity of the for loop (and hence of the entire method) is

O(numberOfCars * (2 * numberOfTracks + 2 + 1))
= O(numberOfCars * numberOfTracks)

.

Potential Method
Let P(j) denote the potential following the iteration i = j of the for loop. We define, P(j) = (numberOfTracks + 1)n, where n is the number of railroad cars in the holding tracks after the for loop iteration with i = j completes. Also, P(0) = 0 is the initial potential (note that n = 0 initially because the holding tracks are empty when we start the method railroad). Since this potential function is nonnegative for all j, Equation (3) is satisfied.

Suppose that when i = j + 1, the else clause is entered. Since a car is added to the holding tracks, P(j+1) - P(j) = numberOfTracks + 1. Therefore, the amortized complexity of the else clause is

actual cost + potential change

= (numberOfTracks + 1) + (numberOfTracks + 1)

= 2 * (numberOfTracks + 1)

Next, suppose that when i = j + 1, the then clause is entered. If the while loop iteraties p times, p cars are removed from the holding tracks. Therefore, P(j+1) - P(j) = -p * (numberOfTracks + 1). The actual complexity of the then clause is 1 + p * (numberOfTracks + 1). So, the amortized complexity of the then clause is

actual cost + potential change

= 1 + p * (numberOfTracks + 1) - p * (numberOfTracks + 1)

= 1

Having established that the amortized complexities of the then and if clauses are 1 and 2 * (numberOfTracks + 1), resepctively, we conclude that the actual complexity of the for loop (and hence of the entire method) is O(numberOfCars * numberOfTracks).

Exercises

What is the relationship between a correct assignment of amortized complexities to operations and the actual complexities of these operations?
How are potential function, amortized complexity, and actual complexity related?
The McWidget company has been bought out by a computer manufacturer who insists that all displays be in binary. Rework the McWidget example using a binary display.
Suppose that a sequence of tasks is performed. The actual complexity of the ith task is 1 when i is not a power of 2. When i is a power of 2, the complexity of the ith task is i. Use each of the methods (a) aggregate, (b) accounting, and (c) potential function to show that the amortized complexity of a task is O(1).
Imagine that a data structure is represented as an array whose initial length is 1. The data structure operations are insert and remove. An insert takes 1 time unit except when the number of elements in the data structure prior to the insert equals the array length n; at this time, the insert takes n time units because we double the array length. A remove takes 1 time unit except when the number of elements left in the array is less than (array length)/4. When the number of elements left in the array is less than (array length)/4, the array length is halved and the remove takes (array length)/2 time units. Use each of the methods (a) aggregate, (b) accounting, and (c) potential function to show that the amortized complexity of each data structure operation is O(1).

Further Reading
A survey of amortized complexity appears in the paper ``Amortized computational complexity,'' by Robert Tarjan, SIAM Journal on Algebraic and Discrete Methods, 6, 2, 306-318, 1985.

Data Structures, Algorithms, & Applications in Java Amortized Complexity Copyright 1999 Sartaj Sahni

Data Structures, Algorithms, & Applications in Java
Amortized Complexity
Copyright 1999 Sartaj Sahni