Resizing Arrays

To eliminate the limitations of a fixed capacity stack, when there is no room left in the array when trying to push an element, we create a new array of greater size and copy the elements of the original array into it. Then we abandon the old array and use the new one in its place.

The question is: "How large do we make the new (resized) array?".

To answer this question, we first note that all the writing to arrays done during the resizing comes at a cost...

If each time we need more room in the array, we resize it to have one additional cell, then as we insert $N$ items (starting from a stack with one empty cell) the resizing requires the following number of "copies" to be made: $$1 + 2 + 3 + \cdots + (N-1)$$

Note, copying an array element from one array to another actually requires two array accesses (i.e., get-the-old-value, put-the-new-value). As such, if we measure the cost of the resizing in terms of "array accesses", then the cost is twice the number of copies made: $$2(1 + 2 + 3 + \cdots + (N-1)) = 2 \cdot \frac{N(N-1)}{2} \sim N^2$$

Unfortunately, this quadratic growth is simply not feasible for large values of N.

Suppose however, we resize the array by doubling its size every time it gets full. For ease of calculation let us suppose that $N = 2^k$ for some $k$. Then, as we insert $N$ items onto the stack (again starting from a stack with one empty cell), the resizing costs the following number of array accesses:

$$ 2(1 + 2 + 4 + \cdots + 2^{k-1}) = 2^{k+1} - 2 = 2N - 2 \sim 2N $$

So the cost rises linearly with the number of elements we push onto the stack.

We could of course do better if we multiplied the size of the array by an even larger value, but then there would likely be a lot more unused cells in the array on average. Whatever we might gain in speed, we lose in the amount of memory allocated to do whatever task we need to do. As such, we opt for the smaller-memory-footprint option of using a resizing coefficient of $2$.

Of course, the total cost in array accesses of pushing $N$ items must include not just those array accesses for the resizing, but also those array accesses for copying the pushed element (which only costs a single array access per copy) to the new array. Thus, the following gives us a grand total cost for the insertion of $N$ elements in terms of array accesses: $$2(1 + 2 + 4 + \cdots + 2^{k-1}) + N \approx 2N + N = 3N$$

Amortized over the N pushes we then see an average of only 3 array accesses per push.

Thrashing

When it comes to popping elements off the array, we want to be mindful of how much unused space we have allocated to the array as well.

One might think that if the array drops below half of its original size, we could resize back to an array half as big (we could always double it again if needed). However, this turns out to be a bad strategy in the common case where pushes and pops come with relatively equal frequency

As a worst case, consider a full array followed by: push, pop, push, pop, push, pop, ... Each action produces a resize and costs N copies! This is called "thrashing" the array.

So when popping, we only resize down when the array is 1/4th full. (Note, we need to keep the factor a power of 2, so that we don't have to worry if the size is a multiple of the factor.)

Note, by doing this we ensure that the array holding the contents of our stack will ALWAYS be between 25% and 100% full

Performance Analysis

We can look at the performance of this implementation by considering the order of growth of running times in the best, worst, and amortized cases...

Note the order of growth in the amortized case is the average running time per operation over a worst-case sequence of operations...

                best    worst    amortized
 construction    1        1          1
         push    1        N          1
          pop    1        N          1
         size    1        1          1
 

Of course, we shouldn't just consider the cost function for the running-time of the various operations associated with stacks. We also need to worry about the cost in terms of memory used to store $n$ things in a stack. Of course, the worst case here happens when the array used to implement the stack is only a quarter full, where we have allocated four times the memory we need for what we actually have in the stack. Still, this is just a constant times $n$, so the memory cost is $O(n)$.