The Call Stack, a.k.a, The Runtime Stack
Last time, we were discussing various uses for the stack data structure. One thing that we mentioned was that stacks were used to manage function calls. Stacks make sense to use here, since if we have functions that call other functions, we need to finsih the most recently called function first, before we can finish the function that called it.
Every time you call a function, the computer needs to keep track of three things that are unique to that particular function call. First, you need to store the arguments of that function, which are different each time it is called. Second, you need to have a place to store any local variables used in the function. Finally, you need to know where in your program you want to return to when your function completes. These three pieces of information are stored together in what is called a "stack frame".
For example, lets take the following code.
public static void main (String args[]){ Car myCar = new Car("Red"); System.out.println(myCar); }Here, we have a main method, which calls the println method, which calls a toString method. Each of these three function calls has their own set of arguments, local variables, and a return address. So when we first call the main method, a new stack frame is created. This data is then pushed onto the "call stack". The computer will now attempt to evaluate the main method.
Next, the main method calls the println method. At this point, a new stack frame is created for the println. This is then pushed onto the stack.
-------------------- | | -------------------- | println | -------------------- | main | --------------------Now, we leave main and attempt to evaluate println. Of course, we keep in the stack frame the address in main where we want to return to after our println call. So, while we evaluate println, the toString of the car class gets called. So once again, a new stack frame gets pushed onto the stack.
-------------------- | toString | -------------------- | println | -------------------- | main | --------------------Now, after we finish evaluating toString, we just pop it off the call stack and return to the correct address in println. We then finish evaluating println, then pop the next frame off the stack, and so on.
Of course, this is a somewhat simplistic view of what actually goes on when you make function calls, but for this class, we don't need to be concerned with the details, since all of this gets done for us.
Recursion
Lets look at the following method.
public static void int pow2 (int n){ if (n==0) return 1; return 2*pow2(n-1); }This function may seem a little strange at first. As you can see, it actually calls itself. However, since we've already looked at how the call stack works, this actually shouldn't seem that strange. When the function is called for the first time, it gets a stack frame just as usual. It has arguements (in this case n) and a return address. In this particular function there are no local variables, but if there were, they too would be placed in the stack frame.
This frame is then pushed onto the call stack. Then, when the function tries to call itself, nothing strange happens. Just like normal, a new stack frame is created that contains its arguments (now n-1) and a return address. This time, the return address will be in the function pow2. So if we had called pow2(1) from the main method, our call stack would looks like this.
-------------------- | pow2 | args = 0, returns to pow2 method -------------------- | pow2 | args = 1, returns to main method -------------------- | main | --------------------This is evaluated exactly like any other series of function calls. When n is 0, the function just returns 1, and we pop the first frame off the stack. We now can finish evaluating the original function call. Finally, we pop that off the stack and we're back to the main method, but with the correct result returned.
This type of function that calls itself is called a recursive function. In this case, the function computes powers of 2. Lets look at some more examples of recursive functions.
public static int factorial(int n) { if(n==0) return 1; return n*factorial(n-1); }Here's a recursive function that computes factorials. If you're skeptical, think about what n! means mathematically. 0! is defined to be 1. And n! = n*(n-1)*(n-2)*....*(2)*(1). Of course, (n-1)*(n-2)*...*(2)*(1) is just (n-1)!. So we can say that n!=n*(n-1)! Look how closely this resemble the code for the function. This is one of the nice things about recursive functions. There is generally a very intuitive correlation between what the function does and how the code looks.
Lets break the function down a bit further. We can see that this function has two parts. In one case, if n is zero, we return immediately. This is called the base case. Next, we have the case where the function calls itself. This is called the recursive case. If you've taken any discrete math courses, you probably recognize this as being very similar to induction. As a result of this, it is usually easier to prove things about recursive functions than their non-recursive counterparts.
One thing to be aware of when writing recursive functions is how important the base case is. Lets look at a funtion that doesnt have a base case.
public static void oops(){ oops(); }This cute little guy is perfectly valid in java. However, clearly, it doesnt actually compute anything, and goes on forever (well, until you run out of memory). It will always call itself, meaning it will just keep adding new stack frames onto the call stack forever. Eventually, since computers have only a finite amount of memory, the stack will run out of room and you'll get a stack overflow exception thrown.
Also, make sure that the recursive case will eventually reach the base case. What if we made a typo writing our factorial function?
public static int badFactorial(int n) { if(n==0) return 1; return n*badFactorial(n+1); }This will also cause a stack overflow. Any positive number will cause an infinite recursion, since you'll never get to a base case of 0 by incrementing a positive number.
It should be noted that both the pow2 and factorial functions can be done easily without using recursion. A simple loop (or less) can usually accomplish the same thing. In fact, not only can these be done without recursion, the recursive solution is actually much slower, since every recursive call we need to push and pop things off the stack. So while recursive functions are often very intuitive and easier to write, they are usually much slower than their iterative counterparts.
Sometimes, this slowdown can be even more dramatic that just the overhead of repeatedly pushing and popping from the call stack. Lets look at the fibonacci series. As your probably familiar with, we define the 0th fibonnacci number as 1, the 1st fibonacci number as 1, and the nth fibonacci number as the sum of the (n-2)nd and (n-1)st fibonnacci numbers. In other words, each fibonacci number is the sum of the previous two.
So, with this definition, we can very easily write a recursive function to compute them.
public static int fib(int n){ if(n==0) return 1; if(n==1) return 1; return fib(n-1) + fib(n-2); }This function correctly computes the fibonacci series. But it does so in an extremely innefficient way. It contains two recursive calls. We call this binary recursion.
Lets think about what we would do if we were asked to compute the 4th fibonacci number by hand. We would start with the base cases, fib(0) = 1 and fib(1) = 1. We would then get fib(2) by adding 1 + 1 to get 2. Next, we would get fib(3) by adding 1+2 and getting 3. Finally, we compute our goal, fib(4) by adding 2 and 3 to get our answer of 5. The big O notation of this method is certainly linear with respect to n. However, lets look at how our function would compute this.
First, it tries to computer fib(4). It pushes the call frame onto the stack and then tries to compute a value. To do this, it must compute fib(3) and fib(2). However, to compute fib(3), it must compute fib(1) and fib(2). Fib(1) is a base case, but for fib(2), it needs to compute fib(1) and fib(0). However, we're not done yet. We need to compute fib(2) again, which means computing fib(1) and fib(0) again. We can look at function calls like a binary tree.
fib(4) / \ fib(3) fib(2) / \ / \ fib(2) fib(1) fib(1) fib(0) / \ fib(1) fib(0)Our function computes fib(0) and fib(2) twice each, and computes fib(1) unnecessarily three times! Lets look at what it would do for fib(5).
fib(5) / \ fib(4) fib(3) / \ / \ fib(3) fib(2) fib(2) fib(1) / \ / \ / \ fib(2) fib(1) fib(1) fib(0) fib(1) fib(0) / \ fib(1) fib(0)This is terrible. fib(1) is computed 5 times! The runtime of this function is growing exponentially with n. Thats O(2^n), which is very slow. And we've already shown that there is a very simple algorithm to the computation linearly. The moral of the story is that binary recursion, although it looks nice and simple, is in fact, the devil.
Generally speaking, we want to look for non-recursive solutions to problems if possible. However, next class we will look at situations where recursion is our only option.