133-200 Lecture #5 (Wednesday, November 30, 2005)

Lecture #33 (Wednesday, November 30, 2005)

Linked Lists

This lecture is a little controversial. Since the 1970's every intro cs sequence has covered linked lists. But because of recent changes in machinery, LinkedLists have become less useful and ArrayLists have become more useful. Some people say that there should be reduced coverage of LinkedLists in cs courses, and there are others that say linked lists should always be a part of programming for years to come. CMU continues to teach LinkedLists because we believe it is an important tool for developing programming skills. So while they structure itself may not be as useful, knowing how to think about them is very useful. And, in truth, good uses abound.
Users typically don't notice slowdowns if they occur a little at a time. It is when the program stops or freezes that users notice and are unhappy. While ArrayLists are great for features such as indexing, they do have their drawbacks. When you add onto a ArrayList that is full, the default for the ArrayList to grow is to double its size. Because it must make a copy of itself before moving all of the data, your program can run a little slower when you have a lot of information. With a LinkedList, however, you can grow the list to as big or as small as you need, without allocating an unnecessary amount of memory.

What is a Linked List?

As you probably know, a list is an ordered collection of items. Like any good collection, you want to be able to add and remove items from your list, as well as traverse the list to look at each item or to look for a particular item.
A LinkedList is a collection of Nodes. In Java, a Node is an object that the programmer creates, and contains two references: a references to an object, and a reference to the next Node object in the list.
Imagine that you've written a Person class, and that you would create a ArrayList; to store them, but your instructor told you to store them in a linked list instead.
Here's a LinkedList of five Person objects.

There are five nodes in this linked list:

Node a Node b Node c Node d Node e

Each of these five nodes contains two references:
reference to a Person object reference to a Node object.
Take a look at each of the five nodes in the LinkedList. Look at the second reference, that is, the Node reference, in each of them. Notice what this Node reference refers to.
Node a refers to Node b.
Node b refers to Node c.
Node c refers to Node d.
Node d refers to Node e.
Node e refers to null, the Object that represents nothing.
With the exception of Node e, each Node in the LinkedList refers to the next Node. ArrayLists are "random access" data structures. This means that you can refer directly to each element in the vector.
v.elementAt(1);
Not so with a LinkedList. A LinkedList is a "sequential access" data structure. Each node knows about the node that comes after it, but the node knows nothing about any of the other nodes in the list.
Node e refers to null because it is the last node in the list. It's the end of the chain, the end of the line. Nothing comes after it.
The list also contains three other references to Nodes:
Node head Node tail Node index
Now look at head, tail, and index. These references refer to Nodes in the LinkedList, too. Take a look and see which nodes each of them refers to.
head refers to the first Node in the LinkedList. tail refers to the last Node in the LinkedList. index can be used by the programmer to keep track of a particular node in the list.

Inserting a Node into the LinkedList

You simply set your Node reference to refer to the first Node in the list. Take a look at the second reference in newNode. It refers to Node a.
But wait. We need to do something else. Now newNode, not Node a is at the beginning of the linked list. If you remember, head is supposed to refer to the first node in the linked list. So we need to make head reference newNode. Take another look at the picture. head does indeed refer to newNode, the new first node in the linked list.
How do we add a node into the end of a list? Much like you add a node to the beginning of the list.

You simply make newNode's second reference refer to nothing (null>), make the current end of the list (Node e} refer to newNode, make tail refer to newNode.

Deleting a Node from a LinkedList

How would you remove a node from a LinkedList?

Take a careful look at this picture. Look at Node a. Its second reference now refers to Node c. References can reference only one object at a time. By resetting the reference to Node c, we lost Node b, which is as good as deleting it.

Traversing a list

Now we need to do something with the index reference. We can use it to refer to whatever node we like. If we want index to refer to the first node in the list, we can write
index = a;
If we want to access the Person object inside of Node a, we can write
Person temp = a.getData();
Assuming that we have written a getData() method in the Node class. But in real linked lists, we don't actually call the individual nodes names like
Node a; Node b; Node c; Node d;
That's what special about linked lists. With the exception of the first Node in the list, we refer to each Node in terms of the Node before it.
Look at the picture of the LinkedList again. index currently refers to the first Node in the list, the head of the list. Without knowing that b is named b, how can I now make index refer to node b?
When we write the Node class, we will give it two instance variables, one for the reference to the object the Node refers to, which in our case is a Person object, and one for the reference to the Node object that comes after it, the next Node.
private Person data; private Node next;
Let's also assume that we've written getData() that returns data for that node, and getNext() that returns next for that node.
Then if I want index to refer to b, I simply write
index = index.getNext(); //change index so that it refers to the next node
I reset the index reference to refer to the node that comes after the node index currently refers to. In this way, I can walk through the entire list if I like, stopping when I reach a node in which index.getNext(); refers to null.

Implementing A Node

Before we can create a LinkedList, we first need to create a class to represent one piece of the list, called a node. A Node consists of an Object to store the data, and a reference to the next Node in the list.
Since a Node is only useful as a part of a LinkedList, we can make the implementation of the Node class part of the implementation of the LinkedList. When one class is defined inside another class, we call this a subclass. So, we can define the Node class as a subclass of the LinkedList class. This insures that no one outside of the LinkedList class can use a Node by itself.

Implementing A LinkedList

Now that we have a node, how do we construct a LinkedList? Since each node stores a reference to the next one in the list, we at least need a reference to the beginning of the list. We call this reference the "head" of the list, and from the head we can get to any other node in the list. For convenience, we will also store a reference to the end of the list, called the "tail". This saves us the time it takes to get from the beginning to the end of the list when we need to operate on the last Node.
For another level of convenience, we will create a reference called "counter", which will be able to tell us exactly how many elements are in the LinkedList

The Cost of LinkedLists

Lets talk about operations. What does it take to implement an addFirst method?
addFirst

create new Node
make new Node point to whatever head used to be
make head point at new Node
increment count;

And what does it take to implement an addLast method?
addLast

create new Node
check to see if it is the first Node in the list, if it is add it and be done, otherwise
loop to the end and add there
increment count

So, it does not matter how many things are in the LinkedList the cost is always the same.
What is the cost of addLast? If we loop from the front of the list to get to the end -- it can be a long walk -- one step for each item within the list. But, the reference tail enables us to addLast() just as easily as addFirst(), by jumping right to the end.
A method like removeNth will always be costly, though. Even if we wanted to remove the last item in the list the tail reference does not help us. In order to remove an item we have to change the reference to the variable in front of it. So to remove the last item we would have to loop through the whole list to get to the second to last term.
A method such as isEmpty() becomes very simple though because we would only have to check to see if count is equal to zero.

Next Class

...we will learn how to code linked lists.