Last class we talked about graphs and how to represent them in programs. We talked about trees, and talked about graphs. A spanning tree is a graph that contains all nodes, but no cycles. A cycle is a way of going from one node to another, and then eventually making your way back to the original node.
If we remove some edges, we can convert from a graph to a tree.
If you have two nodes unattached, is it a tree?
This is a tree:
o --- oIf you add another edge, is it still a tree?
o --- o | | -----No, it has a cycle.
For 3 nodes, anything less than 2 edges is not a tree. If you try to add an edge anywhere else, you will create a cycle, so it is not a tree.
o -- o -- oFor 4 nodes, you start out with
o o o oYou add one edge, still not a tree.
o --- o o oAdd another edge, still not a tree.
o --- o / o oIf you add one more, you can finally get a tree.
o --- o / \ o oCan you add anywhere else? No, or else that would create a cycle.
In general, if you have N nodes, you need at least N-1 edges to attach all nodes. Moreover, if you have N or more edges, then you cannot have a tree. Therefore, in order to have a spanning tree, it must have N-1 edges. In general, a spanning tree is a tree that contains all the same nodes as a graph, and has some, or all edges.
o --- o ---- o \ / / o -------In this graph, there are multiple ways of making a spanning tree:
o --- o ---- o \ oor
o o o \ / / o -------or
o --- o o / / o -------etc
Given any connected graph, how can we find a spanning tree? We have to be able to select edges to remove but still keep it connected.
Start out by picking a node, and consider it the root.
Pick one of the ways to go and label that you've visited it.
1 --- 2 |\ /| | 3 | |/ \| 4 --- 0 0 -- [2, 3, 4] 1 -- [2, 3, 4] 2 -- [0, 1, 3] 3 -- [0, 1, 2, 3, 4] 4 -- [1, 3, 0]Let's say we consider 0 the root. In general, the easiest way to pick the next one is to choose the first in the adjacency list/matrix that hasn't been visited yet.
0 -- 2Now 2's next that hasn't been visited is 1
0 -- 2 -- 11's next not visited is 3
0 -- 2 -- 1 -- 33's next not visited is 4
0 -- 2 -- 1 -- 3 -- 4This is all of the nodes, so we're done.
So for the depth first search, to code it would look like:
public void depthFirstSearch(Vertex v) { v.visited = true; for (each vertex w adjacent to v) if (!w.visited) depthFirstSearch(w) }
Using the same example, seen below, you can also do one with a breadth-first approach
1 --- 2 |\ /| | 3 | |/ \| 4 --- 0 0 -- [2, 3, 4] 1 -- [2, 3, 4] 2 -- [0, 1, 3] 3 -- [0, 1, 2, 3, 4] 4 -- [1, 3, 0]queue[0]
0dequeue[0] queue[2, 3, 4]
2 | 3 | \ | 4 --- 0dequeue[2] Only adjacent to 2 not visited is 1, so queue it in, giving queue[3, 4, 1]
1 --- 2 | 3 | \ | 4 --- 0We dequeue the other 3 nodes in the queue, but see no new non-visited children, so that's our spanning tree.
Coding breadth-first-search is much like the breadth-first search we did before, except we have to make sure we didn't visit it already.
public void breadthFirstSearch(Vertex v) { Queue q = new Queue(); v.visited = true; q.enqueue(v); while (!q.isEmpty()) { vertex w = (Vertex)q.dequeue(); //print here for (each vertex x adjacent to w) if (!x.visited) { x.visited = true; q.enqueue(x); } } }
Not surprisingly, the depth-first search gives longer path, and the breadth-first search gives a wider path.
Spanning trees can solve a lot of problems. For example, the above might represent different rooms in a house. To do a spanning tree, we can minimize the number of connections needed to power the whole house.
What if there are weights associated with the edges? Can we find one that gives a minimal total cost? Say we wanted to minimize the amount of cable we have to buy to wire up the house. This is often called a minimum spanning tree. How do we find this minimal path?
2 a ----- b 1/ | / | \ 1 / | / | \ d 2| / |3 f-| | \ | / 2 | / | | 2\ | / | / 2 | | c ----- e | | 1 | ------------------ 1We can try to take a greedy approach, by adding all of the shortest wires first that don't create a cycle. Add every edge to a heap, and then removeMin each time, trying to add it each time if it doesn't create a cycle. You know you're done when you've added N-1 edges, so just stop then.
This is called Kruskal's algorithm. It is actually correct and works, but what gaurantees that? Here's the idea. Each node needs added to the graph. If you draw an edge, you're making a connection from one to another. There's no better way to connect a node, or else it would have been added already.
This algorithm is dependent upon something that we haven't learned yet, detecting whether adding an edge would create a cycle or not.
To do this we have an array representing each of the nodes, initializing all of its values to -1, representing that it is not attached to anything. Every time you add, just update what you consider the child in the array to point to the parent. For example, if you're going to add an edge from 1 to 3, then either put a 3 in index 1, or a 1 in index 3.
We'll create the initial priority queue:
a-d: 1 c-e: 1 b-f: 1 d-f a-b: 2 a-c: 2 c-d: 2 e-f: 2 b-e: 3and our initial array
a b c d e f [-1, -1, -1, -1, -1, -1]We'll take the first one off, which is a-d. This is making d a child of a, so we modify the array by reflecting this
a b c d e f [-1, -1, -1, 0, -1, -1] a --- dThen the next one, c-e, with e the child of c.
a b c d e f [-1, -1, -1, 0, 2, -1] a --- d c --- eThen the next one, b-f, with f the child of b.
a b c d e f [-1, -1, -1, 0, 2, 1] a ---- d c --- e b --- fThen the next one, d-f, with f the child of d. Since d and f have parents already, just have a's parent point to f.
a b c d e f [5, -1, -1, 0, 2, 1] a ---- d c --- e / b --- fThen the next one, a-b, with b the child of a. We see that a is related to f, which is related to b, so we know that this would create a cycle, so we can't add this one.
a b c d e f [5, -1, -1, 0, 2, 1] a ---- d c --- e / b --- fThe next one is a-c, with c the child of a.
c --- e a b c d e f / [5, -1, 0, 0, 2, 1] a ---- d / b --- fThere are now 5, or 6-1 edges added, so we're done.