Graphs:
A directed graph is a set of
nodes V and a set of pairs of nodes E for which there is never a pair
of the form <u,u>. Each pair in E is called an edge. A graph is said to be undirected if for every edge
<u,v> there is an edge <v,u>. If this is the case, we
normally treat E as having only one of the two edges <u,v> or
<v,u>. Normally we call an undirected graph just a graph and a directed graph a directed graph. (Note that since
this is a set, there is always exactly one pair <u,v> - that is,
there is always at most one edge from node u to node v for every u and
v. A structure allowing edges between a node and itself - i.e., of form
<u,u> - and/or multiple edges between the same pair of nodes is
called a multigraph. These
will come up later in the course.)
The number of possible edges for a directed graph with n edges is
n*(n-1) since there would be n choices for the first node and for each
such chosen node n-1 possible choices for the second edge. In a graph,
there would be exactly half of this since we would treat <u,v>
and <v,u> as the same edge. Thus a graph has at most
n*(n-1)/2 edges.
A path between vertices
u and v is a set of edges "leading from u to v" - that is, a set of
pairs of the form
<u,v1> <v1,v2>...<vk,vk+1><vk+1,v>
That is, it is a set of pairs which, if "followed" would lead step by
step through other vertices from u to v. A path is said to be simple if no vertex is repeated
except possibly the starting and finishing node. (Obviously, any path
contains a simple path which can be constructed by simply removing the
set of pairs returning to the first repetition of the vertex and
repeating until there are no such repetitions.) A simple path for which
the starting and ending vertices are different and which visits every
vertex is called a Hamiltonian Path.
A cycle is a path from
a node u back to itself - that is, a set of pairs of the form
<u,v1> <v1,v2>...<vk,vk+1><vk+1,u>
That is, a set of pairs which if "followed" would lead step by step
from vertice u back to itself. A cycle is called a simple cycle if it is also simple -
that is, no vertex except the first and last appears moore than once.
(It is obvious that for any non-simple cycle, you can create a simple
cycle by simply removing the pairs that return to the vertex.) A Hamiltonian Cycle is a simple cycle
that hits every node in the graph.
A graph (or directed graph) is said to be connected if there is a path
between any two nodes in the graph (respectively directed graph). A
directed graph is said to be strongly
connected if for every two vertices u and v there is a path from
u to v and a path from v to u. (If there is a path from u to v and a
path from v to u in a directed graph, then we say that u and v are mutually reachable.)
A component of a graph (or
directed graph) containing node v is the largest connected set of
vertices containing v. Every graph (or directed graph) can
be broken down into a set of disjoint components. (Note that the set of
vertices in a component and the set of edges connecting vertices in the
component forms a new connected graph - a subgraph of the original graph.) A strong component of a
directed graph G containing vertex v is the largest set of vertices in
G which contains v and such that if u is another vertex in that set,
then u and v are mutually reachable. (Note that the set of vertices in
a strong component and the set of edges connecting vertices in that set
forms a new strongly connected graph.)
A tree is a connected graph
for which there are no cycles. In such a graph if you choose any node
and call it the root, then
there is exactly one path from the root to every other node which does
not repeat any vertex.
A forest is a graph with the
property that each component is a tree.
A traversal of a graph is an
algorithm that somehow visits each node of a graph once.
An adjacency list representation
of
a graph G=(V,E) is an array
of lists A of length n=|V| such that for each list, v is in A[u] if and
only if <u,v> is an edge (i.e., is in edge set E).
A breadth-first traversal of
a
graph is the following algorithm. (From here on we will we assume that
the vertices are labeled with the numbers from 0 to n-1 where n=|V| is
the number of vertices.)
Copy the set of vertices from V into a set T
Create an array of booleans of size n called Marked and set every
element to false
Create Queue Q
While T is not empty
Take an element w from T and enqueue it in Q and set
Marked[w]=true
While Q is not empty
Dequeue the first node v
from Q
For every node u such that
<v,u> is an edge and for which Marked[u]=false
Enqueue u in
Q and add u to Marked
end For
remove v from T
end While
end While
Notice
that
if our graph is represented by an adjacency list, then this
algorithm takes time roughlly Cm+Dn where m is the number of edges and
n is the number of vertices and is, therefore, of time order of m+n.
Notice also that each time through the outer loop visits one connected
component so if the outer loop increments a counter nComponents each
time through, then if nComponents ends up being 1 the graph was
connected.
A depth-first traversal of a
graph is the following algorithm:
OverallDFS()
Create array Marked of length n=|V| and set all
elements of Marked false
Copy each element of V to set T
While there is at least one element u left in T
DFS(u)
end While
end OverallDFS
DFS(u):
Set Marked[u]=true;
Remove u from T
for each eddge <u,v>
if !Marked[v]
DFS(v)
end if
end for
end DFS
The book chooses to implement set T as a stack. This is probably
a mistake since doing so means that to construct the tree of nodes as
they are added to T requires that you have a Parent array. If you do T
as a Queue, then the tree is "as you generate it".
Notice that each call to DFS inside OverallDFS adds a component
to the traversal.
This means that Overall DFS leads to a trivial algorithm to test if a
graph
is connected. Simply change OverallDFS so that it sets a variable
nComponets to 0 before the loop and adds 1 each time through the loop.
If, after the loop is finished, nComponents is 1, then the graph was
connected. As with Breadth-First-Search, if the graph is represented
using an adjacency list, then the algorithm is order n+m where n is
the number of vertices and m is the number of edges.
A Directed Acyclic Graph (Usually
written DAG) is a directed graph with no cycles. A topological sort of a DAG is a
re-ordering of the graph so that if v1,v2,...,vn
is the new ordering then <vi,vj> is an edge
only if i<j. Given a DAG, then the following is an algorithm for
computing a topological sort for the DAG.
TopologicalSort(G=(V,E))
Set i=0
Create a new empty queue Q
Create an array InDegree of length n=|V|
For v=0 to n-1
InDegree[v]=0
endFor
For v=0 to n-1
for each u in Adjacency[v]
InDegree[u]++
endFor
endFor
For v=0 to n-1
if(InDegree[v]==0)
enqueue v in Q
endIf
endFor
Construct an array NewOrdering of length n.
While Q is not empty {
dequeue element v from Q
Set NewOrdering[i]=v
i=i+1
Set InDegree[v]=-1
For every u in Adjacency[v]
subtract 1
from InDegree[u].
If
InDegree[u]==0
enqueue u in Q
end If
end For
end While
At this point, NewOrdering is a topological sort if
G was a DAG
Interestingly, this algorithm (of order m+n) can be used to see if a
graph is a DAG. If there were no such initial v with InDegree[0], then
the graph is not a DAG and if at the end all elements of the array
InDegree were not <0, then the graph was not a DAG.
Note further, that adding indegree and outdegree to the adjacency list
will not change the order of construction time required to build the
adjacency list in the first place.