Graphs:


A directed graph is a set of nodes V and a set of pairs of nodes E for which there is never a pair of the form <u,u>. Each pair in E is called an edge. A graph is said to be undirected if for every edge <u,v> there is an edge <v,u>. If this is the case, we normally treat E as having only one of the two edges <u,v> or <v,u>. Normally we call an undirected graph just a graph and a directed graph a directed graph. (Note that since this is a set, there is always exactly one pair <u,v> - that is, there is always at most one edge from node u to node v for every u and v. A structure allowing edges between a node and itself - i.e., of form <u,u> - and/or multiple edges between the same pair of nodes is called a multigraph. These will come up later in the course.)

The number of possible edges for an undirected graph with n edges is n*(n-1) since there would be n choices for the first node and for each such chosen node n-1 possible choices for the second edge. In a graph, there would be exactly half of this since we would treat <u,v> and <v,u> as the same edge.  Thus a graph has at most n*(n-1)/2 edges.

A path  between vertices u and v is a set of edges "leading from u to v" - that is, a set of pairs of the form
    <u,v1> <v1,v2>...<vk,vk+1><vk+1,v>
That is, it is a set of pairs which, if "followed" would lead step by step through other vertices from u to v. A path is said to be simple if no vertex is repeated except possibly the starting and finishing node. (Obviously, any path contains a simple path which can be constructed by simply removing the set of pairs returning to the first repetition of the vertex and repeating until there are no such repetitions.)

cycle is a path from a node u back to itself - that is, a set of pairs of the form
     <u,v1> <v1,v2>...<vk,vk+1><vk+1,u> 
That is, a set of pairs which if "followed" would lead step by step from vertice u back to itself. A cycle is called a simple cycle if it is also simple - that is, no vertex except the first and last appears moore than once. (It is obvious that for any non-simple cycle, you can create a simple cycle by simply removing the pairs that return to the vertex.) A Hamiltonian Path is a simple cycle that hits every node in the graph.

A graph (or directed graph) is said to be connected if there is a path between any two nodes in the graph (respectively directed graph). A directed graph is said to be strongly connected if for every two vertices u and v there is a path from u to v and a path from v to u. (If there is a path from u to v and a path from v to u in a directed graph, then we say that u and v are mutually reachable.)

A component of a graph (or directed graph) containing node v is the largest connected set of vertices containing v.   Every graph (or directed graph) can be broken down into a set of disjoint components. (Note that the set of vertices in a component and the set of edges connecting vertices in the component forms a new connected graph.) A strong component of a directed graph G containing vertex v is the largest set of vertices in G which contains v and such that if u is another vertex in that set, then u and v are mutually reachable. (Note that the set of vertices in a strong component and the set of edges connecting vertices in that set forms a new strongly connected graph.)

A tree is a connectedd graph for which there are no cycles. In such a graph if you choose any node and call it the root, then there is exactly one path from the root to every other node which does not repeat any vertex.

A forest is a graph with the property that each component is a tree.

A traversal of a graph is an algorithm that somehow visits each node of a graph once.

An adjacency list representation of a graph G=(V,E) is an array of lists A of length n=|V| such that for each list, v is in A[u] if and only if <u,v> is an edge (i.e., is in edge set E).

A breadth-first traversal of a graph is the following algorithm. (From here on we will we assume that the vertices are labeled with the numbers from 0 to n-1 where n=|V| is the number of vertices.)

Copy the set of vertices from V into a set T
Create an array of booleans of size n called Marked and set every element to false
Create Queue Q
While T is not empty
    Take an element w from T and enqueue it in Q and set Marked[w]=true
     While Q is not empty
          Dequeue the first node v from Q
          For every node u such that <v,u> is an edge and for which Marked[u]=false
             Enqueue u in Q and add u to Marked
          end For
    end While
end While
    
Notice that if our graph is represented by an adjacency list, then this algorithm takes time roughlly Cm+Dn where m is the number of edges and n is the number of vertices and is, therefore, of time order of m+n. Notice also that each time through the outer loop visits one connected component so if the outer loop increments a counter nComponents each time through, then if nComponents ends up being 1 the graph was connected.

A depth-first traversal of a graph is the following algorithm:
OverallDFS()
    Create array Marked of length n=|V| and set all elements of Marked false
    Copy each element of V to set T
    While there is at least one element u left in T
             DFS(u)
    end While
end OverallDFS
      
DFS(u):
    Set Marked[u]=true;
    for each eddge <u,v>
       if !Marked[v]
          DFS(v)
       end if
    end for
end DFS

The book  chooses to implement set T as a stack. This is probably a mistake since doing so means that to construct the tree of nodes as they are added to T requires that you have a Parent array. If you do T as a Queue, then the tree is "as you generate it".

Notice that each call to DFS  inside OverallDFS adds a component to the traversal.

This means that Overall DFS leads to a trivial algorithm to test if a graph is connected. Simply change OverallDFS so that it sets a variable nComponets to 0 before the loop and adds 1 each time through the loop. If, after the loop is finished, nComponents is 1, then the graph was connected. As with Breadth-First-Search, if the graph is represented using an adjacency list, then the algorithm is order n+m where n is the number of vertices and m is the number of edges.

A Directed Acyclic Graph (Usually written DAG) is a directed graph with no cycles. A topological sort of a DAG is a re-ordering of the graph so that if v1,v2,...,vn is the new ordering then <vi,vj> is an edge only if i<j. Given a DAG, then the following is an algorithm for computing a topological sort for the DAG.

TopologicalSort(G=(V,E))
    Set i=0
    Create a new empty queue Q
    Create an array InDegree of length n=|V|
    For v=0 to n-1
       InDegree[v]=0
    endFor
    For v=0 to n-1
       for each u in Adjacency[v]
          InDegree[u]++
       endFor
    endFor
    For v=0 to n-1
       if(InDegree[v]==0)
         enqueue v in Q
       endIf
    endFor
    Construct an array NewOrdering of length n.
    While Q is not empty {
        dequeue element v from Q
        Set NewOrdering[i]=v
        i=i+1
        Set InDegree[v]=-1
        For every u in Adjacency[v]
            subtract 1 from InDegree[u].
            If InDegree[u]==0
                enqueue u in Q
             end If
        end For
    end While
    At this point, NewOrdering is a topological sort if G was a DAG

Interestingly, this algorithm (of order m+n) can be used to see if a graph is a DAG. If there were no such initial v with InDegree[0], then the graph is not a DAG and if at the end all elements of InDegree were not less than 0, then the graph was not a DAG.

Note further, that adding indegree and outdegree to the adjacency list will not change the order of construction time required to build the adjacency list in the first place.

Exercises in the book #7,8