Graphs:


A directed graph is a set of nodes V and a set of pairs of nodes E for which there is never a pair of the form <u,u>. Each pair in E is called an edge. A graph is said to be undirected if for every edge <u,v> there is an edge <v,u>. If this is the case, we normally treat E as having only one of the two edges <u,v> or <v,u>. Normally we call an undirected graph just a graph and a directed graph a directed graph. (Note that since this is a set, there is always exactly one pair <u,v> - that is, there is always at most one edge from node u to node v for every u and v. A structure allowing edges between a node and itself - i.e., of form <u,u> - and/or multiple edges between the same pair of nodes is called a multigraph. These will come up later in the course.)

The number of possible edges for a directed graph with n edges is n*(n-1) since there would be n choices for the first node and for each such chosen node n-1 possible choices for the second edge. In a graph, there would be exactly half of this since we would treat <u,v> and <v,u> as the same edge.  Thus a graph has at most n*(n-1)/2 edges.

A path  between vertices u and v is a set of edges "leading from u to v" - that is, a set of pairs of the form
    <u,v1> <v1,v2>...<vk,vk+1><vk+1,v>
That is, it is a set of pairs which, if "followed" would lead step by step through other vertices from u to v. A path is said to be simple if no vertex is repeated except possibly the starting and finishing node. (Obviously, any path contains a simple path which can be constructed by simply removing the set of pairs returning to the first repetition of the vertex and repeating until there are no such repetitions.) A simple path for which the starting and ending vertices are different and which visits every vertex is called a Hamiltonian Path.

cycle is a path from a node u back to itself - that is, a set of pairs of the form
     <u,v1> <v1,v2>...<vk,vk+1><vk+1,u> 
That is, a set of pairs which if "followed" would lead step by step from vertice u back to itself. A cycle is called a simple cycle if it is also simple - that is, no vertex except the first and last appears moore than once. (It is obvious that for any non-simple cycle, you can create a simple cycle by simply removing the pairs that return to the vertex.) A Hamiltonian Cycle is a simple cycle that hits every node in the graph.

A graph (or directed graph) is said to be connected if there is a path between any two nodes in the graph (respectively directed graph). A directed graph is said to be strongly connected if for every two vertices u and v there is a path from u to v and a path from v to u. (If there is a path from u to v and a path from v to u in a directed graph, then we say that u and v are mutually reachable.)

A component of a graph (or directed graph) containing node v is the largest connected set of vertices containing v.   Every graph (or directed graph) can be broken down into a set of disjoint components. (Note that the set of vertices in a component and the set of edges connecting vertices in the component forms a new connected graph - a subgraph of the original graph.) A strong component of a directed graph G containing vertex v is the largest set of vertices in G which contains v and such that if u is another vertex in that set, then u and v are mutually reachable. (Note that the set of vertices in a strong component and the set of edges connecting vertices in that set forms a new strongly connected graph.)

A tree is a connected graph for which there are no cycles. In such a graph if you choose any node and call it the root, then there is exactly one path from the root to every other node which does not repeat any vertex.

A forest is a graph with the property that each component is a tree.

A traversal of a graph is an algorithm that somehow visits each node of a graph once.

An adjacency list representation of a graph G=(V,E) is an array of lists A of length n=|V| such that for each list, v is in A[u] if and only if <u,v> is an edge (i.e., is in edge set E).

A breadth-first traversal of a graph is the following algorithm. (From here on we will we assume that the vertices are labeled with the numbers from 0 to n-1 where n=|V| is the number of vertices.)

Copy the set of vertices from V into a set T
Create an array of booleans of size n called Marked and set every element to false
Create Queue Q
While T is not empty
    Take an element w from T and enqueue it in Q and set Marked[w]=true
     While Q is not empty
          Dequeue the first node v from Q
          For every node u such that <v,u> is an edge and for which Marked[u]=false
             Enqueue u in Q and add u to Marked
          end For
          remove v from T
    end While
end While
    
Notice that if our graph is represented by an adjacency list, then this algorithm takes time roughlly Cm+Dn where m is the number of edges and n is the number of vertices and is, therefore, of time order of m+n. Notice also that each time through the outer loop visits one connected component so if the outer loop increments a counter nComponents each time through, then if nComponents ends up being 1 the graph was connected.

A depth-first traversal of a graph is the following algorithm:
OverallDFS()
    Create array Marked of length n=|V| and set all elements of Marked false
    Copy each element of V to set T
    While there is at least one element u left in T
             DFS(u)
    end While
end OverallDFS
      
DFS(u):
    Set Marked[u]=true;
    Remove u from T
    for each eddge <u,v>
       if !Marked[v]
          DFS(v)
       end if
    end for
end DFS

The book  chooses to implement set T as a stack. This is probably a mistake since doing so means that to construct the tree of nodes as they are added to T requires that you have a Parent array. If you do T as a Queue, then the tree is "as you generate it".

Notice that each call to DFS  inside OverallDFS adds a component to the traversal.

This means that Overall DFS leads to a trivial algorithm to test if a graph is connected. Simply change OverallDFS so that it sets a variable nComponets to 0 before the loop and adds 1 each time through the loop. If, after the loop is finished, nComponents is 1, then the graph was connected. As with Breadth-First-Search, if the graph is represented using an adjacency list, then the algorithm is order n+m where n is the number of vertices and m is the number of edges.

A Directed Acyclic Graph (Usually written DAG) is a directed graph with no cycles. A topological sort of a DAG is a re-ordering of the graph so that if v1,v2,...,vn is the new ordering then <vi,vj> is an edge only if i<j. Given a DAG, then the following is an algorithm for computing a topological sort for the DAG.

TopologicalSort(G=(V,E))
    Set i=0
    Create a new empty queue Q
    Create an array InDegree of length n=|V|
    For v=0 to n-1
       InDegree[v]=0
    endFor
    For v=0 to n-1
       for each u in Adjacency[v]
          InDegree[u]++
       endFor
    endFor
    For v=0 to n-1
       if(InDegree[v]==0)
         enqueue v in Q
       endIf
    endFor
    Construct an array NewOrdering of length n.
    While Q is not empty {
        dequeue element v from Q
        Set NewOrdering[i]=v
        i=i+1
        Set InDegree[v]=-1
        For every u in Adjacency[v]
            subtract 1 from InDegree[u].
            If InDegree[u]==0
                enqueue u in Q
             end If
        end For
    end While
    At this point, NewOrdering is a topological sort if G was a DAG

Interestingly, this algorithm (of order m+n) can be used to see if a graph is a DAG. If there were no such initial v with InDegree[0], then the graph is not a DAG and if at the end all elements of the array InDegree were not <0, then the graph was not a DAG.

Note further, that adding indegree and outdegree to the adjacency list will not change the order of construction time required to build the adjacency list in the first place.