Chapter 12 Slides (Part 1)

« previous | Thursday, March 31, 2011 | next »

Graph ADT

Vertices: nodes that contain a label
Edge: connects two vertices and contains data

Examples:

Road maps
map of flight paths between airports
Networks
Social Networking Sites
Database relations

Edge Types

Directed: edges have arrows (can only be traversed in that direction); a graph with all directed edges is called a directed graph or digraph
Undirected: edges have no direction; a graph with all undirected edges is called an undirected graph
Weighted: edges store values (e.g. distance) about the relationships between vertices
Unweighted: edges only connect vertices

Terminology

Dense: many, many edges
Sparse: very few edges

For Undirected Graphs

Endpoints: two vertices of a given edge
Edges incident on a vertex: Edges that are connected to a given vertex
Adjacent vertices: two vertices that are connected by an edge
Parallel: for two nodes connected by two edges (double-bond in chemistry)
Self-loop: Any edge that starts and ends at the same vertex
Degree

Adding Directed edges

Outgoing edges: edges pointing away from a given vertex
Incoming edges: edges pointing to a given vertex
In-degree: how many edges come to a given vertex
Out-degree: how many edges go out from a given vertex

Traversal

Path: sequence of alternating vertices and edges; begins and ends with a vertex; edge preceded and followed by endpoints
Simple path: No vertex is crossed twice
Cycle: Starts and ends with same vertex
Simple cycle: no vertex is crossed twice

Mathematical Properties

$n$ is # of vertices
$m$ is # of edges
$\deg(v)$ is degree of vertex $v$

Undirected Graphs

$\textstyle \sum _{v}\deg(v)=2m$
$0\leq m\leq {\tfrac {n(n-1)}{2}}$

Directed Graph

$\textstyle \sum _{v}\mathrm {indeg} (v)=m$
$\textstyle \sum _{v}\mathrm {outdeg} (v)=m$
$0\leq m\leq n(n-1)$

Functions

Vertices and edges are both Positions

Accessor Methods

aVertex()
incidentEdges(v)
degree(v)
endVertices(e)
opposite(v,e)
areAdjacent(v,w)

Directed Accessors

isDirected(e)
origin(e)
destination(e)

Update Methods

insertVertex(o)
insertEdge(v,w,o)
insertDirectedEdge(v,w,o)
removeVertex(v)
removeEdge(e)

Generic Methods

numVertices()
numEdges()
vertices()
edges()

Implementations

Edge List

Store ordered pairs of edges containing their vertices and weight. Store separate list of vertices

Adjacency List Structure

(best data structure for graph)

Similar to edge list, but vertex list (hash table with vertices as keys) stores ordered pairs of adjacent vertices

Adjacency Matrix

(good for dense graphs where vertexes are not updated frequently)

Store table of size vertices × vertices with data inside it. Keep the vertex and edge sequences/lists

Performance

Property	Edge List	Adjacency List	Adjacency Matrix
Space	$O(n+m)$	$O(n+m)$	$O(n^{2})$
`incidentEdges(v)`	$O(m)$	$O(\deg(v))$	$O(n)$
`adjacentVertices(v)`	$O(m)$	$O(\deg(v))$	$O(n)$
`areAdjacent(v,w)`	$O(m)$	$O(\min(\deg(v),\deg(w)))$	$O(1)$
`insertVertex(o)`	$O(1)$	$O(1)$	$O(n^{2})$
`insertEdge(v,w,o)`	$O(1)$	$O(1)$	$O(1)$
`removeVertex(v)`	$O(m)$	$O(\deg(v))$ *	$O(n^{2})$
`removeEdge(e)`	$O(1)$	$O(1)$ *	$O(1)$

we could store pointers to both adjacency ordered pairs under each of the vertex's hash chain, so we would have access to any adjacent vertices from any edge.

Subgraphs

Thursday, April 7, 2011

Subgraph is a subset of vertices and edges in the graph

Called a spanning subgraph if it contains all vertices, but not all edges

"Connected graph" has edges between two subgraphs

A "Connected component" is a cluster of vertices that are connected by edges

Trees and Forests

A tree is an undirected graph such that it

is connected
has no cycles

A forest is graph that is a collection of trees

Spanning trees and forests are spanning subgraphs (contains all vertices) that satisfy tree/forest definitions

Depth-First Search

Example Depth-First Search

A traversal of a graph (like Euler tour is to binary trees):

visit all vertices and edges: Runs in $O(n+m)$ time.
determines whether a graph is connected
computes connected components
computes a spanning forest
finds any cycles

Algorithm DFS(Graph G, Vertex v)
  Input: Graph and starting vertex
  Output: Labeling of edges of G in the connected component of v
  setLabel(v, VISITED)
  for all e in G.incidentEdges(v)
    if getLabel(e) = UNEXPLORED
    w = opposite (v,e)
    if getLabel(w) = UNEXPLORED
      setLabel(e, DISCOVERY)
      DFS(G, w)
    else
      setLabel(e, BACK)
    end
  end
end

Algorithm DFS(Graph G)  // Driver function
  for all u in G.vertices()
    setLabel(u, UNEXPLORED)
  end
  for all e in G.edges()
    setLabel(e, UNEXPLORED)
  end
  for all v in G.vertices()
    if getLabel(v) = UNEXPLORED
      DFS(G,v)
    end
  end
end

Analysis

for all loop in DFS(G, v) is $O(\deg(v))$ and decreases each time as more vertices are visited

Loops in DFS(G) are $O(m)+O(n)+O(m)$

Path Finding

Tuesday, April 12, 2011

Perform Depth-First Search to find path from vertex $v$ to vertex $z$

Algorithm pathDFS(G, v, z, S = Stack())
  setLabel(v, VISITED)
  S.push(v)
  if (v==z) return S.elements()
  for e in G.incidentEdges(v)
    if (getLabel(e) == UNEXPLORED)
      setLabel(e, DISCOVERY)
      S.push(e)
      pathDFS(G, w, z, S)
      S.pop()
    else
      setLabel(e, BACK)
    end
  end
end

Adjacency List: $O(n+m)$ Adjacency Matrix: $O(n^{2})$

Cycle Finding

Perform DFS to find cycle from a vertex $v$ looking for a back edge

Algorithm cycleDFS(G, V, S = Stack())
  setLabel(v, VISITED)
  S.push(v)
  for e in G.incidentEdges(v)
    if (getLabel(e) == UNEXPLORED)
      w = opposite(v,e)
      s.push(e)
      if (getLabel(w) == UNEXPLORED)
        setLabel(e, DISCOVERY)
        pathDFS(G,w)
        S.pop(e)
      else
        T = Stack()
        do
          o = S.pop()
          T.push(S.pop())
        while (o!=w)
        return T.elements()
      end
    end
  end
  S.pop()
end

Breadth-First Search

Algorithm BFS(Graph G)  // Driver function
  for u in G.vertices()
    setLabel(u, UNEXPLORED)
  end
  for e in G.edges()
    setLabel(e, UNEXPLORED)
  end
  for v in G.vertices()
    if (getLabel(v) == UNEXPLORED)
      BFS(G,v)
    end
  end
end

Algorithm BFS(Graph G, Vertex s)
  L[0] = new Sequence()
  L[0].insertLast(s)
  setLabel(s, VISITED)
  i = 0
  while !L[i].isEmpty()
    L[i+1] = new Sequence()
    for v in L[i].elements()
      for e in G.incidentEdges(v)
        if (getLabel(e) == UNEXPLORED)
          w = opposite(v,e)
          if (getLabel(w) == UNEXPLORED)
            setLabel(e, DISCOVERY)
            setLabel(w, VISITED)
            L[i+1].insertLast(w)
          else
            setLabel(e, CROSS)
          end
        end
      end
    end
    i++
  end
end

Minimum Spanning Trees

A spanning tree that minimizes the total weights (sum) of all edges

Cycle Property: Adding an edge to a spanning tree will create a cycle. Construct by removing the largest edge of every cycle. If 2 edges in cycle are equal, there are multiple unique minimum spanning trees

Partition Property: Partitioning a spanning subtree will identify one connecting edge. That edge will have the least value of all edges between the two vertex groups.

Prim-Jarnik's Algorithm

Pick a random vertex $s$ . We will grow a minimum spanning tree from that vertex.
Look at all edges connecting the current tree to any other node in the graph (but not in the cloud yet). Include the edge with the smallest weight and add its adjacent vertex to the cloud
Repeat step 2 until all nodes are included

Uses the partition property to connect two groups of nodes

Implementation

A priority queue stores vertices outside the cloud (key: distance, element: vertex)
store distance (weight from current node to antoher other node), parent edge (in min tree), and locator (in priority queue) with each vertex

Algorithm PrimJarnikMST(G)
  Q = HeapPriorityQueue()
  s = G.aVertex()
  for (v in G.vertices)                { O(n) }
    if (v == s) setDistance(v, 0)
    else setDistance(v, INFINITY)      // original distance is infinity
    setParent(v, NULL)
    l = Q.insert(getDistance(v), v)    // add vertex to priority queue
    setLocator(v, l)
  end
  while (!Q.isEmpty())                 { O(n) }
    u = Q.removeMin()                    { O(log n) }
    for (e in G.incidentEdges(u))        { O(deg(u)) + O(m) }
      z = G.opposite(u, e)
      r = weight(e)
      if (r < getDistance(z))            // find and set distance from node u to node z
        setDistance(z, r)
        setParent(z, e)
        Q.replaceKey(getLocator(z), r)   { O(log n) }
      end
    end
  end
end

n × T(:removeMin, n)
m × T(:replaceKey, n)
m × T(:change_data)
n × T(:incidentEdges, n, m)

Total time: $O(n\log {n}+m\log {n}+m+n\deg(u))=O(n\log {n}+m\log {n})$ (assuming Adjacency List and Heap Priority Queue) Note: $n\deg(u)=2m$

Kruskal's Algorithm

Order all edges by weight
Add smallest edge to cloud if it does not create a cycle
Repeat for all edges

Implementation

Priority Queue stores edges outside cloud (key: weight, element: edge)

Algorithm KruskalMST(G)
  for (v in G.vertices())
    define Cloud(v)
  end
  Q = PriorityQueue()
  for (e in G.edges())          { O(m) }
    Q.insert(weight(e), e)      { O(log m) }
  end
  T = Graph()
  while (T.size() < G.size()-1)   { O(n) }
    e = T.removeMin()               { O(log n) }
    (u,v) = endpoints(e)
    if (Cloud(v) != Cloud(u))
      T.insert(e)
      merge(Cloud(v), Cloud(u))     // replace with union(u,v)
    end
  end
  return T
end

Partition Data Structure

Stores a collection of separate sets.

Each set stored in a sequence and each element in the set has a reference back to its set (so find() is fast)

An element will always be added to a set that contains at least 2× the number of elements, so it will be processed $O(\log {n})$ times.

Functions:

find(u) returns the set storing u ( $O(1)$ )
union(u,v) replace the sets storing u and v with their union ( $O(\min(n_{u},n_{v}))$ )

Shortest Path

Given a weighted graph and two vertices, what is the shortest path from u to v (minimum weight)

Shortest path is not contained in the Minimum Spanning Tree

Shortest Path Tree

Not contained in the Minimum Spanning Tree — violates the cycle property

Properties:

A subpath of a shortest path is itself a shortest path
There is a tree of shortest paths from a start vertex to any other vertex

Dijkstra's Algorithm

Very similar to Prim-Jarník's Algorithm

Assumptions:

Graph is connected
Edges are undirected
Edge weights are nonnegative

Grow a cloud of vertices starting with start vertex. Store a label at each vertex (initialized to ∞) that represents the distance from start vertex to that vertx.

Look at vertices outside of cloud, and add the shortest one.

Note: distances from start vertex to vertices in cloud will always be less than distances from cloud to external vertices

Algorithm DijkstraDistances(G, s)
  Q = HeapPriorityQueue();
  for v in G.vertices();               { O(n) }
    if (v==s) setDistance(v, 0)
    else setDistance(v, INFINITY)
    l = Q.insert(getDistance(v), v)      { O(log n); if bottom-up, this entire loop is O(n) }
    setLocator(v,l)
  end
  while !Q.isEmpty()                   { O(n) }
    { pull new vertex into cloud }
    u = Q.removeMin()
    for e in G.incidentEdges(u)          { O(deg(u)); Total: O(sum(deg(u)) = O(m)}
      { all steps in this loop called "relaxing edge" e }
      z = G.opposite(u, e)
      r = getDistance(u) + weight(e)     // only difference between Prim's and Dijkstra's
      if (r < getDistance(z))
        setDistance(z, r)
        setParent(z, u)                    // helps us reconstruct traversal path
        Q.replaceKey(getLocator(z), r)     { O(log n) }
      end
    end
  end
end

Not in the code: Keep track of the edges that give the distances stored in the Queue

Overall times for Adjacency List/Matrix and Heap/UnsortedSequence Priority Queue:

[PQ] n × insert() init = { Heap: $O(n)$ (bototm-up), UnSeq: $O(n)$ }
[Graph] n × incidentEdges() = { AL: $O(m)$ , AM: $O(n^{2})$ }
[PQ] n × removeMin() = { Heap: $O(n\log {n})$ , UnSeq: $O(n^{2})$ }
[PQ] O(m) × replaceKey() = { Heap: $O(m\log {n})$ , UnSeq: $O(m)$ }

	Adjacency Matrix	Adjacency List
Heap	$O(n^{2}+m\log {n})$	$O(n\log {n}+m\log {n})$
Unsorted Sequence	$O(n^{2}+m)=O(n^{2})$	$O(n^{2}+m)=O(n^{2})$

Bellman-Ford Algorithm

Works even with negative-weight edges! (must assume directed edges)

Algorithm BellmanFord(Graph G, Vertex s)
  for v in G.vertices()
    if (v == s) setDistance(v, 0)
    else setDistance(v, INFINITY)
  end
  for i from 1 to G.numVertices()-1  { O(n) }
    for e in G.edges()                 { AL: O(m), AM: O(n^2) }
      if (getDistance(u) == INFINITY) continue
      u = G.origin(e)
      z = G.opposite(u,e)    { z is destination }
      r = getDistance(u) + weight(e)
      if (r < getDistance(z))
        setDistance(z,r)
        setParent(z, u)
      end
    end
  end
end

Analysis

Adjacency List: $O(nm)$ Adjacency Matrix: $O(n^{3})$

All-Pairs Shortest Path

Find shortest path tree for every vertex:

3 Options:

Call Dijkstra's Algorithm n times $O(nm\log {n})$
Call Bellman-Ford's algorithm n times $O(n^{2}m)$
Another algorithm we won't cover…

CSCE 221 Chapter 12

Contents

Graph ADT

Edge Types

Terminology

Mathematical Properties

Functions

Implementations

Edge List

Adjacency List Structure

Adjacency Matrix

Performance

Subgraphs

Trees and Forests

Depth-First Search

Analysis

Path Finding

Cycle Finding

Breadth-First Search

Minimum Spanning Trees

Prim-Jarnik's Algorithm

Implementation

Kruskal's Algorithm

Implementation

Partition Data Structure

Shortest Path

Shortest Path Tree

Dijkstra's Algorithm

Bellman-Ford Algorithm

Analysis

All-Pairs Shortest Path

Navigation menu

Search