Traversing Graphs

When we traverse a graph -- just like traversing a tree (which is really just a special kind of graph) -- we visit each vertex in the graph exactly once. Also like trees, traversing graphs can be done in different ways. One particularly important means for traversing a connected graph is called a depth-first search. Another is a breadth-first search. We will examine both of these, in turn.

Depth-First Searches

As a guiding metaphor, suppose you have discovered an old abandoned network of underground bunkers linked by long (and mostly dark) tunnels. Some of these bunkers have multiple tunnels connected to them, others have only one.

You wish to search through the bunkers for anything valuable that can be found. Checking the first couple of tunnels, however, it quickly becomes apparent that 1) there are a LOT of these bunkers connected together; and 2) apart from the possible difference in the number of tunnels to which they connect -- these bunkers look largely identical to one another.

Not wanting to get lost repeatedly exploring the same bunkers over and over again, you decide to mark which bunkers you have visited as you conduct your search. Fearful that rodents might abscond with any breadcrumbs you might leave behind to this end, you grab a can of luminous spray paint instead that you just so happened to conveniently bring with you.

As you visit each bunker, you paint a large luminous number on its wall -- 1 for the first bunker visited, 2 for the next, and so on. In this way, should you ever run into a dead end or a bunker whose other tunnels all lead to bunkers that have already been visited, you can always backtrack to the bunker most recently explored that still had unexplored bunkers connected to it. Should such backtracking take you all the way back to where you began and you see that all of the connecting tunnels have been visited, then there is nothing left for you to explore.

If you never backtrack unless you have to -- and instead always push yourself deeper into the unknown by traversing the next unexplored tunnel, you are said to be conducting a depth-first search. The analogy for graphs should hopefully be obvious -- bunkers are vertices and tunnels are edges. Starting at a given vertex, one conducts a depth-first search of a connected graph by exploring as far as possible along untaken edges before backtracking.

Depth-first searches can be implemented either with stacks or recursion.

In both, instead of a can of spray paint -- one typically uses an array of booleans to indicate which vertices have been "visited". Supposing for a moment that we named this array visited[]. Recalling that in our representation for graphs, we represented vertices by integers from $0$ to $(n-1)$ for a graph of $n$ vertices, then visited[] should have $n$ elements, and visited[v] should be true if $v$ has been visited and false otherwise.

With this in mind, here is the process for conducting a depth-first search of a connected graph using a stack:

  1. Push a starting vertex, s, onto the stack.

  2. Repeat the following until the stack is empty:

    1. Remove the top vertex from the stack, call it $v$

    2. Mark vertex $v$ as "visited".

    3. Add all $v$'s unvisited "neighbors" (i.e. vertices adjacent to $v$) to the stack

Given the natural relationship between stacks and recursion, we can implement a depth-first search even more quickly using recursion:

  1. Create a recursive function dfs(G,v) that marks $v$ as visited and then recursively calls dfs(G,u) for every vertex $u$ adjacent to $v$ that has not already been marked visited.
  2. Then one just kicks off the search by calling dfs(G,s) for some starting vertex $s$.

Breadth-First Search

Source: Cranky Bear by Nick Bland
Returning to our metaphor of bunkers and tunnels, suppose that after a few minutes of thinking about things you realize that there might be a downside to so quickly getting oneself so deep into the network of bunkers. It's clear that nobody has been in these bunkers for a very long time. What if one of nature's critters -- like a bear -- has taken up residence inside? Clearly, one would want to explore these bunkers in such a way that one could get back to the entrance as quickly as possible, if needed!

For simplicity, let us assume that all of the tunnels are the same length. This allows us to measure the length of the path from any bunker back to the starting bunker in terms of how many tunnels are traversed.

So that we keep a minimum distance between us and the entrance at all times as we explore the network of bunkers, we'll modify our search method in the following way:

We start by painting the number $1$ on our first bunker.

Then, mustering all our courage for the task ahead, we run down each of its tunnels incident to some unvisited bunker; quickly paint successive numbers (e.g., $2,3,4,\ldots$) on these bunkers, marking them as visited; and quickly run back.

Finally, we paint a big X over the current bunker's number so that we never have to do that again from this bunker.

Then, we navigate to the bunker with the smallest number that hasn't yet been X'd out, and repeat the process until there are no such bunkers left.

By picking the smallest numbered bunker that hasn't yet been X'd out, we ensure the distance to the entrance is kept at a minimum. As such, the order in which the bunkers are visited first includes all of the bunkers one tunnel away from the entrance, and then all bunkers two tunnels away from the entrance, then three tunnels away, and so on.

This is called a breadth-first search. Interestingly, if we chose to next explore the bunker with the largest number that hasn't yet been X'd out, we would have a depth-first search again.

Turning our attention back to the universe of graph traversals instead of bunkers and tunnels -- a breadth-first search of a connected graph is a search where one begins at some starting vertex and explores all vertices one edge away from the start, and then two edges away, three edges away, and so on -- by choosing from the edges connected to unvisited vertices yet to be traversed the one that was least recently encountered.

Breadth-first searches, which examine vertices in increasing distance (in terms of number of edges) from the starting vertex, can be implemented with a queue. The process for doing so is shown below:

  1. Enqueue a starting vertex, $s$, marking it as visited

  2. Repeat the following until the queue is empty:

    1. Dequeue the least recently added vertex $v$

    2. Enqueue each of $v$'s unvisited neighbors, marking each as visited

Applications

Breadth-first and depth-first traversals of a graph lay at the heart of algorithms that answer many important questions about graphs.

For example, if when one marks each unvisited neighbor of a vertex $v$ as visited in a breadth first search, one also records $v$ as the vertex that should precede each one of these neighbors, one fimds a path of shortest length between the starting vertex used in the search and any other vertex in the same connected component of the graph.

As another example, if one has a graph that possibly isn't connected, one can partition the vertices into connected components using a depth-first search. To do this, one changes the visited[] array from a boolean array to an array of integers. Then, one conducts a depth-first search on vertex $0$, marking visited vertices by storing a $1$ in this array. When the search is done, one has identified all of the vertices in the connected component containing $0$. If there are any vertices left, pick one and conduct a depth-first search from this new vertex, marking visited vertices by storing a $2$ in the visited[] array. When this search finishes, one has identified all of the vertices in a second connected component. If there are still vertices left, we pick one of these and do the same process. Continuing in this way, we proceed until no vertices are left -- at which point we have identified by the integers used in the visited[] array all of the connected components of the graph.

Similar uses of breadth-first and depth-first searches can determine if there are any cycles in a graph, or if a graph is bipartite (i.e., a graph whose vertices can be partitioned into two sets where no vertices of the same set are adjacent). These are but a paltry few of the problems that can be solved using these two important graph traversal methods.