### Recommended Links

Like this piece? Share it with your friends:

I recently received an email from an audience of my blog on Map/Reduce algorithm design
regarding how to detect whether a graph is acyclic using Map/Reduce. I
think this is an interesting problem and can imagine there can be wide
range of application to it.

Although I haven't solved this exact problem in the past, I'd like to
sketch out my thoughts on a straightforward approach, which may not be
highly optimized. My goal is to invite other audience who has solved
this problem to share their tricks.

To define the problem: Given a simple directed graph, we want to tell whether it contains any cycles.

Lets say the graph is represented in incidence edge format where each
line represent an edge from one node to another node. The graph can be
split into multiple files.

node1, node2
node4, node7
node7, node1
....

Here is a general description of the algorithm.

We'll maintain two types of graph data.

- A set of link files where each line represent an edge in the graph. This file will be static.
- A set of path files where each line represent a path from one node
to another node. This file will be updated in each round of map/reduce.

The general idea is to grow the path file until either the path cannot
grow further, or the path contain a cycle, we'll use two global flags to
keep track of that: "change" flag to keep track of whether new path is
discovered, and "cycle" flag to keep track whether a cycle is detected.

The main driver program will create the initial path file by copying the
link file. And it will set the initial flag condition: Set cycle and
change flag to true. In each round, the driver will

- Check if the cycle is still false and change is true, exit if this is not the case
- Otherwise, it will clear the change flag and invoke the map/reduce

Within each round, the map function will do the following

- Emit the tuple [key=toNode, value=fromNode] if it is a path
- Emit the tuple [key=fromNode, value=toNode] if it is a link

This ensure a path and all extended link reaches the same reducer, which will do the following

- Check to see if the beginning point of the path is the same as the
endpoint of the link. If so, it detects a cycle and mark the cycle flag
to be true.
- Otherwise, it compute the product of every path to every link, and form the new path which effectively extend the length by one.

The following diagram illustrates the algorithm ...