Bela has posted 1 posts at DZone. View Full User Profile

A Simple Clustered Task Distribution System

10.06.2008
| 56257 views |
  • submit to reddit

Conclusion

We implemented a simple, highly decentralized, clustered task distribution system in roughly 500 lines
of code and 5 classes. The system is failure resilient, because all nodes are peers and there's no central
server.

All peers are equal (every peer can act as both master and slave) and tasks are grabbed by a node based
on an ID assigned by the submitter (master).

Crashes or graceful termination of nodes doesn't lead to missing tasks, as the system re-balances itself
and assigns orphaned tasks to another node in the cluster.
The system is so small because it runs on top of JGroups. Had we written it without JGroups, we would
have had to implement the following functionality ourselves:

  • Cluster membership and failure detection: we need to know when the membership changes, and
    all nodes in a cluster need to get these views in exactly the same order
  • Simulcasting (with UDP datagrams): fragmentation (if a task or result is greater than 65K) and
    retransmission (UDP datagrams are lost), plus suppression of duplicate messages (we cannot receive the same task multiple times !).
  • Simple switching between transports (UDP, TCP), and configuration/customization of the
    transport: e.g. adding compression or encryption

The current task distribution is far from complete (after all, this is just a demo of what can be done with JGroups !); possible further improvements include:

  • Implementation java.util.concurrent.ExecutorService. This would extend the in-VM thread pool
    to become a clustered thread pool, where tasks are executed not only by threads in the same
    JVM, but also by threads on different hosts. This would allow masters to submit tasks (for
    example a collection of tasks) and to wait for their completion later. In our current solution, the
    thread of the caller of submit() is blocked until a timeout occurs or the result becomes available.
  • Not all nodes store the task, but only a subset of the nodes. When a node X crashes, we ask
    everyone for the tasks assigned to X, and these are returned by the nodes who stored them.
  • Use random numbers to create ClusterIDs rather than monotonically increasing ints. Currently,
    we use a round robin ID allocation scheme. While this is pretty good at distributing all tasks
    evenly, it might be better in some cases to assign weights to individual cluster nodes, according
    to number of cores, memory etc. Thus, tasks could be assigned more optimally, whereas in the
    current solution we assign all tasks evenly, which means slower hosts get the same number of
    tasks as faster hosts. We should probably externalize the policy which creates the IDs and/or
    picks the nodes, so it can be replaced.

Download

The full demo code can be downloaded here

Published at DZone with permission of its author, Bela Ban.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)