Big Data/Analytics Zone is brought to you in partnership with:

Nikita Ivanov is a founder and CEO if GridGain Systems – developer of one of the most innovative real time big data platform in the world. I have almost 20 years of experience in software development, a vision and pragmatic view of where development technology is going, and high quality standards in software engineering and entrepreneurship. Nikita is a DZone MVB and is not an employee of DZone and has posted 27 posts at DZone. You can read more from them at their website. View Full User Profile

MapReduce Without Splitting

06.23.2008
| 4486 views |
  • submit to reddit

I frequently get the question along these lines: since GridGain is all about MapReduce type of processing, what if I don’t (or can’t) logically split my task into multiple sub-tasks?

There are two facets to this question:

  1. Non-splitting is a perfectly fine use case of MapReduce – you simply split into one sub-task. That allows you move the entire task execution onto the grid.
  2. By not splitting and simply putting the entire task for execution on the grid you gain scalability (but usually not the performance).

There are edge use case where you can gain performance even in this case:

  • If your computers on the grid are more powerful or less busy (given the right collision resolution policy)
  • If you run your tasks locally on multi-core CPU (assuming the original processing was sequential allowing you gain performance by utilizing better threading performance on multi-core CPUs)

Non-split is extremely important use case as it allows gain scalability with minimum effort. In fact, with GridGain you achieve that with just one @Gridify annotation in most cases:

@Gridify
publiv void someBusiness(Object arg) throws SomeException {
// Some business logic.
}
Next time you call this method its entire execution will be moved onto the grid.

I’ve seen number of pilot projects where within several hours since downloading GridGain – one would have 6-8 nodes grid and offloading task execution onto it – gaining instant 6-8 times scalability improvements (!). And I repeat – within several hours…

 

Published at DZone with permission of Nikita Ivanov, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Dmitriy Setrakyan replied on Mon, 2008/06/23 - 2:00pm

Just want to add to the article that if you don't need to split and don't want to use AOP, you can use grid-enabled ExecutorService that comes with GridGain.

To achieve the same result as above, you would have to do the following:

import java.util.concurrent.ExecutorService;

...
ExecutorService execSvc = GridFactory.getGrid().newExecutorService();

// Submit your logic for execution on the grid.
Future<Object> future = execSvc.submit(new Runnable() {
public void run() {
// Call your business logic.
someBusinessLogic();
}
});

// Wait for completion.
future.get();
...

Best,
Dmitriy Setrakyan
GridGain - Grid Computing Made Simple

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.