Ten Useful GridGain How-To Tips
1. How to execute a task on a grid?
GridGain supports annotation-based and API-based GridTask Execution. For annotation-based approach you will need to attach @Gridify annotation to your Java method. For API-based approach you will need to call Grid.execute(..) method directly. There are examples of both approaches on our Wiki.
2. How to split a task into multiple sub-units of work (jobs) for parallel execution?
The main abstraction in GridGain is GridTask, which has 3 methods: map(), result(), and reduce(). In GridTask.map(..) method you are able to create multiple GridJobs and execute them in parallel. There are also multiple adapters, such as GridTaskAdapter or GridTaskSplitAdapter that make implementation of GridTask much simpler.
3. How to execute the same task on all nodes?
In GridTask.map(..) method, or in any of the adapters, simply return as many jobs as there are nodes (you can return the same job instance multiple times). This way every job will be executed on a different grid node. When using GridTaskSplitAdapter, make sure that GridRoundRobinLoadBalancingSpi is configured (this is default configuration).
4. How to pick a random node for execution?
Change load balancer in GridGain configuration to GridWeightedRandomLoadBalancerSpi. This way every job will be assigned to a random node.
5. How to pick a specific node for task execution?
In GridTask.map(..) method you get the list of all available GridNode instances. You can inspect every node for metrics and attributes. GridGain uses node metrics to expose all lifetime vitals for every node, such as CPU utilization, Heap memory, threads, averages for job counts and execution times, etc... Also, all system and environment properties of a node are automatically attached to every node as attributes. Users can also attach any custom attributes they like. You can use all this node information to intelligently select a specific node for job execution (for example, you can pick all Linux nodes with more than 50% of Heap memory available).
Another approach would be to properly configure GridTopologySpi and/or GridLoadBalancingSpi to properly select the nodes for your jobs.
6. How to limit task execution to a subset of nodes?
Use node attributes to Segment Your Grid. Then configure GridAttributesTopologySpi to only include nodes that have specific attributes. This way only the nodes that have configured attributes will be provided to GridTask.map(..) method.
7. How to deploy a grid task?
GridGain supports implicit and explicit task deployment. In implicit mode you don't have to do anything. Simply execute a task and all classes and resources used by it will be automatically loaded to remote nodes via GridGain Peer Class Loading mechanism. You also have an option to deploy a GAR file explicitly on all nodes.
8. How to limit a maximum number of jobs that can execute in parallel?
GridGain has a notion of GridCollisionSpi. This SPI gets invoked every time a job arrives to a remote node. You can configure number of parallel jobs for any Collision SPI you choose. All jobs that exceed this number will be queued for execution. You also have an option to reject any job and fail it over to another node.
9. How to change load balancing policy?
GridGain comes with multiple GridLoadBalancingSpi implementations out of the box. These implementations include a wide range of load balancing algorithms, such as round-robin, random, adaptive, affinity, etc... Some of the interesting ones are adaptive policy, which basically listens to the grid load and automatically self-adjusts to pick the least loaded node, or affinity policy which always assigns a job to the same node based on the affinity key provided - perfect for collocation of computations and data and is often used for integration with data grids.
There is also a concept of Job Stealing which allows less loaded nodes to steal jobs from more loaded nodes.
10. How to control node fail-over behavior?
In GridGain all node failures and job rejections are failed-over automatically. However, users have an option to fully control fail-over behavior by overriding GridTask.result(..) method and decide which cases should be failed over and which should not. Users can also control maximum number of fail-over hops a job can make before it will be considered failed by properly configuring any of GridFailoverSpi implementations shipped with GridGain.
For more information visit:
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)