Dmitriy Setrakyan manages daily operations of GridGain Systems and brings over 12 years of experience to GridGain Systems which spans all areas of application software development from design and architecture to team management and quality assurance. His experience includes architecture and leadership in development of distributed middleware platforms, financial trading systems, CRM applications, and more. Dmitriy is a DZone MVB and is not an employee of DZone and has posted 54 posts at DZone. You can read more from them at their website. View Full User Profile

Say Hello To GridGain Data Grid

01.06.2011
| 5717 views |
  • submit to reddit

I have been thinking how a HelloWorld example should look for data grid. After checking some other products I have noticed that the most popular approach for a HelloWolrd app on a data grid is creating an example which has two counter parts: client and server. The client example generally prints out the operation on cache, and the server would usually print out the same operation whenever the data ends up on remote server. This way users can see that the value stored in cache actually does get distributed to remote nodes.

After looking at such examples it occurred to me that this client/server approach can be implemented a lot simpler in GridGain using zero deployment and basic event subscription. All we need to do is make sure that cache operations get printed out on remote nodes so we can visualize what's going on. However, for that we don't need to create a separate server app - we can do it all from our client example code.

So, let's make sure that events are printed out. For that we will execute a closure on all grid nodes which will subscribe to cache events and print them. This closure can be executed directly from example code and will be automatically deployed on remote nodes. Here is how the code will look like:

// Execute this runnable on all grid nodes, local and remote.
G.grid().run(BROADCAST, new Runnable() {
@Override public void run() {
// Event listener which will print out cache events, so we
// can visualize what happens on remote nodes.
GridLocalEventListener lsnr = new GridLocalEventListener() {
@Override public void onEvent(GridEvent e) {
System.out.println("Event '" + e.type() + "' for key: " +
((GridCacheEvent)e).key());
}
};

// GridNodeLocal is a ConcurrentMap attached to every grid node.
Object prev = grid.nodeLocal().putIfAbsent("lsnr", lsnr);

// Make sure that we only subscribe once regardless
// of how many times we run the example.
if (prev == null)
grid.addLocalEventListener(lsnr,
EVT_CACHE_OBJECT_PUT,
EVT_CACHE_OBJECT_READ,
EVT_CACHE_OBJECT_REMOVED);
}
});

Note how easy it is in GridGain to execute any kind of code on all grid nodes (or any subset of nodes) without actually having to deploy anything. Now lets play with some basic cache operations and see what happens:

// Create strongly typed cache projection to avoid casting.
final GridCacheProjection<Integer, String> cache =
G.grid().cache().projection(Integer.class, String.class);

// Store some values in cache.
for (int i = 0; i < 10; i++)
cache.put(i, "value-" + i);

// Note that size may differ depending on whether cache
// is distributed or partitioned.
System.out.println("Cache size: " + cache.size());

// Visit every cache element stored on local node.
// Note that 'CI1' is a just a type alias for 'GridInClosure' type.
cache.forEach(new CI1<GridCacheEntry<Integer, String>>() {
@Override public void apply(GridCacheEntry<Integer, String> e) {
// Peek at locally cached values.
System.out.println("Visited locally cached entry: " + e.peek());
}
});

// Collocate computations and data.
for (int i = 0; i < 10; i++) {
final int key = i;

// Find primary node for a key.
final UUID nodeId = cache.mapKeyToNode(key);

// Execute your computations on nodes where the data is cached to avoid a
// potentially heavy operation of bringing data to the local node.
// This is called Collocation of Computations and Data.
G.grid().node(nodeId).run(UNICAST, new Runnable() {
@Override public void run() {
System.out.println("Collocating computations and data on node: " + nodeId);

// Usually you would do something more complex than this :)
System.out.println("Cached value: " + cache.peek(key));
}
});
}

// The 'get' operation will bring values from remote nodes
// even if they are not cached on local node. Generally,
// you would want to avoid it, if possible, as it may
// create unnecessary data traffic.
for (int i = 0; i < 10; i++)
System.out.println("Cached value: " + cache.get(i));

The example above is just a small sample of what you can do with GridGain data grid. Note that if the cache is configured to be replicated (which is default), then data will be replicated to all nodes and every node will get the same copy. If cache is partitioned, then only a designated primary node (and also backup nodes, if any) will get to cache a specific key-value pair.

Also note how easily we brought our computations to the nodes where the data is cached, as opposed to bringing the data to the computations. Performing computations without any unnecessary data movement (a.k.a. data noise) is one of the most important elements in achieving better scalability.

To run this example, startup a few stand alone GridGain nodes by executing GRIDGAIN_HOME/bin/ggstart.{sh|bat} script and watch what happens.

From http://gridgain.blogspot.com/2011/01/say-hello-to-gridgain-data-grid.html

Published at DZone with permission of Dmitriy Setrakyan, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: