Introducing the Infinispan Data Grid Platform
This two-part series aims to introduce the reader to Infinispan, a new open source, LGPL licensed data grid platform. The first part will focus on an overview of the scope and capabilities of Infinispan, along with usage examples and a brief tour of the APIs. An upcoming second part will take a deep-dive into the architecture, more advanced APIs and extending Infinispan.
What is Infinispan?
Infinispan is a peer-to-peer, in-memory data grid (IMDG) platform, written for the Java virtual machine (JVM). I first publicly announced the project in April 2009, and a string of alpha and beta releases quickly followed. The project has grown out of experiences gained on JBoss Cache, a popular clustered caching library, and while JBoss Cache was used as a “living prototype” for Infinispan - a place to harvest ideas, designs, usage patterns and the wants of the community - it is in no way related to or dependent on JBoss Cache.
While there are several differences between JBoss Cache and Infinispan, the most significant difference is scope. JBoss Cache focused on being a clustered caching library while Infinispan is a data grid platform, complete with GUI management tooling, and the potential to scale to thousands of nodes. At the same time, Infinispan also fulfills the requirements of a clustered caching library, and even performs exceptionally as a standalone, non-clustered cache. To help existing JBoss Cache users, Infinispan provides for an easy migration path.
Features
Infinispan’s upcoming release is 4.0 - wondering why our first release is numbered such? Have a look at this FAQ.
Current release: 4.0This release offers the following features:
APIs
- Familiar Map-like API (org.infinispan.Cache extends java.util.concurrent.ConcurrentMap)
- Alternative JBoss Cache-compatible tree-like API
- RESTful API for remote access
- Support for non-JVM clients via RESTful API
- Clustered and non-clustered (standalone) operation modes
- Clustered operational modes include invalidation, replication and distribution, making use of either synchronous and asynchronous network communications
- Distribution with optional L1 caching (“near-caching”) reduces network latency for frequent lookups, while maintaining the scalability of distribution
JTA transaction support
- Infinispan is an XA resource that is compatible with any JTA-compliant transaction manager
Write-through caching to pluggable CacheStores
- Can be configured to provide “warm starts”
- Ships with high-performance disk, JDBC and cloud storage based CacheStore implementations
- Support for custom implementations
- Also useful as an overflow space if memory is scarce
Eviction and expiration
- FIFO and LRU based eviction policies
- Expiration of data based on lifespan and idle time
Management
- JMX statistics and monitoring
- JOPR-based GUI management
High-performance custom marshaling layer to provide fast, low-overhead serialization and deserialization
Future releasesPlease visit Infinispan’s project roadmap page to view the project’s roadmap and to check out what is in store for future releases, including the JPA-like API, a client/server module, querying capabilities, and distributed code execution.
Getting started with Infinispan
Demos
A good place to start is to download the Infinispan distribution. The distribution includes a number of demo applications including a GUI-based one to visualize state moving around a grid.
1. Download an Infinispan distribution. E.g.,
$ wget http://sourceforge.net/projects/infinispan/files/infinispan/4.0.0.CR2/infinispan-4.0.0.CR2-bin.zip/download
2. Unzip the archive
$ unzip infinispan-4.0.0.CR2-bin.zip
3. Run the GUI demo by invoking the runGuiDemo.sh script provided with the distribution(or runGuiDemo.bat on Windows platforms)
$ infinispan-4.0.0.CR2/bin/runGuiDemo.sh
4. The GUI console should load up and you would see something like the frame below:
Clicking on the START CACHE button will start the cache instance wrapped up in the frame.
5. Naturally, one instance is of limited use - things are more fun when you have several cache instances in a cluster. Invoking the runGuiDemo.sh script a few more times will create more GUI frames, and you can start them all. For example, starting two more GUIs, we can see that the caches discover each other and form a cluster.
You can start as many cache instances as you wish.
6. You can now use one of the instances to generate data.
7. You can see that this entry has been added to the cache:
8. You will be able to retrieve this key from any of the frames in the cluster:
Using Infinispan in your project
The easiest way to do this is to use Maven, and add dependencies to Infinispan modules in your project’s POM. Alternatively you could download the distribution and get the necessary jar files there, for inclusion in your project.
To use Infinispan with Maven, just add the following to your project’s pom.xml:
<dependencies>
<dependency>
<groupId>org.infinispan</groupId>
<artifactId>infinispan-core</artifactId>
<version>4.0.0.CR2</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>repository.jboss.org</id>
<url>http://repository.jboss.org/maven2</url>
</repository>
</repositories>
From that point on, you will be able to create instances of the cache and use it:
//1. Create a configuration
Configuration cfg = new Configuration();
// ... Customize your configuration as you wish.
// ... Defaults to LOCAL mode.
CacheManager mgr = new DefaultCacheManager(cfg);
//2. Let's create a stock price cache, keyed on String
Cache<String, Float> stockPriceCache = mgr.getCache("strockPriceCache");
//3. Let's check if we have the price of IBM, and if not,
// retrieve and cache it
String ticker = "IBM";
Float value;
if (stockPriceCache.containsKey(ticker)) {
value = stockPriceCache.get(ticker);
} else {
value = getStockPriceFromTheInternet(ticker);
stockPriceCache.put(ticker, value);
}
System.out.printf("Got the price of %s as %s", ticker, value);
We maintain an online, interactive tutorial to walk you through the basic steps of creating a cache and using it. For more examples, we recommend this as a good starting point, and it is probably a very good idea to have the Infinispan API documentation handy while you do this. Exploring the API in this manner is a great way to get up to speed using Infinispan really fast.
Migrating configurations
The Infinispan distribution ships with tools to migrate cache configuration files from JBoss Cache, EHCache and Oracle Coherence to Infinispan configuration files. This can be a useful starting point if you are considering Infinispan as a replacement for one of these cache systems. Information on these tools can be found here.
Using the REST APIFeel like connecting to an Infinispan backend using the REST API? Download the REST server for Infinispan! This is in the form of a WAR file which can be deployed in most Servlet containers.
1. Download the Infinispan REST server
$ wget http://sourceforge.net/projects/infinispan/files/infinispan/4.0.0.CR2/infinispan-4.0.0.0.CR2-server-rest.zip/download
2. Unzip the archive
$ unzip infinispan-4.0.0.CR2-server-rest.zip
3. Deploy the webapp in your favorite Servlet container or Java EE server
$ cp infinispan-4.0.0.CR2/webapp/infinispan.war $JBOSS_HOME/server/default/deploy/
4. Start your Servlet container or Java EE server
The Infinispan REST server should now be listening on the host name and port that you have used to configure your Servlet container or Java EE server. E.g., on http://localhost:8080/infinispan
5. Connecting to your REST server is easy - here is an example using Python:
import httplib
hostname = "localhost:8080"
#putting data in
conn = httplib.HTTPConnection(hostname)
data = "SOME DATA HERE !" #could be a string, or a file ...
conn.request("POST", "/infinispan/rest/Bucket/0", data, {"Content-Type":
"text/plain"})
response = conn.getresponse()
print response.status
#getting data out
conn = httplib.HTTPConnection(hostname)
conn.request("GET", "/infinispan/rest/Bucket/0")
response = conn.getresponse()
print response.status
pring response.read()
Or using Ruby:
require 'net/http'
http = Net::HTTP.new('localhost', 8080)
#Create new entry
http.post ('/infinispan/rest/MyData/MyKey', 'DATA HERE', {"Content-Type" =>
"text/plain"})
#get it back
puts http.get ('/infinispan/rest/MyData/MyKey').body
#use PUT to overwrite
http.put ('/infinispan/rest/MyData/MyKey', 'MORE DATA', {"Content-Type" =>
"text/plain"})
#and remove ...
http.delete('/infinispan/rest/MyData/MyKey')
We have a more detailed guide on using the REST server. The long-term goal with this API is to converge with the REST-* effort and standardize the API for distributed caches via REST.
We have a detailed guide on using the JOPR GUI tool to manage your cache instances as well. The current version allows you to visually monitor the health of your data grid and plots graphs of various metrics over time. As this tool evolves, we hope to add features such as automatic provisioning of nodes based on rules, where the grid’s true elastic nature can be realized.
Querying the gridWhile querying is only really scheduled for 4.1, our next release, we do have a tech preview of querying in the current, 4.0 release. Keep in mind that, as a tech preview, the querying interface and API are subject to change, but it does give you a feel for what to expect. Details about the tech preview of the Query API, along with instructions on usage and sample code, can be found here.
Operational modes
Here we discuss the different operational modes in more detail.
Standalone operationAs a simple, standalone cache to store data that is expensive to retrieve - for example from a database or a mainframe - or recalculate, Infinispan’s highly concurrent core container performs exceptionally well with minimal overhead, and is highly tuned for multi-core/multi-CPU environments. Synchronization and locking are kept to a minimum while delivering concurrent transaction isolation, and offering all of the other features of the platform, including a write-through CacheStore, eviction and JTA transaction compatibility.
Clustered operationIn addition to standalone operation, Infinispan can operate as a cluster, where nodes are aware of each others’ presence and are able to interact and maintain coherence of state. The following clustered modes are supported:
Invalidated data gridA clustered, invalidated data grid is essentially a set of local, standalone caches which are aware of each other. When an entry is changed in any cache in the grid, the entire grid is made aware of the fact. Other nodes, if they happen to have cached the same entry as well, are aware that it is now out of date, and it will be invalidated. This low-cost invalidation message involves a multicast of the modified key(s), and prompts remote caches to remove corresponding entries from their caches. Commonly used in “read-mostly” scenarios where there is a data store elsewhere which can be consulted for data if it is invalidated from parts of the data grid.
Replicated data gridA replicated data grid is where each instance contains a replica of its neighbors. As such, any changes made to any instance is replicated across the entire cluster. This is useful if the cluster size is small and the entire cluster can benefit from having all of the state local and in-memory. However, this operational mode does not scale in terms of memory, since adding more cluster nodes does not give you access to more addressable memory, and you are theoretically limited to the heap of a single JVM, discounting overhead.
Distributed data gridThis is the default clustered operation mode in Infinispan, and makes use of a consistent hash algorithm to determine where keys should be located in the cluster. Consistent hashing allows for cheap, fast and above all, deterministic location of keys with no need for further metadata or network traffic. The goal of distribution is to maintain enough copies of state in the cluster so as to be durable, but not too many copies so as to be scalable. As such, the number of copies of each entry maintained in the grid - the numOwners configuration attribute - is a configurable parameter that can be tuned, and represents the tradeoff between performance and scalability, and durability of data. Regardless of how large the cluster is though, the number of copies is fixed. This means that such a setup scales linearly as nodes are added to the cluster. Further, capacity added is capacity realized, since adding more nodes means more usable memory can be addressed. For example, discounting for overhead, 200 JVMs in such a cluster, with a heap size of 1GB each and setting numOwners 2 would give you 100GB of addressable memory in the entire system!
L1 caching (”near caching”)With a distributed data grid, there is no guarantee that the instance you speak to locally holds the entry you are looking for. The system may have to make a remote call to another cache node to retrieve the requested entry. While this remote lookup happens transparently, it has a cost associated with it. To minimize this cost in the event of repeated lookups on the same key, L1 caching can be enabled. L1 caching causes the requesting node to cache the retrieved entry locally and listen for changes to the key on the wire. L1-cached entries are given an internal expiry to control memory usage. Enabling L1 will improve performance for repeated reads of non-local keys, but will increase memory consumption to some degree. It offers a nice tradeoff between the “read-mostly” performance of an invalidated data grid with the scalability of a distributed one.
Use cases
Infinispan can be used for a number of purposes.
Traditional cache usage - to front databases or other expensive, non-scalable data stores - is one. Such usage helps “read-mostly” setups to relieve their data store from congestion, and provide quick, low-latency access to data being read.
Clustering toolkitUse as a toolkit to cluster a container, framework or server by distributing on-the-fly state and allowing for failover is another common use case. Such usage allows framework or server developers to create clustered offerings where state management is delegated to Infinispan and clients connected to such backends can gracefully fail over to another instance if one were to experience a failure.
Data storeIncreasingly, though, use as a primary data store in itself is gaining popularity, especially for unstructured or semi-structured data. Due to the low-latency, high-concurrency and highly scalable nature of in-memory data grids, they have become popular in many applications that require the ability to scale on-demand, or to have fast, low-latency access to data. Infinispan fits well with the NoSQL movement, which is gaining momentum, as well as cloud-deployments where traditional data stores are problematic. Upcoming features such as indexing and querying of state as well as distributed execution (“move the process to the data, not data to the process”) make this an interesting space to watch.
Integrating with other products and frameworks
We know of several open source and proprietary products considering Infinispan as a part of their offering, and here are some that have reached a certain degree of maturity that may be of interest.
Hibernate 2nd Level Cache ProviderAs of version 3.5, Hibernate ships with an Infinispan cache provider for 2nd level caching. This setup typically uses Infinispan as a clustered, invalidated data grid and helps improve performance on “read-mostly” entities. More details on this cache provider can be found here.
Lucene Directory ProviderContributed to Infinispan’s codebase, this module allows you to use Infinispan as a distributed, in-memory store for Lucene indexes. More details can be found here.
Next steps
Want more?
A formal user guide is in the process of being written. Expect this to be available soon, but in the meanwhile the wiki should be your primary source for information. The wiki serves as a launchpad for more information on Infinispan, from design documents to FAQs, API docs to configuration references, tutorials to tips on contributing to the project. Can’t find the information you need on a specific subject? Visit the Users’ Forum to ask about it. We use JIRA as a project issue tracker. And of course you should follow the project on Twitter! Like our logo? Check out these cool desktop wallpapers the good people at JBoss.org designed for us!
Have a look at this page, which details the resources available to anyone interested in participating in the project, along with information on how to get in touch with the development team.
| Attachment | Size |
|---|---|
| IntroducingInfinispan_img_0.jpg | 42.43 KB |
| IntroducingInfinispan_img_1.jpg | 25.65 KB |
| IntroducingInfinispan_img_2.jpg | 21.46 KB |
| IntroducingInfinispan_img_3.jpg | 19.99 KB |
| IntroducingInfinispan_img_4.jpg | 19.56 KB |
- Login or register to post comments
- 18669 reads
- Printer-friendly version
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)




Comments
claude hussenet replied on Sun, 2009/12/27 - 12:12pm
Hi Manik, Thank you for the post.Works well with CR2 but not with CR3 Getting the following error
Thank you-Claude
Exception in thread "main" java.lang.NoClassDefFoundError: org/infinispan/util/logging/LogFactory at org.infinispan.demo.InfinispanDemo.(InfinispanDemo.java:47) Caused by: java.lang.ClassNotFoundException: org.infinispan.util.logging.LogFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:319) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:264) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:332) ... 1 more Could not find the main class: org.infinispan.demo.InfinispanDemo. Program will exit.
claude hussenet replied on Sun, 2009/12/27 - 7:23pm
Answering to my own question.
The modules/core directory is missing in CR3.
Getting the modules/core/lib directory from CR2 and moving the location of the infinispan-core.jar from CR3
into modules/core are fixing the issue.
Claude Hussenet
Lin Ye replied on Thu, 2010/01/07 - 12:02pm
Hi Manik,
I am working at GE Energy, and we are interested in using Infinispan or JBoss Cache to keep high performance data. I would like to talk to you more about this, and could you please let me know the best way to reach you?
And first I would like to clarify one thing with you. I read couple of your articles on "DIST - distributed cache mode" & "JBoss Cache - Partioning". It seems to me the Partioning fits our particular needs better, but I am not sure if it's implemented in any version of JBoss Cache at all (as Infinispan is using DIST and I only found Buddy Replication in JBoss Cache Users Guide)? I would appreciate your response.
Thanks,
Lin
Tarak Mehata replied on Tue, 2011/06/21 - 4:06am
hi manik
i want ot know about informaion how to run infinispan hotrodserver & hotrod client ,i run guidemo successfully
but for client server architecture version 4.2 require so i downloded that but i am facing problem for running ClientServer ,i also successfully run client & server but their is no response from server.please let me know where i am going wrong ?
Thanks
Tarak Mehta