Work as Technical Evangelist. Have been in Java field since last 10 yrs. Worked with Yahoo, IDeaS, EDS, General Motors. Vikash has posted 2 posts at DZone. View Full User Profile

Java's Unsung Heroes

06.09.2010
| 7154 views |
  • submit to reddit

Whenever we  buy an electronics  gadget, the  scariest part is going  through the verbose manual. However, the section on 'Getting Started' provides some respite. The same holds  true when we try to learn a new  language, where we  focus on the kickstart (sounds  pretty natural) but  staying in the same  gear for long may prevent us from understanding  the  limits of the language. Java is no different, and in the process we  ignore many packages/APIs  because they are not on the kickstart path, despite  the  fact that  they may have  tremendous  potential and capabilities. I ll try to bring out  some of the unsung heroes of the Java SDK. However this may not be an exhaustive list. 

1. Java NIO package(java.nio)

This package has  changed the pace of the servers that  switched from the old IO APIs, but this API has  received limited traction in user applications, due to several reasons. Even though using NIO packages require some action items with regard the OutOfMemory issue, but we can keep this out of this discussion.

One of the most important aspects of NIO is the ability to operate in non-blocking mode, denied to the traditional java I/O library.

With old I/O, commonly needed tasks such as file locking, non-blocking and asynchronous I/O operations and ability to map file to memory were not available. Non-blocking I/O operations were achieved through work around such as multithreading or using JNI. This caused performance issues.

A server’s ability to handle several client requests effectively depends on how it uses I/O streams. When a server has to handle hundreds of clients simultaneously, it must be able to use I/O services concurrently. One way to cater for this scenario in Java is to use threads but having almost one-to-one ratio of threads (100 clients will have 100 threads) is prone to enormous thread overhead and can result in performance and scalability problems due to consumption of memory stacks (i.e. each thread has its own stack.) and CPU context switching (i.e. switching between threads as opposed to doing real computation.). To overcome this problem, a new set of non-blocking I/O classes have been introduced to the Java platform in java.nio package. The non-blocking I/O mechanism is built around Selectors and Channels. Channels, Buffers and Selectors are the core of the NIO. Multiplexed I/O allows a growing number of users to be served by a fixed number of threads. Multiplexing refers to the sending of multiple signals, or streams, simultaneously over a single carrier. The selector handles multiple open sockets(rather than 1 thread per Socket). This allows the server to manage multiple clients with a single thread.

 NIO

·  Before SDK 1.4, servers had a number of performance problems: i/o could easily be blocked; garbage was easily generated when reading i/o; many threads are needed to scale the server.

·  Many threads each blocked on i/o is an inefficient architecture in comparison to one thread blocked on many i/o calls (multiplexed i/o).

 Buffer  is  a  reusable  portion of  memory. DirectByteBuffer directly maps a portion of  file that can be  directly loaded from the disk to RAM. IndirectBuffers involve an intermediate COPY step to the underlying byte[].

A direct byte buffer may be created by invoking the allocateDirect factory method of this class. The buffers returned by this method typically have somewhat higher allocation and deallocation costs than non-direct buffers. The contents of direct buffers may reside outside of the normal garbage-collected heap, and so their impact upon the memory footprint of an application might not be obvious. It is therefore recommended that direct buffers be allocated primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations. In general it is best to allocate direct buffers only when they yield a measureable gain in program performance.

NIO is  smarter than the stream based IOs earlier. But they do involve higher  cost as  they make native calls rather than JVM calls, unlike indirect buffers.

 final int CAPACITY = 0x800000;

ByteBuffer.allocateDirect(CAPACITY);

allocateDirect, will allocate it directly in the OS memory space and not in the JVM space. It has the primary advantage of not using up too much JVM memory for large files or other hige primitive type data.

A direct byte buffer may be created by invoking the allocateDirect factory method of this class. The buffers returned by this method typically have somewhat higher allocation and deallocation costs than non-direct buffers. It is therefore recommended that such buffers be used primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations.

A direct buffer is created using either the ByteBuffer.allocateDirect(), or the FileChannel.map() method that maps a region of a file directly into memory.

 

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4

In DirectByteBuffer , the buffer in the user space is not  required. The file  get   mapped in the kernel space memory. There's no Java array(depends on impl) underlying a direct buffer, but simply a "raw" section of memory. In normal IO, there is intermediate copy into buffer involved in the userspace. BufferedInputStream internally creates an internal buffer array (Userspace).

But note that both operations bring the file into memory in different ways, so which is faster will be system and data dependent. But for large files, the benefit of NIO is too great. Direct buffers have a higher creation cost than non-direct buffers because they use native system operations rather than JVM operations.

2. Java Reference APIs (java.lang.ref)

 Applications that needs  lots of cache typically fight with the deadly java.lang.OutOfMemoryError, and they need to control the size of the cache. By default all objects in the cache are  Hard references, which essentially means that the JVM can clean them only if they are de-referenced. However, the requirements of trhe  cache mandates the cached objects to be eternal, but then that may trigger the  OutOfMemoryError if the JVM runs short of memory. In majority of the cases, the OutOfMemoryError is fatal since the JVM cannot do any productive work if it does not  have  memory, and it keeps trying to free memory, which is non-existent. I think its better to slow down the application rather than hang/crash it. 

Here comes the concept of Soft References which unlike their adamant Hard Reference cousins, yield and become garbage in case th JVM run slow on memory. Three types of reference objects are provided, each weaker than the last: soft, weak, and phantom. Each type corresponds to a different level of reachability, as defined below. Soft references are for implementing memory-sensitive caches. The flip side is that the application needs to check the existence of the object in the cache before invoking a method on it. If the cached object  was gabage collected, then the  application nneds  to recreate  the  object and put it inthe cache.

The SoftRerence wraps around the HardReference. There is a concept of ReferenceQueue where the SoftReference objects about to be GCed can be accessed for some application level cleanups. Soft and weak references are automatically cleared by the collector before being added to the queues with which they are registered, if any. Therefore soft and weak references need not be registered with a queue in order to be useful, while phantom references do. An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable.A cache implemented of a SoftReferencecan still be a subclass of Map, and provide the same functionalities, but instead of storing a HardReference, it now stores a SoftReference.

Example:  

public class SoftCacheMap implements Map{

private final Map cacheMap;
private Thread cleanupThread;
private final ReferenceQueue clearedReferences;

......

/**
* Wrapper class to enable efficient handling of the references
* @author vranjan
*
*/

private static class Entry extends SoftReference {
private final Object _key;

public Entry(Object key, Object value, ReferenceQueue queue) {
super(value, queue);
_key = key;
}

/**
* Gets the key
* @return the key associated with this value.
*/
final Object getKey() {
return _key;
}

/**
* Gets the value
* @return the value; null if it is no longer accessible
*/
final Object getValue() {
return this.get();
}
}

// Put object in the cache
public Object put(Object key, Object o) {
SoftReference refKey =new SoftCacheMap.Entry(key, o,clearedReferences);
Object obj = null;
synchronized (cacheMap) {
obj = cacheMap.put(key, refKey);
}
if(cacheMap.size() > peakSize)
peakSize = cacheMap.size();
if(!referenceQueueCleanupThread)
removeClearedReferences();
return obj;
}

//get the object from the cache

public Object get(Object key) {
if(!referenceQueueCleanupThread)
removeClearedReferences();
if (cacheMap.size() > 0) {
SoftReference sr = (SoftReference)cacheMap.get(key);
synchronized (cacheMap) {
cacheMap.remove(key);
}
if(sr!=null)
return sr.get();
}
return null;
}

 

3. ThreadLocal (java.lang.ThreadLocal)

 Threads normally don't carry information along the path of its  execution, and hence we workaround using global variables to store  data even though we tolerate redunduncy, synchronization-overheads etc. ThreadLocal provides the mechanism by which the executing thread can carry data along its executing path, cross-cutting any modules. Any part of the applicationthrough which the thread runs can access the data. That data is hidden from other  executing threads.

 Example (class WebdavServlet) to use the ThreadLocal:

// Declaring the ThreadLocal:

/**
* Mechanism to carry the data
*/
private static ThreadLocal<String> threadLocal = new ThreadLocal<String>();

public static void setThreadLocalValue(String reqUsername) {
threadLocal.set(reqUsername);
}

public static String getThreadLocalValue() {
return (String) threadLocal.get();
}

// Set the data in the ThreadLocal:

WebdavServlet.setThreadLocalValue(someString);

// Get the data from the ThreadLocal:

String str = WebdavServlet.getThreadLocalValue();

Conclusion:

These are some of the JDK unsung heroes I could think of. There may be other hidden in some corners, and I would appreciate if someone can highlight them too. Its always recommended to keep exploring the APIs in the javadoc, even if you may not use them immediately. It mayprevent you from re-inventing the wheel in future.

Published at DZone with permission of its author, Vikash Ranjan.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Neon Light replied on Wed, 2010/06/09 - 8:55pm

That was a very interesting post. With the exception of LocalThread, I was not familiar with soft referencing and file memory mapping. Now I am. Thank you. 

Oliver Weiler replied on Thu, 2010/06/10 - 4:30am

Very nice post. Would be awesome if you could continue this post by doing some sort of Unsung series. Pleeease!

Senthil Balakrishnan replied on Thu, 2010/06/10 - 7:08pm

Good post ! . I felt java.util.concurrent  deserves a place here...

Sen

Vikash Ranjan replied on Fri, 2010/06/11 - 4:23am in response to: Senthil Balakrishnan

Yes, U r right. Thanks. Once I get couple of more pointers, I ll update this  article.

Puran Singh replied on Fri, 2010/06/11 - 4:37pm

thanks.. I always wanted to know more about soft and hard references i saw someone used in caching and wanted to know about it..thanks again.

Joern Huxhorn replied on Sat, 2010/06/12 - 10:31am

Nice article.

The links to the API point to your harddisk, though... ;)

Shamik Majumdar replied on Wed, 2010/06/16 - 4:22pm

Nice article. I think the following items should also come here - 1) Java Executors and Executorservice details. 2) AtomicInteger or AtomicReference from java.util.concurrent should take a place here. Actually while thinking about it, probably you can add a general topic on java.util.concurrent package as there are just so many things in it.

Vikash Ranjan replied on Thu, 2010/06/17 - 3:36am in response to: Shamik Majumdar

Thanks for the suggestions. Will surely add a few more  items.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.