Performance Zone is brought to you in partnership with:

Work as Technical Evangelist. Have been in Java field since last 10 yrs. Worked with Yahoo, IDeaS, EDS, General Motors. Vikash has posted 2 posts at DZone. View Full User Profile

Java Performance Tuning, Profiling, and Memory Management

09.01.2009
| 117011 views |
  • submit to reddit

Java application performance is an abstract word until you face its real implications. It may vary depending on your interpretation of the word 'performance'. This article is meant to give the developer a perspective of the various aspects of the JVM internals, the controls and switches that can be altered to optimal effects that suit your application. There is no single size that can fits all. You need to customize to suit your application.

You may be facing one of the issues listed below:

  1. The dreaded java.lang.OutOfMemory Error
  2. Your application is literally crawling.

Before we take the plunge into solving the issues, we first need to understand some of the theory behind the issues.

Theory

What does the JVM do? The Java Virtual Machine has two primary jobs:

  1. Executes Code
  2. Manages Memory  
    This includes allocating memory from  the OS, managing Java allocation including heap compaction,
    and removal of garbaged objects 

Besides the above, the JVM also does stuff like managing monitors.

Very Basic Java Theory

An object is created in the heap and is garbage-collected after there are no more references to it. Objects cannot be reclaimed or freed by explicit language directives. Objects become garbage when they’re no longer reachable from the root set (e.g static objects)

Garbage

 

Objects inside the blue square are reachable from the thread root set, while objects outside the square (in red) are not.

The sequence of the garbage collection process is as follows:

1. Root set tracing and figure out objects that are not referenced at all.
2. Put the garbage objects from above in finalizer Q
3. Run finalize() of each of these instances
4. Free memory

Infant mortality in Java

Most of the objects (80%) in a typical Java application die young. But this may not be  true for your application. Hence there is a need to figure out this rough infant mortality number so that you can tune the JVM accordingly.

 Infant mortality in Java 

JVM flavors

The Sun JVM understands the options -classic, -client and -server
  • classic : disables the Hotspot JIT compiler.
  • client (default): activates the Hotspot JIT for "client" applications.
  • server: activates the "server" Hotspot JIT: it requires a fair amount of time to warm up, but delivers best performance for server.
Don't forget that, if you use them, -server or -client must be the first argument to Java.
 
The Hotspot JVM uses adaptive optimization
  • JVM begins by interpreting all code, but it monitors the HotSpot
  • Fires off a background thread that compiles hotspot bytecode to native code
  • Hotspot JVM is only compiling and optimizing the "hot spot". Hotspot JVM has more time than a traditional JIT to perform optimizations
  • The Hotspot JVM keeps the old bytecodes around in case a method moves out of the hot spot.

Java Garbage Collector

The following describes what the Java Garbage Collector does. 

Sun Classic (1.1 JVM) ...for historical reasons

  • Mark, Sweep & Compact
      Mark: identify garbage
      Sweep: Find garbage on heap, de-allocate it
      Compact: collect all empty memory together
  • Eligibility for garbage collection is determined by walking across memory, determining reachability and then compacting the heap
  • Compaction is just copying the live objects so that they’re adjacent in memory
  • there’s one large, contiguous block of free memory
  • The main problem with classic mark, sweep and compact is that all other threads have to be suspended while the garbage collector runs
  • Pause time is proportional to the number of objects on the heap
 
Sun HotSpot( 1.2+ JVM)
  • Sun improved memory management in the Java 2 VMs by switching to a generational garbage collection scheme.
  • The JavaHeap is separated into two regions(we will exclude the Permanent Generation for the time being):
     New Objects
     Old Objects
  • The New Object Regions is subdivided into three smaller regions:
     1. Eden , where objects are allocated
     2. Survivor semi-spaces:  From and To
  •  The Eden area is set up like a stack - an object allocation is implemented as a pointer increment. When the Eden area is full, the GC does a reachability test and then copies all the live objects from Eden to the To region.
  •  The labels on the regions are swapped
  •  To becomes  From - now the  From area has objects.
 

JVM Heap:

Java Heap is divided into 3 generations: Young(Eden), Old(Tenured), and Permanent.

JVM Generations  Arrangement of generations:

The diagram below shows how objects get created in New generation and then move to survivor Spaces at every GC run, and if they survive for long to be considered old, they get moved to the Tenured generation. The number of times an object need to survive GC cycles to be considered old enough can be configured. 

    JVM Generations

By default, Java has 2 separate  threads for GC, one each for young(minor GC) and old generation(major GC). The minor GC (smaller pause, but more frequent) occurs to clean up garbage in the young generation, while  the major GC (larger pause, but less frequent) cleans up the  garbage  in the old  generation. If the major GC too fails to free required memory, the JVM increases the current memory to help create new object. This whole  cycle  can go on till the current memory reaches the MaxMemory for the JVM (default is 64MB for client JVM), after which JVM throws OutOfMemory Error.

  Normal 0

Published at DZone with permission of its author, Vikash Ranjan.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Ramazan VARLIKLi replied on Wed, 2009/09/02 - 6:59am

 

-JSPs converted into Servlet when they compiled so shouldn't be any performance difference between two

-StringBuilder is faster version of StringBuffer 

Kirk Pepperdine replied on Wed, 2009/09/02 - 7:01am in response to: Ramazan VARLIKLi

StringBuilder is an unsynchronized version of StringBuffer

qi yao replied on Thu, 2009/09/03 - 11:55pm

I think it's a very good article.

Ashish Paliwal replied on Fri, 2009/09/04 - 4:54am

A great article. It was much needed as very little information is available for debugging these frequently occurring scenarios. By any chance, would you be posting anything that can help debug concurrency issues.

Vikash Ranjan replied on Fri, 2009/09/04 - 1:57pm in response to: Ashish Paliwal

Thanks guys. If u are  talking about some deadlock issues, u can try JConsole -it's a simple tool available with JDK- decent for deadlock detection. It's light weight and can connect to any JVM. At any time it can show you the status of all the threads running in the JVM. See if the links below helps:

http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html#DeadlockDetection

http://java.sun.com/javase/6/docs/technotes/guides/management/jconsole.html

Go to the scenario where you suspect deadlock and check the deadlocked threads. It can even tell you the monitor details.

Raveendra Maddali replied on Thu, 2009/09/10 - 12:30am

Hi, We have a windows server machine that have enough memory to bring up a jvm with more than 1gb. We have weblogic 10.3 present on that server. when we try to bring a managed server with JRockit Xmx 1024m , it is bringing it up fine. but when we try the same with SunJDK Xmx 1024m , it is failing to allocate the amount of memory (or) object heap and exiting. What is this difference between JRockit and SunJdk.?? Please let us know , if anybody come across this kind of situation.

Raveendra Maddali replied on Wed, 2009/09/09 - 12:18am

HI, What is the best open source java profiler that is suggested for memoryleaks??
and how to setup the profiling on remote server??
ie., i wanted to profile a managed server running on solaris machine, from my local or client machine.
I tried to setup jconsole remotely , but not able to make it. Anyother profilers with easy of integration will be very much helpful.

Thanks in advance..!!!

Vikash Ranjan replied on Thu, 2009/09/10 - 1:03pm in response to: Raveendra Maddali

Not sure  why u need open source profiler .... did u mean free?

If I am not wrong, they all use the same JVMPI/TI APIs.

U can download and try some ....JProfiler is very good. So is AppPerfect, we tried it and then bought licenses.

AppPerfect @ http://www.appperfect.com/products/java-profiler.html

JProfiler @ http://www.brothersoft.com/jprofiler-download-81861.html  

Yourkit  @  http://www.yourkit.com/   

JProbe I have heard is ok too.

 

Vikash Ranjan replied on Thu, 2009/09/10 - 1:06pm in response to: Raveendra Maddali

BTW, JConsole has  the  easiest setup :)

All the  profilers  that I mentioned need some more  effort for  remote  connection.

But u should be able to troubleshoot @ http://java.sun.com/javase/6/docs/technotes/guides/management/jconsole.html

Raveendra Maddali replied on Mon, 2009/09/14 - 11:12pm

Hi, We have a windows server machine that have enough memory to bring up a jvm with more than 1gb. We have weblogic 10.3 present on that server. when we try to bring a managed server with JRockit Xmx 1024m , it is bringing it up fine. but when we try the same with SunJDK Xmx 1024m , it is failing to allocate the amount of memory (or) object heap and exiting. What is this difference between JRockit and SunJdk.?? Please let us know , if anybody come across this kind of situation

Vikash Ranjan replied on Mon, 2009/11/09 - 2:48am in response to: Raveendra Maddali

Incase u r  still looking for an answer.....:)

There are  differences in the default parameters that start the JVM, so sometimes you may need to modify your parameters to adjust the VM. One I know of is the Perm Gen, which will affect the Heap Max. JRockit dynamically  resizes the PermGen, so it can start with a lower value, leaving more space for Heap. There may be other reasons as well. U may need to check the differences in the JVM Startup params to get a comprehensive answer. 

The max heap you can set depends on the AVAILABILITY of the RAM+Swap, as  some of that may be used by other processes. Please try increasing the SWAP (that's easier to do ...dont  forget to restart m/c).

Normal 0 false false false MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman"; mso-ansi-language:#0400; mso-fareast-language:#0400; mso-bidi-language:#0400;}

Carla Brian replied on Tue, 2012/06/19 - 5:54pm

This is good. This is really helpful as well in testing the performance of the java application. Good job on this. - Garrett Hoelscher

Suminda Dharmasena replied on Tue, 2013/04/23 - 11:45am

Hi,

Most of the objects created in java need not be GCed. Some objects can be allocated in the stack  through escape analysis. Ideally the objects can be de allocated once use is over. Static escape analysis can be performed to see the extent of sharing and optimal time to delete. If any objects falls through this then GC can be performed. 

Also annotations can be introduced to mark timing of when GC happens. This will reduce the overhead and penalty of GC and pause.

Suminda

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.