Several years of experience in designing and building products. My roles have spanned from writing code to leading development for core technical features of Java EE App Server (Container Development), Business Intelligence Server, Web Service Stack, and various collaboration tools. Presently work on development a BI Search Engine for Cognos Software at IBM. Committer on the open source Apache Web Services Stack – Apache CXF. Member on the JSR 303 expert group - Bean Validation spec. Bharath has posted 3 posts at DZone. View Full User Profile

The Interesting Leak

06.23.2008
| 9665 views |
  • submit to reddit

Does you appllication use a WeakHashMap with a String key? Pause. Here is something you need to know, while working with such a WeakHashMap. This article is extracted from couple of posts made on my blog.

Part-1

This again is one of those interesting observations I ran into during some memory optimizations CXF. Coincidently here also its a WeakHashMap.
Consider the code snippet below.

public class TestWeakHashMap
{
private String str1 = new String("newString1");
private String str2 = "literalString2";
private String str3 = "literalString3";
private String str4 = new String("newString4");
private Map map = new WeakHashMap();

private void testGC() throws IOException
{
map.put(str1, new Object());
map.put(str2, new Object());
map.put(str3, new Object());
map.put(str4, new Object());

/**
* Discard the strong reference to all the keys
*/
str1 = null;
str2 = null;
str3 = null;
str4 = null;

while (true) {
System.gc();
/**
* Verify Full GC with the -verbose:gc option
* We expect the map to be emptied as the strong references to
* all the keys are discarded.
*/
System.out.println("map.size(); = " + map.size() + " " + map);
}
}
}

What do we expect the size of the map to be after full GC? I initially thought it should be empty. But it turned out to be 2.

Look at the way the four Strings are initialized. Two of them are defined using the 'new' operator, whereas the other two are defined as literals. The Strings defined using the 'new' operator would be allocated in the Java heap, but the Strings defined defined as literals would be in the literal pool.
The Strings allocated in the literal pool (Perm Space) would never be garbage collected.
This would mean that String 'str2' and 'str3' would always be strongly referenced and the corresponding entry would never be removed from the WeakHashMap.

So next time you create a 'new String()' , put it as a key in a WeakHashMap, and later intern() the String, beware - Your key will always be strongly referenced. [Invoking intern() method on a String will add your String to the literal pool if some other String equal to this String does not exist in the pool]

Part-2

This again is a follow up of my previous post - The Interesting Leak.
I was quite amused after reading Markus' follow-up post An interesting leak when using WeakHashMaps.
What surprised me most was If you put

 ("abc" + "def").intern();

as key in the WeakHashMap, it doesn't get GC'd whereas

(new String("abc") + "def").intern()

as key leads to the entry being garbage collected.
Huh!
Tried all combinations. No clarity. Last resort - Pinged Rajiv.
And so went the conversation -

Me : If you put ("abc" + "def").intern(); as key, it doesnt get GC'd but if you put (new String("abc") + "def").intern() it gets GC'd
Rajiv : Decompile and see if "abc"+"def" is being converted to "abcdef" by javac
Me: Yes it is. So?
[This could be the clue. Am still thinking.. tick tick tick]

Rajiv: Check if "abcdef"== (new String("abc") + "def").intern()
Me: It is... printed the identitity hashcodes.
Rajiv: In the class you have both "abcdef" and (new String("abc") + "def").intern() and still
(new String("abc") + "def").intern() gets gc'ed?
Me: God! Then it doesn't get gc'd.
[Now Rajiv cracks it -]

Rajiv: "I think intern is weak map and constant pool has a strong ref"
Me: ohh!
Me: In that case (new String("abc") ).intern(); should get GC'd right? But we saw it doesn't. The maya happens only when someString is '+'d to (new String("abc")) and then the resultant String is interned.
Me: Just (new String("abc")).intern() doesnt get GC'd.
Rajiv: When you say (new String("abc")).intern() there is a string "abc" in constant pool.
Me: Yes "abc" in constant pool would be the literal we created and passed as argument to the String constructor.
Rajiv: (new String("abc")).intern() returns that string. So wont get gc'ed
Me: Oh yeah. Got it!
Me: So only when you do a "+" you get a String which is not there in constant pool and hence it gets GC'd ...
Rajiv: ya right.

I had earlier thought of intern pool and constant pool to be the same. But Rajiv 's prediction of intern being a weak map and constant pool holding a strong ref looks quite convincing.
Oo la.. That solved our mystery.

Originally Posted at: http://thoughts.bharathganesh.com

Published at DZone with permission of its author, Bharath Ganesh.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Artur Biesiadowski replied on Mon, 2008/06/23 - 2:07am

Interesting article - I suppose that you can write entire series of that, with next one focusing on autoboxed integers - what is the difference between map.put(100,something) and map.put(200,something).

Important point to stress is that none of basic java wrappers/classes make good keys for weak hash maps.

Markus Kohler replied on Mon, 2008/06/23 - 5:44am

Hi,

 

Actually I tried to explain it im my blog, but I think I wasn't very clear. 

The point is that references which have an implict reference from the class are not reclaimed. 

As soon as the class would be unloaded the reference would also be reclaimed.

Unfortunately I don't know of any tools that could show you this internal reference.

 

Regards,

Markus

Slava Imeshev replied on Tue, 2008/06/24 - 8:17pm

 

I think the name of the posting is misleading. What you have experienced is a normal WeakHashMap behaviour. You have placed to the map a reference to an object that is not de-referenceable, so it was not GC'd and as a result not removed from the map. That's exactly what it is supposed to do.

 

Regards,

Slava Imeshev

 

Markus Kohler replied on Wed, 2008/06/25 - 2:31am

Hi "Slava Imeshev",

I'm not sure what you mean with "not de-referenceable".

The point is that String literals  get interned automatically by the Java compiler. This creates an implicit reference from the class to the literal. I say "implicit" because you will not see the reference in any of the existing tools, such as heap dump analysis tools (like the Eclipse Memory Analyzer) nor in the debugger.

That's surprising, and that's why IMHO this is a very interesting post.


Regards,

Markus (http://kohlerm.blogspot.com

 

 

Bharath Ganesh replied on Tue, 2008/07/01 - 6:57am in response to: Slava Imeshev

Slava,

As Markus said, the point I was trying to make in the article is that if you use a String literal as a key (as opposed to a only-interned String - which is not present in the global literal pool), even though your application does not hold any strong reference to the String, the entry would never be removed from the HashMap.

I was also trying to show the difference in behaviour  between an interned String and a String literal.

Slava Imeshev replied on Tue, 2008/07/01 - 1:43pm in response to: Bharath Ganesh

Well, this does not change what I said. The WeakHashMap behaves normally. The test is that after a key was eviected, you should not be able to find an object with the same identity. This test does not hold for a literal and holds for new String("listeral"). That's it. This is not a leak but rather a lack of understanding how WeakHashMap works. You get the same problem fith any private static final object.

 

Slava

 

Markus Kohler replied on Wed, 2008/07/02 - 2:13am in response to: Slava Imeshev

Well IMHO, you both are kind of right and wrong ;)

 If you have a private static final object and put it as a key into your WeakHashMap of course that key will live as long as the class is there. But in this case it is obvious and a good tool (http://www.eclipse.org/mat)  will show that it's there because the class is still there. 

In this case, no tool that I'm aware of will show you that there still is a hard reference from the class to the string literal. You just have to know it ...

 

Regards,

Markus

 

Tanya Huff replied on Thu, 2011/05/05 - 11:13pm in response to: Markus Kohler

@Markus Once again Markus I agree with you. The key will live as long as the class is there is correct! Singer Sewing Machine Manual

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.