I have a passion for talking to people, identifying problems, and writing software. When I'm doing my job correctly, software is easy to use, easy to understand, and easy to write... in that order. Michael is a DZone MVB and is not an employee of DZone and has posted 52 posts at DZone. You can read more from them at their website. View Full User Profile

The Java Collections Framework for Newbies

12.18.2011
| 6413 views |
  • submit to reddit

I don't consider myself a java expert by any measure, but there's a disturbing thing I've noticed. There are a LOT of people who claim to be "java developers", but they have zero clue what the "java collections framework" is. This post is designed for folks who keep getting stumped on interview questions or are mystified when someone starts talking about the difference between a Set and a List (for example).

If you google "java collections framework for dummies" you'll find this link which has a more complete, if fairly dense explanation. I'm going to do you one better and give a rule of thumb that you can use without thinking about it.

At it the root of things, a collection is something you can store other things inside. Just like in real life, a collection of marbles is just a "bunch" of marbles. The big difference in the collections framework is that the different implementations have different things they DO with the marbles that you need to understand.

For example, let's consider the ArrayList... everybody and their brother should know this... if not, you are not a java developer, go read a book. Some special things about an array list: It stores the entries in order of when they are added, you can select an element by it's index, it can contain duplicates of the same element. From a performance perspective, it is VERY fast to lookup and add things by index and add things to an ArrayList, on average, it is slow to see if a particular object is present because you must iterate the elements of the list to see if it's there.

Next, let's talk about HashSet... I realize that this might sound vaguely drug related to the uninitiated, but a hashset has some interestingly different characteristics from a list. First off, a HashSet has no concept of order or index, you can add things to it, you can iterate over it, but you cannot look things up by index nor are there any guarantees of what order things will be presented to you when it loop over members. Another interesting characteristic is that it cannot contain duplicates, if you try to add the same object twice, it will NOT fail, it will just return false and you can happily move on.

Last but not least, there is the Hashtable (or his slightly more dangerous cousin, the HashMap). This is used to store key/value pairs. Instead of keying things by an index (like an arraylist), you can key them by just about anything you want. You can do things like myMap.put("foo","bar") and then myMap.get("foo") will return bar...

There is a LOT more to this, but with this quick reference you can at least begin to do useful things in java.

Examples of using a List
ArrayList myList = new ArrayList();
myList.add("Second Thing");
myList.add("Second Thing");
myList.add("First Thing");

System.out.println(myList.get(0));
will output
Second Thing
An interesting thing to note is that the size of this is 3
System.out.println(myList.size());
will output
3
The following:
for (String thing: myList) {
  System.out.println(thing);
}
will always output:
Second Thing
Second Thing
First Thing
Next lets look at a set:
HashSet mySet = new HashSet();
mySet.add("Second Thing");
mySet.add("Second Thing");
mySet.add("First Thing");
The first difference we can see is that
System.out.println(mySet.size());
returns
2
Which makes complete sense if you understand that sets cannot contain duplicates (and you understand how the equals method of String works...;) Another interesting thing is that: The following:
for (String thing: myList) {
  System.out.println(thing);
}
might output:
Second Thing
First Thing
or it might output:
First Thing
Second Thing

It so happens that it returns the second version on my machine but it's really JVM/runtime specific (it depends on how the HashSet is implemented and how hashcode is implemented and a bunch of other variables I don't even fully understand).

More importantly, the following will be likely be much faster for LARGE collections:
System.out.println(mySet.contains("Third Thing"));
Finally, the grandDaddy of all the entire framework, hashtable.
 Hashtable myMap = new Hashtable();
 myMap.put("a", "Second Thing");
 myMap.put("b", "Second Thing");
 myMap.put("c", "First Thing");

 System.out.println(myMap.get("a"));

Will output
Second Thing
and the following:
for (Map.Entry entry: myMap.entrySet()) {
    System.out.println(entry.getKey() + "=" + entry.getValue());
}
will output
b=Second Thing
a=Second Thing
c=First Thing
Hopefully with these examples, you can get an idea of the capabilities of the collections framework. There is much much more to it and I encourage ANYONE doing java development to spend time playing around and learning the different characteristics of the various components as I've only lightly skimmed the surface.

 

From http://mikemainguy.blogspot.com/2011/12/java-collections-framework-for-newbies.html

Published at DZone with permission of Michael Mainguy, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Ash Mughal replied on Wed, 2012/01/25 - 6:53pm

Hi Michael,

Thanks for posting such an informative and valuable post. It really helps to get some good understanding of Java Collections Framework. However, I would like to add some more to Collections Framework.

The key interfaces used by the collection framework are List, Set and Map. The List and Set extends the
Collection interface. Should not confuse the Collection interface with the Collections class which is a utility class.

A Set is a collection with unique elements and prevents duplication within the collection. HashSet and TreeSet
are implementations of a Set interface.

A List is a collection with an ordered sequence of elements and may contain duplicates. ArrayList, LinkedList and Vector are implementations of a List interface.


The Collection API also supports maps, but within a hierarchy distinct from the Collection interface. A Map is an object that maps keys to values, where the list of keys is itself a collection object. A map can contain duplicate values, but the keys in a map must be distinct. HashMap, TreeMap and Hashtable are implementations of a Map interface.

new java

Aries McRae replied on Tue, 2011/12/20 - 1:29am

So, how do you know which collection is appropriate for your scenario?

Here's a flow chart that show just that (courtesy of Sergiy Kovalchuk)

http://www.sergiy.ca/guide-to-selecting-appropriate-map-collection-in-java

 

Java Map/Collection Cheat Sheet

 

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.