Sandeep is a ITA at Tata Consultancy Services. He also is the author of java blog. Sandeep has posted 24 posts at DZone. You can read more from them at their website. View Full User Profile

Java Anagram Buster

03.17.2012
| 6702 views |
  • submit to reddit

While browsing the net, I found this problem somewhere – to write a code that tests the given two strings are anagrams or not. From Wiki,

“An anagram is a type of word play, the result of rearranging the letter of a word or phrase to produce a new word or phrase, using all the original letters exactly once; for example orchestra can be rearranged into carthorse.”

I tried to solve this problem using Java and below is the result of it. The algorithm I tried is very simple:

  1. Clean the input – remove all the spaces and punctuation marks (because it doesn’t affect the compassion).
  2. Go through character by character from string one and check if that character exists in string two.
  3. If exists, then remove it from string two and move on to next character. If not exists, then we found a mismatch and the string is not an anagram.
  4. If all character from string one exists in string two, then we found it’s an anagram.

 Java code to test two strings are anagrams:

public class AnagramTester {

	public static void main(String[] args) {
		String one = "The United States of America";
		String two = "Attaineth its cause, freedom";
		System.out.println(new AnagramTester().test(one, two));
	}

	public boolean test(String a, String b) {

		boolean result = true;

		StringBuilder one = new StringBuilder(a.replaceAll("[\\s+\\W+]", "").toLowerCase());
		StringBuilder two = new StringBuilder(b.replaceAll("[\\s+\\W+]", "").toLowerCase());
		
		if (one.length() == two.length()) {

			int index = -1;

			for (char c : one.toString().toCharArray()) {

				index = two.indexOf(String.valueOf(c));

				if (index == -1) {
					result = false;
					break;
				}
				two.deleteCharAt(index);
			}
		} else {
			result = false;
		}

		return result;

	}

}

 

I tested the above code with most of the anagrams found in the Anagram Site and it worked well.

If you think the above code can be improved in someway, feel free to comment.

 

Published at DZone with permission of its author, Sandeep Bhandari. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Carlos Hoces replied on Sun, 2012/03/18 - 5:27pm

Another possible solution:

public boolean test(String a, String b) {

      boolean result = false;

      final String one = a.replaceAll("[\\s+\\W+]", "").toLowerCase();

      final String two = b.replaceAll("[\\s+\\W+]", "").toLowerCase();

      if (one.length() == two.length()) {

          final char[] oneArray =  one.toCharArray();

          final char[] twoArray =  two.toCharArray(); 

          Arrays.sort(oneArray);

          Arrays.sort(twoArray);

          result = Arrays.equals(oneArray, twoArray);

        }

      return result; 

    } 

Francesco Illuminati replied on Thu, 2013/10/24 - 4:36pm

Considering the prime numbers calculation out of the algorithm cost this seems like an efficient alternative:


public class Anagram {
    public static final Anagram INSTANCE = new Anagram();

    private static final int[] PRIMES =
        Prime.INSTANCE.calculateFirst('Z' - 'A' + 1);

    private Anagram() {}

    public int calculateValue(final String phrase) {
        int value = 1;
        for (char c: phrase.toCharArray()) {
            int index = AlphabetIndexer.getIndex(c);
            if (index != -1) {
                value *= PRIMES[index];
            }
        }
        return value;
    }

    public boolean isAnagram(final String phrase1, final String phrase2) {
        return calculateValue(phrase1) == calculateValue(phrase2);
    }
}

public class AlphabetIndexer {
    public final static int ALPHABET_LENGTH = 'Z' - 'A' + 1;

    public static int getIndex(final char c) {
        int value = c - 'A';
        if (value < 0 || value > ALPHABET_LENGTH) {
            value = c - 'a';
            if (value < 0 || value > ALPHABET_LENGTH) {
                return -1;
            }
        }
        return value;
    }
}

public class Prime {
    public static final Prime INSTANCE = new Prime();

    private Prime() {}

    public int[] calculateFirst(final int number) {
        final int[] primes = new int[number];
        int counter = 1, index = 0;
        boolean isPrime;
        while (index < number) {
            counter++;
            isPrime = true;
            for (int i=0; i<index; i++) {
                if (counter % primes[i] == 0) {
                    isPrime = false;
                    break;
                }
            }
            if (isPrime) {
                primes[index] = counter;
                index++;
            }
        }
        return primes;
    }
}
EDIT: primes should be multiplied and not added!

David Whatever replied on Sun, 2012/03/18 - 4:43pm

Since you have to walk the strings anyway, I would say a good algorithm would be:

  1. Do the easy checks (pointer equality would be true, either value being null would be an exception)
  2. Then, walk each string returning a 26 int array for count of each letter found, any non-alphabetic character should be ignored.
  3. Finally, compare the arrays with Arrays.equals().

Jonas Olsson replied on Mon, 2012/03/19 - 5:17am in response to: David Whatever

This. Any sorting and you're no longer O(n).

Francesco Illuminati replied on Mon, 2012/03/19 - 5:19am in response to: David Whatever

Yes, this is the more general solution. It's a O(n) so it's ok. My use of prime numbers avoids the Array.equals() saving a bunch of calculations but it's only a fancy trick.

David Whatever replied on Tue, 2012/03/20 - 12:59am in response to: Francesco Illuminati

For a more clever solution, you could change the methods slightly to expect the 26-int array to be passed in, and increment each index based on the count of each corresponding letter. Then the process changes to:

 

  1.  Perform the easy checks
  2. Initialize 26-int array (defaults to zeroes)
  3. Pass first string and array into method to increment array based on letter frequency
  4. Negate each index in the array
  5. Pass second string and array into method to increment array based on letter frequency
  6. Verify array is zeroed out
I don't think the change is worth saving the creation of a second array, though.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.