Mark is a graph advocate and field engineer for Neo Technology, the company behind the Neo4j graph database. As a field engineer, Mark helps customers embrace graph data and Neo4j building sophisticated solutions to challenging data problems. When he's not with customers Mark is a developer on Neo4j and writes his experiences of being a graphista on a popular blog at He tweets at @markhneedham. Mark is a DZone MVB and is not an employee of DZone and has posted 544 posts at DZone. You can read more from them at their website. View Full User Profile

Canonical Identifiers

  • submit to reddit

Duncan and I had an interesting problem recently where we had to make it possible to search within an ‘item’ to find possible sub items that exist inside it.

The URI for the item was something like this:


Let’s say Item 234 contains the following sub items:

  • Mark
  • duncan

We have a search box on the page which allows us to type in the name of a sub item and go the sub item’s page if it exists or see an error message if it doesn’t.

If the user types in the sub item name exactly right then there’s no problem:


redirects to:


It becomes more interesting if the user gets the case of the sub item wrong e.g. they type ‘mark’ instead of ‘Mark’.

It’s not very friendly in terms of user experience to give the user an error message if they do that so I suggested that we just make the look up of the sub item case insensitive


would therefore find us the ‘Mark’ sub item.

Duncan pointed out that we’d now have more than 1 URI for the same document which isn’t particularly great since theoretically there should be a one to one mapping between a URI and a given document.

He pointed out that we could do a look up to find the ‘canonical identifier’ before we did the redirect such that if you typed in ‘mark’:


would redirect to:


The logic for checking the existence of a sub item would be the bit that’s case insensitive and makes it more user friendly.



Published at DZone with permission of Mark Needham, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)



Ash Mughal replied on Thu, 2012/01/26 - 2:53am

But choosing a good identifier and making it stick is pretty hard.  Identifier functional characteristics  often conflict with one another, and the identifiers favored by business purposes (which pretty much always carry the day) aren't necessarily optimal from a perspective of long term persistence and management. 

One aspect of the problem just got easier.  Google, Yahoo, and Microsoft have agreed on a convention for identifying the canonical identifier for a given resource that may be rendered under different transactional URLs. 

advanced java

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.