Mark is a graph advocate and field engineer for Neo Technology, the company behind the Neo4j graph database. As a field engineer, Mark helps customers embrace graph data and Neo4j building sophisticated solutions to challenging data problems. When he's not with customers Mark is a developer on Neo4j and writes his experiences of being a graphista on a popular blog at He tweets at @markhneedham. Mark is a DZone MVB and is not an employee of DZone and has posted 527 posts at DZone. You can read more from them at their website. View Full User Profile

Neo4J: Searching for Nodes by Name With Lucene Autocomplete

  • submit to reddit

As I mentioned in a post a few days ago I’ve been graphing connections between ThoughtWorks people using neo4j and wanted to build auto complete functionality so I can search for the names of people in the graph.

The solution I came up was to create a Lucene index with an entry for each node and a common property on each document in the index so that I’d be able to get all the index entries easily.

I created the index like this, using the neography gem:"people", "type", "person", node)

I can then get all the names like this:

all_people ="people", "type", "person").map { |n| n["data"]["name"] }

It seemed like there must be a better way to do this and Michael Hunger was kind enough to show me a couple of cleaner solutions.

One way is to query the initial index rather than creating a new one:

all_people ="people", "name:*").map { |n| n["data"]["name"] }

The ‘find_node_index’ method allows us to pass in a Lucene query which gets executed via neo4j’s REST API. In this case we’re using a wild card query on the ‘name’ property so it will return all documents.

This way of getting all the names seemed to be much more intensive than my other approach and when I ran it a few times in a row I was getting OutOfMemory errors. My graph only has a few thousand nodes in it so I’m not sure why that is yet.

I think it should be possible to query the Lucene index directly with the partial name but I was struggling to get spaces in the search term to encode correctly and was getting back no results.

Another approach is to use a cypher query to get a collection of all the nodes:

all_people ="start n=node(*) return n")["data"].map { |n| n[0]["data"]["name"] }

I imagine this approach wouldn’t scale with graph size but for my graph it works just fine.

Published at DZone with permission of Mark Needham, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)