NoSQL Zone is brought to you in partnership with:

Andreas Kollegger is a leading speaker and writer on graph databases and Neo4j and the bridge between community and developer efforts. He works actively in the community, speaking around the world and promoting the larger Neo4j ecosystem of projects. Author of Fair Trade Software, and the lead for Neo4j in the cloud, Andreas plays a valuable role for progressive happenings within Neo4j. Andreas is a DZone MVB and is not an employee of DZone and has posted 75 posts at DZone. You can read more from them at their website. View Full User Profile

Neo4j 2.0: Importing Data the Spreadsheet Way!

01.27.2014
| 5871 views |
  • submit to reddit
[This post was originally written by Pernilla Lindh]

Hi all graphistas out there,

And happy new year! I hope you had an excellent start, let's keep this year rocking with a spirit of graph-love! Our Rik Van Bruggen did a lovely blog post on how to import data into Neo4j using spreadsheets in March last year.  Simple and easy to understand but only for Neo4j version 1.9.3. Now it’s a new year and in December we launched a shiny new version of Neo4j, the 2.0.0 release! Baadadadaam! So, I thought better provide an update to his blog post, with the spirit of his work. (Thank you Rik!)

You can still use the Neo4j CSV batch-importer (Now for 2.0.0) from Michael Hunger, or look at other Data Import Options.

If you simple want to use Cypher, Rik’s way is much easier. That’s why I have updated Riks Cypher statements old statements in a new spreadsheet that shows how how to import to Neo4j 2.0.0.

How does it work?

Open the spreadhsheet.
 

The sheet is composed of two parts:

  • columns A, B and C: these contain the data for the Nodes of our graph, using a custom “id”, a “name”, and a “gender” as properties.

  • columns F, G and H: these contain the data for the Relationships of our graph, having a “from-id” (where the relationship starts), a “to-id” (where the relationship ends), and a “relationship type”. Columns F and G reference the nodes and their id’s in column A.

And then comes the secret sauce: how to create Cypher statements from this nodes and relationship information.

For this we use very simple statements that leverage the columns mentioned above, the cypher syntax and string concatenation. Look at the columns D and I:

Nodes

We just use this formula to create the cypher statement.

 ="MERGE (meetup:Event {id:'"&A3&"', name:'"&B3&"'})” 

(instead of create we will use merge who is a new feature in 2.0.0 it will create if the node not exist otherwise it will not create a new node. You can read more about it here in the Neo4j Manual. Output for row 3:

  MERGE (meetup:Event{id:’153602002', name:’Meetup Malmö'}) 

If we check the next row, we will see a change, since we know that all attendees of the meetup will attend our meetup, we can create the whole relationship too. So we combine the creation of the “Person” Node with connecting it to the meetup node we just created.

 ="MERGE (_"&A4&":Person {id:'"&A4&"', name:'"&B4&"', gender:'"&C4&"'})
-[:ATTENDS]->(meetup)"
Output for row 4:

 MERGE (_2:Person {id:'2', name:'Donald Duck', gender:'man'})-[:ATTENDS]->(meetup)

As you can see, it takes that id, name and gender properties from columns A, B and C, and puts these into a “MERGE” Cypher statement.

Published at DZone with permission of Andreas Kollegger, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)