Damaris has posted 19 posts at DZone. View Full User Profile

Graph Databases use case: Bibliographic exploration

09.19.2011
| 4257 views |
  • submit to reddit

Bibliographic exploration is an interesting use case for Graph Databases. Bibliographic exploration rises after the need to query huge bibliographic resources to obtain relevant information for researchers.

There are many questions that researchers try to ask to Bibliographic resources, but the vast amount of heterogeneous information stored in them makes it difficult to obtain good and fast answers.

Articles, its authors and the keywords that best describe those articles are stored in Bibliographic resources. This type of information is naturally linked, for instance authors are linked with other authors by the articles they have collaboratively written and at the same time articles may be connected with the keywords that are most relevant in them.

Graph Databases are a good solution to store huge amount of strongly connected information. Graph Databases store information the same way it is connected naturally; therefore answers can be retrieved directly without having to join all the data as it happens in SQL traditional databases.

Take for instance the following image which describes how a bibliographic resource could be stored (and showed) in a graph database.

Bibliographic graph

We can see that authors are nodes in the graph, and they are connected by their collaboration in papers (edge). With a click on the edge of the graph you could obtain all the articles written together by both authors. 

Bibliographic graph

This type of query takes seconds to have a result in a Graph Database and could be relevant to new researchers, like PhD students, or researchers in a new area in order to investigate authors, the papers they have written, who they have collaborated with and about what topic areas. 

Another interesting aspect about storing Bibliographic information with Graph Databases is the use of the citation metric. An article or an author can be considered to be of quality depending on both the number and the acknowledgment (quality) of the citations. Again Graph Databases are the most suitable technology to work with this metrics, since it would represent only retrieving the neighbors for a certain node “article”, that have the edge type “cite”:

//Once the DB is open

article = graph.findType("article");
title = graph.findAttribute(article, “title”);
www = graph.findObject(title, new Value(“The World-Wide Web.”));
cite = graph.findType("cite");
citations = graph.neighbors(www, cite, EdgesDirection.Ingoing)
articleQualityValue = citations.count();

//You should close here the DB

Using citations we could answer questions like “Who is an authority in a certain topic?” or “Who is the most suitable reviewer for a certain paper?” The possibility to answer those new complex queries is what makes graph databases an excellent use case for bibliographic exploration.

Let’s conclude with the big pros of using Graph Databases for Bibliographic exploration:

  • Data sources with bibliographic information are huge and strongly connected. Graph Databases can store billions of objects and are specially created to store linked information.
  • Bibliographic exploration is more interesting if it merges as many sources as possible. Graph Databases can store data with heterogeneous schemas, like bibliographic repositories, publishers, patents, or any other source of information.
  • Researchers need to have answers as quick as possible, in order to have his/her efforts focused in its main topic of research. Graph Databases can query connected data in a few seconds, even for complex queries.
  • New complex questions can be easily answered using graph database ease to navigate through linked information.

 

 

 * Code example uses DEX Graph Database JAVA API

** Images are taken from BIBEX social free demo. Available here

Published at DZone with permission of its author, Damaris Coll.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)