Big Data/Analytics Zone is brought to you in partnership with:

I am a Java Web Developer, Blogger, technology enthusiast and 3D graphic hobbyist. Mostly I am writing about Java related technologies including Java EE, Spring and Grails. Michael is a DZone MVB and is not an employee of DZone and has posted 23 posts at DZone. You can read more from them at their website. View Full User Profile

Getting started with Spring Data Solr

12.12.2013
| 17433 views |
  • submit to reddit

Spring Data Solr is an extension to the Spring Data project which aims to simplify the usage of Apache Solr in Spring applications. Please note that this is not an introduction into Spring (Data) or Solr. I assume you have at least some basic understanding of both technologies. Within the following post I will show how you can use Spring Data repositories to access Solr features in Spring applications.

Configuration

First we need a running Solr server. For simplicity reasons we will use the example configuration that comes with the current Solr release (4.5.1 at the time I am writing) and is described in the official Solr tutorial. So we only have to download Solr, extract it in a directory of our choice and then run java -jar start.jar from the <solr home>/example directory.

Now let's move to our demo application and add the Spring Data Solr dependency using maven: 

<dependency>
  <groupId>org.springframework.data</groupId>
  <artifactId>spring-data-solr</artifactId>
  <version>1.0.0.RELEASE</version>
</dependency>

In this example I am using Spring Boot to set up a small example Spring application. I am using the following Spring Boot dependencies and the Spring Boot parent pom for this:

<parent>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-parent</artifactId>
  <version>0.5.0.BUILD-SNAPSHOT</version>
</parent>
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-test</artifactId>
  <scope>test</scope>
</dependency>

Don't worry if you haven't used Spring Boot yet. These dependencies mainly act as shortcut for common (Spring) dependencies and simplify the configuration a bit. If you want to integrate Spring Data Solr within an existing Spring application you can skip the Spring Boot dependencies.

The Spring bean configuration is quite simple, we only have to define two beans ourself:

@ComponentScan
@EnableSolrRepositories("com.mscharhag.solr.repository")
public class Application {
 
  @Bean
  public SolrServer solrServer() {
    return new HttpSolrServer("http://localhost:8983/solr");
  }
 
  @Bean
  public SolrTemplate solrTemplate(SolrServer server) throws Exception {
    return new SolrTemplate(server);
  }
}

The solrServer bean is used to connect to the running Solr instance. Since Spring Data Solr uses Solrj we create a Solrj HttpSolrServer instance. It would also be possible to use an embedded Solr server by using EmbeddedSolrServer. The SolrTemplate provides common functionality to work with Solr (similar to Spring's JdbcTemplate). A solrTemplate bean is required for creating Solr repositories. Please also note the @EnableSolrRepositories annotation. With this annotation we tell Spring Data Solr to look in the specified package for Solr repositories.

Creating a Document

Before we can query Solr we have to add documents to the index. To define a document we create a POJO and add Solrj annotations to it. In this example we will use a simple Book class as document:

public class Book {
 
  @Field
  private String id;
   
  @Field
  private String name;
   
  @Field
  private String description;
   
  @Field("categories_txt")
  private List<Category> categories;
   
  // getters/setters
}
public enum Category {
  EDUCATION, HISTORY, HUMOR, TECHNOLOGY, ROMANCE, ADVENTURE
}

Each book has a unique id, a name, a description and belongs to one or more categories. Note that Solr requires a unique ID of type String for each document by default. Fields that should be added to the Solr index are annotated with the Solrj @Field annotation. By default Solrj tries to map document field names to Solr fields of the same name. The Solr example configuration already defines Solr fields named id, name and description so it should not be necessary to add these fields to the Solr configuration.

In case you want to change the Solr field definitions you can find the example configuration file at <solr home>/example/solr/collection1/conf/schema.xml. Within this file you should find the following field definitions:

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> 
<field name="name" type="text_general" indexed="true" stored="true" />
<field name="description" type="text_general" indexed="true" stored="true"/>

In general title would be a better attribute name for a Book than name. However, by using name we can use the default Solr field configuration. So I go for name instead of title for simplicity reasons.

For categories we have to define the field name manually using the @Field annotation: categories_txt. This matches the dynamic field named *_txt from the Solr example. This field definition can also be found in schema.xml:

<dynamicField name="*_txt" type="text_general"   indexed="true"  stored="true" multiValued="true"/>
Published at DZone with permission of Michael Scharhag, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)