My passion is building crawlers and search engines. In particular, I specialize in building vertical search engines like Indeed.com, Homethinking.com, Bright.com and Enormo.com (all companies I've worked with). I've also worked on products such as Atlassian Jira and Confluence to improve their search capabilities. Kelvin has posted 22 posts at DZone. You can read more from them at their website. View Full User Profile

Book review of Apache Solr 3 Enterprise Search Server

02.29.2012
| 3980 views |
  • submit to reddit
Published by: Packt Publishing
ISBN: 1849516065

Reviewer Ratings

Relevance:
5

Readability:
5

Overall:
5

Buy it now

One Minute Bottom Line

Apache Solr 3 Enterprise Search Server is a book I'd heartily recommend to new or even moderately experienced users of Apache Solr.  It brings together information which is spread throughout the Lucene and Solr wiki and javadocs, making it a handy desk reference.

Review

Apache Solr 3 Enterprise Search Server published by Packt Publishing is the only Solr book available at the moment.

It's a fairly comprehensive book, and discusses many new Solr 3 features. Considering the breakneck pace of Solr development and the rate at which new features get introduced, you have to hand it to the authors to have released a book which isn't outdated by the time it hits bookshelves.

Nonetheless, it does have shortcomings. I'll cover some of these shortly.

Firstly, the table of contents:

Chapter 1: Quick Starting Solr
Chapter 2: Schema and Text Analysis
Chapter 3: Indexing Data
Chapter 4: Searching
Chapter 5: Search Relevancy
Chapter 6: Faceting
Chapter 7: Search Components
Chapter 8: Deployment
Chapter 9: Integrating Solr
Chapter 10: Scaling Solr
Appendix: Search Quick Reference

A complete TOC with chapter sections is available here: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book

The good points

The book does an overall excellent job of covering Solr basics such as the Lucene query syntax, scoring, schema.xml, DIH (dataimport handler), faceting and the various searchcomponents.

There are chapters dedicated to deploying, integrating and scaling Solr, which is nice. i found the Scaling Solr chapter in particular filled with common performance enhancement tips.

The DisMax query parser is covered in great detail, which is good because I've often found it to be a stumbling block for new solr users.

The bad points

Not many, but here are a few gripes.

The 2 most important files a new Solr user needs to understand are schema.xml and solrconfig.xml. There should have been more emphasis placed on them early on. I don't even see solrconfig.xml anywhere in the TOC.

No mention of the Solr admin interface which is an absolute gem for a number of tasks, such as understanding tokenizers. In the text analysis section of Chapter 2, there really should be a walkthrough of Solr Admin's analyzer interface.

I think there could have been at least an attempt at describing the underlying data structure in which documents are stored (inverted index), as well as a basic introduction to the tf.idf scoring model. No mention of this at all in Chapter 5 Search Relevancy. One could argue that this is out of the scope of the book, but if a reader is to arrive at a deep understanding of what Lucene really is, understanding inverted indices and tf.idf is clearly a must.

Summary

All in all, Apache Solr 3 Enterprise Search Server is a book I'd heartily recommend to new or even moderately experienced users of Apache Solr.

It brings together information which is spread throughout the Lucene and Solr wiki and javadocs, making it a handy desk reference.

 Source:  http://www.supermind.org/blog/1015/book-review-of-apache-solr-3-enterprise-search-server

Published at DZone with permission of its author, Kelvin Tan.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: