Big Data/Analytics Zone is brought to you in partnership with:

Doug has been engrossed in programming since his parents first bought him an Apple IIe computer in 4th grade. Throughout his early career, Doug proved his flexibility and ingenuity in crafting solutions in a variety of environments. Doug’s most recent work has been in the telecom industry developing tools to analyze large amounts of network traffic using C++ and Python. Doug loves learning and synthesizing this knowledge into code and blog articles. Doug is a DZone MVB and is not an employee of DZone and has posted 36 posts at DZone. You can read more from them at their website. View Full User Profile

Migrating the American Medical Association’s Search to Solr

  • submit to reddit

Our client, Silverchair Information Systems, recently completed a successful migration of the American Medical Association’s search over to Solr. Leveraging Silverchair’s semantic platform, we’ve helped migrate Silverchair’s SCM platform to Solr and away from Windows Search, dramatically improving performance and search quality. The AMA is the first of Silverchair’s clients to benefit from this swap-out.

Numerous search quality problems had to be tackled to get the AMA’s search just right. These include:

  1. Research journal users value recent publications very highly. Users want to see recent research, not just documents that score well due to how frequently search terms occur in a document. If you were a doctor, would you rather see brain cancer research that occurred this decade, or in the early 20th century?
  2. Searching on an author's name is an important feature. It’s understandably important that if you’ve published findings in the AMA’s prestigious journals, you’ll want to be able to find your research.
  3. Silverchair uses a semantic tagging system to improve search quality. Integrating the semantic features into AMA’s search required a lot of tuning and testing. Turns out we know a thing or two about this too.

For these problems, we’ve been using a new tool, Quepid, to help fix, regression test, and collaborate with the team on the quality of search results. I hope to talk about this tool and the related methodology – Test Driven Relevancy — at the upcoming Lucene Revolution (go here to vote for it).

It’s been amazing to see how Quepid completely alters how teams of technologists, programmers, and content experts can collaborate on search quality. Quepid offers a platform where the team can monitor the impact of tuning on multiple searches simultaneously, rate the quality of results, demonstrate problems with troublesome queries, and troubleshoot and resolve those search problems. Furthermore, by laying out queries that represent all the search problems resolved in the past, it’s helped us avoid the problem of fixing one problem and breaking dozens of others.

Are you stuck in a morass of never improving search quality? Do you want to make it easier to collaborate on search quality across disciplines? Quepid could be the right tool for you. Sign up if you’re interested in hearing about the upcoming alpha release of this product.

And finally, congratulations to Silverchair and the AMA on their successful migration!

Published at DZone with permission of Doug Turnbull, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)