Ed has posted 2 posts at DZone. View Full User Profile

Software Review: From Zero to Searching with LucidWorks Enterprise

03.01.2011
| 13188 views |
  • submit to reddit
The staff and customers of today’s businesses and organizations must deal with an enormous amount of data from a wide variety of sources. While everyone needs quick access to this information, the typical methods, like search engines, are not keeping up with the need for sifting through this vast array of information.  Users need to be able to gather data from a variety of sources, filter out the unrelated information, and capture only what is truly needed.

To help organizations with this problem, Lucid Imagination has created LucidWorks Enterprise.  LucidWorks Enterprise is a search platform that packages frontend application interfaces with the open-source project Apache Solr.  Solr was developed as a programming tool to facilitate high-speed searching of large volumes of information for web-based users or other applications. While Solr holds the promise of being a major benefit to organizations, on its own it can be a difficult application to implement and also has some shortcomings in the scope of its searching and administration capabilities.

LucidWorks Enterprise (LWE) helps make the promise of Solr come true for an organization, without the excessive amount of development time needed to implement a separate high-speed search tool.  LucidWorks Enterprise adds a very clean frontend to a fully tested implementation of the Solr application.  This frontend provides a variety of administrative and search functionalities that are not available in the core Solr implementation.

Some of the most notable features added by LucidWorks Enterprise include:

  • Dashboard with quick status view of operations and searches performed
  • Fast data source setup for indexing web sites, local files, database information, and other Solr Data Sources
  • Indexing management that includes: Creation of data sources, Scheduling of source document reindexing, Managing indexed fields, and Administration of system settings
  • Maintaining query option settings
  • Configuring user rights
  • Creating collections of information indexes
  • Creating searches into the Solr database
  • API’s for all the operations to provide remote administration, updating, and searching
  • User alerts for notification of updates to established searches

 Lucidworks Dashboard

Installation

To see if LucidWorks Enterprise does in fact bring the promise of high speed searches, I decided to install the application onto a system and see how it works.  I choose to setup a virtual machine with the minimum requirements for the application. LucidWorks Enterprise works on most Windows platforms, Linux, Mac OS X+ and CentOS, so I decided to see how easy it is to setup on a Windows platform.

The virtual machine used was setup on a workstation running VMWare Workstation, which was also running another virtual machine in addition to the LucidWorks installation.  The virtual machine for LucidWorks was configured with 3 gig of RAM, one processor, and 100 gig of hard disk.  

Following the installation instructions of the Windows installer made for a smooth installation, I had no issues at all to get the application going.  The only real concerns on using Windows as the test platform is that the documentation seems to all reference Linux type commands for management and any modifications.  There was very little reference to windows and anywhere that command line operations were shown, they were Linux references.  While the lack of Windows references was not a problem in testing, it may be a concern when implementing in a production environment.

The installation documentation does have some omissions that can be troublesome, like the default login is contained in the README file of the installation and not in the actual documentation.  This is not a major issue, just takes time to find all the bits and pieces needed.  The other thing to note on a Windows platform, you will need an editor other than Notepad as the text files were saved with a non-windows application and opening them in Notepad makes them virtually unreadable with no line breaks.

I did have a couple questions that came up during the installation and initial usage of the application, but was able to get a quick response from the Lucid support team.  They were able to provide quick information on my questions and seemed very knowledgeable on their product and its implementation.

Documentation

The Lucid site has a number of documentation resources available for installation and use of LucidWorks Enterprise. The documentation materials available from their site include:

  • LucidWorks Enterprise User Guide
  • Complementary Certified Distribution Reference Guide book for Solr
  • Wiki documentation on the LucidWorks platform
  • Links to a variety of other Solr related documentation and books
  • Built-in help system within the user interface
  • Whitepapers on implementing Solr and LucidWorks Enterprise

Lucidworks Reference Guide

While I did not read everything available, I did skim through most of the book, wiki, and online help system.  The materials in those areas are predominantly geared towards Linux installations and if you are planning to do a full implementation of LucidWorks Enterprise that would probably be the OS you want to target for your production platform.

The downloadable Certified Solr Reference Guide is a very comprehensive book, but be aware that a number of the screen shots in the book are from tho Solr Admin console and not from the LucidWorks Enterprise front end, or at least the windows version.  Most of the differences are minor and easily understood, you just need to be aware of the differences.  Many of the examples in the book refer to the Solr wiki or other sites for more details.  The details on those other sites can vary greatly, so plan on spending time doing research when getting deeper into the product.

The one area that could probably use more documentation is the API examples.  There are examples for .Net, Perl, and Python clients to access the APIs.  Yet for languages that are considered the primary languages, Java and JavaScript, the Lucid documentation only refers the reader to the Solr site for that API information.  After spending a bit of time going through Solr documentation, LucidWorks would benefit from providing their own examples of how to interface with LucidWorks and Solr.  The examples on the Solr site are very much geared towards developers with experience in the Solr platform.  So if you are going to be doing the development work, count on spending some time learning.

Using LucidWorks Enterprise

I decided to try indexing some local files on my file server.  I choose a section that had a few hundred documents on a large project and set LucidWorks to indexing them.  The process for creating the index source was easy to follow and allowed for quick selection of the folder that contained the documents.  Once the indexing started it took less than a minute to do the indexing of the Word and PDF documents in the folders.  If there were thousands of documents, this process could take some time to perform.  LucidWorks Enterprise can be setup to scale to a large network and very large volumes of documents.  It’s just a matter of how much horsepower you need based on the volume of indexing to be performed.

Creating a data source


After the indexing was complete, I tried searching the documents.  This was a very pleasant surprise, it searched the documents for any set of words given and returned a list of relevant documents very quickly.  In addition to the list, a sidebar list also gives quick links to documents by author, source, keywords, and types of documents.  The automated suggestion mechanism is one of the big features of LucidWorks Enterprise over the generic Solr installation.  

 Licidworks Search Results

LucidWorks Enterprise has a significant set of enhancements for providing the user with a better search experience.  The documentation has a large portion dedicated to covering how LucidWorks Enterprise can be tuned to provide more accurate and relevant searches for the users.  The list of options available for customizing the searches warrants a series of articles all to themselves, so I will not go into depth here.

Lucidworks Query Summary


In addition to advanced search capabilities, LucidWorks Enterprise can be setup to send alerts to users on indexes that change.  This functionality makes the system very useful to organizations that have frequently changing information sources and that need to let their users know about those changes.

One area that should be noted on indexing: if you set up a web site to be indexed and the site links to other sites, it is possible to have the indexing run amuck.  I selected a site and set the number of levels to do and it happened that there was a link to a news site from one of the levels.  This just kept going and going… fortunately I was able to find how to stop the indexing from a reference in one of the pieces of documentation.  


To stop an indexing operation once it starts, you have to go to the source in the administration interface and uncheck that source’s “Activate” option.  That will stop the indexing for that source at that point, not an obvious option but good to know.

Another note on usage, if you want to have your users be able to open files from the file system, you will have to modify a configuration file in LucidWorks.  LucidWorks default setting is to not allow downloading of files and the documentation provides lots of disclaimers about letting that happen.  But if your main users are internal staff and not external users, this option could be very useful in being able to index internal documents.

Interfacing with other systems

LucidWorks Enterprise is designed to provide interfacing with other systems through its API set and the Solr API set.  Most of the LucidWorks Enterprise API calls are in support of the enhanced functionality provided by LucidWorks.  The Actual search API’s are those from the Solr API set and are only documented in the Solr documentation.

Since the search interfacing only uses the Solr API set, I did not setup an interface environment to test the API calls.  The LucidWorks API set does seem to cover all aspects of their frontend and has options that are not even in the user interface.  For example, to add new users to the LucidWorks environment, API calls must be used to create or change users.  An optional link to an LDAP authentication system can be used if your environment has your users maintained in an LDAP format.  The user interface does provide a mechanism to assign rights to using the three main areas of the user interface, but not for the creation of the users.

Lucidworks User Administration

 Pricing

LucidWorks Enterprise is geared towards large organizations with significant quantities of data to be indexed.  The pricing structure for LucidWorks Enterprise is subscription-based, starting at free for testing or developing with the LucidWorks Enterprise software and progresses through two tiers of development support to a production subscription.  The pricing is not geared for smaller organizations as the annual production subscription is currently $64,000.  LucidWorks does not currently have a lower end offering for smaller organizations, but it is quite possible that they may see that as a market and provide a lower cost option in the future.

Summary

Overall, LucidWorks Enterprise is a very good product that installs cleanly, performs very well, and is reasonably well documented.  Their support team is very responsive and even passed on some of the suggestions I sent to their development and documentation teams.

For organizations that have a significant amount of data from various sources to be distributed on a regular basis to its employees or customers, LucidWorks Enterprise provides a solid way to accelerate the move to the Solr platform in a relatively painless process.

Published at DZone with permission of its author, Ed Rought.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)