Brian has 10+ years of experience as a technology leader and architect in a wide variety of settings from early startups to Fortune 500 companies. With experience delivering SaaS solutions in business intelligence, artificial intelligence and VoIP, his current focus is big data and analytics. Brian leads the Virgil project on Apache Extras, which is a services layer built on Cassandra that provides REST, Map/Reduce, Search and Distributed Processing capabilities. Brian is a DZone MVB and is not an employee of DZone and has posted 62 posts at DZone. You can read more from them at their website. View Full User Profile

Cassandra & Solr Integration in Virgil GUI

12.15.2011
| 6775 views |
  • submit to reddit
Up front, I'd like to say this is still pretty raw. We'd love to get feedback and contributions.

That said, Virgil (a services layer and GUI on top of Cassandra) now has the ability to integrate SOLR and Cassandra. When you add and delete rows and columns via the REST interface, an index is updated in SOLR.

For more information check out:
http://code.google.com/a/apache-extras.org/p/virgil/wiki/solr

Let us know what we can do better.

From Google Code:

Introduction

Virgil integrates SOLR such that columns and rows added and deleted via the REST API are automatically indexed by SOLR. The indexing occurs outside of cassandra.

Getting Started

Install SOLR

  1. Download SOLR
  2. Unzip/Untar the install
  3. Start the default server with:
  4. cd $SOLR_HOME/example
    java -jar start.jar
  5. This will start a SOLR instance with the default schema.

Configure Virgil

  1. Open the virgil.yaml in src/main/resources/ and confirm the hostname is correct, and indexing is enabled.
  2. solr_host: http://localhost:8983/solr/
    enable_indexing: true
  3. Start virgil with bin/virgil

Usage

Make sure you've followed the getting started steps to create a playground keyspace, and toys column family.

To use the indexing, add a query parameter onto your url. For adds, tack on "index=true". For deletes, tack on "purgeIndex=true".

You can see the contents of the SOLR index with the following URL. Initially, it should be empty: http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on

Then, after each of the following steps, you should be able to refresh and see the index updated.

Insert Row (rowkey = "swingset", columns foo:1, bar:22 )

curl -X PUT http://localhost:8080/virgil/data/playground/toys/swingset?index=true -d "{\"foo\":\"1\",\"bar\":\"33\"}"

Insert Column (rowkey = "swingset", columns snaf:lisa )

curl -X PUT http://localhost:8080/virgil/data/playground/toys/swingset/snaf?index=true -d "lisa"

Delete Column (rowkey = "swingset")

curl -X DELETE http://localhost:8080/virgil/data/playground/toys/swingset/snaf?purgeIndex=true

Delete Row

curl -X DELETE http://localhost:8080/virgil/data/playground/toys/swingset?purgeIndex=true

Using the Index

To achieve the integration, Virgil is using the dynamic fields capability of SOLR. Using the curl commands above, you'll end up with the following document in SOLR:

<result name="response" numFound="1" start="0">
   <doc>
      <str name="bar_t">37</str>
      <str name="columnFamily_t">toys</str>
      <str name="foo_t">1</str>
      <str name="id">toys.swingset</str>
      <str name="rowKey_t">swingset</str>
   </doc>
</result>

You'll notice that each row becomes a document in SOLR, and each column becomes a field. Additionally, we include the column family and rowkey as fields, which is what we combine to generate a unique id for the document. Since we are using dynamic fields, we tack on a "t" to each field name, which allows SOLR to recognize it as text. Based on that, you should be able to search now in SOLR using your column names + "t".

Give it a whirl from the admin console: http://localhost:8983/solr/admin/

Try plugging in the following as search criteria.

bar_t:37


IMPORTANT NOTE

Since there is no way to add and delete fields of a document cleanly within SOLR, there is a penalty for deleting or adding a single column. Thus, it is best to do row inserts and deletions when you can.


Blog Source:  http://brianoneill.blogspot.com/2011/11/cassandra-integration-w-solr-using.html


Published at DZone with permission of Brian O' Neill, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)