Connecting Redis to Solr For Boosting Documents
There are a number of instances in Solr where it's desirable to retrieve data from an external datastore for boosting purposes instead of trying to contort Solr with multiple queries, joins etc.
Here's a trivial example:
Jobs are stored as documents in Solr. Users of the application can rank a job from 1-10. We need to boost each job with the user's rank if it exists.
Now, to try to attempt to model this fully in Solr would be fairly inefficient, especially for large # of jobs and/or users, since each time a user ranks a job, the searcher has to reload in order for that data to be available for searching.
A much more efficient method of implementing this, is by storing the rank data in a nosql store like Redis, and retrieving the rank at query-time, using it to boost the documents accordingly.
This can be accomplished using a custom FunctionQuery. I've blogged about how to create custom function queries in Solr before, so this is simply an application of the subject.
Here's the code:
public class RedisValueSourceParser extends ValueSourceParser {
@Override public ValueSource parse(FunctionQParser fp) throws ParseException {
String dataType = fp.parseArg(); // either z (sortedset) or h (hash)
if (!dataType.equalsIgnoreCase("z") && !dataType.equalsIgnoreCase("h")) {
throw new ParseException("Expecting first arg to be either z (sortedset) or h (hash)");
}
String redisKey = fp.parseArg();
String field = fp.parseArg();
return new RedisValueSource(dataType, redisKey, field);
}
}This FunctionQuery accepts 3 arguments:
1. dataType, either a Redis sortedset or hash
2. the key to the Redis collection
3. the field to use as an id field
Here's what the salient part of RedisValueSource looks like:
@Override public DocValues getValues(Map context, IndexReader reader) throws IOException {
final String[] lookup = FieldCache.DEFAULT.getStrings(reader, field);
final Jedis jedis = new Jedis("localhost");
return new DocValues() {
@Override public String strVal(int doc) {
final String id = lookup[doc];
String result = redisDataType.equalsIgnoreCase("h") ?
jedis.hget(redisKey, id) : Double.toString(jedis.zscore(redisKey, id));
return result;
}
@Override public String toString(int doc) {
return strVal(doc);
}
};
}
From here, you can use the following Solr query to perform boosting based on the Redis value:
http://localhost:8983/solr/select?defType=edismax&q=cat:electronics&bf=redis(h,bar,id)&debugQuery=on
The explain output looks like this:
3.4664698 = (MATCH) sum of:
1.070082 = (MATCH) weight(cat:electronics in 2), product of:
0.80067647 = queryWeight(cat:electronics), product of:
1.3364723 = idf(docFreq=14, maxDocs=21)
0.59909695 = queryNorm
1.3364723 = (MATCH) fieldWeight(cat:electronics in 2), product of:
1.0 = tf(termFreq(cat:electronics)=1)
1.3364723 = idf(docFreq=14, maxDocs=21)
1.0 = fieldNorm(field=cat, doc=2)
2.3963878 = (MATCH) FunctionQuery(redis(h,bar,id)), product of:
4.0 = 4.0
1.0 = boost
0.59909695 = queryNorm
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)





Comments
David Smiley replied on Fri, 2012/07/06 - 12:26pm
How does this perform?
The standard Solr solution involves ExternalFileField but that is only half of a full implementation since you have to generate the file and trigger a commit. Commits needn't be with every change, but spaced out enought to meet requirements.