Rafal Kuc is a team leader and software developer. Right now he is a software architect and Solr and Lucene specialist. Mainly focused on Java, but open on every tool and programming language that will make the achievement of his goal easier and faster. Rafal is also one of the founders of solr.pl site where he tries to share his knowledge and help people with their problems. Rafał is a DZone MVB and is not an employee of DZone and has posted 75 posts at DZone. You can read more from them at their website. View Full User Profile

Use of cache=false and Cost Parameters

03.10.2012
| 3406 views |
  • submit to reddit

From the day Solr 3.4 was released its users got a nice feature which decides if the results of a filter query or query should be placed in cache. In addition to that we got the possibility to set filter query cost. Let’s see how to use those features.

Parameter cache=false

Setting the cache parameter to false we tell Solr not to cache current query results. This parameter can also be used as a filter query (fq) attribute, which tell Solr not to cache filter query results. What do we get from such behavior ? Let’s imagine the following filter as a part of the query:

fq={!frange l=10 u=100}log(sum(sqrt(popularity),100))

If we know that the queries with filter like the above one are rare, we can decide not to cache them and not change cache state for irrelevant data. To do that we add the cache=false attribute in the following way:

fq={!frange l=10 u=100 cache=false}log(sum(sqrt(popularity),100))

As I told, adding this additional attribute will result in the filter results not being cached.

Parameter cost

The additional feature of Solr 3.4 is the possibility to set filter cost in case of those filters that we don’t want to cache. Filter queries with specified cost are executed as last ones after all the cached filters. The cost attribute is specified as the integer value. Let’s look at the following example filters:


fq=cat:video&fq={!cache=false cost=50}productGroup:12&fq={!frange l=10 u=100 cache=false cost=150}log(sum(sqrt(popularity),100))

The first filter to execute will be the fq=cat:video one because it is cached. The next one to evaluate will be the one with lesser cost value, so the fq={!cache=false cost=50}. The last filter to evaluate will be the most expensive filter. In addition the last filter will operate only on documents that match the main query and all previous filters (because its cost attribute is higher than 100).

You should remember that cost attribute work only when the filter query is not cached.

To sum up

With cache and cost attributes we can control what we place in Solr cache, which is very good in most situation, when we know what queries are sent to Solr instances. Whats more, using those attributes we can improve query performance, for those queries that have filters with cost higher than 100. I think it’s worth to take a while, look at Your queries and think about if You need to cache all of those :)

Published at DZone with permission of Rafał Kuć, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)