Mitch Pronschinske is a Senior Content Analyst at DZone. That means he writes and searches for the finest developer content in the land so that you don't have to. He often eats peanut butter and bananas, likes to make his own ringtones, enjoys card and board games, and is married to an underwear model. Mitch is a DZone Zone Leader and has posted 2573 posts at DZone. You can read more from them at their website. View Full User Profile

Fixing Solr Java Heap OOM with “OmitNorms=true”

10.27.2011
| 5601 views |
  • submit to reddit
I found another post today from the blogger who decided to clone Wikipedia and index it with Solr.  This time he's got a short commentary which can serve as useful advice to you search indexer's  out there.

Solr has been running out of heap memory while trying to add a *small* number of documents to my 11,000,000 document Wikipedia index. So, diving bravely into the world of Java heap memory …

Because I am indexing diverse types of data (Wikipedia and Nutch to begin with), I have a lot of fields: I count 31 without omitNorms values, which is false by default.

11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself. So time to fire up the schema editor!  -- Fred Zimmerman


I would also check out Fred's posts on Indexing Nutch Crawls and DataImportHandler Commands.

Source:  http://business.zimzaz.com/wordpress/2011/10/fixing-solr-heap-oom-with-omitnormstrue/
Tags: