Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2576 posts at DZone. You can read more from them at their website. View Full User Profile

Fixing Solr Java Heap OOM with “OmitNorms=true”

10.27.2011
| 5791 views |
  • submit to reddit
I found another post today from the blogger who decided to clone Wikipedia and index it with Solr.  This time he's got a short commentary which can serve as useful advice to you search indexer's  out there.

Solr has been running out of heap memory while trying to add a *small* number of documents to my 11,000,000 document Wikipedia index. So, diving bravely into the world of Java heap memory …

Because I am indexing diverse types of data (Wikipedia and Nutch to begin with), I have a lot of fields: I count 31 without omitNorms values, which is false by default.

11,000,000 * 1 * 31 = 31 x 10M = 310MB RAM all by itself. So time to fire up the schema editor!  -- Fred Zimmerman


I would also check out Fred's posts on Indexing Nutch Crawls and DataImportHandler Commands.

Source:  http://business.zimzaz.com/wordpress/2011/10/fixing-solr-heap-oom-with-omitnormstrue/
Tags: