It's very interesting. Is there some way to use your FSM implementation
without Lucene itself? I'm interested, because I am writing dictionary
based lemmatizer for russian language. Stemmers works not so well for
russian, because it's very complicated language with very rich flexion
model. And so I need some memory efficient data structure which allows
me to map char sequences to their ordinal lemma number. I think FST
