java - Why is using MappingCharFilter in Lucene 4,1 analyzer breaking wildcard matches -
using mappingcharfilter in analyzer breaking wildcard matches
i created simple stripspacesandseparatorsanalyzer
public class stripspacesandseparatorsanalyzer extends analyzer { protected normalizecharmap charconvertmap; protected void setcharconvertmap() { normalizecharmap.builder builder = new normalizecharmap.builder(); builder.add(" ",""); builder.add("-",""); builder.add("_",""); builder.add(":",""); charconvertmap = builder.build(); } public stripspacesandseparatorsanalyzer() { setcharconvertmap(); } @override protected tokenstreamcomponents createcomponents(string fieldname, reader reader) { tokenizer source = new keywordtokenizer(reader); tokenstream filter = new lowercasefilter(source); return new tokenstreamcomponents(source, filter); } @override protected reader initreader(string fieldname, reader reader) { return new mappingcharfilter(charconvertmap, reader); } } so ignores characters such hyphens in field can search for
catno:wrathcd25 catno:wrathcd-25 and same results, , works (the original value of field added index wrathcd-25)
however there problem wildcard searching
catno:wrathcd25* works, but
catno:wrathcd-25* does not
if amend analyzer comment out initreader() method then
catno:wrathcd-25* now works of course
catno:wrathcd25 no longer works.
wham doing wrong please
let me guess: parse query, using regular queryparser, right?
try using analyzingqueryparser, should trick. javadoc:
overrides lucene's default queryparser fuzzy-, prefix-, range-, , wildcardquerys passed through given analyzer, wild card characters (like *) don't removed search terms.
Comments
Post a Comment