|
> lexer分词器不能分像“皖DSYQOW”的字符??即使可以分,但是像车牌号这样数字和字符组合的得到多少tokens,全文检索在这里是否不适用?
I wish somebody working at Baidu or Google can jump in. But I think your understanding is correct. Each token will be 像“皖DSYQOW”的字符, or maybe just the Latin spelling part “DSYQOW”. In the latter case, a search for '皖DSYQOW' is probably a search for 'DSYQOW' first and in the result set, filter on those preceded by '皖'.
If we don't consider text indexes, I can think of one strategy: manual "indexing" of your license plate numbers (车牌号). In the column that contains license plate number plus other text (i.e. your column hphm), you only want to search for the license plates, not any other text. Correct? I mean, if the column has value 'some text 皖DSYQOW some other text', you always search for '皖DSYQOW' and don't care the texts before and after it. Correct? If so, create a separate column, that *only* contains the license plate number '皖DSYQOW', and index that column. Then your query will be "... where license_plate = '皖DSYQOW'". But your select-list can include the free text column hphm. Just don't index hphm and don't use that as fuzzy search.
That basically breaks the requirement that your search must be a fuzzy search. For example, if your free text column hphm has this value '皖DSYQOW crashed into 皖ABCDEF on a Sunday morning', and you want to find this row with either '皖DSYQOW' or '皖ABCDEF' in the where-clause, what I suggested won't work. |
|