搜索资源列表
word_split
- 这个一个基于逆向最大匹配的分词程序,语料规模比较小。-The maximum matching based on the reverse of the sub-term process, relatively small-scale corpus.
Dictory
- 用B-树作为查找存储结构,对中文单词进行哈希,用最长逆向匹配算法进行中文句段分词,实现中文词典以及分词。中文词典规模为十万八千多词汇。分词的正确率在90 以上。-Use the B-tree as storage structure , and hash the Chinese word while storing or searching. Use the longest reverse matching algorithm to split Chinese sentence to word
mongolian
- 蒙古文unicode编码语料,可用于蒙古文编码查看、以及后续一些研究使用,规模较小。-Mongolian unicode encoded corpus can be used to encode view Mongolian, and follow some of the studies used a smaller scale.
