搜索资源列表
segment
- segment,一个简单的中文分词程序,命令行如下: java -jar segmenter.jar [-b|-g|-8|-s|-t] inputfile.txt -b Big5, -g GB2312, -8 UTF-8, -s simp. chars, -t trad. chars Segmented text will be saved to inputfile.txt.seg
C99
- n algorithm for domain independent linear text segmentation This the Windows version of the C99 algorithm that was presented in my NAACL00 paper. [Directories] bin contains executables, JAR file and test files classes compiled code as
KSeg4J.1.0
- 简体中文机械分词模块,实现正逆向最大匹配消除歧义,jar封装,可以导入后直接使用-a simplified Chinese segmentation Jar
ReutersClassification
- 调用weka.jar中的前处理及分类方法,实现对Reuters-21578数据集的文本分类-Do the classification for Reuters-21578 dataset based on methods in weka.jar