搜索资源列表
MFC查词典、分词、词频统计程序
- MFC编程,功能是查词典(用户可自己导入文本),分词,统计词频,还可以保存结果!我们MFC课的期末作业,强烈推荐!-MFC programming function is to check dictionary (users can import their own version), participle, statistical, frequency, the results can be saved! We MFC class at the end operations, strongly
本程序可以实现对已有网页的信息提取和分词
- 本程序可以实现对已有网页的信息提取和分词,结果会导入叫做res.txt的文件中。本程序是开发搜索引擎的前期工作。-This procedure can be achieved on existing Web information extraction and segmentation, the results into a file called res.txt. This program is the development of the preliminary work the searc
Free ICTCLAS 中科院的分词软件ICTCLAS
- 中科院的分词软件ICTCLAS,自己已经把他用到程序里了感觉效果很好,分享给大家-Chinese Academy of Sciences of the sub-word software ICTCLAS, he has used his program works well in a sense, we share
Chinese-Segmentation.rar
- 自己编写的中文分词源程序,用vc++编写,附有完整的文档,以及标准的分词数据库,I have written the source code of the Chinese word segmentation, using vc++ to prepare, with complete documentation, as well as sub-standard speech database
windows_c_32.rar
- 中国科学院的最新版本的中文分析程序,可以进行分词、词性标注等,The latest version of the Chinese Academy of Sciences of the Chinese language analysis procedures, can be sub-word-of-speech tagging, etc.
word-frequency
- java 编写的词频统计,包含极易分词软件的包,Lucene包,程序调试通过-java written word frequency, word that contains the software package easy points, Lucene package, program debugging by
SW_I_WordSegment
- SW-I中文分词算法,MFC程序,在visual studio 2008中调试通过。默认词库为mdb,由于较大未包含在源文件中,请自行下载mdb格式的词典。-SW-I Chinese word segmentation algorithm, MFC procedures, visual studio 2008 in debug through. Default thesaurus for the mdb, as a result of the larger not included in the
中文分词算法
- 本程序使用给出的字典进行学习并对训练语料进行分词处理,采用C语言编写,高效易懂!
text
- python写的gbk分词分句程序 可以使用sogou或者谷歌输入法的词库进行分词-python written procedures gbk participle clause can use Google sogou or input method for segmentation of the thesaurus
MmFenCi
- 基于MM的分词算法,有兴趣者可以把程序中没有完成的部分继续。-MM sub-word based algorithm, are interested in can not complete the program part to continue.
WordSegmentation
- 基于java的一个分词程序 速度比较快 精确度比较高-A java-based segmentation procedures faster relatively high accuracy
WordSegment
- 很简单的中文分词程序,命令行程序,在VisualStudio2008中调试通过,内附测试文档。-Chinese language is very simple segmentation procedures, command-line procedures, the debugging of VisualStudio2008 passed, the document containing the test.
word_split
- 这个一个基于逆向最大匹配的分词程序,语料规模比较小。-The maximum matching based on the reverse of the sub-term process, relatively small-scale corpus.
imdict-chinese-analyzer
- imdict-chinese-analyzer 是 imdict智能词典 的智能中文分词模块,算法基于隐马尔科夫模型(Hidden Markov Model, HMM),是中国科学院计算技术研究所的ictclas中文分词程序的重新实现(基于Java),可以直接为lucene搜索引擎提供简体中文分词支持。-imdict-chinese-analyzer is a smart imdict Chinese Dictionary smart module segmentation algorithm
fenci
- 一个简单的基于词典分词的程序,lucene的分词程序不少,但有时候并不需要复杂的功能,只是需要简单的根据指定的词典分词。代码简单,可以作为学习参考-A simple dictionary-based word process, lucene procedures for sub-word a lot, but sometimes does not require complex functions, but only require a simple dictionary word accord
IKAnalyzer3.1.1_userguide
- java分词程序,能够精确分词,包含词库等-java word program, word accurately, including the thesaurus, etc.
yard
- 一个简单的中文分词程序,用纯java编写,请解压后,在java环境中运行。-A simple Chinese word segmentation program, written in pure java, please unzip, run the java environment.
splitword
- 基于VC++6.0的中文分词程序。内含词典。-VC++6.0 based Chinese word segmentation procedure. Embedded dictionary.
ICTCLAS50_Linux_RHAS_64_JNI
- 中科院中文分词程序,国内相关领域的的权威.这是Java(JNI)64位版-Institute of Chinese word segmentation program, the domestic authority of the relevant fields, which is Java (JNI) 64-bit version
20180306142010_ICTCLAS2016分词系统下载包
- 供中文文本挖掘程序员使用,训练文本挖掘能力(Chinese Corpus, used to exercise and test your ability of digging in Chinese Text)