搜索资源列表
Segmenter
- 本程序用于对大批量的文本数据进行分词,分词的结果很好,同时能过滤掉不必要的听用词-This procedure is used to large quantities of text data sub-word, the word is very good, while filtering out unwanted listen wording
zidong
- 用c++实现自动文摘功能,包括自动分词,计算句子权重,摘录等功能,整个的程序代码都有-With c++ automatic summarization, including automatic word calculated weights sentence, extract features such as the entire program code
bianli
- VS2010编写的遍历目录小程序,dos界面,数字内容安全的“分词”中可能会用到-VS2010 directory traversal written procedures, dos interface, digital content security " word" may be used
RMM
- 这个是RMM算法,支持正向、逆向最大匹配,是自然语言处理的重要算法之一,只要替代程序中的词库即可。本词库取自1988年人民日报语料材料,算法对中文分词精确度达到90 以上-This is RMM algorithm supports forward, reverse maximum matching, natural language processing algorithm, as long as the alternative procedures thesaurus can. The th
NBclassfier
- 贝叶斯情感分类器,基于五倍交叉法来验证。程序可以直接运行,改程序是在基于已经分词的情况下实施的。-Bayesian classifier, emotion to verify five times based on the crossover. Program can be run directly, the program is based on the segmentation of the case.
MyPaodingTest
- 一个中文分词 庖丁的测试程序 仅供初学者参考-paoding Test project
MapTest
- 倒排索引,此程序,运用ICTClas分词工具实现的中文分词,并建立倒排索引输出到指定文件。-Inverted index, this procedure, use ICTClas segmentation tool to achieve the Chinese word segmentation and indexing inverted output to the specified file.
fenci_v1.0_utf8
- 本程序利用数学算法简单实现了文章分词的功能,页面清洁、简单。-the program use math to
devide
- 用于分词的c语言小程序,对大量的文档分词,可能会在数据挖掘用到-C language word for a small program, a large number of documents word may be used in data mining
TextAnalysis
- TextAnalysis系统及算法设计 输入为ICTCLAS分词后的词语结构信息,对每个词语的词性进行判断。 1. 如果不存在词性,则跳过这次循环。用来跳过一些语气助词等无意义的信息。 2. 由于每个句子都有几个子句,而每个子句都是一个独立的主谓宾结构,所以系统将子句通过标点符号来分隔。最后将所以子句的总情感权值相加得到总句的情感权值。 3. 在对字典的预处理阶段,系统对不同程度的词语赋予了不同的权值。为了提高处理程序的效率,系统只分析对体现语言情感有较大作用的词性(包括形容词、
word_counting
- 利用C#编写的一个小程序,可以对一个文档按照空格和逗号等分隔符进行分词并且计数-Use C# to write a small program, you can follow a document, such as space and comma separators for segmentation and counting
Windows_32_C_Demo
- ICTCLAS 汉语分词系统 pku_test.txt 未经过分词的文档文件 调用ICTCLAS程序对其中的文档进行分词 -ICTCLAS Chinese word segmentation system pku_test.txt word document file without calling undue ICTCLAS program on which the document word
matching-Chinese-word-by-HMM-and-MM
- 该程序为在MFC下开发的正向和反向两种中文分词系统。-The program was developed in MFC under both positive and negative Chinese word segmentation system.
ConvertPinYin
- 汉字转拼音程序,可对文件进行转换,不包括标点符号,而且使用了一个分词的库-Chinese characters to Pinyin program can convert the file, not including punctuation, and the use of a word in the library
Split
- Java实现逆向最大匹配中文分词算法,本程序可以实现较为简单的中文分词-Java implementation reverse maximum matching Chinese word segmentation algorithm, the program can be implemented relatively simple Chinese word segmentation
IKAnalyzer
- JAVA实现简单客服的机器人系统,分词用系统用IK分词,机器人语言用AIML。程序已经实现java socket服务的建立。实现了中文分词,同义词输出,答案匹配。用到的库有IK、program-ab。搞了一个月的小成果,希望大家能用到。-JAVA simple customer service robotic systems, word by word IK systems, robot language with AIML. Procedures have been implemented t
word
- 一个用C语言写的分词小程序,值得细细研究-A C language written word applet
MFC-Look-it-up-in-the-dictionary
- 查词典、分词、词频统计程序,非常实用读者,建议下载。-Look it up in the dictionary, word segmentation, word frequency statistics program, very practical readers, it is recommended to download.
Sina-weibo
- 运行环境为C#+MYSQL,并融合了ICTCLAS分词和TF*PDF算法,能够对采集到的信息,做趋势分析和热点发现等分析;此外,您可以通过调整程序中的正则表达式,以匹配相关代码区域的数据。-Runtime environment for C#+ MYSQL, and the integration of ICTCLAS word and TF* PDF algorithm, able to collect information, analyze trends and hot spots dis
201411149222244
- 随便下载一篇中文的文本文档,通过这个程序可以将文档进行分词处理,还能够统计词语出现的次数-To download a Chinese text documents, through this program can be word processing document, will also be able to statistics the number of occurrences of words and phrases