搜索资源列表
SemanticFR(软件大赛版)
- 抓取网页,对语句进行分词处理,进行语义分析,对网页内容进行基于语义的过滤(Crawl web pages, word segmentation, semantic analysis, semantic filtering of web content)
fenci
- 用NLPIR工具包实现中文分词,很用的中文分词工具。(Implementation of Chinese word segmentation with NLPIR Toolkit.)
199801
- 人民日报语料,分词和词性标注POS的语料(Chinese corpus for Word segmentation and par of speech)
RMM
- 逆向最大匹配算法实现分词,分词结果在另一个txt里呈现(Reverse maximum matching algorithm to achieve the word segmentation, word segmentation results in another TXT presentation)
24.HMM
- 通过hmm实现中文分词,并且能自动发现新词的功能(The Chinese word segmentation is realized by HMM, and the function of new words can be automatically found)
遗忘算法(词库生成、分词、词权重)演示程序
- 通过非主流的遗传算法进行关键词提取,分词的功能(Through the non mainstream genetic algorithm for keyword extraction, word segmentation function)
FudanNLP-1.6.1
- 分词模块的说明,非常很好的使用和二次开发。(Explanation of the word segmentation module)
coreseek
- 非常好用的中文分词工具,在网上找了好久才找到的非常好的工具(very good Chinese languag participle tool;I Looking for a very good tool on the Internet for a long time)
jieba
- python结巴分词模块.能对句子进行分词.(python module that make sentences into words)
20170314140452_ICTCLAS2016分词系统下载包
- ICTCLAS中文分词系统下载包,最新版2017(ICTCLAS Chinese word segmentation system download package, the latest version of 2017)
jieba-0.39
- Python非常强大的中文分词包,用于自然语言处理。(Python is a very powerful Chinese word wrapper for natural language processing.)
ngram模型分词与统计算法
- N-Gram(有时也称为N元模型)是自然语言处理中一个非常重要的概念,通常在NLP中,人们基于一定的语料库,可以利用N-Gram来预计或者评估一个句子是否合理。另外一方面,N-Gram的另外一个作用是用来评估两个字符串之间的差异程度。这是模糊匹配中常用的一种手段。本文将从此开始,进而向读者展示N-Gram在自然语言处理中的各种powerful的应用。(N-Gram (sometimes referred to as N metamodel) is a very important concept
CSATP
- 汉语文章的自动分词系统,带界面,java编写(Automatic word segmentation system for Chinese articles, with interface, Java writing)
cppjieba-master
- 结巴分词,提供针对中文的分词方法,使用C++语言编写(jieba provides a word segmentation method for Chinese, which is written in the C++ language)
IK Analyzer 2012FF_hf1
- ik分词器源码,非常的好用,能够智能分词,检索命中率比较高的。(ik analyse code,it is useful)
IKAnalyzer-1.0.0
- IKAnalyzer分词Jar包,有需要的可以下载(IKAnalyzer participle Jar package, which can be downloaded)
ChPreprocess
- 使用jieba包从excel表中读取数据,进行中文分词,预料分析(Using Jieba package for Chinese analysis, expected analysis)
课程设计作业
- 用分词包进行分词,并通过分词统计每个词频出现次数(use to seperate an article, and use the dictionary to find the frequency of each word)
paoding-analysis-2.0.4-beta
- paoding-dic-home.properties是庖丁解牛中文分词的配置文件 庖丁解牛分词器是基于lucene的中文分词系统的软件。(Paoding-dic-home.properties is Chinese Paodingjieniu participle configuration file Is Chinese Paodingjieniu segmentation segmentation system based on the Lucene software.)
segmentation
- 对文本进行分词,使用停用词表去除停用词,标点等。(segmentation, and deleting stop words and punctuations.)