搜索资源列表
fencijiansuo
- 分词检索 一个java编的界面还不错的分词检索软件-word search
lunce01
- lucene3.5实现搜索功能。主要实现索引,搜索,分词等简单功能。-Lucene3.5 search function.The main index, search, word segmentation and other simple functions.
Calfreq
- 对英文文档分词,实现文章的词频统计以及整序排列输出-The English word document, word frequency statistics and realize the article output of the entire sequence alignment
Stem
- 实现英文文档的分词,并且对词汇进行波特词干处理,输出文章中词干的出现数量-Achieve the English word document and vocabulary Porter Stemming for processing, the output article appeared in the number of stem
ltp_new
- 使用哈工大信息检索实验室LTP系统进行分词,词性标注,命名实体识别,依存句法分析,语义角色标注 同时是最新版API使用情况,JAVA版本,并且生成XML文件-Retrieval Laboratory LTP system segmentation, the use of Harbin Institute of technology information of part of speech tagging, named entity recognition, dependency parsi
lexical-analysis
- 通过写好的最小DFA实现源码的分词功能。 内容:GetToken.java(源码)SourceCode.txt(要分词的源码) Tokens.txt(Token序列)-Achieve source word feature written by minimal DFA. Content: GetToken.java (source) SourceCode.txt (word to the source) Tokens.txt (Token sequence)
WordSegment
- 中文分词,Java版本,词库已经包括,安装JDK后直接运行里面的WordSegment.java就行了。-Chinese Segment of Java, contains dictionary.It is OK that execute the WordSegment.java after install JDK.
TermAttribute
- 是luece的一个类,作用于文本的分词处理-Is luece a class act on the word of the text processing
Split
- Java实现逆向最大匹配中文分词算法,本程序可以实现较为简单的中文分词-Java implementation reverse maximum matching Chinese word segmentation algorithm, the program can be implemented relatively simple Chinese word segmentation
Fenci
- 中文分词程序源码,包含所用到的词库词典。-Chinese word segmentation program source code, including the use of the thesaurus dictionary.
jieba-analysis-master
- 结巴分词,在Lucene中用来分词的,该分词器具有自动提取关键字的功能-Stuttered word, the word is used to Lucene, the word has the function of automatic extraction of keywords
ksoap2-android-2.5.2.jar
- 最好用的java 分词工具,主来是采用疱丁分词的原理来分的-Best Java segmentation tools, principle of the coming of the Lord is the use of Cook Ding participle
ictclas4j
- 中科院分词Java版,根据C语言版改写-ictclas4j-Participles ictclas4j Java version of Chinese academy of sciences, in the C language version
DeleteStopWord
- 此源码组要用于中文文本预处理。源码首先进行文本分词,分词之后对文本中的停用词进行过滤。-text preprocessing
Data
- 分词所采用的词典文件,对中文分词词库建设有很大的帮忙-help you to chinese segment
fenci
- 基于IKAnalyzer2012的中文分词java代码,可以去除停用词。-The Chinese word segmentation based IKAnalyzer2012 java code, you can remove stop words.
LDA_java
- Java,LDA(Latent Dirichlet Allocation)源代码,可以实现分词、去除停用词功能。-Java, LDA (Latent Dirichlet Allocation) source code, can achieve the segmentation, removing stop words function.
zhongwenfenci-
- 中文分词,
SplitWord
- 北邮教授写的基于中科院研发的ITCALAS分词软件,主接口看test包下的Test类-BUPT professor wrote based Chinese Academy of Sciences developed ITCALAS segmentation software, see the main interface Test class under test package
JnaTest_V1
- 基于中科院NLPIR分词系统做的分词以及新词发现系统,第二十届全国信息检索学术会议(CCIR2014)题目2.3,微博新词发现与情感分析的源码,可处理大语料的微博出局-Based on NLPIR Chinese Academy of Sciences, to solve the problem CCIR2014,Blog content s new words discover and sentiment classify.