搜索资源列表
ictclas4j
- 中科院中文分词系统完成的java源码,能很好的实现中文的分词,为文本挖掘提供基础。-Chinese Academy of Sciences Chinese word segmentation system to complete the java source code, can achieve good word of Chinese, provide a basis for text mining.
chinese
- java开发的中文分词提取关键字系统源代码,里面有说明文档,可以按步骤使用 -java development of the Chinese word keyword extraction system source code, which has documentation, you can use the step by step
ICTCLAS50_Windows_32_JNI
- 中科院分词器(java),分词效果不错 里面还有API demo 使用文档等文件夹 -CAS segmentation unit (java), there is also the effect of a good word to use API demo documents and other folders
ICTCLAS003
- 主要利用中科院分词,实现在java语言下的分词,主要是帮助和我一样刚接触ICTCLAS的人熟悉一下分词的调用,简单的运行一下-The main advantage of CAS segmentation to achieve word in java language, like me, is to help people who are new to familiarize yourself ICTCLAS word calling, simply run the
Nlpir
- 前NLPIR汉语分词系统(又名ICTCLAS2013),主要功能包括中文分词;词性标注;命名实体识别;用户词典功能;支持GBK编码、UTF8编码、BIG5编码。新增微博分词、新词发现与关键词提取;张华平博士先后倾力打造十余年,内核升级10次。国内国际排名均为第一。 项目已经配置好环境,导入Eclipse即可使用,文件内src下的TestUTF8.java可以直接运行,提供了分词接口-Before NLPIR Chinese word segmentation system (aka I
ftp
- 分词器使用java语言开发,主要从事PTF上传下载功能-Java language to develop the lexer uses, mainly engaged in PTF upload and download function
fencijiansuo
- 分词检索 一个java编的界面还不错的分词检索软件-word search
FreeICTCLAS
- 对中文进行分词,c++实现多中文文本的分词算法-Using java prepared tf* idf results
ltp_new
- 使用哈工大信息检索实验室LTP系统进行分词,词性标注,命名实体识别,依存句法分析,语义角色标注 同时是最新版API使用情况,JAVA版本,并且生成XML文件-Retrieval Laboratory LTP system segmentation, the use of Harbin Institute of technology information of part of speech tagging, named entity recognition, dependency parsi
lexical-analysis
- 通过写好的最小DFA实现源码的分词功能。 内容:GetToken.java(源码)SourceCode.txt(要分词的源码) Tokens.txt(Token序列)-Achieve source word feature written by minimal DFA. Content: GetToken.java (source) SourceCode.txt (word to the source) Tokens.txt (Token sequence)
WordSegment
- 中文分词,Java版本,词库已经包括,安装JDK后直接运行里面的WordSegment.java就行了。-Chinese Segment of Java, contains dictionary.It is OK that execute the WordSegment.java after install JDK.
IKAnalyzer
- JAVA实现简单客服的机器人系统,分词用系统用IK分词,机器人语言用AIML。程序已经实现java socket服务的建立。实现了中文分词,同义词输出,答案匹配。用到的库有IK、program-ab。搞了一个月的小成果,希望大家能用到。-JAVA simple customer service robotic systems, word by word IK systems, robot language with AIML. Procedures have been implemented t
ksoap2-android-2.5.2.jar
- 最好用的java 分词工具,主来是采用疱丁分词的原理来分的-Best Java segmentation tools, principle of the coming of the Lord is the use of Cook Ding participle
HMMSeg
- java ,隐马尔科夫的分词算法实现。包含10w条训练集,字典。也可以自己重新添加训练集。-java, hidden Markov segmentation algorithm. 10w of the training set contains dictionary. You can also add your own re-training set.
LDA_java
- Java,LDA(Latent Dirichlet Allocation)源代码,可以实现分词、去除停用词功能。-Java, LDA (Latent Dirichlet Allocation) source code, can achieve the segmentation, removing stop words function.
textmin
- 朴素贝叶斯文本分类代码,带分词。JAVA编写,有注释说明。-text minner tools
JnaTest_V1
- 调用ICTCLAS2014分词系统进行新词发现的Java接口代码。-Call ICTCLAS2014 segmentation system Java interface code found new words.
IKAnalyzer_all_jar
- 基于java分词器,轻量级的,挺好用的,大家试试看-Something for java
MMSeg
- 中文自动分词系统,java编写,有界面。可以实现正向最大匹配FMM和逆向最大匹配B-Chinese automatic segmentation system, java write, there are interfaces. You can achieve maximum matching FMM forward and reverse maximum matching BMM
chinese-segment
- 中文分词开源项目 JAVA中文分词 中文分词开源项目 JAVA中文分词-Chinese word segmentation open source projects JAVA Chinese Chinese word segmentation open source projects JAVA Chinese