搜索资源列表
汉语分词
- 汉语分词系统,对中文语句进行识别,然后分词,是很好的自然语言理解的例子-Chinese word segmentation system, the Chinese phrase for identification, then Word, is a very good natural language understanding examples
TextCategorization
- 基于朴素贝叶斯算法实现的中文文本分类程序。可以对中文文本进行分类识别,使用时先对分类器进行训练,然后进行识别。该Beta版本仅支持对3类文本进行分类,使用简单的中文分词方法,本程序尚不具备实用性,用于算法研究和改进。-based on Bayesian algorithms to achieve the Chinese text classification procedure. Can the Chinese text classification identification, the us
myKbest_0513
- 中文分词, N-最短路径算法 ICTCLAS研究学习组 http://groups.google.com/group/ictclas?msg=subscribe-Chinese word segmentation, N-shortest path algorithm ICTCLAS Studies Group http : / / groups.google.com / group / sub ictclas msg = scribe
word_segment
- 基于Java的全文文本檢測與分割 (word segmentation)-Java-based version of the transcr ipt detection and segmentation (word segmentation)
TestSeg
- Chinese Word segmentation
WordSeg
- 利用最大匹配法进行汉语句子的分词 最大匹配算法是最常用的分词算法,简单实用正确率可达到80%以上-the maximum matching method for the Chinese Sentence Word maximum matching algorithm is the most commonly used word segmentation algorithm, simple and practical accuracy rate can reach more than 80%
ProbWordSeg
- 最大概率分词法,这种分词算法能够较好的解决汉语分词中的歧义问题,但分词效率比最大匹配分词算法要低-greatest probability points accidence, Segmentation algorithm can be used to solve the Chinese word segmentation of Ambiguity, but Word efficient than the largest matching segmentation algorithm lower
FreeICTCLAS
- ictclas c++版源代码,适用于C++语言的学习和中文分词算法的研究。-ictclas c++ version of the source code for C++ language learning and Chinese word segmentation algorithm.
FreeICTCLAS
- 中科院自动化所的ICTCLAS,C++编写。用于中文文本分词-Automation of the Chinese Academy of Sciences ICTCLAS, C++ to prepare. For the Chinese text word segmentation
softwarecode
- 中文分词是中文信息处理中的重要环节,中文分词技术广泛应用于自动翻译、文本检索、语音识别、文本校对、人工智能以及搜索引擎技术等领域。中文分词算法的选择,中文词库的构建方式,词库中词条的完备性在很大程度上与中文分词系统性能紧密相关。-Chinese word segmentation in Chinese information processing is an important part of Chinese word segmentation technology is widely used
WordPartation2
- 中文分词程序 利用最大匹配算法 支持GB2312编码格式的文件-Chinese word segmentation procedure using the maximum matching algorithm to support GB2312 encoding format of the file
GBKhash
- 利用了GBK编码的hash表,快速进行汉语分词的自然语言程序-Advantage of the GBK-encoded hash table, fast Chinese word segmentation of natural language program
segChnWord
- 中文分词评测系统,用于评测中文分词的质量,给出准确率等-Chinese word segmentation evaluation system for evaluating the quality of Chinese word segmentation, given the accuracy of such
Chinese-text-categorization-Study
- 本文通过对Bayes、KNN、SVM 应用于中文文本分类进行比较实验研究。 应用ICTCLAS 对中文文档进行分词,在大维数,多数据情况下应用TFIDF 进行 特征选择,并同时利用它实现了对特征项进行加权处理,使文本库中的每个文本 具有统一的、可处理的结构模型。然后通过三类分类算法实现了对权值数据进行 训练和分类。-Based on the Bayes, KNN, SVM applied to compare the Chinese text ca
partition
- 分词系统的实现和测试 基于字符串的分词,根据分词标记提取单个词组-Segmentation system implementation and testing of the sub-string based on word segmentation based on extracting a single phrase marker
PanGu_Release_V2.3.1.0
- 盘古分词算法,应用于搜索和需要分词的地方,源码-Pangu word segmentation algorithm
Farsi-character-segmentation
- This paper purposes a method for Off-Line Hand-written Farsi/Arabic Word Segmentation into Subword under Overlapped or Connected Conditions.
automatic-word-segmentation
- 实现一个中文自动分词程序,所使用的编程语言不限 选作:对人名,地名,机构名的识别 下载北大计算语言所标注的99年人民日报分词语料库,构建一个词表 实现正向、逆向最大分词算法-To implement a Chinese automatic word segmentation procedure, used by any programming language Chosen for: the person names, place names, organization name
Chinese-Word-Segmentation
- 很好的中文分词算法,详细介绍请解压后看注释。字典文件也要放在目录下。-Good Chinese word segmentation algorithm, detailed look after unzip comment. But also on the dictionary file directory.
line and word seg source code
- line and word segmentation
