资源列表
The-Unicode-Standard
- unicode 标准文档 研究字体编码的可能会用到-The unicode standard document font encoding may be used
PU123ACorpora.tar
- 这是一个供做垃圾邮件方面东西的朋友的语料库,很好用的,望对大家有帮助-This is a place for things to do in junk e-mail a friend corpus, well used, hope helpful to everyone
计算机常用算法
- 经常用到的计算机算法,经过整理后献给大家,也许以后大家用的着。-frequently used a computer algorithm, collated dedicated to everyone, perhaps after we used to.
initial
- 中科院分词系统,C++版,简单调用接口实现分词示例。(只需更改主函数中目标文件名即可)-CAS segmentation system, C++ version, simply call the interface segmentation examples. (Just change the primary function of the target file name)
基本词典程序
- 这是一个能吧中文信息处理中的以二进制形式存储的词典,以文字形式转换过来,方便我们对词典的理解,非常有用哦-This is a Chinese information processing it in storage in a binary form in the dictionary, in textual format conversion up to us to the dictionary to understand, very useful oh
spamFiliter
- 中文邮件过滤。对训练邮件分词训练贝叶斯模型。然后对测试邮件分类
Unihan
- Unicode 库,国际标准最新公布的。包含东亚地区的文字-unicode charactor data
qirunVIP
- VIP信息处理系统源码,看大家需要提供出来了-VIP-source information processing system
participle.rar
- 分词系统,包括linux和window环境下,分词准确,实用工具之一,Segmentation systems, including linux and window environment, the sub-word accurate, and practical tools
Word
- 对中科院分词程序的简化版本,做成了动态链接库形式
OP-COM100119a-DK-OK
- windows diagnose tool
工作区整体,功能较多,转为UNICODE量身定做
- 专门处理UNICODE编码文件的工作区,功能还在继续扩充,小弟抠了很久的。-devoted UNICODE coding of documents, functions also continue to grow, the younger telling for a long time.