搜索资源列表
aiml-en-us-foundation-alice.snapshot
- ALICE问答系统的aiml格式对话语料,比较全的英文问答系统语料,供大家研究使用,可翻译成中文,参考设计中文问答系统。-aiml format dialog data ALICE question answering system, comparison of the whole corpus of English question answering system for everyone to use, can be translated into Chinese, reference d
Word2VEC
- 从Word2vec训练好的语料中提取余弦距离-The cosine distance is extracted the corpus of Word2vec training.
aiml
- aiml python 版本 里面包含alice语料库 有需要的朋友可以下载一下-aiml python version
learning-data-mining-with-python
- 《python数据挖掘入门与实践》随书源代码,Chapter1-Chapter12.使用ipython notebook运行,包括社会媒体挖掘,作者归属,新闻语料分析,大数据处理等应用实例。-Python data mining entry and practice with the book source code, using Chapter1-Chapter12. IPython notebook operation, including social media mining, aut
TF
- TF-IDF是一种统计方法,用以评估一字词对于一个文件集或一个语料库中的其中一份文件的重要程度。字词的重要性随着它在文件中出现的次数成正比增加,但同时会随着它在语料库中出现的频率成反比下降。TF-IDF加权的各种形式常被搜索引擎应用,作为文件与用户查询之间相关程度的度量或评级- TF-IDF is a statistical method to assess the importance of a word for a file set or a corpus of the importan
tc-corpus-answer
- 复旦中文文本语料库,共十类文本,未分词,有兴趣可以-Fudan Chinese text corpus
GMM_gulici
- 基于GMM的孤立词识别,包含源代码和语料-isolated word recognition based on GMM, including source code and the corpus
aec-test-audio
- 用于测试AEC的一些音频语料,用于测试AEC的一些音频语料,-Used for testing the AEC some audio corpora,
Corpus
- 对话类语料10万条左右,可用于进行机器人对话训练。-Dialogue about 100,000 words can be used for robot dialogue training.
hownet
- 知网完整版,附带相关的各种论文文档,中文语料库-see chinese descr iption
jevmkm
- SVM文本分类器源程序,英文界面,包含语料,没有解压缩密码,-The SVM classifier source program text, English interface, contains the corpus, not unzip password,
databayy
- 一份很重要的语料库,为你的分词程序是一个很好用的资料库文件-An important corpus, word segmentation procedure for you is a very useful files
95777978
- SVM文本分类器源程序,英文界面,包含语料,没有解压缩密码,-The SVM classifier source program text, English interface, contains the corpus, not unzip password,
canaonstruction
- 这是一个语料库查询系统,可以学习一下VC的文件操作和管理平台建设-This is a corpus query system, can learn VC file operations and management platform construction
LSI
- 基于隐语义模型的新闻相似度分析,根据一片包含三千多篇的新闻语料库,做新闻相似度分析。-Based on the similarity news hidden semantic analysis of the model, according to a news article that contains more than three thousand corpus, do news similarity analysis.
DocumentSimilarity.py
- 基于向量空间模型的计算新闻相似度算法,根据一篇1998年的人民日报语料库,进行文章相似度计算,输出结果为一个上三角矩阵-News similarity algorithm to calculate the vector space model, according to a People' s Daily Corpus 1998, carried articles similarity calculation, output is an upper triangular matrix
2_simplifyweibo
- 20万情感语料 已经分词了 20万情感语料 已经分词了-200 thousand, the emotional corpus has been participle
COAE2014task01
- 第六届中文倾向性分析评测(COAE2014)语料资料-Sixth corpus of Chinese tendentious analysis and uation (COAE2014) corpus
nklrc
- SVM文本分类器源程序,英文界面,包含语料,没有解压缩密码,-The SVM classifier source program text, English interface, contains the corpus, not unzip password,
简单基于词典的分词(带txt语料库)
- 基于词典的分词,用于英文文本对文本进行词典的分词(Based on the word segmentation of the dictionary, used in the English text of the text of the dictionary word segmentation)