搜索资源列表
classifier-1.12
- 能对从Google中搜索出来的文本进行聚类,提供了Java包,及调用源代码.-can right from the Google Search for the text clustering, a Java package, source code and call.
clusty.tar
- 聚类分析的文本组合 压缩文件用于聚类分析的数据处理-cluster analysis of the text composition compressed files for Cluster Analysis of Data Processing
kmeansjulei
- 这是用VC++编写的K_means 聚类算法的程序,详细的运行说明在文件夹的文本说明里!-This is the preparation of the VC K_means clustering algorithm procedures, a detailed statement of the operation folder notes to the text!
wenbenwajue1232
- 关于文本挖掘的摘要,对各种聚类算法进行了分析,是个好的东西-on Text Mining summary of the various clustering algorithms to the analysis is a good thing
HLSSplit.RAR
- 关键词抽取技术广泛应用于信息检索、文本分类/聚类、信息过滤
KMEANSII
- 神经网络中的K均值聚类算法II: 1.KMIn为输入数据文本,其中,第一个参数为所要聚类点个数,第二个参数为聚类点的维数,第三个参数为所要求聚类的个数 2.KM2OUT为经过K均值聚类算法II计算后得到的结果
LHY
- 文本统计与识别的代码 用到了聚类的算法 是统计课程的大作业
DataSets
- 文本聚类用到的数据集,国外人提供的用于聚类算法-dataset,EMforSoftKmeans dataset Binary_1 Binary_2
toolkit_for_words_En
- 处理英文中的停词、同词干词,不改变文章结构。适用于文本分类、文本聚类、推荐预处理。-Processing of stop words in English, with the stem word, does not change the structure of the article. Suitable for text categorization, text clustering, recommend pretreatment.
cluster
- python语言实现k-means算法和Fast Search And Find Of Density Peaks算法用于文本聚类,-python language implements k-means algorithm and Fast Search And Find Of Density Peaks for text clustering algorithm,
maxent-master
- 最大熵模型算法,用于统计学习,文本分类,文本聚类研究-The maximum entropy model algorithm for statistical learning, text classification, text clustering research
kmeans
- k-means算法是文本聚类经典算法,也是数据挖掘十大经典算法之一。k-means算法Java实现。-k-means algorithm is a classical algorithm text clustering, data mining is one of the ten classic algorithms. k-means algorithm is implemented in Java.
datamining
- PDF格式的PPT,来自英国南安普顿大学。主要介绍了数据挖掘的技术以及应用,包括决策树,推荐系统,文本聚类,搜索引擎,购物篮子分析。-PPT PDF format, the University of Southampton. It introduces data mining technology and applications, including decision, recommendation systems, text clustering, search engines, sho
DataStructTest
- K-means文本聚类方法(IDEA项目包) 下载就能运行-K-means clustering method text (IDEA project package) will be able to download Run
Text-clustering
- 机器学习中文本聚类算法,里面有5个文件,包含Python实现代码和测试数据。-The clustering algorithm machine learning Chinese, there are five files that contain Python implementation code and test data.
words_1025_dic.txt
- dbscan,暂时不要下载,有误,回头整理(dbscan and word2vec for chinese words)
English
- 包括了原始英文文档、删除特殊符号、分词、词干化、计算相似度等文本预处理后产生的文档,总的数量是500个英文文档(Including the original English document, delete special symbols, such as text segmentation, a preprocessed documents produced, the total number of 500 English document)
EnglishChuLi
- 利用python编写的文本预处理的程序,包含了每一步的实现代码,分为删除标点符号、删除停用词、相似度计算、PCA降维、聚类以及可视化等,运行环境为pytharm,python3开发环境(The text preprocessing program written by Python contains every step of implementation code, which is divided into delete punctuation marks, delete stop word
ChineseChuLi
- 中文文本处理的python程序,包括分词、删除特殊字符、删除停用词、爬虫程序、PCA降维、Kmean聚类、可视化等(Python programs for Chinese text processing, including participle, deleting special characters, deleting disuse words, crawler programs, PCA dimensionality reduction, Kmean clustering, visuali
协同过滤算法
- 文本聚类(Text clustering)文档聚类主要是依据著名的聚类假设:同类的文档相似度较大,而不同类的文档相似度较小。作为一种无监督的机器学习方法,聚类由于不需要训练过程,以及不需要预先对文档手工标注类别,因此具有一定的灵活性和较高的自动化处理能力,已经成为对文本信息进行有效地组织、摘要和导航的重要手段,为越来越多的研究人员所关注。(Text clustering document clustering is based on the well-known clustering assum