搜索资源列表
DATA
- 文本聚类分类数据集 包括20newsgroup 和retuers 中抽取的500条数据,有四个表-Text clustering and classification of data sets including 20newsgroup retuers 500 extracted data, there are four tables
kmean
- 使用k-means算法对150个数据集进行分簇。-K-means algorithm using 150 data sets to carry out sub-cluster.
swissroll
- 此文件包含生成swissroll数据集,并用LLE算法进行降维处理,很实用,应该对现在学习流形学习的同学有帮助。-This file contains data sets generated swissroll and dimensionality reduction algorithm with the LLE process, it is useful, it should now learning manifold learning for students help.
fuzzyPR
- 模糊聚类的经典算法,可以使用UCI数据集进行聚类,该程序附有详细的说明-Classical fuzzy clustering algorithm, you can use the UCI data sets clustering, the program is accompanied by a detailed descr iption of
ids
- 网络入侵检测系统的源代码 检测网络入侵的存在 数据来源是收集到的dump数据集-Network Intrusion Detection System Network Intrusion Detection of the source code of the existence of the data collected from the dump data sets
DimensionalityReductionOfClusteredData
- Matlab code for Dimensionality Reduction of Clustered Data Sets
protein-data
- 一个arff格式的源码数据集,可用于WEKA挖掘软件当中。-1 arff format source data sets which can be used for mining software WEKA.
pls
- 所谓偏最小二乘法,就是指在做基于最小二乘法的线性回归分析之前,对数据集进行主成分分析降维,下面的源码是没有删减的,GreenSim团队免费提供您使用,转载请注明GreenSim团队(http://blog.sina.com.cn/greensim)。 -The so-called partial least squares method, this means doing the least square method based on linear regression analysis pr
dbscan
- DBSCAN算法,利用数据集中密度差异来区分不同聚类。-DBSCAN algorithm, the density difference between the use of data sets to distinguish between different cluster.
DocumentSet_rar
- files that are very useful as data sets for document clustering.. i have done project based on these document sets for my pg degr-files that are very useful as data sets for document clustering.. i have done project based on these document sets for m
program-and-data-sets
- 本程序是用于在时域中分析处理轴承故障数据的,提取了各种时域的特征参数,包括均值,有效值,峭度,裕度指标,波形指标等等 适合初学者用于轴承的故障诊断中对参数的提取。(附上轴承故障数据3类共21组 正常、内圈、外圈)-This procedure is used in the time domain analysis of bearing fault data processing to extract the characteristics of various time domain param
Data-Mining-Concepts-and-Techniques
- 介绍什么是数据挖掘,什么是数据库中知识发现。书中的材料从数据库角度 提供,特别强调发现隐藏在大型数据集中有趣数据模式的数据挖掘基本概念和技术。所讨论的实现 方法主要面向可规模化的、有效的数据挖掘工具开发。 除学习数据挖掘系统的分类之外,你将看到建立未来的数据挖掘工具所面临的挑战性问题-Introduction What is data mining, knowledge of what is found in the database. The book provides the m
gridded-data-sets.pdf
- gmt 格网化文档,例子很详细,很好懂。将多列数据文件中的其中几列进行格网化,-gmt the gridding document, examples are detailed, easy to understand. Multi-column data file columns of the grid of
VBPbutton-Control-DataSet-data
- VB利用按钮控制数据集浏览的一个小程序,可实现用按钮对数据集的控制,对于VB新手可以学习一下-VB use button to control the datasets browse a small program, can be realized using the buttons on the control of the data sets, for VB novice can learn about
Wine-Quality-Data-Set
- 红酒、白酒质量数据集,可作为机器学习中的数据挖掘数据库-Red wine, white wine quality data sets can be used as data mining machine learning database
matlab-data-generation
- 《Detecting Novel Associations in Large Data Sets》论文中生成数据实验部分的数据生成代码-《Detecting Novel Associations in Large Data Sets》based data generation
Sparse-data-setsCCA
- 数据集稀疏化然后用典型相关分析进行降为再分类-Sparse data sets, and then reduced to using canonical correlation analysis
Data Mining Concepts and Techniques
- 数据挖掘:概念与技术,本书是一个导论,介绍什么是数据挖掘,什么是数据库中知识发现。书中的材料从数据库角度提供,特别强调发现隐藏在大型数据集中有趣数据模式的数据挖掘基本概念和技术。所讨论的实现方法主要面向可规模化的、有效的数据挖掘工具开发。(Data mining: concepts and techniques. This book is an introduction to what data mining is and what is knowledge discovery in datab
Data sets for spectral clustering
- 论文用的数据,非常实用。里面附带了论文的代码(The data set used in this paper is very practical)
Geolife Data 1.3
- Geolife GPS 轨迹数据集–用户指南 这一 GPS 轨迹数据集是在 (微软研究亚洲) Geolife 项目中收集的, 178 用户在四年 (2007年4月至 2011年10月) 期间。该数据集的 GPS 轨迹由一个时间戳点序列表示, 每一个都包含纬度、经度和高度信息。该数据集包含17621个轨迹, 总距离为1251654公里, 总持续时间为48203小时。该轨迹数据集可以应用于移动模式挖掘、用户活动识别、基于位置的社交网络、位置隐私和位置推荐等多个研究领域。(Geolife GPS t
