搜索资源列表
kmeans
- 对文章进行kmeans聚类,进行网页主体内容的提取-Extraction of articles kmeans clustering for web main content
extractWiki
- 从enwiki-latest-pages-articles.xml中抽取维基页面的正文内容。-Extract text content enwiki-latest-pages-articles.xml.
Social-Networks-PPT-a-R
- 主要内容为R语言环境下的社交网络数据挖掘,附有源代码和数据,并包含案例所使用的PPT和相关文献。-The main content is under R locales social network data mining, with the source code and data, and includes cases PPT and related documentation used.
selenium_sina_text
- python 写的爬虫 可以爬取新浪微博wap端的内容,包括用户发表的微博内容,时间,终端,评论数,转发数等指标,直接可用-write python reptile You can crawl content Weibo wap side, including micro-blog content published by users, time, terminal, Comments, forwarding numbers and other indicators, directly
ThemeCrawler
- 现在常见的搜索策略主要分为两种:一种是基于网页链接结构的搜索策略,另一种是基于内容评价的搜索策略。第一种是通过网页之间的链接关系来确定网页的重要性,从而决定链接访问的顺序。此方法虽然考虑了网页链接结构和网页之间的链接关系,但忽略了网页内容与主题的相关度,容易出现网页搜索“主题漂移”。第二种主要考虑网页内容,好处就是思路清晰且计算简单。但这种方法忽略了网页的链接关系,故在预测链接网页价值方面存在不足。考虑到这些问题,提出将布谷鸟搜索算法应用到主题爬虫中。-Now the common search
data-mining-video
- 炼数成金数据挖掘视频,前两周,包括:数据挖掘概论,数据挖掘标准流程,适合刚入门的初学者作为了解-data mining video,the content including the introduction of data mining,data mining standard process,
beautifulsoup4test1
- 爬取糗事百科,运用BeautifulSoup模块对爬取内容进行处理。-Crawling embarrassing encyclopedia, using BeautifulSoup module to crawl content processing.
pachongtest2
- 运用python爬取知乎日报的内容,对知乎日报网页中的每一个子链接进行爬取,并对内容进行修改,运用re,urllib2,BeautifulSoup模块。-Use python to crawl the contents of daily news, to know every page in the daily sub-links to crawl, and to modify the content, the use of re, urllib2, BeautifulSoup module.
cnbeta
- 运用python爬取cnbeta的最新内容,运用到了scarpy模块。-The use of python crawl cnbeta the latest content, the use of the scarpy module.
MachineLearningInaction
- 本文件中的内容是《机器学习实战》这本书中的代码,语言采用的是python。(The content of this document is the code in this book "machine learning combat"and the language of code is Python.)
eccv10_tutorial_part2
- 稀疏编码图像分类, 稀疏表示创始人写的PPT,内容精彩,分析清晰,易于理解(sparse coding image classification, PPT written by the original author of sparse representation, the PPT content is easy to realize with clear illustration and analysis.)
wp-autopost-pro.3.7.7 (1)
- WP-AutoPost 插件可以采集来自于任何网站的内容并全自动更新你的WordPress站点。它使用非常简单,无需复杂设置,并且足够强大和稳定,支持wordpress所有特性。(The WP-AutoPost plug-in can collect content from any site and automatically update your WordPress site. It is very simple to use, without complex settings, and
