文件名称:BuptCrawl
-
所属分类:
- 标签属性:
- 上传时间:2013-11-19
-
文件大小:5.41mb
-
已下载:0次
-
提 供 者:
-
相关连接:无下载说明:别用迅雷下载,失败请重下,重下不扣分!
介绍说明--下载内容来自于网络,使用问题请自行百度
使用Java语言编写的一个网络爬虫demo,将爬取下来的网页转化为统一的XML格式,对XML文件进行解析,对各个DOM节点进行编号。根据节点编号可以获取到各元素节点的内容-Using the Java language using a web crawler demo, will climb to take down the web page into a unified XML format, the XML file is parsed for each DOM nodes are numbered. According to the node ID can get to the content of each element node
(系统自动生成,下载前可以参看下载内容)
下载文件列表
BuptCrawl/
BuptCrawl/.classpath
BuptCrawl/.project
BuptCrawl/.settings/
BuptCrawl/.settings/org.eclipse.core.resources.prefs
BuptCrawl/.settings/org.eclipse.jdt.core.prefs
BuptCrawl/bin/
BuptCrawl/bin/com/
BuptCrawl/bin/com/bupt/
BuptCrawl/bin/com/bupt/crawler/
BuptCrawl/bin/com/bupt/crawler/Controller.class
BuptCrawl/bin/com/bupt/crawler/dom4j/
BuptCrawl/bin/com/bupt/crawler/dom4j/Dom4JUtils.class
BuptCrawl/bin/com/bupt/crawler/dom4j/Downloader.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlClean.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlCodeUtil.class
BuptCrawl/bin/com/bupt/crawler/MyCrawler.class
BuptCrawl/bin/edu/
BuptCrawl/bin/edu/uci/
BuptCrawl/bin/edu/uci/ics/
BuptCrawl/bin/edu/uci/ics/crawler4j/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Configurable.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Page.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/WebCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/Cryptography.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/CrawlStat.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/Downloader.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/MultipleCrawlerController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/ControllerWithShutdown.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/CustomFetchStatus.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/IdleConnectionMonitorThread.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$GzipDecompressingEntity.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetchResult.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters$ReservedCounterNames.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/DocIDServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Frontier.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/InProcessPagesDB.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WebURLTupleBinding.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WorkQueues.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/BinaryParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ExtractedUrlAnchorPair.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$Element.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$HtmlFactory.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/Parser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/TextParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/HostDirectives.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtParser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RuleSet.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/
BuptCrawl/bin/edu/uci/ics/crawler4j/url/TLDList.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/URLCanonicalizer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver$Url.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/WebURL.class
BuptCrawl/bin/edu/uci/ics/crawler4j/util/
BuptCrawl/bin/edu/uci
BuptCrawl/.classpath
BuptCrawl/.project
BuptCrawl/.settings/
BuptCrawl/.settings/org.eclipse.core.resources.prefs
BuptCrawl/.settings/org.eclipse.jdt.core.prefs
BuptCrawl/bin/
BuptCrawl/bin/com/
BuptCrawl/bin/com/bupt/
BuptCrawl/bin/com/bupt/crawler/
BuptCrawl/bin/com/bupt/crawler/Controller.class
BuptCrawl/bin/com/bupt/crawler/dom4j/
BuptCrawl/bin/com/bupt/crawler/dom4j/Dom4JUtils.class
BuptCrawl/bin/com/bupt/crawler/dom4j/Downloader.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlClean.class
BuptCrawl/bin/com/bupt/crawler/dom4j/HtmlCodeUtil.class
BuptCrawl/bin/com/bupt/crawler/MyCrawler.class
BuptCrawl/bin/edu/
BuptCrawl/bin/edu/uci/
BuptCrawl/bin/edu/uci/ics/
BuptCrawl/bin/edu/uci/ics/crawler4j/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Configurable.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/CrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/Page.class
BuptCrawl/bin/edu/uci/ics/crawler4j/crawler/WebCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/basic/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/Cryptography.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/imagecrawler/ImageCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/CrawlStat.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/Downloader.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/multiple/MultipleCrawlerController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/BasicCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/shutdown/ControllerWithShutdown.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawlController.class
BuptCrawl/bin/edu/uci/ics/crawler4j/examples/statushandler/StatusHandlerCrawler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/CustomFetchStatus.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/IdleConnectionMonitorThread.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$1.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher$GzipDecompressingEntity.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetcher.class
BuptCrawl/bin/edu/uci/ics/crawler4j/fetcher/PageFetchResult.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters$ReservedCounterNames.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Counters.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/DocIDServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/Frontier.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/InProcessPagesDB.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WebURLTupleBinding.class
BuptCrawl/bin/edu/uci/ics/crawler4j/frontier/WorkQueues.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/BinaryParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ExtractedUrlAnchorPair.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$Element.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler$HtmlFactory.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlContentHandler.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/HtmlParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/ParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/Parser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/parser/TextParseData.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/HostDirectives.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtConfig.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtParser.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RobotstxtServer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/robotstxt/RuleSet.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/
BuptCrawl/bin/edu/uci/ics/crawler4j/url/TLDList.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/URLCanonicalizer.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver$Url.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/UrlResolver.class
BuptCrawl/bin/edu/uci/ics/crawler4j/url/WebURL.class
BuptCrawl/bin/edu/uci/ics/crawler4j/util/
BuptCrawl/bin/edu/uci
1999-2046 搜珍网 All Rights Reserved.
本站作为网络服务提供者,仅为网络服务对象提供信息存储空间,仅对用户上载内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
