搜索资源列表
nutch_recrawl_mergecrawl
- nutch一款开源搜索引擎,recrawl是实现索引更新的脚本 mergecrawl是合并多个网站查询的bash脚本。
nutchtutorial
- nutch turorial,nutch搜索引擎开发文档
je-analysis-1.5.3
- 在java环境下开发的分词源代码,本代码可以通过lucene,nutch调用,实现对中文的分词-Java development environment in the sub-etymology code, this code can be used with lucene, nutch call, the aim is to achieve the Chinese word
nutch0.8
- nutch0.8源码,开源搜索引擎,希望大家从中可以学到很多东西-nutch0.8 source, open source search engine, hope that we can learn a lot from
lucenenutch
- lucene和nutch书中配套代码,这部分为chapter2的内容-lucene 和nutch book package code, this is divided into chapter2 content
vicaya-0.1.6.0
- 基于Nutch 开发的搜索引擎,用于在网上搜索CDRom-Developed based on the Nutch search engine, online search for CDRom
nutch-4.7.x-1.x-dev.tar
- search engines sql server
OReilly.Hadoop.The.Definitive.Guide.June.2009.RETA
- Hadoop got its start in Nutch. A few of us were attempting to build an open source web search engine and having trouble managing computations running on even a handful of computers.-Hadoop got its start in Nutch. A few of us were attempting to buil
lukemin.tar
- lukemin软件:用来查看nutch爬虫抓取的网页的各种信息,清晰全面。-lukemin Software: nutch crawler is used to view web pages crawled all kinds of information, clear and comprehensive.
search
- lucene应用实例程序,包含了建立索引到web搜索的完整代码,里面用到的数据库是dedecms的,可以自己去下载,config.xml为配置文件,需要配置索引目录和链接数据的用户密码。该代码实例可以直接作为你用lucene建立全文搜索的参考-lucene Applications programs, including the establishment of an index to the web search the complete code, which used the databa
lucenePnutchPmapreducePsearch-engine
- 三篇关于开源搜索引擎的硕士论文 1、基于Lucene的Web搜索引擎实现 2、基于MapReduce的分布式智能搜索引擎框架研究 3、基于Nutch的垂直搜索引擎的分析与实现-Three open source search engine on the master' s thesis 1, the Web search engine based on Lucene implementation 2, based on the MapReduce framework
NutchAnalysis
- Nutch中,解决韩语无法解析的问题。文件为.jj文件,需要用JAVACC解析。相信用过NUTCH的人都知道,生成5个文件替换后,重新抓取,然后ant一下,打包新的nutch-1.0.jar,替换到tomcat下就行了。OK-Nutch, solve the problem cannot resolve in Korean. Documents. Jj files, need to use JAVACC analytical. Believe that used NUTCH knows that
08214942iobg
- lucene+nutch搜索引擎(lucene开发资料文档,各种功能实例)-lucene development information, features instances
apache-nutch-1.4-src.tar
- 很好的一个开源搜索引擎,可以自己设计添加代码。-A good open source search engine can be designed to add your own source code
clou
- 集群三台主机,搭建了基于NUTch的搜索引擎-Cluster three hosts, built based on NUTch search engine
Hadoopsource
- Google的核心竞争技术是它的计算平台。Apache上就出现了一个类似的解决方案,目前它们都属亍Apache的Hadoop项目,对应的分删是: Chubby-->ZooKeeper GFS-->HDFS BigTable-->HBase MapReduce-->Hadoop 目前,基亍类似思想的Open Source项目迓径多,Hadoop是其中最为流行的框架,本文就将简要介绍hadoop的一个开发流程。-Hadoop got its start in Nutch. A
lucene
- lucene+nutch搜索引擎开发,分布式搜索引擎开发-lucene+nutch search engine development Download
Nutch
- Apache-Nutch1.3 学习笔记,很完整的学习笔记,内容很全-Apache-Nutch1.3 study notes, very complete study notes, is the whole content
nutch-yuqing
- 该资料介绍了现实网络舆情监测系统所用到的流行的技术手段-The information on the real network public opinion monitoring system used by the popular techniques
Hadoop-based-distributed-crawler
- 本文讨论了搜索引擎的基本技术和网络爬虫的基本原理,并对分布式爬虫的技术原型Nutch进行了剖析。 -This article discusses the basic principles and basic techniques of search engine web crawlers, and distributed Nutch crawler technology prototypes were analyzed.