搜索资源列表
Searcharoo
- It is a simple, free, easy to install Search page written in C#. The goal is to build a simple search tool that can be installed simply by placing three files on a website, and that could be easily extended to add all the features you might need for
NewsSpider
- 我以前自己写的能够抓取新闻的新闻蜘蛛。上载的包中有文档。代码中也有注释。-past, I wrote it myself to crawl business news spiders. Upload the files included in the package. The code also notes.
multithreadingsearch
- 能进行多个文件的搜索是个很好的搜索工具啊-to conduct the search multiple files is a very good search tool ah
firtex_beta102_src
- FirteX介绍 功能: 支持增量索引,差量索引,多字段索引,提供了3种前向索引方式; 支持纯文本,HTML,PDF等文件格式; 提供快速中文分词; 从底层到高层,提供了多种索引访问接口,灵活自由地使用索引文件; 提供丰富的检索语法,支持多字段检索,日期范围检索,检索结果自定义排序等。 性能: 在Pentium 4 2.8G 2GRAM的机器上超过200Mb每分钟的索引速度 在近7G的索引文件(100G网页,11G纯文本的索引)上检索,仅使用十几M内存在数毫
sogzq
- 功用:跟踪搜索引擎的蜘蛛(BOT),并进行记录,提供在线察看和生成cvs格式文档下载。-function : tracking search engine spiders (BOT), and make a record, View online and generate cvs format files downloaded.
inverted_index.rar
- 简单的文件倒排实现,搜索引擎实现的步骤之一。大量使用STL,实现简单容易理解。效率一般。,Simple realization of inverted files, search engines to achieve one of the steps. Extensive use of STL, the realization of simple and easy to understand. Efficiency in general.
lucene-2.4.0
- 最好的分析器代码,不过是class文件形式,可以反编译的,快快看看吧-The best parser code, but a form of class files, you can decompile and quickly take a look at it
ginss-engine
- Samba local network search engine. This is a engine indexer. Using Mysql as main storage of information. Have functions of multimedia extended index like merging same video files and id3 tag indexing. Have computer online status checking.
AnalyzerViewer_source
- Lucene.Net is a high performance Information Retrieval (IR) library, also known as a search engine library. Lucene.Net contains powerful APIs for creating full text indexes and implementing advanced and precise search technologies into your programs.
jiatongpeifushousuowenjian
- 一个通过加通配符搜索文件的详细代码,有兴趣的大家一起研究研究下-A wildcard search for files by the addition of the detailed code, for everyone interested in research studies. . .
search
- search engine from selected txt files in .cgi document sort by 0.5 mutual score for every similarity.
Crawler
- A mini crawler engine for html files. The application is written in Visual C++ with MFC.
larbin-2.6.3
- 一个高效的网络爬虫,可以自行修改配置文件,为linux下工作环境,很具有参考意义-An efficient Web crawler that can modify configuration files for linux work environment, it is a reference value
strigi.tar
- STrigi是一个高效的搜索引擎架构。可以迅速索引你的硬盘而不会拖慢你的系统。使得其成为一个迅速而且小型的桌面搜索系统。而且其可以索引多种文件格式。-Strigi is a daemon which uses a very fast and efficient crawler that can index data on your harddrive. Indexing operations are performed without hammering your system, this ma
Hyperion
- 一款开源的桌面搜索引擎源代码,技术特色包括快速搜索文件(作者称经常会少于1秒),支持音乐/文档/图片筛选过滤,文件类型筛选过滤,文件访问和文件大小筛选等-An open source desktop search engine source code, technical features include a quick search for files (that are often less than one second), support the music/document/image
Lucene.PaodingSrc.jar
- 最新的开源的中文分词paoding ,包含jar包和源码 可以给设计搜索的人一些帮助-The latest open-source Chinese the word paoding, contains the jar files and source code to the design search some help
Search.test1
- 主要是測試以asp.net下載網路上的檔案,並可以解析word,excel,pdf格式的檔案為文字檔。限制:必須安裝office 2-The test asp.net download files over the Internet, and can parse word, excel, pdf format file as a text file. Restrictions: must install office 2000
python_sina_crawl
- 新浪微博的爬虫程序。程序运行方式:保存所有代码后,打开Main.py,修改LoginName为你的新浪微博帐号,PassWord为你的密码。运行Main.py,程序会在当前目录下生成CrawledPages文件夹,并保存所有爬取到的文件在这个文件夹中。-Sina microblogging reptiles. Program operation: save all the code, open Main.py, modify LoginName for your Sina Weibo accou
openpyxl-2.4.1.tar
- To Read Ms Excel files form Python. (init)
