搜索资源列表
WordCount.tar
- hadoop 中最基础最重要的例子,wordcount,将文件中的字符进行map\reduce,得到每个字符出现的次数-hadoop in the most basic of the most important example, wordcount, characters in the file, the map \ reduce the number of each character appears
hadoop2011
- hadoop基本手册,供hadoop初学者参考-introduce of hadoop
hadoop_Ubuntu_StudyNotes
- Hadoop the Ubuntu learning notes
mahout-distribution-0.5.tar
- Mahout 是 Apache Software Foundation(ASF) 旗下的一个开源项目,提供一些可扩展的机器学习领域经典算法的实现,旨在帮助开发人员更加方便快捷地创建智能应用程序。Apache Mahout项目已经发展到了它的第三个年头,目前已经有了三个公共发行版本。Mahout包含许多实现,包括集群、分类、推荐过滤、频繁子项挖掘。此外,通过使用 Apache Hadoop 库,Mahout 可以有效地扩展到云中。 -Mahout is Apache Software Founda
testHdfs.tar
- 通过Hadoop的提供的C接口,来读写HDFS。-Use th C API of hadoop to deal with the Hadoop File System.
HadoopStreaming.tar
- 在使用Hadoop Streaming时,自己定义的Inputformat,实现Map的输入<key,value>,key为文件名,value为文档的整篇内容。-Using the self design inputformat to make the filename as the key and the whole content as the key in the Map Input come true while do the MapReduce program with h
Parallel_R
- R语言并行执行的学习文档,R语言在hadoop上执行的教程-tutorial of parallel_r
SequenceFile
- Hadoop下的SequenceFile文件类的操作,包括写入和读取,适合刚开始学习hadoop。-Hadoop under the SequenceFile file class action, including written and read, suitable for beginning to learn Hadoop.
HDFS
- 该代码是介绍了如何在搭好的hadoop平台上,实现对文件的操作-The code was introduced in setting up hadoop platform on file
hadoopPSQL
- 这些资料是讲述了hadoop与SQL之间的论文,相关论文可以帮助你理解数据库与Hadoop之间如何实现的原理。-Such information about the paper between hadoop and SQL related papers can help you understand how to implement the principles of database with Hadoop.
shortestPath
- 采用hadoop,解决海量数据中,图的并行计算,计算图的最短距离,主要用于社区数据分析-Using Hadoop Distributed to solve the shortest path problem, and and hbase do back-end database, and learn hadoop is a very good reference
graph_betweeness
- 采用hadoop,解决图问题中betweeness问题,采用分布式数据库hbase,是个很不错的例子-Using Hadoop to solve graph problems betweeness distributed database hbase is a very good example
help
- 把mount写到/etc/fstab文件里面去 192.168.1.206:/u01/data/hadoop/hdfs/name /name_bak/ nfs defaults 0 0 开机后cat /proc/mounts 查看nfs目录有没有挂载上去,没有则手动挂载 修改master的hdfs-site.xml文件,把name_bak/hdfs/name加到配置文件里面去 -Please carefully write the upload information, includ
hdfs_shell.pdf
- Hadoop的脚本命令,来自网络收集,与apache官网的相同。-Hadoop Shell Command
ApacheLog
- hadoop下简单的日志分析系统演示源码-hadoop logs analysis system source
Hhadoop-010taa
- Hadoop是一个用于运行应用程序在大型集群的廉价硬件设备上的框架。Hadoop为应用程序透明的提供了一组稳定/可靠的接口和数据运动。在 Hadoop中实现了Google的MapReduce算法,它能能够把应用程序分割成许多很小的工作单元,每个单元可以在任何集群节点上执行或重复执行。此外,Hadoop还提供一个分布式文件系统用来在各个计算节点上存储数据,并提供了对数据读写的高吞吐率。由于应用了map -Hadoop is a framework for running application
Pppageranka
- PageRank的MapReduce(hadoop)实现,包包括链接关系的抽取,PageRank值的计算和结果的排名显示。 -PageRank, MapReduce (hadoop) implementation, the package includes the extraction of link relationships, the calculation of PageRank values and results of the ranking.
zookeeper
- Hadoop-0.20.2中的SecondaryNameNode只是用来周期性从NameNode下载镜像(fsimage)和日志(edits),将其合并形成新的镜像后上传给NameNode,这个过程被称为CheckPoint。运行SecondaryNameNode节点的内存中没有命名空间和数据块Block与DataNode的映射信息,所以当NameNode宕机时,无法完全恢复元数据,只能恢复上一次做CheckPoint时的元数据。 因此,Hadoop-0.21.0中引入了一个BackupNo
accepted
- hadoop的wordcount源码,用来计数单词的个数。-hadoop wordcount source, used to count the number of words.
AdvancedWordCount
- 高级版的hadoop的wordcount程序,在计数单词的时候可以考虑到标点符号等,运用了hadoop较高级的功能。-Hadoop wordcount procedures Premium, count words can be taken into account such as punctuation, use hadoop higher functionality.