文件名称:Data-Processing-on-Large-Clusters
-
所属分类:
- 标签属性:
- 上传时间:2015-07-10
-
文件大小:149.21kb
-
已下载:0次
-
提 供 者:
-
相关连接:无下载说明:别用迅雷下载,失败请重下,重下不扣分!
介绍说明--下载内容来自于网络,使用问题请自行百度
Abstract:
Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. Through a simple interface with two functions, map and reduce, this model facilitates parallel implementation of many real-world tasks such as data processing for search engines and machine learning. However, this model does not directly support processing multiple related heterogeneous datasets. While processing relational data is a common need, this limitation causes dif- ficulties and/or inefficiency when Map-Reduce is applied on relational operations like joins. We improve Map-Reduce into a new model called MapReduce-Merge. It adds to Map-Reduce a Merge phase that can efficiently merge data already partitioned and sorted (or hashed) by map and reduce modules. We also demonstrate that this new model can express relational algebra operators as well as implement several join algorithms.
Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. Through a simple interface with two functions, map and reduce, this model facilitates parallel implementation of many real-world tasks such as data processing for search engines and machine learning. However, this model does not directly support processing multiple related heterogeneous datasets. While processing relational data is a common need, this limitation causes dif- ficulties and/or inefficiency when Map-Reduce is applied on relational operations like joins. We improve Map-Reduce into a new model called MapReduce-Merge. It adds to Map-Reduce a Merge phase that can efficiently merge data already partitioned and sorted (or hashed) by map and reduce modules. We also demonstrate that this new model can express relational algebra operators as well as implement several join algorithms.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
Map-Reduce-Merge Simplified Relational Data Processing on Large Clusters.docx
本网站为编程资源及源代码搜集、介绍的搜索网站,版权归原作者所有! 粤ICP备11031372号
1999-2046 搜珍网 All Rights Reserved.