中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

Message Board

Respected readers, authors and reviewers, you can add comments to this page on any questions about the contribution, review, editing and publication of this journal. We will give you an answer as soon as possible. Thank you for your support!

Name
E-mail
Phone
Title
Content
Verification Code
Issue 5
Nov.  2014
Turn off MathJax
Article Contents
BIAN Hao-Qiong, CHEN Yue-Guo, DU Xiao-Yong, GAO Yan-Jie. Equi-join optimization on spark[J]. Journal of East China Normal University (Natural Sciences), 2014, (5): 261-270. doi: 10.3969/j.issn.1000-5641.2014.05.023
Citation: BIAN Hao-Qiong, CHEN Yue-Guo, DU Xiao-Yong, GAO Yan-Jie. Equi-join optimization on spark[J]. Journal of East China Normal University (Natural Sciences), 2014, (5): 261-270. doi: 10.3969/j.issn.1000-5641.2014.05.023

Equi-join optimization on spark

doi: 10.3969/j.issn.1000-5641.2014.05.023
  • Publish Date: 2014-09-25
  • Equi-join is one of the most common and costly operations in data analysis. The implementations of equijoin on Spark are different from those in parallel databases. Equi-join algorithms based on data prepartitioning which are commonly used in parallel databases can hardly be implemented on Spark. Currently common used equijoin algorithms on Spark, such as broadcast join and repartition join, are not efficient. How to improve equijoin performance on Spark becomes the key of big data analysis on Spark. This work combines the advantages of SimiJoin and Partition Join and provides an optimized equijoin algorithm based on the features of Spark. It is indicated by cost analysis and experiment that this algorithm is 12 times faster than algorithms used in stateoftheart data analysis systems on Spark.
  • loading
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索
    Article views (2617) PDF downloads(1308) Cited by()
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return