中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

Message Board

Respected readers, authors and reviewers, you can add comments to this page on any questions about the contribution, review, editing and publication of this journal. We will give you an answer as soon as possible. Thank you for your support!

Name
E-mail
Phone
Title
Content
Verification Code
Issue 5
Sep.  2017
Turn off MathJax
Article Contents
YU Yang, ZHOU Min-qi, FANG Zhu-he. An outer join algorithm based on Cuckoo filter[J]. Journal of East China Normal University (Natural Sciences), 2017, (5): 40-51. doi: 10.3969/j.issn.1000-5641.2017.05.005
Citation: YU Yang, ZHOU Min-qi, FANG Zhu-he. An outer join algorithm based on Cuckoo filter[J]. Journal of East China Normal University (Natural Sciences), 2017, (5): 40-51. doi: 10.3969/j.issn.1000-5641.2017.05.005

An outer join algorithm based on Cuckoo filter

doi: 10.3969/j.issn.1000-5641.2017.05.005
  • Received Date: 2017-06-19
  • Publish Date: 2017-09-25
  • In recent years, due to the development of the Internet, data size has been increased rapidly. At the era of big data, the analysis efficiency of distributed database system needs to be optimized urgently. Nevertheless, the join operation is the main performance bottleneck of a distributed database system. Join operations are mainly divided into inner join and outer join, and outer join is widely used in the business situations. Distributed join algorithm involves a large amount of network transmission, which affects the performance of the system severely. Although there are some studies in the literature of inner join optimized, these optimization methods cannot be directly applied to outer join. This paper proposes a distributed outer join algorithm based on Cuckoo filter. By building a Cuckoo filter with replication subdivision technology for data filtering and allocation, it reduces the amount of data transmission and improves the degree of parallelism accordingly. Finally, it improves the query performance. We implement this algorithm in Ginkgo. Based on the given extensive experimental verification, the algorithm largely improves the efficiency of the outer join.
  • loading
  • [1]
    STONEBRAKER M. The case for shared nothing[J]. Database Engineering, 1986, 9:4-9.
    [2]
    王立. 分布式内存数据库系统的查询处理与优化[D]. 上海: 华东师范大学, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10269-1015348214.htm
    [3]
    邹先霞, 贾维嘉, 潘久辉.实化外连接视图的增量计算[J].系统工程与电子技术, 2011, 33(4):938-942. http://www.cnki.com.cn/Article/CJFDTOTAL-XTYD201104045.htm
    [4]
    张磊, 方祝和, 周敏奇, 等.面向内存计算的连接算法[J].华东师范大学学报(自然科学版), 2014(5):180-191. http://xblk.ecnu.edu.cn/CN/abstract/abstract25018.shtml
    [5]
    BERNSTEIN P A, CHIU D-M W. Using semi-joins to solve relational queries[J] Journal of the Association for Computing Machinery, 1981, 28(1):25-40. doi:  10.1145/322234.322238
    [6]
    樊秋实, 周敏奇, 周傲英.基线与增量数据分离架构下的分布式连接算法[J].计算机学报, 2016, 39(10):2102-2113 doi:  10.11897/SP.J.1016.2016.02102
    [7]
    茅潇潇, 段惠超, 高明. OceanBase中基于布隆过滤器的连接算法[J].华东师范大学学报(自然科学版), 2016(5):67-74. http://xblk.ecnu.edu.cn/CN/abstract/abstract25336.shtml
    [8]
    BLASGEN M W, ESWARAN K P. Storage and access in relational data bases[J]. IBM Systems Journal, 1977, 16(4):363-377. doi:  10.1147/sj.164.0363
    [9]
    赵宇兰.分布式数据库查询优化研究[M].成都:电子科技大学出版社, 2016.
    [10]
    LI J T, XIA X L. Research on outer join optimization in parallel DBMS[C]//Proceedings of the 2011 International Conference on Computer Science and Information Technology (ICCSIT 2011). 2011:502-507.
    [11]
    XU Y, KOSTAMAA P. A new algorithm for small-large table outer joins in parallel DBMS[C]//Proceedings of the IEEE 29th International Conference on Data Engineering (ICDE) (2010). IEEE, 2010:1018-1024.
    [12]
    CHENG L, KOTOULAS S, WARD T E, et al. Robust and efficient large-large table outer joins on distributed infrastructures[C]//European Conference on Parallel Processing. Berlin:Springer International Publishing, 2014:258-269.
    [13]
    XU Y, KOSTAMAA P, ZHOU X, et al. Handling data skew in parallel joins in shared-nothing systems[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 2008:1043-1052.
    [14]
    CHENG L, KOTOULAS S, WARD T E, et al. Efficiently handling skew in outer joins on distributed systems[C]//Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 2014:295-304.
    [15]
    FAN B, ANDERSEN D G, KAMINSKY M, et al. Cuckoo Filter:Practically better than bloom[C]//Proceedings of the ACM International on Conference on Emerging Networking Experiments & Technologies. ACM, 2014:75-88.
    [16]
    PAGH R, RODLER F F. Cuckoo Hashing[J]. Journal of Algorithms, 2004, 51(2):122-144. doi:  10.1016/j.jalgor.2003.12.002
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(9)  / Tables(2)

    Article views (217) PDF downloads(309) Cited by()
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return