中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

Message Board

Respected readers, authors and reviewers, you can add comments to this page on any questions about the contribution, review, editing and publication of this journal. We will give you an answer as soon as possible. Thank you for your support!

Name
E-mail
Phone
Title
Content
Verification Code
Issue 5
Nov.  2016
Turn off MathJax
Article Contents
MAO Xiao-xiao, DUAN Hui-chao, GAO Ming. A join algorithm based on bloom filter in OceanBase[J]. Journal of East China Normal University (Natural Sciences), 2016, (5): 67-74. doi: 10.3969/j.issn.1000-5641.2016.05.008
Citation: MAO Xiao-xiao, DUAN Hui-chao, GAO Ming. A join algorithm based on bloom filter in OceanBase[J]. Journal of East China Normal University (Natural Sciences), 2016, (5): 67-74. doi: 10.3969/j.issn.1000-5641.2016.05.008

A join algorithm based on bloom filter in OceanBase

doi: 10.3969/j.issn.1000-5641.2016.05.008
  • Received Date: 2016-06-24
  • Publish Date: 2016-09-25
  • In the era of big data, the movement of de-IOE campaign and the development of activities such as Double 11 have put forward higher request of the performance of distributed database. OceanBase is an open sourced distributed database implemented by Alibaba. It supports for cross-table relational query of massive data but the performance for complex queries remains to be improved. The network transmission overheads caused by join operator seriouslyinfluenced the performance of distributed database. This paper proposes a join algorithm based on bloom filter. It filters the data of the right table by constructing a bloom filter on the join column of the left table. The key point of this algorithm is that it reduces the overhead of unnecessary data transmission and the consumption of memory resources by data processing. We implement this algorithm in OceanBase and the experiment results show that the algorithm can greatly improve the efficiency of join operator.
  • loading
  • [1]

    [ 1 ] 杨传辉.大规模分布式存储系统: 原理解析与架构实战[M]. 北京:机械工业出版社, 2013.
    [ 2 ] BLASGEN M W, ESWARAN K P. Storage and access in relational data bases[J]. IBM Systems Journal, 1977, 16(4): 363-377.
    [ 3 ] MERRETT T H. Why sort-merge gives the best implementation of the natural join[J]. ACM SIGMOD Record, 1983, 13(2): 39-51.
    [ 4 ] BABB E. Implementing a relational database by means of specialized hardware[J]. ACM Transactions on Database Systems, 1979, 4(1): 1-29.
    [ 5 ] SCHNEIDER D A, DEWITT D J. A performance evaluation of four parallel join algorithms in a shared-nothingmultiprocessor environment[C]//Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data. ACM,1989: 110-121.
    [ 6 ] BERNSTEIN P A, GOODMAN N, WONG E, et al. Query processing in a system for distributed databases (SDD-1)[J]. ACMTransactions on Database Systems, 1981, 6(4): 602-625.
    [ 7 ] BLOOM B H. Space/time trade-offs in hash coding with allowable errors[J]. Communications of the ACM, 1970, 13(7):422-426.
    [ 8 ] CHEN M S, HSIAO H I, YU P S. On applying hash filters to improving the execution of multi-join queries[J]. The VLDB journal, 1997, 6(2): 121-131.
    [ 9 ] MACKERT L F, Lohman G M. R* optimizer validation and performance evaluation for distributed queries[C]//Proceedings of the 12th International Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann Publishers Inc, 1986: 149-159.
    [10] BACON D F, STROM R E, TARAFDAR A. Guava: A dialect of Java without data races[C]//Proceedings of the 15th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. 2000: 382-400.
    [11] GHEMAWAT S, DEAN J. Level DB[DB/OL]. [2011-5-12]. http://code.google.com/p/leveldb/.

  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索
    Article views (263) PDF downloads(540) Cited by()
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return