中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于Paxos的分布式一致性算法的实现与优化

祝朝凡 郭进伟 蔡鹏

祝朝凡, 郭进伟, 蔡鹏. 基于Paxos的分布式一致性算法的实现与优化[J]. 华东师范大学学报(自然科学版), 2019, (5): 168-177. doi: 10.3969/j.issn.1000-5641.2019.05.014
引用本文: 祝朝凡, 郭进伟, 蔡鹏. 基于Paxos的分布式一致性算法的实现与优化[J]. 华东师范大学学报(自然科学版), 2019, (5): 168-177. doi: 10.3969/j.issn.1000-5641.2019.05.014
ZHU Chao-fan, GUO Jin-wei, CAI Peng. Implementation and optimization of a distributed consistency algorithm based on Paxos[J]. Journal of East China Normal University (Natural Sciences), 2019, (5): 168-177. doi: 10.3969/j.issn.1000-5641.2019.05.014
Citation: ZHU Chao-fan, GUO Jin-wei, CAI Peng. Implementation and optimization of a distributed consistency algorithm based on Paxos[J]. Journal of East China Normal University (Natural Sciences), 2019, (5): 168-177. doi: 10.3969/j.issn.1000-5641.2019.05.014

基于Paxos的分布式一致性算法的实现与优化

doi: 10.3969/j.issn.1000-5641.2019.05.014
基金项目: 

国家重点研发计划 2018YFB1003303

国家自然科学基金 61432006

详细信息
    作者简介:

    祝朝凡, 男, 硕士研究生, 研究方向为分布式数据库.E-mail:chaofanzhu2@163.com

    通讯作者:

    蔡鹏, 男, 副教授, 研究方向为高性能事务处理.E-mail:pcai@dase.ecnu.edu.cn

  • 中图分类号: TP392

Implementation and optimization of a distributed consistency algorithm based on Paxos

  • 摘要: 互联网的不断发展,企业的信息化程度不断加强,不计其数的数据需要得到及时处理.但是网络环境不稳定,容易发生数据丢失、节点宕机,从而造成严重后果.因此,构建可以容错的分布式存储系统变得越来越受欢迎.为了保证系统的高可用性和一致性,需要引入分布式一致性算法.为了提高系统在不稳定网络下的性能,传统基于Paxos的分布式系统允许日志中存在空洞.然而,当节点进入恢复状态时,这些系统通常需要大量网络交互来补全日志空洞,这极大地增加了节点恢复的时间,从而影响了系统的可用性.针对节点恢复过程中补全日志空洞代价过大的问题,本文重新设计了日志项结构,优化了数据恢复流程,通过实验模拟,验证改进的基于Paxos的一致性算法的有效性.
  • 图  1  主节点选举

    Fig.  1  Master node election

    图  2  日志文件结构

    Fig.  2  Structure of log file

    图  3  写请求过程

    Fig.  3  Process of write requests

    图  4  选主时间

    Fig.  4  Time of election

    图  5  恢复策略对比

    Fig.  5  Comparison of recovery strategy

    算法1   节点恢复算法
    1:  if已提交的日志没有全部回放
    2:    从日志文件中获取firstUnApplyIndex=还未回放的日志起点
    3:    从日志文件中获取firstUnchosenIndex=最小的未提交的日志项的索引值
    4:    for(i= firstUnApplyIndex to firstUnchosenIndex-1)do
    5:      以≤10个日志项为一组, 利用includeIndex[9]排除不需要回放的日志项
    6:      剔除后的这组日志如果还有空洞, 利用全局paxos获取日志
    7:      将这一组日志回放
    8:    end for
    9:  end if
    10:   通过全局paxos获取多数派认可的最大日志号=max_logIndex
    11:   for(i= firstUnchosenIndex to
    12:    针对每一项日志进行全局的paxos过程
    13:    if(获得大多数节点的确认)
    14:      提交日志号为i的日志
    15:      回放日志号为i的日志
    16:    end if
    17:   end for
    下载: 导出CSV
  • [1] ADDISIE A, BERTACCO V. Collaborative accelerators for in-memory MapReduce on scale-up machines[C]//Proceedings of the 24th Asia and South Pacific Design Automation Conference. New York: ACM, 2019: 747-753.
    [2] APPUSWAMY R, GKANTSIDIS C, NARAYANAN D, et al. Scale-up vs scale-out for hadoop: Time to rethink?[C]//Proceedings of the 4th annual Symposium on Cloud Computing. New York: ACM, 2013: 20.
    [3] KRASKA T, PANG G, FRANKLIN M J, et al. MDCC: Multi-data center consistency[C]//Proceedings of the 8th ACM European Conference on Computer Systems. New York: ACM, 2013: 113-126.
    [4] MUÑOZ-ESCOÍ F D, DE JUAN-MARÍN R, GARCÍA-ESCRIVÁ J R, et al. CAP theorem:Revision of its related consistency models[J]. The Computer Journal, 2019, 62(6):943-960. doi:  10.1093/comjnl/bxy142
    [5] LAMPORT L. Paxos made simple[J]. ACM Sigact News, 2001, 32(4):18-25. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0235401123/
    [6] LEE J, MUEHLE M. Distributed transaction management using two-phase commit optimization: U.S. Patent 8, 442, 962[P]. 2013-5-14.
    [7] ATIF M. Analysis and verification of two-phase commit& three-phase commit protocols[C]//2009 International Conference on Emerging Technologies. New York: IEEE, 2009: 326-331.
    [8] HERLIHY M. A quorum-consensus replication method for abstract data types[J]. ACM Transactions on Computer Systems (TOCS), 1986, 4(1):32-53. doi:  10.1145/6306.6308
    [9] BURROWS M. The Chubby lock service for loosely-coupled distributed systems[C]//Proceedings of the 7th Symposium on Operating systems design and implementation. USENIX Association, 2006: 335-350.
    [10] BAKER J, BOND C, JAMES C, et al. Megastore: Providing scalable, highly available storage for interactive services[C]//Proceedings of CIDR'11, 2011: 9-12.
    [11] CORBETT J C, DEAN J, EPSTEIN M, et al. Spanner:Google's globally distributed database[J]. ACM Transactions on Computer Systems (TOCS), 2013, 31(3):8. http://d.old.wanfangdata.com.cn/Periodical/txxb200506011
    [12] ZHENG J, LIN Q, XU J, et al. PaxosStore:High-availability storage made practical in WeChat[J]. Proceedings of the VLDB Endowment, 2017, 10(12):1730-1741. doi:  10.14778/3137765
    [13] RAO J, SHEKITA E J, TATA S. Using paxos to build a scalable, consistent, and highly available datastore[J]. Proceedings of the VLDB Endowment, 2011, 4(4):243-254. doi:  10.14778/1938545
    [14] OKI B M, LISKOV B H. Viewstamped replication: A new primary copy method to support highly-available distributed systems[C]//Proceedings of the seventh annual ACM Symposium on Principles of distributed computing. New York: ACM, 1988: 8-17.
    [15] OKI B M. Viewstamped replication for highly available distributed systems[R].Massachusetts Inst of Tech Cambridge Lab for Computer Science, 1988.
    [16] LAMPORT L, MASSA M. Cheap paxos[C]//International Conference on Dependable Systems and Networks, 2004. New York: IEEE, 2004: 307-314.
    [17] MAO Y, JUNQUEIRA F P, MARZULLO K. Mencius: Building efficient replicated state machines for WANs[C]//Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI'08, Berkeley, 2008: 369-384.
    [18] LAMPORT L B. Generalized paxos: U.S. Patent 7, 698, 465[P]. 2010-4-13.
    [19] ONGARO D, OUSTERHOUT J. In search of an understandable consensus algorithm[C]//2014{USENIX}Annual Technical Conference ({USENIX}{ATC}14). 2014: 305-319.
    [20] MORARU I, ANDERSEN D G, KAMINSKY M. Egalitarian paxos[C]//ACM Symposium on Operating Systems Principles, 2012.
    [21] LIN W, JIANG H, ZHAO N, et al. An optimized multi-Paxos protocol with centralized failover mechanism for cloud storage applications[C]//International Conference on Collaborative Computing: Networking, Applications and Worksharing. New York: Springer, 2018: 610-625.
    [22] POKE M, HOEFLER T. Dare: High-performance state machine replication on rdma networks[C]//Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. New York: ACM, 2015: 107-118.
  • 加载中
图(5) / 表(1)
计量
  • 文章访问数:  148
  • HTML全文浏览量:  116
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-07-28
  • 刊出日期:  2019-09-25

目录

    /

    返回文章
    返回