[1]
|
BAILIS P, KINGSBURY K. The network is reliable [J]. Communications of the ACM, 2014, 57(9): 48-55. DOI: 10.1145/2643130.
|
[2]
|
DEAN J. Designs, lessons and advice from building large distributed systems [R/OL]. The 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware.(2009-10-10)[2019-08-07]. http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf.
|
[3]
|
ALSBERG P A, DAY J D. A principle for resilient sharing of distributed resources [C]//Proceedings of the 2nd International Conference on Software Engineering. IEEE, 1976: 562-570.
|
[4]
|
STONEBRAKER M. Concurrency control and consistency of multiple copies of data in distributed INGRES [J]. IEEE Transactions on Software Engineering, 1979(3): 188-194. DOI: 10.1109/TSE.1979.234180.
|
[5]
|
GARCIA-MOLINA H. Elections in a distributed computing system [J]. IEEE Transactions on Computers, 1982(1): 48-59.
|
[6]
|
LEESATAPORNWONGSA T, LUKMAN J F, LU S, et al. TaxDC: A taxonomy of non-deterministic concurrency bugs in datacenter distributed systems [J]. ACM SIGPLAN Notices, 2016, 51(4): 517-530. DOI: 10.1145/2954679.2872374.
|
[7]
|
GAO Y, DOU W S, QIN F, et al. An empirical study on crash recovery bugs in large-scale distributed systems [C]//Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 2018: 539-550.
|
[8]
|
FONSECA P, ZHANG K, WANG X, et al. An empirical study on the correctness of formally verified distributed systems [C]//Proceedings of the 12th European Conference on Computer Systems. ACM, 2017: 328-343.
|
[9]
|
GANESAN A, ALAGAPPAN R, ARPACI-DUSSEAU A C, et al. Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to single errors and corruptions [J]. ACM Transactions on Storage, 2017, 13(3): Article 20. DOI: 10.1145/3125497.
|
[10]
|
DAWSON S, JAHANIAN F, MITTON T. ORCHESTRA: A fault injection environment for distributed systems [C]// Proceedings of the 26th International Symposium on Fault Tolerant Computing. IEEE, 1996: 404-414.
|
[11]
|
GUNAWI H S, DO T, JOSHI P, et al. FATE and DESTINI: A framework for cloud recovery testing [C]// Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. ACM, 2011: 238-252.
|
[12]
|
KANAWATI G A, KANAWATI N A, ABRAHAM J A. FERRARI: A flexible software-based fault and error injection system [J]. IEEE Transactions on Computers, 1995, 44(2): 248-260. DOI: 10.1109/12.364536.
|
[13]
|
LEESATAPORNWONGSA T, HAO M Z, JOSHI P, et al. SAMC: Semantic-aware model checking for fast discovery of deep bugs in cloud systems [C]// Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation. USENIX Association. 2014: 399-414.
|
[14]
|
BASIRI A, BEHNAM N, DE ROOIJ R, et al. Chaos engineering [J]. IEEE Software, 2016, 33(3): 35-41. DOI: 10.1109/MS.2016.60.
|
[15]
|
ALVARO P, ROSEN J, HELLERSTEIN J M. Lineage-driven fault injection [C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015: 331-346.
|
[16]
|
ALVARO P, MARCZAK W R, CONWAY N, et al. Dedalus: Datalog in time and space [C]//International Datalog 2.0 Workshop Datalog 2.0 2010: Datalog Reloaded. Berlin: Springer, 2010: 262-281.
|
[17]
|
BUNEMAN P, KHANNA S, TAN W C. Why and where: A characterization of data provenance [C]//International Conference on Database Theory. Berlin: Springer, 2001: 316-330.
|
[18]
|
CUI Y, WIDOM J, WIENER J L. Tracing the lineage of view data in awarehousing environment [J]. ACM Transactions on Database Systems (TODS), 2000, 25(2): 179-227. DOI: 10.1145/357775.357777.
|
[19]
|
ALVARO P, CONWAY N, HELLERSTEIN J M, et al. Consistency analysis in bloom: A CALM and collected approach [C]// 5th Biennial Conference on Innovative Data Systems Research (CIDR ’11). 2011: 249-260.
|
[20]
|
LUKMAN J F, KE H, STUARDO C A, et al. FlyMC: Highly scalable testing of complex interleavings in distributed systems [C]//Proceedings of the 14th EuroSys Conference 2019. ACM, 2019: Article number 20.
|
[21]
|
MAJUMDAR R, NIKSIC F. Why is random testing effective for partition tolerance bugs? [J]. Proceedings of the ACM on Programming Languages, 2018, 2(POPL): Article number 46. DOI: 10.1145/3158134.
|
[22]
|
LAMPORT L. The part-time parliament [J]. ACM Transactions on Computer Systems (TOCS), 1998, 16(2): 133-169. DOI: 10.1145/279227.279229.
|
[23]
|
ONGARO D, OUSTERHOUT J. In search of an understandable consensus algorithm [C]// Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference. USENIX Association, 2014: 305-320.
|