CS 244 ’19: Exploring Copysets under Repeated Failures


Download report

Jayden Navarro, Isaiah Brandt-Sims

Original paper: Asaf Cidon, Stephen M. Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout, and Mendel Rosenblum. 2013. Copysets: reducing the frequency of data loss in cloud storage. In Proceedings of the 2013 USENIX conference on Annual Technical Conference (USENIX ATC’13). USENIX Association, Berkeley, CA, USA, 37-48.

For our final project, we present a reproduction and extension of one of the results from the paper “Copysets: Reducing the Frequency of Data Loss in Cloud Storage” by Asaf Cidon et al. First, we present a recreation and limited simulated reproduction of Figure 6, which shows the probability of data loss when 1% of the nodes in a cluster fail due to correlated failures (e.g. power loss) under different replication schemes. Additionally, we highlight a mistake we discovered in one of the equations used to generate Figure 6 in the original work. Next, we present an extension to the research, exploring the probability of data loss under repeated failure events. Our findings show that Copyset Replication parameters that yield high probability of data loss with single failures can perform better than those that yield low probability of data loss with single failures, when repeated failures are introduced at a short time interval.

Leave a comment