ABSTRACT
A fundamental fact about polynomial interpolation is that k evaluations of a degree-(k-1) polynomial f are sufficient to determine f. This is also necessary in a strong sense: given k-1 evaluations, we learn nothing about the value of f on any k'th point. In this paper, we study a variant of the polynomial interpolation problem. Instead of querying entire evaluations of f (which are elements of a large field F), we are allowed to query partial evaluations; that is, each evaluation delivers a few elements from a small subfield of F, rather than a single element from F. We show that in this model, one can do significantly better than in the traditional setting, in terms of the amount of information required to determine the missing evaluation. More precisely, we show that only O(k) bits are necessary to recover a missing evaluation. In contrast, the traditional method of looking at k evaluations requires Omega(k log(k)) bits. We also show that our result is optimal for linear methods, even up to the leading constants. Our motivation comes from the use of Reed-Solomon (RS) codes for distributed storage systems, in particular for the exact repair problem. The traditional use of RS codes in this setting is analogous to the traditional interpolation problem. Each node in a system stores an evaluation of f, and if one node fails we can recover it by reading k other nodes. However, each node is free to send less information, leading to the modified problem above. The quickly-developing field of regenerating codes has yielded several codes which take advantage of this freedom. However, these codes are not RS codes, and RS codes are still often used in practice; in 2011, Dimakis et al. asked how well RS codes could perform in this setting. Our results imply that RS codes can also take advantage of this freedom to download partial symbols. In some parameter regimes---those with small levels of sub-packetization---our scheme for RS codes outperforms all known regenerating codes. Even with a high degree of sub-packetization, our methods give non-trivial schemes, and we give an improved repair scheme for a specific (14,10)-RS code used in the Facebook Hadoop Analytics cluster.
- Cadambe, V. R., Huang, C., Jafar, S. A., and Li, J. Optimal repair of mds codes in distributed storage via subspace interference alignment. arXiv:1106.1250 (2011).Google Scholar
- Cadambe, V. R., Huang, C., and Li, J. Permutation code: Optimal exact-repair of a single failed node in mds code based distributed storage systems. In Information Theory Proceedings (ISIT), 2011 IEEE International Symposium on (2011), IEEE, pp. 1225–1229.Google ScholarCross Ref
- Cadambe, V. R., Jafar, S., Maleki, H., Ramchandran, K., Suh, C., et al. Asymptotic interference alignment for optimal repair of mds codes in distributed storage. Information Theory, IEEE Transactions on 59, 5 (2013), 2974–2987. Google ScholarDigital Library
- Dimakis, A. G., Godfrey, P., Wu, Y., Wainwright, M. J., and Ramchandran, K. Network coding for distributed storage systems. Information Theory, IEEE Transactions on 56, 9 (2010), 4539–4551. Google ScholarDigital Library
- Dimakis, A. G., Ramchandran, K., Wu, Y., and Suh, C. A survey on network codes for distributed storage. Proceedings of the IEEE 99, 3 (2011), 476–489.Google ScholarCross Ref
- https://github.com/facebookarchive/hadoop-20/ tree/master/src/contrib/raid/src/java/org/ apache/hadoop/raid. Accessed: July 2015.Google Scholar
- Goparaju, S., Tamo, I., and Calderbank, R. An improved sub-packetization bound for minimum storage regenerating codes. Information Theory, IEEE Transactions on 60, 5 (2014), 2770–2779.Google Scholar
- Guruswami, V., and Wootters, M. Repairing reed-solomon codes. arXiv:1509.04764 (2015).Google Scholar
- Han, Y. S., Pai, H.-T., Zheng, R., and Varshney, P. K. Update-efficient regenerating codes with minimum per-node storage. In Information Theory Proceedings (ISIT), 2013 IEEE International Symposium on (2013), IEEE, pp. 1436–1440.Google ScholarCross Ref
- Han, Y. S., Zheng, R., and Mow, W. H. Exact regenerating codes for byzantine fault tolerance in distributed storage. In INFOCOM Proceedings (2012), IEEE, pp. 2498–2506.Google ScholarCross Ref
- Huang, W., Langberg, M., Kliewer, J., and Bruck, J. Communication efficient secret sharing. arXiv:1505.07515 (2015).Google Scholar
- MacWilliams, F. J., and Sloane, N. J. A. The theory of error correcting codes. Elsevier, 1977.Google Scholar
- Papailiopoulos, D. S., Dimakis, A. G., and Cadambe, V. R. Repair optimal erasure codes through hadamard designs. Information Theory, IEEE Transactions on 59, 5 (2013), 3021–3037. Google ScholarDigital Library
- Rashmi, K., Shah, N. B., Kumar, P. V., and Ramchandran, K. Explicit codes minimizing repair bandwidth for distributed storage. In Allerton Conference on Communication, Control, and Computing (2009), IEEE, pp. 1243–1249. Google ScholarDigital Library
- Rashmi, K., Shah, N. B., Kumar, P. V., and Ramchandran, K. Explicit construction of optimal exact regenerating codes for distributed storage. In Allerton Conference on Communication, Control, and Computing (2009), IEEE, pp. 1243–1249. Google ScholarDigital Library
- Sathiamoorthy, M., Asteris, M., Papailiopoulos, D., Dimakis, A. G., Vadali, R., Chen, S., and Borthakur, D. Xoring elephants: Novel erasure codes for big data. In Proceedings of the VLDB Endowment (2013), vol. 6, VLDB Endowment, pp. 325–336. Google ScholarDigital Library
- Shanmugam, K., Papailiopoulos, D. S., Dimakis, A. G., and Caire, G. A repair framework for scalar mds codes. Selected Areas in Communications, IEEE Journal on 32, 5 (2014), 998–1007.Google Scholar
- Suh, C., and Ramchandran, K. Exact-repair MDS codes for distributed storage using interference alignment. In Information Theory Proceedings (ISIT), 2010 IEEE International Symposium on (2010), IEEE, pp. 161–165.Google ScholarCross Ref
- Suh, C., and Ramchandran, K. On the existence of optimal exact-repair mds codes for distributed storage. arXiv:1004.4663 (2010).Google Scholar
- Tamo, I., and Barg, A. Bounds on locally recoverable codes with multiple recovering sets. In Information Theory (ISIT), 2014 IEEE International Symposium on (2014), IEEE, pp. 691–695.Google ScholarCross Ref
- Tamo, I., and Barg, A. A family of optimal locally recoverable codes. Information Theory, IEEE Transactions on 60, 8 (2014), 4661–4676.Google Scholar
- Tamo, I., Papailiopoulos, D. S., and Dimakis, A. G. Optimal locally repairable codes and connections to matroid theory. In Information Theory Proceedings (ISIT), 2013 IEEE International Symposium on (2013), IEEE, pp. 1814–1818.Google ScholarCross Ref
- Tamo, I., Wang, Z., and Bruck, J. Zigzag codes: Mds array codes with optimal rebuilding. Information Theory, IEEE Transactions on 59, 3 (2013), 1597–1616. Google ScholarDigital Library
- Tamo, I., Wang, Z., and Bruck, J. Access versus bandwidth in codes for storage. Information Theory, IEEE Transactions on 60, 4 (2014), 2028–2037. Google ScholarDigital Library
- University of Texas ECE Department. Erasure Coding for Distributed Storage wiki. http://storagewiki.ece.utexas.edu/. Accessed: July 2015.Google Scholar
- Wu, Y., and Dimakis, A. G. Reducing repair traffic for erasure coding-based storage via interference alignment. In Information Theory Proceedings (ISIT), 2009 IEEE International Symposium on (2009), IEEE, pp. 2276–2280. Google ScholarDigital Library
- Wu, Y., Dimakis, A. G., and Ramchandran, K. Deterministic regenerating codes for distributed storage. In Allerton Conference on Control, Computing, and Communication (2007).Google Scholar
Index Terms
- Repairing Reed-solomon codes
Recommendations
Maximum-likelihood decoding of Reed-Solomon codes is NP-hard
Maximum-likelihood decoding is one of the central algorithmic problems in coding theory. It has been known for over 25 years that maximum-likelihood decoding of general linear codes is NP-hard. Nevertheless, it was so far unknown whether maximum-...
A note on good permutation codes from Reed–Solomon codes
AbstractLet M(n, d) be the maximum size of a permutation code of length n and distance d. In this note, the permutation codewords of a classical code C are considered. These are the codewords with all different entries in C. Using these codewords for Reed–...
Improved Decoding of Reed-Solomon and Algebraic-Geometric Codes
FOCS '98: Proceedings of the 39th Annual Symposium on Foundations of Computer ScienceGiven an error-correcting code of block length n and an arbitrary input string also of length n, the list decoding problem is that of finding all codewords within a specified Hamming distance from the input string. We present an improved list decoding ...
Comments