ABSTRACT
Not all important network properties need to be enforced all the time. Often, what matters instead is the fraction of time / probability these properties hold. Computing the probability of a property in a network relying on complex inter-dependent routing protocols is challenging and requires determining all failure scenarios for which the property is violated. Doing so at scale and accurately goes beyond the capabilities of current network analyzers.
In this paper, we introduce NetDice, the first scalable and accurate probabilistic network configuration analyzer supporting BGP, OSPF, ECMP, and static routes. Our key contribution is an inference algorithm to efficiently explore the space of failure scenarios. More specifically, given a network configuration and a property φ, our algorithm automatically identifies a set of links whose failure is provably guaranteed not to change whether φ holds. By pruning these failure scenarios, NetDice manages to accurately approximate P(φ). NetDice supports practical properties and expressive failure models including correlated link failures.
We implement NetDice and evaluate it on realistic configurations. NetDice is practical: it can precisely verify probabilistic properties in few minutes, even in large networks.
Supplemental Material
- Anubhavnidhi Abhashkumar, Aaron Gember-Jacobson, and Aditya Akella. 2020. Tiramisu: Fast Multilayer Network Verification. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI '20). USENIX Association, Santa Clara, CA, 201--219. https://www.usenix.org/conference/nsdi20/presentation/abhashkumarGoogle Scholar
- Carolyn Jane Anderson, Nate Foster, Arjun Guha, Jean-Baptiste Jeannin, Dexter Kozen, Cole Schlesinger, and David Walker. 2014. NetKAT: Semantic Foundations for Networks (POPL '14).Google ScholarDigital Library
- Anindya Basu, Chih-Hao Luke Ong, April Rasala, F. Bruce Shepherd, and Gordon Wilfong. 2002. Route Oscillations in I-BGP with Route Reflection. In Proceedings of the 2002 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (Pittsburgh, Pennsylvania, USA) (SIGCOMM '02). ACM, New York, NY, USA, 235--247. https://doi.org/10.1145/633025.633048Google ScholarDigital Library
- Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker. 2017. A general approach to network configuration verification. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM '17). ACM, 155--168.Google ScholarDigital Library
- Christopher M. Bishop. 2006. Pattern recognition and machine learning. Springer, New York.Google ScholarDigital Library
- Jeremy Bogle, Nikhil Bhatia, Manya Ghobadi, Ishai Menache, Nikolaj Bjørner, Asaf Valadarsky, and Michael Schapira. 2019. TEAVAR: striking the right utilization-availability balance in WAN traffic engineering. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM '19). ACM, Beijing China, 29--43. https://doi.org/10.1145/3341302.3342069Google ScholarDigital Library
- Lawrence D. Brown, T. Tony Cai, and Anirban DasGupta. 2001. Interval Estimation for a Binomial Proportion. Statist. Sci. 16, 2 (05 2001), 101--133. https://doi.org/10.1214/ss/1009213286Google Scholar
- Yiyang Chang, Chuan Jiang, Ashish Chandra, Sanjay Rao, and Mohit Tawarmalani. 2019. Lancet: Better Network Resilience by Designing for Pruned Failure Sets. Proceedings of the ACM on Measurement and Analysis of Computing Systems 3, 3 (Dec. 2019), 1--26. https://doi.org/10.1145/3366697Google ScholarDigital Library
- Jaeyoung Choi, Jong Han Park, Pei chun Cheng, Dorian Kim, and Lixia Zhang. 2011. UnderstandingBGPnext-hopdiversity. In2011IEEEConferenceonComputer Communications Workshops (INFOCOM WKSHPS). 846--851. https://doi.org/10.1109/INFCOMW.2011.5928930Google Scholar
- Luca Cittadini, Stefano Vissicchio, and Giuseppe Di Battista. 2010. Doing don'ts: Modifying BGP attributes within an autonomous system. In Network Operations and Management Symposium (NOMS), 2010IEEE. IEEE, 293--300.Google ScholarCross Ref
- Mary Kathryn Cowles and Bradley P. Carlin. 1996. Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review. J. Amer. Statist. Assoc. 91, 434 (1996), 883--904. https://doi.org/10.1080/01621459.1996.10476956Google ScholarCross Ref
- Seyed K. Fayaz, Tushar Sharma, Ari Fogel, Ratul Mahajan, Todd Millstein, Vyas Sekar, and George Varghese. 2016. Efficient Network Reachability Analysis Using a Succinct Control Plane Representation. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI '16) (Savannah, GA, USA). USENIX Association, USA, 217--232.Google ScholarDigital Library
- N. Feamster and J. Rexford. 2007. Network-Wide Prediction of BGP Routes. IEEE/ACM Transactions on Networking 15, 2 (April 2007), 253--266. https://doi.org/10.1109/TNET.2007.892876Google ScholarDigital Library
- Ashley Flavel, Jeremy McMahon, Aman Shaikh, Matthew Roughan, and Nigel Bean. 2010. BGP route prediction within ISPs. Computer Communications 33, 10 (2010), 1180--1190.Google ScholarDigital Library
- Ashley Flavel, Matthew Roughan, Nigel Bean, and Aman Shaikh. 2008. Where's Waldo? practical searches for stability in iBGP. In IEEE International Conference on Network Protocols. ICNP2008. IEEE, 308--317.Google ScholarDigital Library
- Ari Fogel, Stanley Fung, Luis Pedrosa, Meg Walraed-Sullivan, Ramesh Govindan, Ratul Mahajan, and Todd Millstein. 2015. A General Approach to Network Configuration Analysis. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI '15). USENIX Association, Oakland, CA, 469--483.Google ScholarDigital Library
- Nate Foster, Dexter Kozen, Konstantinos Mamouras, Mark Reitblatt, and Alexandra Silva. 2016. Probabilistic NetKAT. In Programming Languages and Systems (ESOP '16), Peter Thiemann (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 282--309.Google Scholar
- Nate Foster, Dexter Kozen, Matthew Milano, Alexandra Silva, and Laure Thompson. 2015. A coalgebraic decision procedure for NetKAT. In ACM SIGPLAN Notices, Vol. 50. ACM, 343--355.Google ScholarDigital Library
- Aaron Gember-Jacobson, Raajay Viswanathan, Aditya Akella, and Ratul Mahajan. 2016. Fast Control Plane Analysis Using an Abstract Representation. In Proceedings of the 2016 ACM SIGCOMM Conference (Florianopolis, Brazil) (SIGCOMM '16). Association for Computing Machinery, New York, NY, USA, 300--313. https://doi.org/10.1145/2934872.2934876Google ScholarDigital Library
- Phillipa Gill, Navendu Jain, and Nachiappan Nagappan. 2011. Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications. In Proceedings of the ACM SIGCOMM 2011 Conference (Toronto, Ontario, Canada) (SIGCOMM '11). ACM, New York, NY, USA, 350--361. https://doi.org/10.1145/2018436.2018477Google ScholarDigital Library
- Barry Raveendran Greene and Philip Smith. 2002. Cisco ISP essentials. Cisco Press.Google Scholar
- Timothy G Griffin and Gordon Wilfong. 2002. On the correctness of IBGP configuration. In ACM SIGCOMM Computer Communication Review, Vol. 32. ACM, 17--29.Google ScholarDigital Library
- Wassily Hoeffding. 1963. Probability Inequalities for Sums of Bounded Random Variables. J. Amer. Statist. Assoc. 58, 301 (1963), 13--30.Google ScholarCross Ref
- Peyman Kazemian, Michael Chang, Hongyi Zeng, George Varghese, Nick McKeown, and Scott Whyte. 2013. Real Time Network Policy Checking Using Header Space Analysis. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI '13). USENIX, Lombard, IL, 99--111.Google Scholar
- Peyman Kazemian, George Varghese, and Nick McKeown. 2012. Header Space Analysis: Static Checking for Networks. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI '12). USENIX, San Jose, CA, 113--126.Google Scholar
- Ahmed Khurshid, Xuan Zou, Wenxuan Zhou, Matthew Caesar, and P. Brighten Godfrey. 2013. VeriFlow: Verifying Network-Wide Invariants in Real Time. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI '13). USENIX, Lombard, IL, 15--27.Google Scholar
- Simon Knight, Hung X Nguyen, Nick Falkner, Rhys Bowden, and Matthew Roughan. 2011. The internet topology zoo. IEEE Journal on Selected Areas in Communications 29, 9 (2011), 1765--1775.Google ScholarCross Ref
- Pierre Simon Laplace. 1812. Théorie analytique des probabilités. Ve. Courcier.Google Scholar
- Haohui Mai, Ahmed Khurshid, Rachit Agarwal, Matthew Caesar, P. Brighten Godfrey, and Samuel Talmadge King. 2011. Debugging the Data Plane with Anteater. In Proceedings of the ACM SIGCOMM 2011 Conference (Toronto, Ontario, Canada) (SIGCOMM '11). Association for Computing Machinery, New York, NY, USA, 290--301. https://doi.org/10.1145/2018436.2018470Google ScholarDigital Library
- Pascal Mérindol, Virginie Van den Schrieck, Benoit Donnet, Olivier Bonaventure, and Jean-Jacques Pansiot. 2009. Quantifying Ases Multiconnectivity Using Multicast Information. In Proceedings of the 9th ACM SIGCOMMConference on Internet Measurement (IMC '09). Association for Computing Machinery, New York, NY, USA, 370--376.Google ScholarDigital Library
- Santhosh Prabhu, Kuan Yen Chou, Ali Kheradmand, Brighten Godfrey, and Matthew Caesar. 2020. Plankton: Scalable network configuration verification through model checking. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI '20). USENIX Association, Santa Clara, CA, 953--967. https://www.usenix.org/conference/nsdi20/presentation/prabhuGoogle Scholar
- Bruno Quoitin and Steve Uhlig. 2005. Modeling the routing of an autonomous systemwith C-BGP. IEEEnetwork 19, 6 (2005), 12--19.Google Scholar
- Y. Rekhter, T. Li, andS. Hares. 2006. ABorder GatewayProtocol 4 (BGP-4). RFC 4271 (Draft Standard). http://www.ietf.org/rfc/rfc4271.txtGoogle Scholar
- Jennifer Rexford, Jia Wang, Zhen Xiao, and Yin Zhang. 2002. BGP routing stability of popular destinations. In Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment. ACM, 197--202.Google ScholarDigital Library
- Steffen Smolka, Praveen Kumar, Nate Foster, Dexter Kozen, and Alexandra Silva. 2017. Cantor Meets Scott: Semantic Foundations for Probabilistic Networks. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France)(POPL '17). ACM, New York, NY, USA, 557--571. https://doi.org/10.1145/3009837.3009843Google ScholarDigital Library
- M. Steinder and A. S. Sethi. 2002. End-to-end service failure diagnosis using belief networks. In Network Operations and Management Symposium (NOMS '02). 375--390.Google Scholar
- M. Steinder and A. S. Sethi. 2002. Increasing robustness of fault localization through analysis of lost, spurious, and positive symptoms. In Proceedings of the Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, Vol. 1. 322-331 vol.1.Google Scholar
- Daniel Turner, Kirill Levchenko, Alex C. Snoeren, and Stefan Savage. 2010. California Fault Lines: Understanding the Causes and Impact of Network Failures. In Proceedings of the ACM SIGCOMM 2010 Conference (New Delhi, India) (SIGCOMM '10). ACM, New York, NY, USA, 315--326.Google ScholarDigital Library
- Stefano Vissicchio, Luca Cittadini, and Giuseppe Di Battista. 2015. On iBGP Routing Policies. IEEE/ACM Trans. Netw. 23, 1 (Feb. 2015), 227--240. https://doi.org/10.1109/TNET.2013.2296330Google ScholarDigital Library
- Konstantin Weitz, Doug Woos, Emina Torlak, Michael D. Ernst, Arvind Krishnamurthy, and Zachary Tatlock. 2016. Scalable verification of border gateway protocol configurations with an SMT solver. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, (OOPSLA '16). https://doi.org/10.1145/2983990.2984012Google ScholarDigital Library
- Nevin Lianwen Zhang and David Poole. 1996. Exploiting Causal Independence in Bayesian Network Inference. J. Artif. Int. Res. 5, 1 (Dec. 1996), 301--328. http://dl.acm.org/citation.cfm?id=1622756.1622765Google Scholar
Index Terms
- Probabilistic Verification of Network Configurations
Recommendations
An efficient approach for multiple probabilistic inferences with Deepwalk based Bayesian network embedding
AbstractAs a classical probabilistic graphic model, Bayesian network (BN) is widely used for representing and inferring dependence relationships with uncertainties. However, multiple probabilistic inferences on BN are quite inefficient, since ...
Highlights- Construct the WDG to represent the states with transition probabilities in BN.
- ...
Probabilistic inference on uncertain semantic link network and its application in event identification
AbstractThe Probabilistic Semantic Link Network (P-SLN) is a model for enhancing the ability of Semantic Link Network in representing uncertainty. Probabilistic inference over uncertain semantic links can process the likelihood and consistency ...
Highlights- A formalism for Probabilistic Semantic Link Network P-SLN.
- A successful ...
Lattice sensor networks: capacity limits, optimal routing and robustness to failures
IPSN '04: Proceedings of the 3rd international symposium on Information processing in sensor networksWe study network capacity limits and optimal routing algorithms for regular sensor networks, namely, square and torus grid sensor networks, in both, the static case (no node failures) and the dynamic case (node failures). For static networks, we derive ...
Comments