ABSTRACT
Prior measurement studies of the Internet have explored traffic and topology, but have largely ignored edge hosts. While the number of Internet hosts is very large, and many are hidden behind firewalls or in private address space, there is much to be learned from examining the population of visible hosts, those with public unicast addresses that respond to messages. In this paper we introduce two new approaches to explore the visible Internet. Applying statistical population sampling, we use censuses to walk the entire Internet address space, and surveys to probe frequently a fraction of that space. We then use these tools to evaluate address usage, where we find that only 3.6% of allocated addresses are actually occupied by visible hosts, and that occupancy is unevenly distributed, with a quarter of responsive /24 address blocks (subnets) less than 5% full, and only 9% of blocks more than half full. We show about 34 million addresses are very stable and visible to our probes (about 16% of responsive addresses), and we project from this up to 60 million stable Internet-accessible computers. The remainder of allocated addresses are used intermittently, with a median occupancy of 81 minutes. Finally, we show that many firewalls are visible, measuring significant diversity in the distribution of firewalled block size. To our knowledge, we are the first to take a census of edge hosts in the visible Internet since 1982, to evaluate the accuracy of active probing for address census and survey, and to quantify these aspects of the Internet.
- M. Allman, W. M. Eddy, and S. Ostermann. Estimating loss rates with TCP. ACM Performance Evaluation Review, 31(3):12--24, Dec. 2003. Google ScholarDigital Library
- G. Bartlett, J. Heidemann, and C. Papadopoulos. Understanding passive and active service discovery. In Proc. of the ACM Internet Measurement Conference. ACM, Oct. 2007. Google ScholarDigital Library
- R. Beck. Passive-aggressive resistance: OS fingerprint evasion. The Linux Journal, Sept. 2001. Google ScholarDigital Library
- R. Braden. Requirements for Internet hosts - communication layers. RFC 1122, Internet Request For Comments, Oct. 1989. Google ScholarDigital Library
- T. Bu, L. Gao, and D. Towsley. On characterizing BGP routing table growth. Proc. of the IEEE Global Internet, November 2002.Google Scholar
- S. Deering and R. Hinden. Internet protocol, IP version 6 specification. RFC 2460, Internet Request For Comments, Dec. 1998. Google ScholarDigital Library
- X. Dimitropoulos, D. Krioukov, M. Fomenkov, B. Huffaker, Y. Hyun, kc claffy, and G. Riley. AS relationships: Inference and validation. ACM Computer Communication Review, 37(1):29--40, Jan. 2007. Google ScholarDigital Library
- N. Duffield and M. Grossglauser. Trajectory sampling for direct traffic observation. In Proc. of the ACM SIGCOMM Conference, pages 179--191, Stockholm, Sweeden, Aug. 2000. ACM. Google ScholarDigital Library
- Ed3f. Firewall spotting and networks analysis with a broken CRC. http://www.phrack.org/archives/60/p60-0x0c.txt, Dec. 2002.Google Scholar
- M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In Proc. of the ACM SIGCOMM Conference, pages 251--262, Cambridge, MA, USA, Sept. 1999. ACM. Google ScholarDigital Library
- V. Fuller, T. Li, J. Yu, and K. Varadhan. Classless inter-domain routing (CIDR): an address assignment and aggregation strategy. RFC 1519, Internet Request For Comments, Sept. 1993. Google ScholarDigital Library
- L. Gao. On inferring automonous system relationships in the internet. ACM/IEEE Transactions on Networking, 9(6):733--745, Dec. 2001. Google ScholarDigital Library
- R. Govindan and H. Tangmunarunkit. Heuristics for Internet Map Discovery. In Proc. of the IEEE Infocom, March 2000.Google ScholarCross Ref
- K. P. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and J. Zahorjan. Measurement, modelling, and analysis of a peer-to-peer file-sharing workload. In Proc. of the 19th Symposium on Operating Systems Principles, pages 314--329. ACM, Oct. 2003. Google ScholarDigital Library
- T. Hain. A pragmatic report on IPv4 address space consumption. The Internet Protocol Journal, 8(3), 2004.Google Scholar
- B. He, M. Patel, Z. Zhang, and K. C.-C. Chang. Accessing the deep web. Communications of the ACM, 50(5):94--101, May 2007. Google ScholarDigital Library
- A. S. Hedayat and B. K. Sinha. Design and Inference in Finite Population Sampling. John Wiley & Sons, Inc., 1991.Google Scholar
- J. Heidemann, Y. Pradkin, R. Govindan, C. Papadopoulos, G. Bartlett, and J. Bannister. Census and survey of the visible internet (extended). Technical Report ISI-TR-2008-649, Feb. 2008.Google Scholar
- B. Huffaker, D. Plummer, D. Moore, and K. C. Claffy. Topology Discovery by Active Probing. In Proc. of the Symposium on Applications and the Internet, January 2002. Google ScholarDigital Library
- G. Huston. Analyzing the Internet's BGP routing table. Internet Protocol Journal, 4(1), March 2001.Google Scholar
- G. Huston. IPv4 address report. http://bgp.potaroo.net/ipv4/, June 2006.Google Scholar
- IANA. Internet protocol v4 address space. web page http://www.iana.org/assignments/ipv4-address-space, Sept. 2002.Google Scholar
- IANA. ICMP type numbers. web page http://www.iana.org/assignments/icmp-parameters, Mar. 2007.Google Scholar
- Internet Software Consortium. Internet domain survey. web page http://www.isc.org/ds.Google Scholar
- M. Khadilkar, N. Feamster, M. Sanders, and R. Clark. Usage-based DHCP lease time optimization. In Proc. of the 7th ACM Internet Measurement Conference, pages 71--76, Oct. 2007. Google ScholarDigital Library
- E. Kohler, J. Li, V. Paxson, and S. Shenker. Observed structure of addresses in IP traffic. In Proc. of the 2nd ACM Internet Measurement Workshop, pages 253--266, Nov. 2002. Google ScholarDigital Library
- C. Labovitz, A. Ahuja, A. Abose, and F. Jahanian. Delayed Internet routing convergence. In Proc. of the ACM SIGCOMM Conference, pages 175--187, Stockholm, Sweeden, Aug. 2000. ACM. Google ScholarDigital Library
- W. Leland, M. Taqqu, W. Willinger, and D. Wilson. On the self-similar nature of Ethernet traffic (extended version). ACM/IEEE Transactions on Networking, 2(1):1--15, Feb. 1994. Google ScholarDigital Library
- L. Li, D. Alderson, W. Willinger, and J. Doyle. A first-principles approach to understanding the Internet's router-level topology. In Proc. of the ACM SIGCOMM Conference, pages 3--14, Aug. 2004. Google ScholarDigital Library
- M. Lottor. Internet growth (1981-1991). RFC 1296, Internet Request For Comments, Jan. 1992. Google ScholarDigital Library
- X. Meng, Z. Xu, B. Zhang, G. Huston, S. Lu, and L. Zhang. IPv4 address allocation and the BGP routing table evolution. ACM Computer Communication Review, 35(1):71--80, Jan. 2005. Google ScholarDigital Library
- W. Mühlbauer, O. Maennel, S. Uhlig, A. Feldmann, and M. Roughan. Building an AS-topology model that captures route diversity. In Proc. of the ACM SIGCOMM Conference, pages 195--204, Sept. 2006. Google ScholarDigital Library
- H. Narayan, R. Govindan, and G. Varghese. On the impact of routing and address allocation on the structure and implementation of routing tables. In Proc. of the ACM SIGCOMM Conference, Aug. 2003. Google ScholarDigital Library
- NJABL. Not just another bogus list. http://www.njabl.org/, 2007.Google Scholar
- p0f Project. p0f passive OS fingerprinting. http://lcamtuf.coredump.cx/p0f.shtml, Sept. 2006.Google Scholar
- V. Paxson. End-to-end Internet packet dynamics. ACM/IEEE Transactions on Networking, 7(3):277--292, June 1999. Google ScholarDigital Library
- V. Paxson and S. Floyd. Why we don't know how to simulate the Internet. In Proc. of the 29th SCS Winter Simulation Conference, pages 1037--1044, Atlanta, Georga, USA, Dec. 1997. Google ScholarDigital Library
- N. Project. Nmap network security scanner. http://www.insecure.org/nmap/, 1997.Google Scholar
- Y. Rekhter, B. Moskowitz, D. Karrenberg, G. J. de Groot, and E. Lear. Address allocation for private internets. RFC 1918, IETF, Feb. 1996. Google ScholarDigital Library
- Y. Shavitt and E. Shir. Dimes: let the Internet measure itself. SIGCOMM Comput. Commun. Rev., 35(5):71--74, 2005. Google ScholarDigital Library
- D. Smallberg. Who talks TCP? RFC 832, Internet Request For Comments, Dec. 1982. Google ScholarDigital Library
- M. Smart, G. R. Malan, and F. Jahanian. Defeating TCP/IP stack fingerprinting. In Proc. of the USENIX Security Symposium, pages 229--240, Denver, Colorado, USA, Aug. 2000. USENIX. Google ScholarDigital Library
- F. D. Smith, F. Hernandez, K. Jeffay, and D. Ott. What TCP/IP protocol headers can tell us about the web. In Proc. of the ACM SIGMETRICS, pages 245--256, Cambridge, MA, USA, June 2001. ACM. Google ScholarDigital Library
- SORBS. Sorbs dynamic user and host list. http://www.au.sorbs.net/faq/dul.shtml, 2004.Google Scholar
- N. Spring, R. Mahajan, D. Wetherall, and T. Anderson. Measuring ISP topologies with Rocketfuel. ACM/IEEE Transactions on Networking, 12(1):2--16, 2004. Google ScholarDigital Library
- L. Subramanian, S. Agarwal, J. Rexford, and R. H. Katz. Characterizing the Internet hierarchy from multiple vantage points. In Proc. of the IEEE Infocom, pages 618--627, June 2002.Google ScholarCross Ref
- H. Tangmunarunkit, R. Govindan, S. Jamin, and S. S. W. Willinger. Network Topology Generators: Degree-Based vs. Structural. In Proceedings of ACM SIGCOMM, pages 188--195, Pittsburgh, PA, 2002. Google ScholarDigital Library
- P. F. Tsuchiya and T. Eng. Extending the IP Internet through address reuse. ACM Computer Communication Review, 23(1):16--33, Jan. 1993. Google ScholarDigital Library
- R. Whittle. Probing the density of ping-responsive-hosts in each /8 IPv4 prefix and in different sizes of BGP advertised prefix. Web page http://www.firstpr.com.au/ip/host-density-per-prefix/, Nov. 2007.Google Scholar
- Y. Xie, F. Yu, K. Achan, E. Gillum, M. Goldszmidt, and T. Wobber. How dynamic are IP addresses? In Proc. of the ACM SIGCOMM Conference, Kyoto, Japan, Aug. 2007. ACM. Google ScholarDigital Library
Index Terms
- Census and survey of the visible internet
Recommendations
Security of Visible Light Communication systems—A survey
AbstractThis paper reviews the security of Visible Light Communication (VLC) methods, protocols and systems. In the introduction we briefly outline VLC technology as of today. We then identify the physical aspects of VLC which are relevant to ...
Understanding block-level address usage in the visible internet
SIGCOMM '10: Proceedings of the ACM SIGCOMM 2010 conferenceAlthough the Internet is widely used today, we have little information about the edge of the network. Decentralized management, firewalls, and sensitivity to probing prevent easy answers and make measurement difficult. Building on frequent ICMP probing ...
Comments