skip to main content
10.1145/1452520.1452542acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

Census and survey of the visible internet

Published:20 October 2008Publication History

ABSTRACT

Prior measurement studies of the Internet have explored traffic and topology, but have largely ignored edge hosts. While the number of Internet hosts is very large, and many are hidden behind firewalls or in private address space, there is much to be learned from examining the population of visible hosts, those with public unicast addresses that respond to messages. In this paper we introduce two new approaches to explore the visible Internet. Applying statistical population sampling, we use censuses to walk the entire Internet address space, and surveys to probe frequently a fraction of that space. We then use these tools to evaluate address usage, where we find that only 3.6% of allocated addresses are actually occupied by visible hosts, and that occupancy is unevenly distributed, with a quarter of responsive /24 address blocks (subnets) less than 5% full, and only 9% of blocks more than half full. We show about 34 million addresses are very stable and visible to our probes (about 16% of responsive addresses), and we project from this up to 60 million stable Internet-accessible computers. The remainder of allocated addresses are used intermittently, with a median occupancy of 81 minutes. Finally, we show that many firewalls are visible, measuring significant diversity in the distribution of firewalled block size. To our knowledge, we are the first to take a census of edge hosts in the visible Internet since 1982, to evaluate the accuracy of active probing for address census and survey, and to quantify these aspects of the Internet.

References

  1. M. Allman, W. M. Eddy, and S. Ostermann. Estimating loss rates with TCP. ACM Performance Evaluation Review, 31(3):12--24, Dec. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Bartlett, J. Heidemann, and C. Papadopoulos. Understanding passive and active service discovery. In Proc. of the ACM Internet Measurement Conference. ACM, Oct. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Beck. Passive-aggressive resistance: OS fingerprint evasion. The Linux Journal, Sept. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Braden. Requirements for Internet hosts - communication layers. RFC 1122, Internet Request For Comments, Oct. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Bu, L. Gao, and D. Towsley. On characterizing BGP routing table growth. Proc. of the IEEE Global Internet, November 2002.Google ScholarGoogle Scholar
  6. S. Deering and R. Hinden. Internet protocol, IP version 6 specification. RFC 2460, Internet Request For Comments, Dec. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. X. Dimitropoulos, D. Krioukov, M. Fomenkov, B. Huffaker, Y. Hyun, kc claffy, and G. Riley. AS relationships: Inference and validation. ACM Computer Communication Review, 37(1):29--40, Jan. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Duffield and M. Grossglauser. Trajectory sampling for direct traffic observation. In Proc. of the ACM SIGCOMM Conference, pages 179--191, Stockholm, Sweeden, Aug. 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ed3f. Firewall spotting and networks analysis with a broken CRC. http://www.phrack.org/archives/60/p60-0x0c.txt, Dec. 2002.Google ScholarGoogle Scholar
  10. M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In Proc. of the ACM SIGCOMM Conference, pages 251--262, Cambridge, MA, USA, Sept. 1999. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Fuller, T. Li, J. Yu, and K. Varadhan. Classless inter-domain routing (CIDR): an address assignment and aggregation strategy. RFC 1519, Internet Request For Comments, Sept. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Gao. On inferring automonous system relationships in the internet. ACM/IEEE Transactions on Networking, 9(6):733--745, Dec. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Govindan and H. Tangmunarunkit. Heuristics for Internet Map Discovery. In Proc. of the IEEE Infocom, March 2000.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. P. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and J. Zahorjan. Measurement, modelling, and analysis of a peer-to-peer file-sharing workload. In Proc. of the 19th Symposium on Operating Systems Principles, pages 314--329. ACM, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Hain. A pragmatic report on IPv4 address space consumption. The Internet Protocol Journal, 8(3), 2004.Google ScholarGoogle Scholar
  16. B. He, M. Patel, Z. Zhang, and K. C.-C. Chang. Accessing the deep web. Communications of the ACM, 50(5):94--101, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. S. Hedayat and B. K. Sinha. Design and Inference in Finite Population Sampling. John Wiley & Sons, Inc., 1991.Google ScholarGoogle Scholar
  18. J. Heidemann, Y. Pradkin, R. Govindan, C. Papadopoulos, G. Bartlett, and J. Bannister. Census and survey of the visible internet (extended). Technical Report ISI-TR-2008-649, Feb. 2008.Google ScholarGoogle Scholar
  19. B. Huffaker, D. Plummer, D. Moore, and K. C. Claffy. Topology Discovery by Active Probing. In Proc. of the Symposium on Applications and the Internet, January 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Huston. Analyzing the Internet's BGP routing table. Internet Protocol Journal, 4(1), March 2001.Google ScholarGoogle Scholar
  21. G. Huston. IPv4 address report. http://bgp.potaroo.net/ipv4/, June 2006.Google ScholarGoogle Scholar
  22. IANA. Internet protocol v4 address space. web page http://www.iana.org/assignments/ipv4-address-space, Sept. 2002.Google ScholarGoogle Scholar
  23. IANA. ICMP type numbers. web page http://www.iana.org/assignments/icmp-parameters, Mar. 2007.Google ScholarGoogle Scholar
  24. Internet Software Consortium. Internet domain survey. web page http://www.isc.org/ds.Google ScholarGoogle Scholar
  25. M. Khadilkar, N. Feamster, M. Sanders, and R. Clark. Usage-based DHCP lease time optimization. In Proc. of the 7th ACM Internet Measurement Conference, pages 71--76, Oct. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Kohler, J. Li, V. Paxson, and S. Shenker. Observed structure of addresses in IP traffic. In Proc. of the 2nd ACM Internet Measurement Workshop, pages 253--266, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Labovitz, A. Ahuja, A. Abose, and F. Jahanian. Delayed Internet routing convergence. In Proc. of the ACM SIGCOMM Conference, pages 175--187, Stockholm, Sweeden, Aug. 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Leland, M. Taqqu, W. Willinger, and D. Wilson. On the self-similar nature of Ethernet traffic (extended version). ACM/IEEE Transactions on Networking, 2(1):1--15, Feb. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. L. Li, D. Alderson, W. Willinger, and J. Doyle. A first-principles approach to understanding the Internet's router-level topology. In Proc. of the ACM SIGCOMM Conference, pages 3--14, Aug. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Lottor. Internet growth (1981-1991). RFC 1296, Internet Request For Comments, Jan. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. X. Meng, Z. Xu, B. Zhang, G. Huston, S. Lu, and L. Zhang. IPv4 address allocation and the BGP routing table evolution. ACM Computer Communication Review, 35(1):71--80, Jan. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W. Mühlbauer, O. Maennel, S. Uhlig, A. Feldmann, and M. Roughan. Building an AS-topology model that captures route diversity. In Proc. of the ACM SIGCOMM Conference, pages 195--204, Sept. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Narayan, R. Govindan, and G. Varghese. On the impact of routing and address allocation on the structure and implementation of routing tables. In Proc. of the ACM SIGCOMM Conference, Aug. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. NJABL. Not just another bogus list. http://www.njabl.org/, 2007.Google ScholarGoogle Scholar
  35. p0f Project. p0f passive OS fingerprinting. http://lcamtuf.coredump.cx/p0f.shtml, Sept. 2006.Google ScholarGoogle Scholar
  36. V. Paxson. End-to-end Internet packet dynamics. ACM/IEEE Transactions on Networking, 7(3):277--292, June 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. V. Paxson and S. Floyd. Why we don't know how to simulate the Internet. In Proc. of the 29th SCS Winter Simulation Conference, pages 1037--1044, Atlanta, Georga, USA, Dec. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. N. Project. Nmap network security scanner. http://www.insecure.org/nmap/, 1997.Google ScholarGoogle Scholar
  39. Y. Rekhter, B. Moskowitz, D. Karrenberg, G. J. de Groot, and E. Lear. Address allocation for private internets. RFC 1918, IETF, Feb. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Y. Shavitt and E. Shir. Dimes: let the Internet measure itself. SIGCOMM Comput. Commun. Rev., 35(5):71--74, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Smallberg. Who talks TCP? RFC 832, Internet Request For Comments, Dec. 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Smart, G. R. Malan, and F. Jahanian. Defeating TCP/IP stack fingerprinting. In Proc. of the USENIX Security Symposium, pages 229--240, Denver, Colorado, USA, Aug. 2000. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. F. D. Smith, F. Hernandez, K. Jeffay, and D. Ott. What TCP/IP protocol headers can tell us about the web. In Proc. of the ACM SIGMETRICS, pages 245--256, Cambridge, MA, USA, June 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. SORBS. Sorbs dynamic user and host list. http://www.au.sorbs.net/faq/dul.shtml, 2004.Google ScholarGoogle Scholar
  45. N. Spring, R. Mahajan, D. Wetherall, and T. Anderson. Measuring ISP topologies with Rocketfuel. ACM/IEEE Transactions on Networking, 12(1):2--16, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. L. Subramanian, S. Agarwal, J. Rexford, and R. H. Katz. Characterizing the Internet hierarchy from multiple vantage points. In Proc. of the IEEE Infocom, pages 618--627, June 2002.Google ScholarGoogle ScholarCross RefCross Ref
  47. H. Tangmunarunkit, R. Govindan, S. Jamin, and S. S. W. Willinger. Network Topology Generators: Degree-Based vs. Structural. In Proceedings of ACM SIGCOMM, pages 188--195, Pittsburgh, PA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. P. F. Tsuchiya and T. Eng. Extending the IP Internet through address reuse. ACM Computer Communication Review, 23(1):16--33, Jan. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. R. Whittle. Probing the density of ping-responsive-hosts in each /8 IPv4 prefix and in different sizes of BGP advertised prefix. Web page http://www.firstpr.com.au/ip/host-density-per-prefix/, Nov. 2007.Google ScholarGoogle Scholar
  50. Y. Xie, F. Yu, K. Achan, E. Gillum, M. Goldszmidt, and T. Wobber. How dynamic are IP addresses? In Proc. of the ACM SIGCOMM Conference, Kyoto, Japan, Aug. 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Census and survey of the visible internet

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IMC '08: Proceedings of the 8th ACM SIGCOMM conference on Internet measurement
        October 2008
        352 pages
        ISBN:9781605583341
        DOI:10.1145/1452520

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 October 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate277of1,083submissions,26%

        Upcoming Conference

        IMC '24
        ACM Internet Measurement Conference
        November 4 - 6, 2024
        Madrid , AA , Spain

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader