Hostname: page-component-76fb5796d-5g6vh Total loading time: 0 Render date: 2024-04-30T01:27:55.910Z Has data issue: false hasContentIssue false

Approximate entropy for testing randomness

Published online by Cambridge University Press:  14 July 2016

Andrew L. Rukhin*
Affiliation:
University of Maryland
*
Postal address: Department of Mathematics and Statistics, University of Maryland at Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA. Email address: rukhin@math.umbc.edu

Abstract

This paper arose from interest in assessing the quality of random number generators. The problem of testing randomness of a string of binary bits produced by such a generator gained importance with the wide use of public key cryptography and the need for secure encryption algorithms. All such algorithms are based on a generator of (pseudo) random numbers; the testing of such generators for randomness became crucial for the communications industry where digital signatures and key management are vital for information processing.

The concept of approximate entropy has been introduced in a series of papers by S. Pincus and co-authors. The corresponding statistic is designed to measure the degree of randomness of observed sequences. It is based on incremental contrasts of empirical entropies based on the frequencies of different patterns in the sequence. Sequences with large approximate entropy must have substantial fluctuation or irregularity. Alternatively, small values of this characteristic imply strong regularity, or lack of randomness, in a sequence. Pincus and Kalman (1997) evaluated approximate entropies for binary and decimal expansions of e, π, √2 and √3 with the surprising conclusion that the expansion of √3 demonstrated much less irregularity than that of π. Tractable small sample distributions are hardly available, and testing randomness is based, as a rule, on fairly long strings. Therefore, to have rigorous statistical tests of randomness based on this approximate entropy statistic, one needs the limiting distribution of this characteristic under the randomness assumption. Until now this distribution remained unknown and was thought to be difficult to obtain. To derive the limiting distribution of approximate entropy we modify its definition. It is shown that the approximate entropy as well as its modified version converges in distribution to a χ2-random variable. The P-values of approximate entropy test statistics for binary expansions of e, π and √3 are plotted. Although some of these values for √3 digits are small, they do not provide enough statistical significance against the randomness hypothesis.

Type
Research Papers
Copyright
Copyright © 2000 by The Applied Probability Trust 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Billingsley, P. (1956). Asymptotic distributions of two goodness of fit criteria. Ann. Math. Statist. 27, 11231129.CrossRefGoogle Scholar
Chaitin, G. (1975). Randomness and mathematical proof. Scientific American 232, 4752.Google Scholar
Devroye, L., Gyorfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York, pp. 3134.Google Scholar
Good, I. J. (1953). The serial test for sampling numbers and other tests for randomness. Proc. Camb. Phil. Soc. 47, 276284.CrossRefGoogle Scholar
Good, I. J. (1957). On the serial test for random sequences. Ann. Math. Statist. 28, 262264.Google Scholar
Good, I. J. (1997). The roughness of visitations in a Markov chain, a review with extensions. In Advances in the Theory and Practice of Statistics, eds. Johnson, N. L. and Balakrishnan, N. John Wiley, New York, pp. 6787.Google Scholar
Good, I. J., and Gover, T. N. (1967). The generalized serial test and the binary expansion of 2. J. Roy. Statist. [4] Soc. 130 A, 102107.Google Scholar
Gustafson, H., Dawson, E., Nielsen, L., and Caelli, W. (1994). A computer package for measuring the strength of encryption algorithms. Computers and Security 13, 687697.Google Scholar
Kimberley, M. (1987). Comparison of two statistical tests for keystream sequences. Electron. Lett. 23, 365366.Google Scholar
Knuth, D. E. (1998). The Art of Computer Programming, Vol 2, 3rd edn. Addison-Wesley, Reading, MA, pp. 6180.Google Scholar
Marsaglia, G. (1985). A current view of random number generation. In Computer Science and Statistics: Proceedings of the Sixteenth Symposium on the Interface, Elsevier, New York, pp. 310.Google Scholar
Marsaglia, G. (1996). Diehard: a battery of tests for randomness. Available at http://stat.fsu.edu/geo/diehard.html.Google Scholar
Menezes, A. J., van Oorschot, P. C., and Vanstone, S. A. (1997). Handbook of Applied Cryptography. CRC Press, Boca Raton, p. 181.Google Scholar
Morales, D., Pardo, L., and Vajda, I. (1996). Uncertainty of discrete stochastic systems: General theory and statistical inference. IEEE Trans. Systems, Man and Cybernetics Part A 26, 681697.Google Scholar
Pincus, S., and Huang, W.-M. (1992). Approximate entropy, statistical properties and applications. Commun. Statist. Part A Theory and Methods 21, 30613077.Google Scholar
Pincus, S., and Kalman, R. E. (1997). Not all (possibly) ‘random’ sequences are created equal. Proc. Nat. Acad. Sci. USA 94, 35133518.Google Scholar
Pincus, S., and Singer, B. H. (1996). Randomness and degrees of irregularity. Proc. Nat. Acad. Sci. USA 93, 20832088.Google Scholar
Rao, C. R., and Mitra, C. K. (1971). Generalized Inverse of Matrices and its Applications. John Wiley, New York, pp. 171173.Google Scholar
Read, T. R., and Cressie, N. A. (1988). Goodness-of-Fit Statistics for Discrete Multivariate Data. Springer, New York, p. 46.Google Scholar
Seife, C. (1997). New test sizes up randomness. Science 276, 532.Google Scholar
Vajda, I. (1989). Theory of Statistical Inference and Information. Kluwer, Dordrecht, pp. 300328.Google Scholar