Hostname: page-component-76dd75c94c-qmf6w Total loading time: 0 Render date: 2024-04-30T09:12:53.235Z Has data issue: false hasContentIssue false

Longest common subsequences of two random sequences

Published online by Cambridge University Press:  14 July 2016

Vacláv Chvátal
Affiliation:
Université de Montréal
David Sankoff
Affiliation:
Université de Montréal

Summary

Given two random k-ary sequences of length n, what is f(n,k), the expected length of their longest common subsequence? This problem arises in the study of molecular evolution. We calculate f(n,k) for all k, where n ≦ 5, and f(n,2) where n ≦ 10. We study the limiting behaviour of n–1f(n,k) and derive upper and lower bounds on these limits for all k. Finally we estimate by Monte-Carlo methods f(100,k), f(1000,2) and f(5000,2).

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 1975 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Fekete, M. (1923) Über die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahligen Koeffizienten. Math. Z. 17, 228249.CrossRefGoogle Scholar
[2] Riordan, J. (1958) An Introduction to Combinatorial Analysis. John Wiley, New York.Google Scholar
[3] Sankoff, D. (1972) Matching sequences under deletion/insertion constraints. Proc. Nat. Acad. Sci. U.S.A. 69, 46.CrossRefGoogle ScholarPubMed
[4] Sankoff, D. and Cedergren, R. J. (1973) A test for nucleotide sequence homology. J. Mol. Biol. 77, 159164.CrossRefGoogle ScholarPubMed