ABSTRACT
Two different approaches to determining the human genome are currently being pursued: one is the “clone-by-clone” approach, employed by the publicly-funded. Human Genome Project, and the other is the “whole genome shotgun” approach, favored by researchers at Celera Genomics. An interim strategy employed at Celera, called hierarchical assembly, makes use of preliminary data produced by both approaches. This paper introduces the Bactig Ordering Problem, which is a key problem that arises in this context, and presents an efficient heuristic called the greedy path-merginq algorithm that performs well on real data.
- BKML+00.D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, B. A. Rapp, and D. L. Wheeler. Genbank. Nucleic Acids Research, 28(1):15-8, 2000.Google ScholarCross Ref
- GJ79.M.R. Garey and D. S. Johnson. Computers and Intractability, a guide to the theory of NP-completeness. Bell Telephone Laboratories, Inc., 1979. Google ScholarDigital Library
- HRK+.D.H. Huson, K. Reinert, S. A. Kravitz, K. A. Remington, A. L. Delcher, I. M. Dew, A. L. Halpern, Z. Lai, G. G. Sutton, and E. W. Myers. Design and operation of an hierarchical assembler for the human genome. In preparation.Google Scholar
- LW88.E.S. Lander and M. S. Waterman. Genomic mapping by fingerprinting random clones: A mathematical analysis. Genomics, 2:231-239, 1988.Google ScholarCross Ref
- Mar99a.E. Marshall. A high-stakes gamble on genome sequencing. Science, 284(5422):1906-1909, 1999.Google ScholarCross Ref
- Mar99b.E. Marshall. Sequencers endorse plan for draft in 1 year. Science, 284(5419):1439-1441, 1999.Google ScholarCross Ref
- Mar00.E. Marshall. Human genome. Rival genome sequences celebrate a milestone together. Science, 288(5475):2294-5, 2000.Google ScholarCross Ref
- MSD+00.E.W. Myers, G. G. Sutton, A. L. Delcher, I. M. Dew, D. P. Fasulo, M. J. Flanigan, S. A. Kravitz, C. M. Mobarry, K. H. J. Reinert, K. A. Remington, E. L. Anson, R. A. Bolanos, H-H. Chou, C. M. Jordan, A. L. Halpern, S. Lonardi, E. M. Beasley, R. C. Brandon, L. Chen, P. J. Dunn, Z. Lai, Y. Liang, D. R. Nusskern, M. Zhan, Q. Zhang, X. Zheng, G. M. Rubin, M. D. Adams, and J. C. Venter. A whole-genome assembly of Drosophila. Science, 287:2196-2204, 2000.Google Scholar
- SCH+92.F. Sanger, A. R. Coulson, G. F. Hong, D. F. Hill, and G. B. Petersen. Nucleotide sequence of bacteriophage A DNA. J. Mol. Bio., 162(4):729-73, 1992.Google ScholarCross Ref
- SNC77.F. Sanger, S. Nicklen, and A. R. Coulson. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74(12):5463-5467, 1977.Google Scholar
- UOO97.U.S. Dep. of Energy, Office of Energy Research, and Office of Biological and Environmental Research. Human genome program report. http ://www. ornl. gov/hgmis/publicat/97pr/, 1997.Google Scholar
- WM97.J.L. Webber and E. W. Myers. Human whole-genome shotgun sequencing. Genome Research, 7(5):401-409, 1997.Google ScholarCross Ref
Index Terms
- The greedy path-merging algorithm for sequence assembly
Recommendations
The greedy path-merging algorithm for contig scaffolding
Given a collection of contigs and mate-pairs. The Contig Scaffolding Problem is to order and orientate the given contigs in a manner that is consistent with as many mate-pairs as possible. This paper describes an efficient heuristic called the greedy-...
Genome Sequence Assembly: Algorithms and Issues
Ultimately, genome sequencing seeks to provide an organism's complete DNA sequence. Automation of DNA sequencing allowed scientists to decode entire genomes and gave birth to genomics, the analytic and comparative study of genomes. Although genomes can ...
Comments