ABSTRACT
In this paper we describe an algorithm for the localization of structured models, i.e. sequences of (simple) motifs and distance constraints. It basically combines standard pattern matching procedures with a constraint satisfaction solver, and it has the ability, not present in similar tools, to search for partial matches. A significant feature of our approach, especially in terms of efficiency for the application context, is that the (potentially) exponentially many solutions to the considered problem are represented in compact form as a graph. Moreover, the time and space necessary to build the graph are linear in the number of occurrences of the component patterns.
- F. Cattonaro, I. Jurman, and M. Morgante. The Alisei family of Gypsy-like retrotransposons in Norway spruce (Picea abies L., Karsten) genome. In preparation.]]Google Scholar
- M. Crochemore and M.-F. Sagot. Motifs in sequences: localization and extraction. In A. Konopka and al., editor, Handbook of Computational Chemistry. Marcel Dekker Inc., 2002. in press.]]Google Scholar
- K. M. Devos, K. M. Brown, and J. L. Bennetzen. Genome Size Reduction through Illegitimate Recombination Counteracts Genome Expansion in Arabidopsis. Genome Research, 12(7):1075--1079, 2002.]]Google ScholarCross Ref
- K. Fredriksson and G. Navarro. Average-Optimal Multiple Approximate String Matching. In Proceedings of the 14th Annual Symposium on Combinatorial Pattern Matching (CPM 2003), volume 2676 of LNCS, pages 109--128, 2003.]]Google Scholar
- R. Giegerich, S. Kurtz, and J. Stoye. Efficient Implementation of Lazy Suffix Trees. In Proc. of the Third Workshop on Algorithmic Engineering (WAE99), pages 30--42. Lecture Notes in Computer Science 1668, 1999.]] Google ScholarDigital Library
- D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York, 1997.]] Google ScholarDigital Library
- A. Kumar and J. L. Bennetzen. Plant Retrotransposons. Annu. Rev. Genet., 33:479--532, 1999.]]Google ScholarCross Ref
- A. Kumar and H. Hirochika. Applications of retrotransposons as genetic tools in plant biology. Trends in Plant Sciences, 6:127--134, March 2001.]]Google ScholarCross Ref
- S. Kurtz, E. Ohlebusch, C. Schleiermacher, J. Stoye, and R. Giegerich. Computation and Visualization of Degenerate Repeats in Complete Genomes. In Proceedings of the International Conference on Intelligent Systems for Molecular Biology, pages 228--238, Menlo Park, CA, 2000. AAAI-Press.]] Google ScholarDigital Library
- E. M. McCarthy and J. F. McDonald. LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics, 19(3):362--367, February 2003.]]Google ScholarCross Ref
- G. Mehldau and E. Myers. A System for Pattern Matching Applications on Biosequences. In CABIOS, volume 9(3), pages 299--314, 1993.]]Google ScholarCross Ref
- E. Myers. Approximate Matching of Network Expressions with Spacers. Journal of Computational Biology, 1(3):33--51, 1996.]]Google ScholarCross Ref
- E. W. Myers, P. Oliva, and K. Guimãraes. Reporting exact and approximate regular expression matches. In M. Farach-Colton, editor, Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching, number 1448 in LNCS, pages 91--103, Piscataway, NJ, 1998. Springer-Verlag, Berlin.]] Google ScholarDigital Library
- G. Navarro and M. Raffinot. Fast and Simple Character Classes and Bounded Gaps Pattern Matching, with Application to Protein Searching. In Proc. 5th Annual International ACM Conference on Computational Molecular Biology (RECOMB'01), pages 231--240, 2001.]] Google ScholarDigital Library
- A. Zuccolo and M. Morgante. Abundance, distribution and phylogenetic relationship of LTR retrotransposons in the rice genome: an in silico survey. In preparation.]]Google Scholar
Index Terms
- Structured motifs search
Recommendations
Identification of structured motifs
BIBMW '09: Proceedings of the 2009 IEEE International Conference on Bioinformatics and Biomedicine WorkshopStructured motifs consist of two simpler patterns (half-sites) separated from each other by a gap, with no restriction on the nucleotides that may occur within the gap. This paper proposes a new algorithm to identify structured motifs. First, a simpler ...
An efficient algorithm for planted structured motif extraction
CompBio '09: Proceedings of the 1st ACM workshop on Breaking frontiers of computational biologyIn this paper we present an algorithm for the problem of planted structured motif extraction from a set of sequences. This problem is strictly related to the structured motif extraction problem, which has many important applications in molecular ...
Comments