Skip to main content

Finding Common Motifs with Gaps Using Finite Automata

  • Conference paper
Implementation and Application of Automata (CIAA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4094))

Included in the following conference series:

Abstract

We present an algorithm that uses finite automata to find the common motifs with gaps occurring in all strings belonging to a finite set S = {S 1,S 2,...,S r }. In order to find these common motifs we must first identify the factors that exist in each string. Therefore the algorithm begins by constructing a factor automaton for each string S i . To find the common factors of all the strings, the algorithm needs to gather all the factors from the strings together in one data structure and this is achieved by computing an automaton that accepts the union of the above-mentioned automata. Using this automaton we are able to create a new factor alphabet. Based on this factor alphabet a finite automaton is created for each string S i that accepts sequences of all non overlapping factors residing in each string. The intersection of the latter automata produces the finite automaton which accepts all the common subsequences with gaps over the factor alphabet that are present in all the strings of the set S = {S 1,S 2,...,S r }. These common subsequences are the common motifs of the strings.

This research has been partially supported by the Ministry of Education, Youth and Sports under research program MSM 6840770014 and the Czech Science Foundation as project No. 201/06/1039.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Charras, C., Lecroq, T.: Exact string matching algorithms (2004)

    Google Scholar 

  2. Crawford, T., Iliopoulos, C.S., Raman, R.: String matching techniques for musical similarity and melodic recognition. Computing in Musicology 11, 73–100 (1998)

    Google Scholar 

  3. Crochemore, M., Vérin, R.: Direct construction of compact directed acyclic word graphs. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 116–129. Springer, Heidelberg (1997)

    Google Scholar 

  4. Crochemore, M., Hancart, C.: Automata for matching patterns. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, Linear Modeling: Background and Application, ch. 9, vol. 2, pp. 399–462. Springer, Heidelberg (1997)

    Google Scholar 

  5. Crochemore, M., Rytter, W.: Text algorithms. Oxford University Press, Inc., New York (1994)

    MATH  Google Scholar 

  6. Holub, J., Melichar, B.: Approximate string matching using factor automata. Theor. Comput. Sci. 249(2), 305–311 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  7. Iliopoulos, C.S., McHugh, J., Peterlongo, P., Pisanti, N., Rytter, W., Sagot, M.: A first approach to finding common motifs with gaps. International Journal of Foundations of Computer Science (2004)

    Google Scholar 

  8. Leung, H.C.M.: Finding motifs with insufficient number of strong binding sites. Journal of Computational Biology 12(6), 686–701 (2005)

    Article  Google Scholar 

  9. Skiena, S.S.: The algorithm design manual. Springer, New York (1998)

    Google Scholar 

  10. Baker, M.E., Bailey, T.L., Elkan, C.P.: An artificial intelligence approach to motif discovery in protein sequences: Application to steroid dehydrogenases. The Journal of Steroid Biochemistry and Molecular Biology 62(1), 29–44 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Antoniou, P., Holub, J., Iliopoulos, C.S., Melichar, B., Peterlongo, P. (2006). Finding Common Motifs with Gaps Using Finite Automata. In: Ibarra, O.H., Yen, HC. (eds) Implementation and Application of Automata. CIAA 2006. Lecture Notes in Computer Science, vol 4094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11812128_8

Download citation

  • DOI: https://doi.org/10.1007/11812128_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37213-4

  • Online ISBN: 978-3-540-37214-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics