skip to main content
article

Definable relations and first-order query languages over strings

Published:01 September 2003Publication History
Skip Abstract Section

Abstract

We study analogs of classical relational calculus in the context of strings. We start by studying string logics. Taking a classical model-theoretic approach, we fix a set of string operations and look at the resulting collection of definable relations. These form an algebra---a class of n-ary relations for every n, closed under projection and Boolean operations. We show that by choosing the string vocabulary carefully, we get string logics that have desirable properties: computable evaluation and normal forms. We identify five distinct models and study the differences in their model-theory and complexity of evaluation. We identify a subset of these models that have additional attractive properties, such as finite VC dimension and quantifier elimination.Once you have a logic, the addition of free predicate symbols gives you a string query language. The resulting languages have attractive closure properties from a database point of view: while SQL does not allow the full composition of string pattern-matching expressions with relational operators, these logics yield compositional query languages that can capture common string-matching queries while remaining tractable. For each of the logics studied in the first part of the article, we study properties of the corresponding query languages. We give bounds on the data complexity of queries, extend the normal form results from logics to queries, and show that the languages have corresponding algebras expressing safe queries.

References

  1. Abiteboul. S., Hull, R., and Vianu, V. 1995. Foundations of Databases. Addison-Wesley, Reading, Mass.]] Google ScholarGoogle Scholar
  2. Ajtai, M. 1983. Σ11 formulas on finite structures. Ann. Pure Appl. Logic 24, 1--48.]]Google ScholarGoogle Scholar
  3. Ajtai, M., Fagin, R., and Stockmeyer, L. 2000. The closure of monadic NP. J. Comput. Syst. Sci. 60, 3, 660--716.]] Google ScholarGoogle Scholar
  4. Angluin, D., and Hoover, D. N. 1984. Regular prefix relations. Math. Syst. Theory 17, 3, 167--191.]]Google ScholarGoogle Scholar
  5. Anthony, M., and Biggs, N. 1992. Computational Learning Theory. Cambridge Univ. Press, Cambridge, Mass.]] Google ScholarGoogle Scholar
  6. Atserias, A., and Kolaitis, Ph. 1998. First-order logic vs. fixed-point logic in finite set theory. In LICS'98. pp. 275--284.]] Google ScholarGoogle Scholar
  7. Barrington, D. A., Immerman, N., and Straubing, H. 1990. On uniformity within NC1. J. Comput. Syst. Sci., 41, 274--306.]] Google ScholarGoogle Scholar
  8. Belegradek, O., Stolboushkin, A., and Taitslin, M. 1999. Extended order-generic queries. Ann. Pure Appl. Logic 97, 85--125.]]Google ScholarGoogle Scholar
  9. Benedikt, M., and Libkin, L. 2000a. Relational queries over interpreted structures. J. ACM 47, 644--680.]] Google ScholarGoogle Scholar
  10. Benedikt, M., and Libkin, L. 2000b. Safe constraint queries. SIAM J. Comput. 29, 1652--1682.]] Google ScholarGoogle Scholar
  11. Blumensath, A., and Grädel, E. 2000. Automatic structures. In LICS'00. pp. 51--62.]] Google ScholarGoogle Scholar
  12. Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M. 1989. Learnability and the Vapnik--Chervonenkis dimension. J. ACM 36, 929--965.]] Google ScholarGoogle Scholar
  13. Bjørner, N. 2000. Integration of decision procedures in temporal verification. Ph.D. dissertation, Stanford Univ., Stanford, Calif.]]Google ScholarGoogle Scholar
  14. Bonner, A., and Mecca, G. 1998. Sequences, datalog, and transducers. J. Comput. Syst. Sci. 57, 234--259.]] Google ScholarGoogle Scholar
  15. Bonner, A., and Mecca, G. 1997. Querying string databases with transducers. In DBPL'97. pp. 118--135.]] Google ScholarGoogle Scholar
  16. Bruyère, V., Hansel, G., Michaux, C., and Villemaire, R. 1994. Logic and p-recognizable sets of integers. Bull. Belg. Math. Soc. 1, 191--238.]]Google ScholarGoogle Scholar
  17. Büchi, J. R. 1960. Weak second-order arithmetic and finite automata. Zeit. Math. Logik Grundl. Math. 6, 66--92.]]Google ScholarGoogle Scholar
  18. Chang, C. C., and Keisler, H. J. 1990. Model Theory. North Holland, Amsterdam, The Netherlands.]]Google ScholarGoogle Scholar
  19. Cherlin, G., and Point, F. 1986. On extensions of Presburger arithmetic. In Proceedings of the 4th Easter Model Theory Conference. Humboldt Univ., Berlin, Germany.]]Google ScholarGoogle Scholar
  20. Comon, H., and Treinen, R. 1997. The first-order theory of lexicographic path orderings is undecidable. Theoret. Comput. Sci. 176, 67--87.]] Google ScholarGoogle Scholar
  21. Consens, M., and Milo, T. 1998. Algebras for querying text regions: Expressive power and optimization. J. Comput. Syst. Sci. 57, 272--288.]] Google ScholarGoogle Scholar
  22. Dantsin, E., and Voronkov, A. 2000. Expressive power and data complexity of query languages for trees and lists. In PODS'2000. pp. 157--165.]] Google ScholarGoogle Scholar
  23. Denenberg, L., Gurevich, Y., and Shelah, S. 1986. Definability by constant-depth polynomial-size circuits. Inf. Contr. 70, 216--240.]] Google ScholarGoogle Scholar
  24. Ebbinghaus, H.-D., and Flum, J. 1995. Finite Model Theory. Springer-Verlag, New York.]]Google ScholarGoogle Scholar
  25. Ehrenfeucht, A. 1961. An application of games to the completeness problem for formalized theories. Fund. Math. 49, 129--141.]]Google ScholarGoogle Scholar
  26. Elgot, C., and Mezei, J. 1965. On relations defined by generalized finite automata. IBM J. Res. Develop. 9, 47--68.]]Google ScholarGoogle Scholar
  27. Epstein, D. B. A., Cannon, J. W., Holt, D. F., Levy, S. V. F., Paterson, M. S., and Thurston, W. P. 1992. Word Processing in Groups. Jones and Bartlett, Boston, Mass.]] Google ScholarGoogle Scholar
  28. Fischer, M., and Rabin, M. 1974. Super-exponential complexity of Presburger arithmetic. SIAM-AMS Proc. 7, 27--41.]]Google ScholarGoogle Scholar
  29. Flum, J., and Ziegler, M. 1999. Pseudo-finite homogeneity and saturation. J. Symb. Logic 64, 1689--1699.]]Google ScholarGoogle Scholar
  30. Fraïssé, R. 1954. Sur quelques classifications des systèmes de relations. Publ. Sci. Univ. Alger. Sér. A, 1, 35--182.]]Google ScholarGoogle Scholar
  31. Frougny, C., and Sakarovitch, J. 1993. Synchronized rational relations of finite and infinite words. Theoret. Comput. Sci. 108, 45--82.]] Google ScholarGoogle Scholar
  32. Furst, M., Saxe, J., and Sipser, M. 1984. Parity, circuits, and the polynomial-time hierarchy. Math. Syst. Theory 17, 13--27.]]Google ScholarGoogle Scholar
  33. Ginsburg, S., and Wang, X. S. 1992. Pattern matching by rs-operations: Toward a unified approach to querying sequenced data. In PODS'92. pp. 293--300.]] Google ScholarGoogle Scholar
  34. Grädel, E., and Gurevich, Y. 1998. Metafinite model theory. Inf. Comput. 140, 26--81.]] Google ScholarGoogle Scholar
  35. Grahne, G., and Nykänen, M. 1997. Safety, translation and evaluation of alignment calculus. In Proceedings of the 1st East-European Symposium on Advances in Databases and Information Systems (ADBIS'97). 295--304.]]Google ScholarGoogle Scholar
  36. Grahne, G., Nykänen, M., and Ukkonen, E. 1999. Reasoning about strings in databases. J. Comput. Syst. Sci. 59, 116--162.]] Google ScholarGoogle Scholar
  37. Grahne, G., and Waller, E. 2000. How to make SQL stand for string query language. In Proceedings of DBPL'99. Lecture Notes in Computer Science, vol. 1949. Springer-Verlag, New York, pp. 61--79.]] Google ScholarGoogle Scholar
  38. Gulutzan, P., and Pelzer, S. 1999. SQL-99 Complete, Really. R&D Books.]]Google ScholarGoogle Scholar
  39. Hakli, R., Nykänen, M., Tamm, H., and Ukkonen, E. 1999. Implementing a declarative string query language with string restructuring. In PADL'99. pp. 179--195.]] Google ScholarGoogle Scholar
  40. Harel, D. 1998. Towards a theory of recursive structures. In Proceedings of the 21st Symposium on Mathematical Foundations of Computer Science (MFCS'98). pp. 36--53.]] Google ScholarGoogle Scholar
  41. Hodges, M. 1993. Model Theory. Cambridge.]]Google ScholarGoogle Scholar
  42. Hodgson, B. 1983. Décidabilité par automate fini. Ann. Sc. Math. Québec 7, 1, 39--57.]]Google ScholarGoogle Scholar
  43. Ibarra, O., and Su, J. 1999. A technique for proving decidability of containment and equivalence of linear constraint queries. J. Comput. Syst. Sci. 59, 1--28.]] Google ScholarGoogle Scholar
  44. Immerman, N. 1999. Descriptive Complexity. Springer-Verlag, New York.]]Google ScholarGoogle Scholar
  45. Khoussainov, B., and Nerode, A. 1994. Automatic presentations of structures. In LCC'94. pp. 367--392.]] Google ScholarGoogle Scholar
  46. Kolaitis, Ph., and Vardi, M. 1992. Fixpoint logic vs. infinitary logic in finite-model theory. In LICS'92. pp. 46--57.]]Google ScholarGoogle Scholar
  47. Kuper, G., Libkin, L., and Paredaens, J. Eds. 2000. Constraint Databases. Springer-Verlag, New York.]]Google ScholarGoogle Scholar
  48. Laskowski, M. C. 1992. Vapnik--Chervonenkis classes of definable sets. J. London Math. Soc. 45, 377--384.]]Google ScholarGoogle Scholar
  49. Läuchli, H., and Savioz, C. 1987. Monadic second order definable relations on the binary tree. J. Symb. Logic 51, 1, 219--226.]]Google ScholarGoogle Scholar
  50. Maĺcev, A. 1961. On the elementary theories of locally free universal algebras. Sov. Math. Doklady 2, 768--771.]]Google ScholarGoogle Scholar
  51. McNaughton, R., and Papert, S. 1971. Counter-Free Automata. MIT Press, Cambridge, Mass.]] Google ScholarGoogle Scholar
  52. Michaux, C., and Villemaire, R. 1996. Open questions around Büchi and Presburger arithmetics. In Logic: From Foundations to Applications. Oxford University Press, Oxford, England, pp. 353--383.]] Google ScholarGoogle Scholar
  53. Neven, F., and Van den Bussche, J. 2002. Expressiveness of structured document query languages based on attribute grammars. J. ACM 49, 1, 56--100.]] Google ScholarGoogle Scholar
  54. Papadimitriou, C. H. 1994. Computational Complexity. Addison-Wesley, Reading, Mass.]]Google ScholarGoogle Scholar
  55. Rabin, M. 1970. Weakly definable relations and special automata. In Mathematical Logic and Foundations of Set Theory. North Holland, Amsterdam, The Netherlands, pp. 1--23.]]Google ScholarGoogle Scholar
  56. Rajasekar, A. 1999. String-oriented databases. SPIRE/CRIWG'99. pp. 158--167.]] Google ScholarGoogle Scholar
  57. Rybina, T., and Voronkov, A. 2001. A decision procedure for term algebras with queues. ACM Trans. Comput. Lang. 2, 2, 155--181.]] Google ScholarGoogle Scholar
  58. Salomaa, A. 1973. Formal Languages. Academic Press, New York.]] Google ScholarGoogle Scholar
  59. Schwentick, T. 2000. On diving in trees. In Proceedings of the 23rd Symposium on Mathematical Foundations of Computer Science (MFCS 2000) (Bratislava). pp. 660--669.]] Google ScholarGoogle Scholar
  60. Shelah, S. 1971. Stability, the f.c.p., and superstability. Ann. Math. Logic 3, 271--362.]]Google ScholarGoogle Scholar
  61. Stolboushkin, A., and Taitslin, M. 1999. Finite queries do not have effective syntax. Inf. Comput. 153, 1, 99--116.]] Google ScholarGoogle Scholar
  62. Straubing, H. 1994. Finite Automata, Formal Logic, and Circuit Complexity. Birkhäuser.]] Google ScholarGoogle Scholar
  63. Thomas, W. 1992. Infinite trees and automaton-definable relations over ω-words. Theoret. Comput. Sci. 103, 143--159.]] Google ScholarGoogle Scholar
  64. Thomas, W. 1997. Languages, automata, and logic. In Handbook of Formal Languages, Vol. 3 Springer-Verlag, New York.]] Google ScholarGoogle Scholar
  65. Venkataraman, K. 1987. Decidability of the purely existential fragment of the theory of term algebras. J. ACM 34, 492--510.]] Google ScholarGoogle Scholar

Index Terms

  1. Definable relations and first-order query languages over strings

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader