Abstract
We study analogs of classical relational calculus in the context of strings. We start by studying string logics. Taking a classical model-theoretic approach, we fix a set of string operations and look at the resulting collection of definable relations. These form an algebra---a class of n-ary relations for every n, closed under projection and Boolean operations. We show that by choosing the string vocabulary carefully, we get string logics that have desirable properties: computable evaluation and normal forms. We identify five distinct models and study the differences in their model-theory and complexity of evaluation. We identify a subset of these models that have additional attractive properties, such as finite VC dimension and quantifier elimination.Once you have a logic, the addition of free predicate symbols gives you a string query language. The resulting languages have attractive closure properties from a database point of view: while SQL does not allow the full composition of string pattern-matching expressions with relational operators, these logics yield compositional query languages that can capture common string-matching queries while remaining tractable. For each of the logics studied in the first part of the article, we study properties of the corresponding query languages. We give bounds on the data complexity of queries, extend the normal form results from logics to queries, and show that the languages have corresponding algebras expressing safe queries.
- Abiteboul. S., Hull, R., and Vianu, V. 1995. Foundations of Databases. Addison-Wesley, Reading, Mass.]] Google Scholar
- Ajtai, M. 1983. Σ11 formulas on finite structures. Ann. Pure Appl. Logic 24, 1--48.]]Google Scholar
- Ajtai, M., Fagin, R., and Stockmeyer, L. 2000. The closure of monadic NP. J. Comput. Syst. Sci. 60, 3, 660--716.]] Google Scholar
- Angluin, D., and Hoover, D. N. 1984. Regular prefix relations. Math. Syst. Theory 17, 3, 167--191.]]Google Scholar
- Anthony, M., and Biggs, N. 1992. Computational Learning Theory. Cambridge Univ. Press, Cambridge, Mass.]] Google Scholar
- Atserias, A., and Kolaitis, Ph. 1998. First-order logic vs. fixed-point logic in finite set theory. In LICS'98. pp. 275--284.]] Google Scholar
- Barrington, D. A., Immerman, N., and Straubing, H. 1990. On uniformity within NC1. J. Comput. Syst. Sci., 41, 274--306.]] Google Scholar
- Belegradek, O., Stolboushkin, A., and Taitslin, M. 1999. Extended order-generic queries. Ann. Pure Appl. Logic 97, 85--125.]]Google Scholar
- Benedikt, M., and Libkin, L. 2000a. Relational queries over interpreted structures. J. ACM 47, 644--680.]] Google Scholar
- Benedikt, M., and Libkin, L. 2000b. Safe constraint queries. SIAM J. Comput. 29, 1652--1682.]] Google Scholar
- Blumensath, A., and Grädel, E. 2000. Automatic structures. In LICS'00. pp. 51--62.]] Google Scholar
- Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M. 1989. Learnability and the Vapnik--Chervonenkis dimension. J. ACM 36, 929--965.]] Google Scholar
- Bjørner, N. 2000. Integration of decision procedures in temporal verification. Ph.D. dissertation, Stanford Univ., Stanford, Calif.]]Google Scholar
- Bonner, A., and Mecca, G. 1998. Sequences, datalog, and transducers. J. Comput. Syst. Sci. 57, 234--259.]] Google Scholar
- Bonner, A., and Mecca, G. 1997. Querying string databases with transducers. In DBPL'97. pp. 118--135.]] Google Scholar
- Bruyère, V., Hansel, G., Michaux, C., and Villemaire, R. 1994. Logic and p-recognizable sets of integers. Bull. Belg. Math. Soc. 1, 191--238.]]Google Scholar
- Büchi, J. R. 1960. Weak second-order arithmetic and finite automata. Zeit. Math. Logik Grundl. Math. 6, 66--92.]]Google Scholar
- Chang, C. C., and Keisler, H. J. 1990. Model Theory. North Holland, Amsterdam, The Netherlands.]]Google Scholar
- Cherlin, G., and Point, F. 1986. On extensions of Presburger arithmetic. In Proceedings of the 4th Easter Model Theory Conference. Humboldt Univ., Berlin, Germany.]]Google Scholar
- Comon, H., and Treinen, R. 1997. The first-order theory of lexicographic path orderings is undecidable. Theoret. Comput. Sci. 176, 67--87.]] Google Scholar
- Consens, M., and Milo, T. 1998. Algebras for querying text regions: Expressive power and optimization. J. Comput. Syst. Sci. 57, 272--288.]] Google Scholar
- Dantsin, E., and Voronkov, A. 2000. Expressive power and data complexity of query languages for trees and lists. In PODS'2000. pp. 157--165.]] Google Scholar
- Denenberg, L., Gurevich, Y., and Shelah, S. 1986. Definability by constant-depth polynomial-size circuits. Inf. Contr. 70, 216--240.]] Google Scholar
- Ebbinghaus, H.-D., and Flum, J. 1995. Finite Model Theory. Springer-Verlag, New York.]]Google Scholar
- Ehrenfeucht, A. 1961. An application of games to the completeness problem for formalized theories. Fund. Math. 49, 129--141.]]Google Scholar
- Elgot, C., and Mezei, J. 1965. On relations defined by generalized finite automata. IBM J. Res. Develop. 9, 47--68.]]Google Scholar
- Epstein, D. B. A., Cannon, J. W., Holt, D. F., Levy, S. V. F., Paterson, M. S., and Thurston, W. P. 1992. Word Processing in Groups. Jones and Bartlett, Boston, Mass.]] Google Scholar
- Fischer, M., and Rabin, M. 1974. Super-exponential complexity of Presburger arithmetic. SIAM-AMS Proc. 7, 27--41.]]Google Scholar
- Flum, J., and Ziegler, M. 1999. Pseudo-finite homogeneity and saturation. J. Symb. Logic 64, 1689--1699.]]Google Scholar
- Fraïssé, R. 1954. Sur quelques classifications des systèmes de relations. Publ. Sci. Univ. Alger. Sér. A, 1, 35--182.]]Google Scholar
- Frougny, C., and Sakarovitch, J. 1993. Synchronized rational relations of finite and infinite words. Theoret. Comput. Sci. 108, 45--82.]] Google Scholar
- Furst, M., Saxe, J., and Sipser, M. 1984. Parity, circuits, and the polynomial-time hierarchy. Math. Syst. Theory 17, 13--27.]]Google Scholar
- Ginsburg, S., and Wang, X. S. 1992. Pattern matching by rs-operations: Toward a unified approach to querying sequenced data. In PODS'92. pp. 293--300.]] Google Scholar
- Grädel, E., and Gurevich, Y. 1998. Metafinite model theory. Inf. Comput. 140, 26--81.]] Google Scholar
- Grahne, G., and Nykänen, M. 1997. Safety, translation and evaluation of alignment calculus. In Proceedings of the 1st East-European Symposium on Advances in Databases and Information Systems (ADBIS'97). 295--304.]]Google Scholar
- Grahne, G., Nykänen, M., and Ukkonen, E. 1999. Reasoning about strings in databases. J. Comput. Syst. Sci. 59, 116--162.]] Google Scholar
- Grahne, G., and Waller, E. 2000. How to make SQL stand for string query language. In Proceedings of DBPL'99. Lecture Notes in Computer Science, vol. 1949. Springer-Verlag, New York, pp. 61--79.]] Google Scholar
- Gulutzan, P., and Pelzer, S. 1999. SQL-99 Complete, Really. R&D Books.]]Google Scholar
- Hakli, R., Nykänen, M., Tamm, H., and Ukkonen, E. 1999. Implementing a declarative string query language with string restructuring. In PADL'99. pp. 179--195.]] Google Scholar
- Harel, D. 1998. Towards a theory of recursive structures. In Proceedings of the 21st Symposium on Mathematical Foundations of Computer Science (MFCS'98). pp. 36--53.]] Google Scholar
- Hodges, M. 1993. Model Theory. Cambridge.]]Google Scholar
- Hodgson, B. 1983. Décidabilité par automate fini. Ann. Sc. Math. Québec 7, 1, 39--57.]]Google Scholar
- Ibarra, O., and Su, J. 1999. A technique for proving decidability of containment and equivalence of linear constraint queries. J. Comput. Syst. Sci. 59, 1--28.]] Google Scholar
- Immerman, N. 1999. Descriptive Complexity. Springer-Verlag, New York.]]Google Scholar
- Khoussainov, B., and Nerode, A. 1994. Automatic presentations of structures. In LCC'94. pp. 367--392.]] Google Scholar
- Kolaitis, Ph., and Vardi, M. 1992. Fixpoint logic vs. infinitary logic in finite-model theory. In LICS'92. pp. 46--57.]]Google Scholar
- Kuper, G., Libkin, L., and Paredaens, J. Eds. 2000. Constraint Databases. Springer-Verlag, New York.]]Google Scholar
- Laskowski, M. C. 1992. Vapnik--Chervonenkis classes of definable sets. J. London Math. Soc. 45, 377--384.]]Google Scholar
- Läuchli, H., and Savioz, C. 1987. Monadic second order definable relations on the binary tree. J. Symb. Logic 51, 1, 219--226.]]Google Scholar
- Maĺcev, A. 1961. On the elementary theories of locally free universal algebras. Sov. Math. Doklady 2, 768--771.]]Google Scholar
- McNaughton, R., and Papert, S. 1971. Counter-Free Automata. MIT Press, Cambridge, Mass.]] Google Scholar
- Michaux, C., and Villemaire, R. 1996. Open questions around Büchi and Presburger arithmetics. In Logic: From Foundations to Applications. Oxford University Press, Oxford, England, pp. 353--383.]] Google Scholar
- Neven, F., and Van den Bussche, J. 2002. Expressiveness of structured document query languages based on attribute grammars. J. ACM 49, 1, 56--100.]] Google Scholar
- Papadimitriou, C. H. 1994. Computational Complexity. Addison-Wesley, Reading, Mass.]]Google Scholar
- Rabin, M. 1970. Weakly definable relations and special automata. In Mathematical Logic and Foundations of Set Theory. North Holland, Amsterdam, The Netherlands, pp. 1--23.]]Google Scholar
- Rajasekar, A. 1999. String-oriented databases. SPIRE/CRIWG'99. pp. 158--167.]] Google Scholar
- Rybina, T., and Voronkov, A. 2001. A decision procedure for term algebras with queues. ACM Trans. Comput. Lang. 2, 2, 155--181.]] Google Scholar
- Salomaa, A. 1973. Formal Languages. Academic Press, New York.]] Google Scholar
- Schwentick, T. 2000. On diving in trees. In Proceedings of the 23rd Symposium on Mathematical Foundations of Computer Science (MFCS 2000) (Bratislava). pp. 660--669.]] Google Scholar
- Shelah, S. 1971. Stability, the f.c.p., and superstability. Ann. Math. Logic 3, 271--362.]]Google Scholar
- Stolboushkin, A., and Taitslin, M. 1999. Finite queries do not have effective syntax. Inf. Comput. 153, 1, 99--116.]] Google Scholar
- Straubing, H. 1994. Finite Automata, Formal Logic, and Circuit Complexity. Birkhäuser.]] Google Scholar
- Thomas, W. 1992. Infinite trees and automaton-definable relations over ω-words. Theoret. Comput. Sci. 103, 143--159.]] Google Scholar
- Thomas, W. 1997. Languages, automata, and logic. In Handbook of Formal Languages, Vol. 3 Springer-Verlag, New York.]] Google Scholar
- Venkataraman, K. 1987. Decidability of the purely existential fragment of the theory of term algebras. J. ACM 34, 492--510.]] Google Scholar
Index Terms
- Definable relations and first-order query languages over strings
Recommendations
Decidable and Undecidable Problems for First-Order Definability and Modal Definability
Language, Logic, and ComputationAbstractThe core of this paper is Chagrova’s Theorems about first-order definability of given modal formulas and modal definability of given elementary conditions. We consider classes of frames for which modal definability is decidable and classes of ...
Complexity and expressive power of logic programming
This article surveys various complexity and expressiveness results on different forms of logic programming. The main focus is on decidable forms of logic programming, in particular, propositional logic programming and datalog, but we also mention ...
Existential second-order logic over strings
Existential second-order logic (ESO) and monadic second-order logic(MSO) have attracted much interest in logic and computer science. ESO is a much expressive logic over successor structures than MSO. However, little was known about the relationship ...
Comments