skip to main content
article
Open Access

Recognizing substrings of LR(k) languages in linear time

Published:01 May 1994Publication History
Skip Abstract Section

Abstract

LR parsing techniques have long been studied as being efficient and powerful methods for processing context-free languages. A linear-time algorithm for recognizing languages representable by LR(k) grammars has long been known. Recognizing substrings of a context-free language is at least as hard as recognizing full strings of the language, since the latter problem easily reduces to the former. In this article we present a linear-time algorithm for recognizing substrings of LR(k) languages, thus showing that the substring recognition problem for these languages is no harder than the full string recognition problem. An interesting data structure, the Forest-Structured Stack, allows the algorithm to track all possible parses of a substring without loosing the efficiency of the original LR parser. We present the algorithm, prove its correctness, analyze its complexity, and mention several applications that have been constructed.

References

  1. ~AGRAWAL, R., AND DETRO, K.D. 1983. An efficient incremental LR parser for grammars with ~epsilon productions. Acta Inf. 19, 4, 369-376. Google ScholarGoogle Scholar
  2. ~AHO, A. V., AND JOHNSON, S.C. 1974. LR parsing. ACM Comput. Surv. 6, 2, 99-124. Google ScholarGoogle Scholar
  3. ~AHO, A. V., AND ULLMAN, J.D. 1977. Principles of Compiler Design. Addison-Wesley, Reading, ~Mass. Google ScholarGoogle Scholar
  4. ~AHO, A. V., AND ULLMAN, J.D. 1972. The Theory of Parsing, Translation and Compiling. Vol. I. ~Parsing. Prentice-Hall, Englewood Cliffs, N.J. Google ScholarGoogle Scholar
  5. ~AHO, A. V., SETHI, R., AND ULLMAN, J.D. 1986. Compilers: Principles, Techniques and Tools. ~Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  6. ~CELENTANO, A. 1978. Incremental LR parsers. Acta Inf. 10, 4, 307-321.Google ScholarGoogle Scholar
  7. ~CORMACK, G.V. 1989. AU LR substring parser for noncorrecting syntax error recovery. In ~Proceedings of the SIGPLAN'89 Conference on Programming Language Design and Implemen- ~tation. ACM, New York, 161-169. Google ScholarGoogle Scholar
  8. ~COPPERSMITH, D. AND WINOGRAD, S. 1987. Matrix multiplication via arithmetic progressions. In ~Proceedings of STOC'87. ACM Press, New York, 1-6. Google ScholarGoogle Scholar
  9. ~DEREMER, F.L. 1971. Simple LR(k) grammars. Commun. ACM 14, 7, 453-460. Google ScholarGoogle Scholar
  10. ~DEREMER, F. L. 1969. Practical translators for LR(k) languages. Ph.D. dissertation, MIT, ~Cambridge, Mass.Google ScholarGoogle Scholar
  11. ~EARLEY, J. 1970. An efficient context-free parsing algorithm. Commun. ACM 13, 2, 94-102. Google ScholarGoogle Scholar
  12. ~FISCHER, C. N. 1975. On parsing context-free languages in parallel environments. Ph.D. ~thesis, Tech. Rep. TR-75-237, Cornell University, ithaca, N.Y. Google ScholarGoogle Scholar
  13. ~HOPCROFT, J. E., AND ULLMAN, J.D. 1979. Introduction to Automata Theory, Languages, and ~Computation. Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  14. ~KNUTH, D. E. 1965. On the translation of languages from left to right. Inf. Contr. 8, 6, ~607-639.Google ScholarGoogle Scholar
  15. ~KORENJAK, A.J. 1969. A practical method for constructing LR(k) processors. Commun. ACM ~12, 11, 613-623. Google ScholarGoogle Scholar
  16. ~LEWIS, P. M., ROSENKRANTZ, D. J., AND STEARNS, R. E. 1976. Compiler Design Theory. ~Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  17. ~LIGETT, D., MCCLUSKEY, G., AND MCKEEMAN, W.M. 1982. Parallel LR parsing. Tech. Rep. ~TR-82-03, Wang Inst. of Graduate Studies, Tyngsboro, Mass.Google ScholarGoogle Scholar
  18. ~LINDSTROM, G. 1970. The design of parsers for incremental language processors. In Proceed- ~ings of the 2nd ACM Symposium on Theory of Computation. ACM, New York, 81-91. Google ScholarGoogle Scholar
  19. ~REKERS, J., AND KOORN, W. 1991. Substring parsing for arbitrary context-free grammars. In ~Proceedings of the 2nd International Workshop on Parsing Technologies. ACL SIGPARSE, ~218-224.Google ScholarGoogle Scholar
  20. ~RICHTER, H. 1985. Noncorrecting syntax error recovery. ACM Trans. Program. Lang. Syst. 7, ~3, 478-489. Google ScholarGoogle Scholar
  21. ~SCHELL, R.M. 1979. Methods for constructing parallel compilers for use in a multiprocessor ~environment. Ph.D. thesis, Tech. Rep. UIUCDCS-R-79-958, Univ. of Illinois. Urbana, Ill. Google ScholarGoogle Scholar
  22. ~SCHNEIDER, V. B. 1969. A system for designing fast programming language translators. In ~Proceedings ofAFIPS. AFIPS, Washington, D.C.Google ScholarGoogle Scholar
  23. ~TOMITA, M. 1986. Efficient Parsmg for Natural Language. Kluwer Academic Publishers, ~Hingham, Mass. Google ScholarGoogle Scholar
  24. ~VALIANT, L. 1975. General context free recognition in less than cubic time. J. Comput. Syst. ~Sci. 10, 2, 308-315.Google ScholarGoogle Scholar

Index Terms

  1. Recognizing substrings of LR(k) languages in linear time

          Recommendations

          Reviews

          Marian Gheorghe

          The problem of recognizing the substrings of LR(k) languages is addressed. A detailed description of the algorithm in the case of one lookahead symbol is given in the second section. An interesting data structure, called a forest-structured stack, is used by the algorithm. Other algorithms parse the substrings of a language [1] by modifying the LR parsing tables of the classical approach [2]; in contrast, the algorithm presented in this paper modifies the parsing algorithm itself and does not require any changes in the original LR tables. It is important that this method has a linear time complexity, the same as that of the original LR parsing algorithm. The algorithm is slightly modified, in the fifth section, to deal with canonical LR(k) grammars, for k?2 . The correctness and the complexity of the algorithm are established in the third and fourth sections<__?__Pub Caret>. The conclusions mention that a first version of the algorithm was developed by the first author in 1980 and was applied as a basis for syntax checking in the XEDIT editor. New results concerning the application of this recognition algorithm in different editor systems as a syntax checker and in the areas of speech recognition and parsing natural languages are also reported. Because of its linear time complexity, this algorithm seems to be appropriate for different applications involving substring recognition.

          Access critical reviews of Computing literature here

          Become a reviewer for Computing Reviews.

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Programming Languages and Systems
            ACM Transactions on Programming Languages and Systems  Volume 16, Issue 3
            May 1994
            773 pages
            ISSN:0164-0925
            EISSN:1558-4593
            DOI:10.1145/177492
            Issue’s Table of Contents

            Copyright © 1994 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 May 1994
            Published in toplas Volume 16, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader