article

Open Access

Recognizing substrings of LR(k) languages in linear time

Authors:
Joseph Bates

Carnegie Mellon Univ., Pittsburgh, PA

Carnegie Mellon Univ., Pittsburgh, PA
View Profile

,
Alon Lavie

Carnegie Mellon Univ., Pittsburgh, PA

Carnegie Mellon Univ., Pittsburgh, PA
View Profile

ACM Transactions on Programming Languages and Systems Volume 16 Issue 3pp 1051–1077https://doi.org/10.1145/177492.177768

Published:01 May 1994Publication History

ACM Transactions on Programming Languages and Systems

Abstract

LR parsing techniques have long been studied as being efficient and powerful methods for processing context-free languages. A linear-time algorithm for recognizing languages representable by LR(k) grammars has long been known. Recognizing substrings of a context-free language is at least as hard as recognizing full strings of the language, since the latter problem easily reduces to the former. In this article we present a linear-time algorithm for recognizing substrings of LR(k) languages, thus showing that the substring recognition problem for these languages is no harder than the full string recognition problem. An interesting data structure, the Forest-Structured Stack, allows the algorithm to track all possible parses of a substring without loosing the efficiency of the original LR parser. We present the algorithm, prove its correctness, analyze its complexity, and mention several applications that have been constructed.

References

~AGRAWAL, R., AND DETRO, K.D. 1983. An efficient incremental LR parser for grammars with ~epsilon productions. Acta Inf. 19, 4, 369-376. Google Scholar
~AHO, A. V., AND JOHNSON, S.C. 1974. LR parsing. ACM Comput. Surv. 6, 2, 99-124. Google Scholar
~AHO, A. V., AND ULLMAN, J.D. 1977. Principles of Compiler Design. Addison-Wesley, Reading, ~Mass. Google Scholar
~AHO, A. V., AND ULLMAN, J.D. 1972. The Theory of Parsing, Translation and Compiling. Vol. I. ~Parsing. Prentice-Hall, Englewood Cliffs, N.J. Google Scholar
~AHO, A. V., SETHI, R., AND ULLMAN, J.D. 1986. Compilers: Principles, Techniques and Tools. ~Addison-Wesley, Reading, Mass. Google Scholar
~CELENTANO, A. 1978. Incremental LR parsers. Acta Inf. 10, 4, 307-321.Google Scholar
~CORMACK, G.V. 1989. AU LR substring parser for noncorrecting syntax error recovery. In ~Proceedings of the SIGPLAN'89 Conference on Programming Language Design and Implemen- ~tation. ACM, New York, 161-169. Google Scholar
~COPPERSMITH, D. AND WINOGRAD, S. 1987. Matrix multiplication via arithmetic progressions. In ~Proceedings of STOC'87. ACM Press, New York, 1-6. Google Scholar
~DEREMER, F.L. 1971. Simple LR(k) grammars. Commun. ACM 14, 7, 453-460. Google Scholar
~DEREMER, F. L. 1969. Practical translators for LR(k) languages. Ph.D. dissertation, MIT, ~Cambridge, Mass.Google Scholar
~EARLEY, J. 1970. An efficient context-free parsing algorithm. Commun. ACM 13, 2, 94-102. Google Scholar
~FISCHER, C. N. 1975. On parsing context-free languages in parallel environments. Ph.D. ~thesis, Tech. Rep. TR-75-237, Cornell University, ithaca, N.Y. Google Scholar
~HOPCROFT, J. E., AND ULLMAN, J.D. 1979. Introduction to Automata Theory, Languages, and ~Computation. Addison-Wesley, Reading, Mass. Google Scholar
~KNUTH, D. E. 1965. On the translation of languages from left to right. Inf. Contr. 8, 6, ~607-639.Google Scholar
~KORENJAK, A.J. 1969. A practical method for constructing LR(k) processors. Commun. ACM ~12, 11, 613-623. Google Scholar
~LEWIS, P. M., ROSENKRANTZ, D. J., AND STEARNS, R. E. 1976. Compiler Design Theory. ~Addison-Wesley, Reading, Mass. Google Scholar
~LIGETT, D., MCCLUSKEY, G., AND MCKEEMAN, W.M. 1982. Parallel LR parsing. Tech. Rep. ~TR-82-03, Wang Inst. of Graduate Studies, Tyngsboro, Mass.Google Scholar
~LINDSTROM, G. 1970. The design of parsers for incremental language processors. In Proceed- ~ings of the 2nd ACM Symposium on Theory of Computation. ACM, New York, 81-91. Google Scholar
~REKERS, J., AND KOORN, W. 1991. Substring parsing for arbitrary context-free grammars. In ~Proceedings of the 2nd International Workshop on Parsing Technologies. ACL SIGPARSE, ~218-224.Google Scholar
~RICHTER, H. 1985. Noncorrecting syntax error recovery. ACM Trans. Program. Lang. Syst. 7, ~3, 478-489. Google Scholar
~SCHELL, R.M. 1979. Methods for constructing parallel compilers for use in a multiprocessor ~environment. Ph.D. thesis, Tech. Rep. UIUCDCS-R-79-958, Univ. of Illinois. Urbana, Ill. Google Scholar
~SCHNEIDER, V. B. 1969. A system for designing fast programming language translators. In ~Proceedings ofAFIPS. AFIPS, Washington, D.C.Google Scholar
~TOMITA, M. 1986. Efficient Parsmg for Natural Language. Kluwer Academic Publishers, ~Hingham, Mass. Google Scholar
~VALIANT, L. 1975. General context free recognition in less than cubic time. J. Comput. Syst. ~Sci. 10, 2, 308-315.Google Scholar

Index Terms

Recognizing substrings of LR(k) languages in linear time
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Parsers
      2. Translator writing systems and compiler generators
2. Theory of computation
  1. Formal languages and automata theory
    1. Grammars and context-free languages
  2. Semantics and reasoning
    1. Program reasoning
      1. Parsing

Recommendations

Deterministic parsing of ambiguous grammars

Methods of describing the syntax of programming languages in ways that are more flexible and natural than conventional BNF descriptions are considered. These methods involve the use of ambiguous context-free grammars together with rules to resolve ...
Read More
On parsing and condensing substrings of LR languages in linear time

LR parsers have longbeen known as being an efficient algorithm for recognizing deterministic context-free grammars. In this article, we present a linear-time method for parsing substrings of LR languages. The algorithm depends on the LR automaton which ...
Read More
Recognizing substrings of LR(k) languages in linear time
POPL '92: Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages

LR parsing techniques have long been studied as efficient and powerful methods for processing context free languages. A linear time algorithm for recognizing languages representable by LR(k) grammars has long been known. Recognizing substrings of a ...
Read More

Reviews

Reviewer: Marian Gheorghe

The problem of recognizing the substrings of LR(k) languages is addressed. A detailed description of the algorithm in the case of one lookahead symbol is given in the second section. An interesting data structure, called a forest-structured stack, is used by the algorithm. Other algorithms parse the substrings of a language [1] by modifying the LR parsing tables of the classical approach [2]; in contrast, the algorithm presented in this paper modifies the parsing algorithm itself and does not require any changes in the original LR tables. It is important that this method has a linear time complexity, the same as that of the original LR parsing algorithm. The algorithm is slightly modified, in the fifth section, to deal with canonical LR(k) grammars, for k?2 . The correctness and the complexity of the algorithm are established in the third and fourth sections<__?__Pub Caret>. The conclusions mention that a first version of the algorithm was developed by the first author in 1980 and was applied as a basis for syntax checking in the XEDIT editor. New results concerning the application of this recognition algorithm in different editor systems as a syntax checker and in the areas of speech recognition and parsing natural languages are also reported. Because of its linear time complexity, this algorithm seems to be appropriate for different applications involving substring recognition.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Programming Languages and Systems Volume 16, Issue 3
May 1994
773 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/177492
Issue’s Table of Contents

Copyright © 1994 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 1994
Published in toplas Volume 16, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
LR parsing
substrings
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 534
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Recognizing substrings of LR(k) languages in linear time

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Deterministic parsing of ambiguous grammars

On parsing and condensing substrings of LR languages in linear time

Recognizing substrings of LR(k) languages in linear time

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Recognizing substrings of LR(k) languages in linear time

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Deterministic parsing of ambiguous grammars

On parsing and condensing substrings of LR languages in linear time

Recognizing substrings of LR(k) languages in linear time

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media