Computable Syntactic Analysis: The 1959 Computer Sentence-Analyzer

X is the center of XY (or of Y), and Y is the adjunct of XY (or of X) in sentences S₁, if for every S₁ we can obtain a sentence S₂ by replacing XY by X, but not in general by replacing XY by Y, and if X is the smallest part of XY for which this holds. X, Y range over the set of word-categories defined for each language; for certain purposes they can be taken to range over the individual words or morphemes (word-parts) of the language. If for every sentence composed of the class or constituent (§ 2.7) sequence ABC there exists a sentence AC, the centers are A and C; but this definition would not specify to which center B is the adjunct. If, in addition, for every sentence ABD there exists a sentence AD, we can add to the definition that Y is adjunct of X only if in all sentences of the form S₂ (here = AC ⋁ AD) every X is replaceable by XY yielding a sentence (an S₁). The analysis is of interest, of course, only if S₂ (and S₁) are convenient types of S for a characterization of language structure, not if they leave inconvenient residues of S types. If X is the center of some sentence section, and X also appears elsewhere in S₁, a unique decision as to which occurrence of X is the center requires the addition either of simplicity conditions (over the structure of the adjunct or the types of S in which X is center) or else of co-occurrence similarities (X₁ and not X₂ is the center if, in the set of word-triples for which X₁X₂Y occurs, the dependence of X₁ values on Y values is great in comparison with the dependence of X₂). With a definition thus strengthened, the choice of centers is unique, including the center ∑ of a whole sentence, except in unimportant respects (e. g. in X and X, either X could be taken as center).
Google Scholar
However, the assignment of word sequences to substrings is not unique; so that for some sentences, the fact that a sentence is a case of one structure does not exclude its being recognized as also a case of some other structure.
Google Scholar
As before, the structural assignment need not be unique: a given word in a given position may fit into two or more structural assignments. This is not a failure of the computation, but a specific and known homomorphic mapping (homonymous ambiguity) of the set of structures onto the set of sentences.
Google Scholar
The major word categories are: A adjective (recursive,...), N noun (operation,...), V verb (recognize,...), T article (the,...), W tense or auxiliary (-ed, will,...), P preposition (of,...), D adverb (recursively,...),C conjunction (and, because), V+ verb with its full object (took the book, elected the man president), V− verb with one ̄N or ̋N missing from its object (took, elected president). V_{i + i} indicates that the object is of type i which is called for by a V of subset i. (The matching subscripts may be omitted, since they are understood); see note 17. S sentence, ∑ sentence-center; for the major S type considered here, ∑ is NWV_{i + i.}
Google Scholar
G(N) itself does not in turn produce a new N, and hence is not recursive. But in some cases it is formed out of some other recursive operation: If G′(P) is the left concatenation of P onto P, forming e. g. over near from near, and out over near from over near, etc., we have: G′(0, Pⁱ) = P = P⁰ and G′(n,Pⁱ) = Pⁱ⁺¹. Then G(N) consists in the left concatenation of Pⁱ (i. e. of the resultant of G′(P)) to N.
Google Scholar
Although a greater number of repetitions, or a wider variety of words in the adjoined substrings, gives an increasingly bizarre effect (in different degrees for different operations).
Google Scholar
There are various additional operations of this type. E. g. F_a- operates on N to yield compound nouns (A-N: wild-flowers); thus F_n and F_a- both produce N_n and can operate on each other’s resultants. Fn can also operate on A (including Wing and Wed) to yield compound adjectives (N-A: stone-cold). Fa can in some cases operate on Np (i. e. N_a ⁰ = N_p ⁱ) as in a veritable bull in a china shop.
Google Scholar
F_d can operate on A even when it is not part of N_a: This is very nice. But F_a operates only on N_a; A appearing elsewhere are not repeatable: we don’t have This is nice old. V (not all) with-ing,-ed can also be adjoined by F_a; or additional operators have to be set up for them: burning interest, broken tubes. We do not say that F_d operates on N_a, since it does not operate on N_a ⁰ (e. g. on N-N or on N). We cannot say that F_a adjoins A_d to N_a, because only the exterior A can be preceded by D; i. e. F_d operates after F_a: there is no DADAN (without commas). F_d also operates on PN: completely at ease.
Google Scholar
We do not write N_d, since this would indicate a recursive operation on N, and would include the non-existent F_d on N_a ⁰. If either the operator F_x or the operand Y are limited, we write F_x.Y for their permitted resultants, not Y_x ⁱ. A number of detailed restrictions are omitted in this survey. Also omitted are distinctions among some operators (and hence some word-classes) which are grouped together into Fq and into Fe.
Google Scholar
Ar·Fp (and Ar with several other F_r, chiefly F_k and F_e, the various F not repeated) is itself a right operator (with difficulty repeatable) on N; we may write it F_-a: children lost in thought.
Google Scholar
F_t-also (infrequently) adjoins a hyphenated to V— to the right of A; the resultant may be written A_r, with A_r ⁰ = A: a hard-to-distinguish thin line. (The to V— is not hyphenated when A_r is not in N_a: This is hard to distinguish.) There are other infrequently-met operators which hyphenate strings to the right of A in N_a, or simply adjoin the strings to A.
Google Scholar
There are other, less-frequently occurring, right operators both on N and on V-containing strings. A right substring which is adjoined almost only to ∑ is , which NWV-,: as in I found her there, which I had long hoped to do.
Google Scholar
I. e. (a) N, W, WV, etc.; (b) N-, A, D, PN, Wing+, Wen+, wh-strings, NVing +, C4 NWV +, etc.; (c) N_a, A_d, N_r, A_r, YCY, etc. In addition, Y can be F_t.A_d and F_t.F_n (the old and the new plans and the dress and the shoe sales); and there are some special and infrequent cases, such as the value of Y being two non-contiguous constituents: e. g. the subject and the object, as in He speaks English and I French. Among the types of strings excluded from the values of Y are (aside from certain special cases) the adjoinings due to two or more operations (which may be the same); e. g. there is no AACAA: nice large and new beautiful (without comma).
Google Scholar
Subsidiary sentence types are chiefly: sentences in which certain N-replacer substrings, not adjoined to N, occupy the position of N in the major type; questions; imperatives; object-subject-verb arrangements.
Google Scholar
One member of W is zero. If we do not admit zero members, we would have to say that W may or may not occur here.
Google Scholar
For example, the subset V_e (containing the single verb have, which is also a member of other subsets) has as object W _{ien +i} (i. e. a verb of any subject i plus the suffix en plus the object of type i); the subset Vg (is, like, etc.) has as object W _{iing + i}; the subset V_n (sell, find) has N; the subset V_nt (find, know) has N toV_i + i; etc.
Google Scholar
In V_i−i, the — equals: _+i minus one ̋N; or _+iP (i. e. _+iP̄N minus the last ̄N). ̄N is a match between (a) the set of strings produced by the N′ generator and (b) the sequence of marks to the left of each N in a sentence; the first approximation to ̄N t is the longest string (out of b) which ends on the right with this N and which is a member of the set N′. A second approximation is obtained if part of the ̄N can be reassigned to some other element in the well-formedness of the sentence: e. g. the ANN = ̄N may be reassigned into TA = ̄N plus NN = ̄N, or into TAN = ̄N plus N = ̄N (both yielding two ̄N in the sentence); ACAN = ̄N may be assigned into A (as an object) plus C plus AN = ̄N and into D (as object or right identity) plus C plus DAN = ̄N (if an appropriate V precedes).
Google Scholar
Repetition with C of any circled element X or any succession of these XY, i. e. the adjoining of CX to X and of CXY to XY, is not shown here, but is to be carried out in accordance with 2.7.
Google Scholar
In certain positions, e. g. the second ̋ of an object, few or none of these strings can besubstituted for ̋N. Note that while F_r produce adjunct strings adjoined to an N′ (and together with that N’ constitute an ̋N-string), the N-replacers are strings (mostly similar to some F_r) which occupy the position of an N′ in an including string.
Google Scholar
The local-tree network can be completed by considering the dictionary also to be a tree, which in this case follows not the sequence of a text, but the sequence in the matched word and classification in the dictionary. (The entry — the word — reaches immediately to the output; but the analogy to the tree will become useful when we consider the variable output here as in the later trees.) Symbols from the computer program of TDAP16-20 are: T1 = a, an; B = adjectivized pronouns (e. g. their); R = pronouns; T2 = the; Q == quantifiers (e. g. few); M = words not in the dictionary (mostly nouns); L= numbers; G = V + ing; S = V + en; N7 = measure nouns (e. g. minutes).
Google Scholar
See 5.2, 3.
Google Scholar
The two relations of sentence-center to sentence (note 1 and 6.1) are not a surprising correlation, but the result of simplicity considerations in the construction of the analysis. The categories in respect to which the adjoinings were recursive (in 1 above) were determined on this basis: roughly, ̄Nⁱ is that set of word sequences that can replace N⁰ (i. e. N) in a sentence.
Google Scholar
Ad indicates D composed of A plus the adverbializing suffix-ly; An indicates N composed of A plus nominalization (in this case, the dropping of-ful). Various restrictions on the correspondences of substring and sentence are not mentioned here, but will be discussed in a later paper.
Google Scholar
Under given conditions, these substrings correspond to other sentence transforms. For example, if F_p contains Wing of N₂, the corresponding sentence is often N₂ WV + (where + is zero): barking of dogs — dogs bark.
Google Scholar
The substring adjoined, usually between commas, by F_n, is more exactly not ̋N but any string which can be the object of is in a sentence; compare the operator F_k.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Pennsylvania, USA
Zellig S. Harris

Authors

Zellig S. Harris
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Harris, Z.S. (1970). Computable Syntactic Analysis: The 1959 Computer Sentence-Analyzer. In: Papers in Structural and Transformational Linguistics. Formal Linguistics Series. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-6059-1_16

Download citation

DOI: https://doi.org/10.1007/978-94-017-6059-1_16
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-5716-4
Online ISBN: 978-94-017-6059-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Computable Syntactic Analysis: The 1959 Computer Sentence-Analyzer

Access this chapter

Preview

Notes