1 Introduction

The Sylvester–Gallai theorem is a well-known theorem in combinatorial geometry. It was proven by Gallai [18] in response to a question of Sylvester [33] from 40 years earlier (Fig. 1).

Fig. 1
figure 1

Sylvester’s question [33]

Theorem 1.1

(Sylvester–Gallai theorem) Suppose that \(P\) is a finite set of points in the plane, not all on one line. Then there is an ordinary line spanned by \(P\), that is to say a line in \(P\) containing exactly two points.

Several different proofs of this now appear in the literature. We will be particularly interested in a proof due to Melchior [25] based on projective duality and Euler’s formula, which we will recall in Sect. 3. It is natural to wonder how many ordinary lines there are in a set \(P\) of points, not all on a line, when the cardinality \(|P|\) of \(P\) is equal to \(n\). Melchior’s argument in fact shows that there are at least three ordinary lines, but considerably more is known. Motzkin [26] was the first to obtain a lower bound (of order \(n^{1/2}\)) tending to infinity with \(n\). Kelly and Moser [22] proved that there are at least \(3n/7\) ordinary lines, and Csima and Sawyer [11] improved this to \(6n/13\) when \(n > 7\). Their work used ideas from the thesis of Hansen [20], which purported to prove the \(n/2\) lower bound but was apparently flawed. An illuminating discussion of this point may be found in the MathSciNet review of [11]. There are several nice surveys on this and related problems; see [4], [15, Chap. 17], [27] or [28].

One of our main objectives in this paper is to clarify this issue for large \(n\). The following theorem resolves, for large \(n\), a long-standing conjecture which has been known as the Dirac–Motzkin conjecture. Apparently neither author formally conjectures this in print, though Dirac [12] twice states that its truth is “likely”. Motzkin [26] does not seem to mention it at all.

Theorem 1.2

(Dirac–Motzkin conjecture) Suppose that \(P\) is a finite set of \(n\) points in the plane, not all on one line. Suppose that \(n \geqslant n_0\) for a sufficiently large absolute constant \(n_0\). Then \(P\) spans at least \(n/2\) ordinary lines.

We will in fact establish a more precise result obtaining the exact minimum for all \(n \geqslant n_0\) as well as a classification of the extremal examples. One rather curious feature of this more precise result is that if \(n\) is odd there must be at least \(3\lfloor n/4\rfloor \) ordinary lines. See Theorems 2.2 and 2.4 below for more details. When \(n\) is even, one can attain \(n/2\) ordinary lines by combining \(n/2\) equally spaced points on a circle with \(n/2\) points at infinity; see Proposition 2.1 below.

For small values of \(n\), there are exceptional configurations with fewer than \(n/2\) ordinary lines. Kelly and Moser [22] observe that a triangle together with the midpoints of its sides and its centroid has \(n = 7\) and just \(3\) ordinary lines. Crowe and McKee [10] provide a more complicated configuration with \(n = 13\) and 6 ordinary lines. It is possible that Theorem 1.2 remains true for all \(n\) with the exception of these two examples (or equivalently, one could take \(n_0\) as low as \(14\)). Unfortunately our method does not give a good bound for \(n_0\); we could in principle compute such a bound, but it would be of double exponential type and, we think, not worth the effort.

Our methods also apply (in fact with considerably less effort) to resolve sufficiently large instances of a slightly less well-known (but considerably older) problem referred to in the literature as the orchard problem. This was first formally posed by Sylvester [32] in 1868 (Fig. 3), though the 1821 book of Jackson [21] has a whole section containing puzzles of a similar flavour, posed more eloquently than any result in this paper (Fig. 2).

Fig. 2
figure 2

Jackson’s question [21]

Fig. 3
figure 3

Sylvester’s question [32]

Theorem 1.3

(Orchard problem) Suppose that \(P\) is a finite set of \(n\) points in the plane. Suppose that \(n \geqslant n_0\) for some sufficiently large absolute constant \(n_0\). Then there are no more than \(\lfloor n (n-3)/6\rfloor + 1\) lines that are \(3\)-rich, that is they contain precisely \(3\) points of \(P\). (Here and in the sequel, \(\lfloor x \rfloor \) denotes the integer part of \(x\).)

This theorem is tight for large \(n\), as noted by Sylvester [32], and also subsequently by Burr et al. [5], who discuss this problem extensively. We will give these examples, which are based on irreducible cubic curves, in Proposition 2.6 below. In fact these are the only examples where equality occurs for large \(n\): see the remarks at the very end of Sect. 9. Again, there are counterexamples for small \(n\). In particular, the example of a triangle, the midpoints of its sides and its centroid has \(n = 7\) but 6 lines containing precisely three points of \(P\); by contrast, the bound of Theorem 1.3 is \(5\) in this case.

As observed in [5], lower bounds for the number \(N_2\) of ordinary lines can be converted into upper bounds for the number \(N_3\) of 3-rich lines thanks to the obvious double-counting identity \(\sum _{k=2}^n \left( {\begin{array}{c}k\\ 2\end{array}}\right) N_k = \left( {\begin{array}{c}n\\ 2\end{array}}\right) \) (with \(N_k\) denoting the number of \(k\)-rich lines). In particular, previously known lower bounds on the Dirac–Motzkin conjecture can be used to deduce upper bounds on the orchard problem. However, one cannot deduce Theorem 1.3 in this fashion from Theorem 1.2; this is related to the fact that the extremal examples showing the sharpness of the two theorems are quite different, as we shall see in Sect. 2 below.

Underlying the proof of both of these results are structure theorems for sets with few ordinary lines, which are perhaps of independent interest. The most basic such result is the following. We use the asymptotic notation \(X=O(Y)\) or \(X \ll Y\) to denote the bound \(|X| \leqslant C Y\) for some absolute constant \(C\).

Theorem 1.4

(Weak structure theorem) Suppose that \(P\) is a finite set of \(n\) points in the plane. Suppose that \(P\) spans at most \(Kn\) ordinary lines for some \(K \geqslant 1\). Suppose also that \(n \geqslant \exp \exp (C K^C)\) for some sufficiently large absolute constant \(C\). Then all but at most \(O(K^{O(1)})\) points of \(P\) lie on an algebraic curve \(\gamma \) of degree at most 3.

In fact we establish a slightly more precise statement, see Proposition 6.14 below. Note that we do not require the algebraic curve \(\gamma \) to be irreducible; thus \(\gamma \) could be an irreducible cubic, the union of a conic and a line, or the union of three lines. As we shall see in later sections, cubic curves arise naturally in the study of point-line configurations with few ordinary lines, in large part due to the well-known abelian group structure (or pseudo-group structure) defined by the collinearity relation on such curves (or equivalently, by Chasles’s version of the Cayley–Bacharach theorem, see Proposition 4.1). The lower bound \(n \geqslant \exp \exp (CK^C)\) is present for rather artificial reasons, and can likely be improved substantially.

Projective geometry Much of the paper is best phrased in the language of projective geometry. We recall for the convenience of the reader the notion of the projective plane \(\mathbb R \mathbb P ^2\) as \((\mathbb R ^3 \setminus \{0\})/\sim \), where \((x, y, z) \sim (x^{\prime }, y^{\prime }, z^{\prime })\) if and only if there is some \(\lambda \ne 0\) such that \(x^{\prime } = \lambda x, y^{\prime } = \lambda y\) and \(z^{\prime } = \lambda z\). We denote points of \(\mathbb R \mathbb P ^2\) with square brackets, thus \([x, y, z]\) is the equivalence class of \((x, y, z)\) under \(\sim \). We have the embedding \(\mathbb R ^2 \hookrightarrow \mathbb R \mathbb P ^2\) given by \((x, y) \mapsto [x, y, 1]\); in fact \(\mathbb R \mathbb P ^2\) may be thought of as \(\mathbb R ^2\) together with the line at infinity consisting of points \([x, y, 0]\) (modulo the equivalence relation \(\sim \)). For the point-line incidence problems considered in this paper, the projective and affine formulations are equivalent. Indeed given a finite set of points \(P\) in \(\mathbb R \mathbb P ^2\), we may apply a generic projective transformation so as to move all points of \(P\) to the affine plane \(\mathbb R ^2\) if desired, without affecting the number of ordinary lines or 3-rich lines. This is illustrated in the figures in Sect. 2.

For our purposes, there are two main advantages of working in projective space instead of affine space. The first is to allow the use of projective transformations to normalise one’s geometric configurations, for instance by moving a line to the line at infinity, or transforming a non-singular irreducible cubic curve into an elliptic curve in Weierstrass normal form. The other main advantage is the ability to utilise point-line duality. Given a point \(p = [a,b,c]\), one may associate the line \(p^* := \{ [x, y, z] : ax + by + cz = 0\}\), and conversely given a projective line \(\ell = \{[x, y, z] : ax + by + cz = 0\}\) one may associate the point \(\ell ^* = [a, b, c]\). It is clear that \(p \in \ell \) if and only if \(\ell ^* \in p^*\). Working in the dual can provide us with information that is difficult to access otherwise. We shall see this twice: once in Sects. 3 and 4, when we apply Euler’s formula in the dual setting following an argument of Melchior [25], and then again in Sect. 6 where we will employ a convexity argument, due to Luke Alexander Betts, in the dual setting.

Next, we give a structure theorem which is more precise than Theorem 1.4.

Theorem 1.5

(Full structure theorem) Suppose that \(P\) is a finite set of \(n\) points in the projective plane \(\mathbb R \mathbb P ^2\). Let \(K > 0\) be a real parameter. Suppose that \(P\) spans at most \(Kn\) ordinary lines. Suppose also that \(n \geqslant \exp \exp (CK^C)\) for some sufficiently large absolute constant \(C\). Then, after applying a projective transformation if necessary, \(P\) differs by at most \(O(K)\) points (which can be added or deleted) from an example of one of the following three types:

  1. (i)

    \(n-O(K)\) points on a line;

  2. (ii)

    The set

    $$\begin{aligned}&X_{2m} := \Big \{ \Big [ \cos \frac{2\pi j}{m}, \sin \frac{2\pi j}{m}, 1\Big ]: 0 \leqslant j < m \Big \} \nonumber \\&\qquad \qquad \quad \cup \Big \{ \Big [-\sin \frac{\pi j}{m}, \cos \frac{\pi j}{m},0\Big ], 0 \leqslant j < m \Big \} \end{aligned}$$
    (1.1)

    consisting of \(m\) points on the unit circle and \(m\) points on the line at infinity, for some \(m = \frac{n}{2}+O(K)\);

  3. (iii)

    A coset \(H \oplus g, 3g \in H\), of a finite subgroup \(H\) of the non-singular real points on an irreducible cubic curve, with \(H\) having cardinality \(n+O(K)\) (the group law \(\oplus \) on such curves is reviewed in Sect. 2 below).

Conversely, every set of this type has at most \(O(K n)\) ordinary lines.

We have the following consequence, which can handle slowly growing values of \(K\).

Corollary 1.6

Suppose that \(P\) is a finite set of \(n\) points in the projective plane \(\mathbb R \mathbb P ^2\). Suppose that \(P\) spans at most \(n(\log \log n)^c\) ordinary lines for some sufficiently small constant \(c>0\). Then, after applying a projective transformation, \(P\) differs by at most \(o(n)\) points from one of the examples (i), (ii), (iii) detailed in Theorem 1.5 above. In particular, we may add/remove \(o(n)\) points to/from \(P\) to get a set with at most \(n + O(1)\) ordinary lines.

Here, of course, \(o(n)\) denotes a quantity which, after dividing by \(n\), tends to zero as \(n\) goes to infinity.

Remark

This corollary may, for all we know, be true with a much weaker assumption, perhaps even that \(P\) spans \(o(n^2)\) ordinary lines. Very likely, if one could weaken the hypothesis \(n \geqslant \exp \exp (CK^C)\) in Theorem 1.4 then one could do so also in Theorem 1.5.

Proof methods As mentioned previously, the starting pointFootnote 1 of our arguments will be Melchior’s proof [25] of the Sylvester–Gallai theorem using duality and the Euler formula \(V-E+F=1\) for polygonal decompositions of the projective plane \(\mathbb R \mathbb P ^2\). Melchior’s argument uses at one point the obvious fact that all polygons have at least three sides to obtain an inequality implying the existence of ordinary lines. The same argument also shows that if a point set \(P\) spans very few ordinary lines, then almost all of the polygons in the dual configuration \(\Gamma _P\) (cut out by the dual lines \(p^*\) for \(p \in P\)) must in fact have exactly three sides. Because of this, it is possible to show in this case that the dual configuration contains large regions which have the combinatorial structure of a regular triangular grid.

The next key observation is that inside any triangular grid of non-trivial size, one can find “hexagonal” configurations of lines and points (see Fig. 14) which are dual to the configuration of lines and points arising in Chasles’s version of the Cayley–Bacharach theorem (Proposition 4.1 below). From this observation and some elementary combinatorial arguments, one can start placing large subsets of \(P\) on a single cubic curve. For instance, in Proposition 5.1 we will be able to establish a “cheap structure theorem” asserting that a set of \(n\) points with fewer than \(Kn\) ordinary lines can be covered by no more than \(500K\) cubic curves. This observation turns out not to be new—a closely related technique is used in a paper of Carnicer and Godés [7] concerning generalised principal lattices, which arise in interpolation theory.

In principle, this cheap structure theorem already reduces the underlying geometry from a two-dimensional one (the projective plane \(\mathbb R \mathbb P ^2\)) to a one-dimensional one (the union of a number of cubic curves). Unfortunately, the collinearity relation between distinct cubic curves is too complicated to handle directly. Because of this, we must refine the previous combinatorial analysis to strengthen the structural control on a point set \(P\) with few ordinary lines. By studying the lines connecting a typical point \(p\) in \(P\) with all the other points in \(P\) using the triangular grid analysis, one can obtain a more complicated partition of \(P\) into cubic curves passing through \(p\). A detailed statement may be found in Lemma 5.2. Comparing such a partition with the reference partition coming from the cheap structure theorem, one can obtain Proposition 5.3, a structure theorem of intermediate strength. Roughly speaking this result asserts that most of the points in \(P\) lie on a single irreducible cubic curve, on a union of an irreducible conic and a bounded number of lines (with the points shared almost evenly between the conic and the lines), or on the union of a bounded number of lines only.

The next stage is to cut the number of lines involved down to one. The key propositionFootnote 2 here is Proposition 6.3. It asserts, roughly speaking, that a set of \(n\) points on two or more lines, which contains \(\gg n\) points on each line, must generate \(\gg n^2\) ordinary lines. This is a statement that fails in finite field geometries, and must use at some point the torsion-free nature of the real line \(\mathbb R \) [see Sect. 2 for more discussion of this point, particularly with regards to the “near-counterexamples” (2.1) and (2.2)]. There are two key cases of the proposition which need to be established. The first is when the lines involved are all concurrent or, after a projective transformation, all parallel. In this case we use an argument of Betts, Proposition 6.4, involving projective duality and convexity. The use of convexity here is where the torsion-free nature of \(\mathbb R \) is implicitly used. In the case when the lines are not concurrent, we instead rely on Menelaus’s theorem to introduce various ratios of lengths, and then exploit a sum-product estimate of Elekes et al. [14]. This latter result, stated in Proposition A.9, also implicitly exploits the torsion-free nature of \(\mathbb R \).

The result of the above analysis is a yet stronger structure theorem for sets \(P\) with few ordinary lines: \(P\) is mostly placed in either an irreducible cubic curve, the union of an irreducible conic and a line, or on a single line. A detailed statement may be found in Proposition 6.14. The latter case, in which almost all points lie on a line, is easily studied. To deal with the other two cases one uses the abelian group structure on irreducible cubics, as well as the analogous pseudo-group structure on the union of a conic and a line. The information that \(P\) contains few ordinary lines can then be converted to an additive combinatorics property on finite subsets of an abelian group. Fairly standard tools from additive combinatorics then show that \(P\) is almost a finite subgroup of that abelian group. This allows us to rule out “essentially torsion-free” situations, such as that provided by singular cubic curves (except for the acnodal singular cubic curve), and eventually leads to the full structure theorem in Theorem 1.5.

To solve the Dirac–Motzkin conjecture and the orchard problem for large \(n\), we observe that potential counterexamples \(P\) to either conjecture will have few ordinary lines and hence can be described by Theorem 1.5. This quickly implies that \(P\) is close, up to projective transformation, to one of the known extremisers coming from roots of unity or from subgroups of elliptic curves, with a small number of additional points added or removed. The remaining task is to compute the effect that these added/removed points have on the number of ordinary lines or 3-rich lines in \(P\). Here, to get the strongest results, we will need a slight variant of a result of Poonen and Rubinstein [29] in order to control the number of times a point may be concurrent with two roots of unity. See Proposition 7.5 for details.

Of the two problems, the orchard problem turns out to be somewhat easier, and can be in fact established using only the intermediate structure theorem in Proposition 5.3 rather than the more difficult structure theorem in Theorem 1.5.

2 The Key Examples

The aim of this section is to describe the various key examples of sets with few ordinary lines or many 3-rich lines. In particular we describe the sets \(X_{2m}\) in (1.1) and the Sylvester examples appearing in the various cases of the main structure theorem, Theorem 1.5. We will also mention some important “near-counterexamples” which do not actually exist as finite counterexamples to the structure theorem, but nevertheless are close enough to genuine counterexamples that some attention must be given in the analysis to explicitly exclude variants of these examples from the list of possible configurations. All of these examples are connected to the group (or pseudo-group) structure on a cubic curve, or equivalently to Chasles’s version of the Cayley–Bacharach theorem as described in Proposition 4.1 below. The main variation in the examples comes from the nature of the cubic curve being considered, which may or may not be irreducible and/or nonsingular.

Böröczky and near-Böröczky examples We begin with the sets \(X_{2m}\) from (1.1), together with some slight perturbations of these sets described by Böröczky (as cited in Ref. [10]). These sets, it turns out, provide the examples of non-collinear sets of \(n\) points with the fewest number of ordinary lines, at least for \(n\) large.

Proposition 2.1

(Böröczky examples) Let \(m \geqslant 3\) be an integer. Then we have the following.

  1. (i)

    The set \(X_{2m}\) contains \(2m\) points and spans precisely \(m\) ordinary lines.

  2. (ii)

    The set \(X_{4m}\) together with the origin \([0,0,1]\) contains \(4m + 1\) points and spans precisely \(3m\) ordinary lines.

  3. (iii)

    The set \(X_{4m}\) minus the point \([0,1,0]\) on the line at infinity contains \(4m-1\) points and spans precisely \(3m-3\) ordinary lines.

  4. (iv)

    The set \(X_{4m + 2}\) minus any of the \(2m + 1\) points on the line at infinity contains \(4m+1\) points and spans \(3m\) ordinary lines.

Thus, if we define a function \(f : \mathbb N \rightarrow \mathbb N \) by setting \(f(2m) := m, f(4m+1) := 3m\) and \(f(4m - 1) := 3m - 3\), then there is an example of a set of \(n\) points in \(\mathbb R \mathbb P ^2\), not all on a line, spanning \(f(n)\) ordinary lines.

Proof

This is a rather straightforward check, especially once one has drawn suitable pictures. Whilst the unit circle together with the line at infinity form a pleasant context for calculational work, drawing configurations involving the line at infinity is problematic. In the four diagrams below, Figs. 4, 5, 6 and 7, we have applied a projective transformation to aid visualisation. First of all we applied a rotation about the origin through \(\pi /12\), and followed this by the projective map \([x, y, z] \mapsto [-y, x, 2z + x]\). The unit circle is then sent to the ellipse whose equation in the affine plane is \(4x^2 + 3(y + \frac{1}{3})^2 = \frac{4}{3}\), while the line at infinity is sent to the horizontal line \(y = 1\). The origin is mapped to itself, and the point \([0, 1, 0]\) at infinity now has coordinates \((-\cot (\pi /12),1) \approx (-3.73,1)\). In the pictures, ordinary lines are red and lines with three or more points of \(P\) are dotted green.

It is helpful to note that the line joining

$$\begin{aligned} \Big [\cos \frac{2\pi j}{m}, \sin \frac{2\pi j}{m}, 1\Big ] \end{aligned}$$

and

$$\begin{aligned} \Big [\cos \frac{2\pi j^{\prime }}{m}, \sin \frac{2\pi j^{\prime }}{m}, 1\Big ] \end{aligned}$$

passes through the point

$$\begin{aligned} \Big [-\sin \frac{\pi ( j+j^{\prime })}{m}, \cos \frac{\pi ( j+ j^{\prime })}{m},0\Big ] \end{aligned}$$

on the line at infinity (cf. the proof of Proposition 7.3).

For case (i), the ordinary lines are the \(m\) tangent lines to the \(m\)th roots of unity. The case \(m = 6\) is depicted in Fig. 4.

In case (ii), the ordinary lines are the \(2m\) tangent lines to the \(2m\)th roots of unity together with the \(m\) lines joining the origin \([0,0,1]\) to \([-\sin \frac{\pi j}{2m}, \cos \frac{\pi j}{2m}, 0], j\) even. The case \(m = 6\) is depicted in Fig. 5.

In case (iii), the ordinary lines are the \(2m\) tangent lines to the \(2m\)th root of unity except for those at the points \([\pm 1,0,1]\), whose corresponding point at infinity has now been removed. However we do have \(m - 1\) new ordinary lines, the vertical lines joining \([\cos \frac{\pi j}{m}, \sin \frac{\pi j}{m},1]\) and \([\cos \frac{\pi (2m - j)}{m}, \sin \frac{\pi (2 j-m)}{m},1]\) for \(j = 1,\dots , m-1\). The case \(m = 6\) is illustrated in Fig. 6.

Finally, in case (iv) the ordinary lines are the \(2m + 1\) tangent lines to the \((2m+1)\)th roots of unity except for one whose corresponding point at infinity has been removed, together with \(m\) new ordinary lines joining pairs of roots of unity. This is illustrated in Fig. 7 in the case \(m = 2\). \(\square \)

Fig. 4
figure 4

The Böröczky example \(X_{12}\), a set with \(n = 12\) points and 6 ordinary lines. The ordinary lines (in red) are just the tangent lines to the 6th roots of unity on the unit circle (Color figure online)

Fig. 5
figure 5

The Böröczky example \(X_{12}\) together with the origin \([0,0,1]\), a set with \(n = 13\) points and 9 ordinary lines. The ordinary lines (in red) are just the tangent lines to the 6th roots of unity on the unit circle, plus 3 extra lines through the origin and 3 of the points on the line at infinity (Color figure online)

Fig. 6
figure 6

The Böröczky example \(X_{12}\) minus the point at infinity \([0,1,0]\), a set with \(n = 11\) points and \(6\) ordinary lines. The ordinary lines (in red) are the 4 tangent lines to the 6th roots of unity on the unit circle not through the point at infinity, plus 2 extra lines passing through the point at infinity (Color figure online)

Fig. 7
figure 7

The Böröczky example \(X_{10}\) minus the point at infinity \([0,1,0]\), a set with \(n = 9\) points and 6 ordinary lines. The ordinary lines (in red) are the 4 tangent lines to the 6th roots of unity on the unit circle not through the point at infinity, plus 2 extra lines passing through the point at infinity (Color figure online)

We remark that Proposition 2.1 illustrates a basic fact, namely that if one adds or removes \(K\) points to an \(n\)-point configuration, then the number of ordinary lines (or 3-rich lines) is modified by at most \(O(Kn + K^2)\); this can be seen by first considering the \(K=1\) case and then iterating. This stability with respect to addition or deletion of a few points is reflected in the conclusions of the various structural theorems in this paper.

We may now state our more precise version of the Dirac–Motzkin conjecture for large \(n\).

Theorem 2.2

(Sharp threshold for Dirac–Motzkin) Let the function \(f : \mathbb N \rightarrow \mathbb N \) be defined by setting \(f(2m) := m, f(4m+1) := 3m\) and \(f(4m - 1) := 3m - 3\). There is an \(n_0\) such that the following is true. If \(n \geqslant n_0\) and if \(P\) is a set of \(n\) points in \(\mathbb R \mathbb P ^2\), not all on a line, then \(P\) spans at least \(f(n)\) ordinary lines. Furthermore if equality occurs then, up to a projective transformation, \(P\) is one of the Böröczky examples described in Proposition 2.1 above.

Remark

Note in particular that there is an essentially unique extremal example unless \(n \equiv 1 (\mathrm{mod }\, 4)\), in which case there are two, namely examples (ii) and (iv) above. Note that, all of the examples in (iv) are equivalent up to rotation.

Let us record, in addition to the Böröczky examples mentioned in Proposition 2.1, the following near-extremal example.

Proposition 2.3

(Near-Böröczky example) The set \(X_{4m}\) minus the point \([-\sin \frac{\pi }{2m}, \cos \frac{\pi }{2m},0]\) on the line at infinity contains \(4m -1\) points and spans \(3m\) ordinary lines.

Proof

This is illustrated in Fig. 8 in the case \(m = 3\). The ordinary lines are the \(2m\) tangent lines to the \(2m\)th roots of unity as well as \(m\) lines joining \([\cos \frac{\pi j}{m}, \sin \frac{\pi j}{m},1]\) and \([\cos \frac{\pi j^{\prime }}{m}, \sin \frac{\pi j^{\prime }}{m},1]\) with \(j + j^{\prime } \equiv 1 (\mathrm{mod }\, 2m)\).

Fig. 8
figure 8

The near-Böröczky example with \(n = 11\) points and 9 ordinary lines. The ordinary lines (in red) are the 6 tangent lines to the 6th roots of unity on the unit circle plus 3 lines passing through the removed point \([-\sin \frac{\pi }{6}, \cos \frac{\pi }{6},0]\) (Color figure online)

We may now state a still more precise result, which asserts that all configurations not equivalent to one of the above examples must necessarily have a significantly larger number of ordinary lines than \(f(n)\), when \(n\) is large. In fact there must be at least \(n-O(1)\) ordinary lines in such cases.

Theorem 2.4

(Strong Dirac–Motzkin conjecture) There is an absolute constant \(C\) such that the following is true. If \(P\) is a set of \(n\) points in \(\mathbb R \mathbb P ^2\), not all on a line, spanning no more than \(n - C\) ordinary lines then \(P\) is equivalent under a projective transformation to one of the Böröczky examples or to a near-Böröczky example.

The threshold \(n-C\) is sharp except for the constant \(C\). Indeed we will shortly see that finite subgroups of elliptic curves of cardinality \(n\) give examples of sets with \(n-O(1)\) lines. This gives infinitely many new examples of sets with few ordinary lines which are inequivalent under projective transformation due to the projective invariance of the discriminant of an elliptic curve.

Sylvester’s cubic curve examples We turn now to Sylvester’s examples of point sets coming from cubic curves, as further discussed by Burr et al. [5]. While these do not provide the best examples of sets with few ordinary lines, it appears that consideration of them is essential in order to solve the Dirac–Motzkin problem. Of course, they also feature in the statement of our main structural result, Theorem 1.5, and are optimal for the orchard problem (see Sect. 9). Finally, they provide essentially different examples of sets with \(n + O(1)\) ordinary lines to any of those considered so far.

For a leisurely discussion of all the projective algebraic geometry required in this paper, including an extensive discussion of cubic curves, we recommend the book [3].

Let \(\gamma \) be any irreducible cubic curve. It is known (see [3, Chap. 12]) that \(\gamma \) has a point of inflection, that is to say a point where the tangent meets \(\gamma \) to order \(3\). By moving this to the point \([0,1,0]\) at infinity, we may bring \(\gamma \) into the form \(y^2 = f(x)\) in affine coordinates, where \(f(x)\) is a cubic polynomial. If \(\gamma \) is smooth then it is called an elliptic curve. An elliptic curve may have one or two components; these two cases are illustrated in Fig. 10. If \(\gamma \) has a singular point then it may be transformed into one of the following three (affine) forms:

  • (nodal case) \(y^2 = x^2(x+1)\);

  • (cuspidal case) \(y^2 = x^3\);

  • (acnodal case) \(y^2 = x^2 (x-1)\).

See [3, Theorem 8.3] for details. These three singular cases are illustrated in Figure 9.

Fig. 9
figure 9

The three different types of singular cubic curve

Fig. 10
figure 10

Two elliptic curves, illustrating the group law and showing the two possibilites for the group structure

We remark that the classification of cubic curves over \(\mathbb R \) has a long and honourable history dating back to Isaac Newton.

The group law Suppose that \(\gamma \) is an irreducible cubic curve, and write \(\gamma ^*\) for the set of nonsingular points of \(\gamma \). If \(\gamma \) is smooth then of course \(\gamma = \gamma ^*\), and in this case \(\gamma \) is an elliptic curve. We may define an abelian group structure on \(\gamma ^*\) by taking the identity \(O\) to be a point of inflection on \(\gamma ^*\) and, roughly speaking, \(P \oplus Q \oplus R = O\) if and only if \(P, Q, R\) are collinear. The “roughly speaking” refers to the fact that we must take appropriate account of multiplicity, thus \(P \oplus P \oplus Q = O\) if the tangent to \(\gamma \) at \(P\) also passes through \(Q\). The inverse \(\ominus P\) of \(P\) is defined using the fact that \(\ominus P, O\) and \(P\) are collinear. See [3, Chap. 9] for more details, including a proof that this does indeed give \(\gamma ^*\) the structure of an abelian group.

We have the following theorem regarding the nature of \(\gamma ^*\) as a group.

Theorem 2.5

Let \(\gamma \) be an irreducible cubic curve, and let \(\gamma ^*\) be the set of its nonsingular points. Then we have the following possibilities for \(\gamma ^*\), considered as a group:

  • (elliptic curve case) \(\mathbb R /\mathbb Z \) or \(\mathbb R /\mathbb Z \times \mathbb Z /2\mathbb Z \), depending on whether \(\gamma \) has 1 or 2 connected components;

  • (nodal case) \(\mathbb R \times \mathbb Z /2\mathbb Z \);

  • (cuspidal case) \(\mathbb R \);

  • (acnodal case) \(\mathbb R /\mathbb Z \).

Once again, details may be found in [3]. Thinking about the curves topologically, the theorem is reasonably evident. In the three singular cases isomorphisms \(\phi : G \rightarrow \gamma ^*\) can be given quite explicitly, as detailed in the following list.

  • In the nodal case \(y^2 = x^2(x+1)\), the map \(\phi : \mathbb R \times \mathbb Z /2\mathbb Z \rightarrow \gamma ^*\) defined by \(\phi (x,\varepsilon ) = (t^2 - 1, t (t^2 - 1))\), where \(t = \coth x\) if \(\varepsilon = 0\) and \(t = \tanh x\) if \(\varepsilon = 1\) provides an isomorphism;

  • In the cuspidal case \(y^2 = x^3\), the map \(\phi : \mathbb R \rightarrow \gamma ^*\) defined by \(\phi (x) = (\frac{1}{x^3}, \frac{1}{x^2})\) provides an isomorphism;

  • In the acnodal case \(y^2 = x^2(x -1)\), the map \(\phi : \mathbb R /\mathbb Z \rightarrow \gamma ^*\) defined by \(\phi (x) = (t^2 + 1, t (t^2 + 1))\), where \(t = \cot (\pi x)\), provides an isomorphism.

We leave the reader to provide the details. In the nodal case (for example) we recommend first proving that \((t^2 - 1, t(t^2 - 1), (u^2 - 1, u(u^2 - 1))\) and \((v^2 - 1, v (v^2 - 1))\) are collinear if and only if \(-v = (1 + tu)/(t + u)\).

The following maps in the other direction, described in Silverman [30, III. 7] (the acnodal case is described in [30, Exercise 3.15]) are perhaps even tidier. Here \(\infty = [0,1,0]\).

  • In the nodal case the map \((x,y) \mapsto (y - x)/(y+x)\) and \(\infty \mapsto 1\) gives an isomorphism from \(\gamma ^*\) to \(\mathbb R ^* \cong \mathbb R \times \mathbb Z /2\mathbb Z \);

  • In the cuspidal case the map \((x,y) \mapsto x/y\) and \(\infty \mapsto 1\) gives an isomorphism from \(\gamma ^*\) to \(\mathbb R \);

  • In the acnodal case the map \((x,y) \mapsto -(x + iy)^2/x^3\) and \(\infty \mapsto 1\) gives an isomorphism from \(\gamma ^*\) to the unit circle \(S^1\) in the complex plane.

Sylvester’s examples By a Sylvester example \(E_n\) we mean a set of \(n\) points \(P\) in the plane which corresponds to a subgroup of order \(n\) of an irreducible cubic curve \(\gamma \). If \(n > 2\) the existence of such an example requires \(\gamma \) to be either an elliptic curve or an acnodalFootnote 3 cubic curve, by the classification of the group structure of \(\gamma \) described in Theorem 2.5. A Sylvester example coming from an elliptic curve is depicted in Fig. 11.

Fig. 11
figure 11

A Sylvester example with \(n = 8\), the subgroup being isomorphic to \(\mathbb Z /2\mathbb Z \times \mathbb Z /4\mathbb Z \). The labels reflect the group structure, thus \(03\) corresponds to the element \((0,3) \in \mathbb Z /2\mathbb Z \times \mathbb Z /4\mathbb Z \). This comes from an elliptic curve with equation \(y^2 = x^3 - \frac{1}{36} x^2 - \frac{5}{36} x + \frac{25}{1296} = 0\) to which we have applied the projective transformation \([x, y, z] \mapsto [x, y, x+ y + z]\), so that the point at infinity maps to the point \((0,1)\) in the affine plane (which is then an inflection point for the curve). There are 7 ordinary lines, marked in red, and also 7 3-rich lines, marked in dotted green (Color figure online)

As it turns out, Sylvester examples have somewhat more ordinary lines than the Böröczky examples, namely \(n+O(1)\) instead of \(n/2+O(1)\) or \(3n/4+O(1)\), and are thus not extremisers for the Dirac–Motzkin conjecture. However, due to the more evenly distributed nature of the Sylvester examples, they have significantly more 3-rich lines. Indeed, the following is essentially established in Ref. [5].

Proposition 2.6

Let \(n \geqslant 3\), and let \(E_n\) be a subgroup of order \(n\) in \(\gamma ^*\), the group of nonsingular points of an irreducible cubic curve \(\gamma \) (which must be an elliptic curve or an acnodal cubic). Then \(E_n\) spans \(n-1-2 \cdot \mathbf 1 _{3|n}\) ordinary lines and \(\lfloor \frac{n(n-3)}{6} \rfloor + 1\) \(3\)-rich lines, where \(\mathbf 1 _{3|n}\) is equal to 1 when 3 divides \(n\) and zero otherwise. Furthermore, if \(x \in E\) is such that \(x \not \in E_n\) and \(x \oplus x \oplus x \in E_n\) then \(E_n \oplus x\) has \(n-1\) ordinary lines and \(\lfloor \frac{n(n-3)}{6} \rfloor \) \(3\)-rich lines.

Proof

Let \(N_2\) be the number of ordinary lines, and \(N_3\) be the number of 3-rich lines. From Bézout’s theorem no line can meet \(E_n\) in more than three points, and so by double counting we have the identity

$$\begin{aligned} N_2 + \big (\begin{array}{l} {3}\\ {2}\end{array}\big ) N_3 = \big (\begin{array}{l} {n}\\ {2} \end{array}\big ). \end{aligned}$$

A brief computation (splitting into three cases depending on the residue of \(n\) modulo 3) then shows that \(N_3 = \lfloor \frac{n(n-3)}{6} \rfloor + 1\) if and only if \(N_2 = n-1-2 \cdot \mathbf 1 _{3|n}\). But from the group law the number of ordinary lines is precisely equal to the number of elements \(a \in E_n\) such that \(-2a\) is distinct from \(a\), or in other words the number \(n\) of elements in \(E_n\) minus the number of third roots in \(E_n\). But \(\gamma ^*\) is isomorphic as a group to either \(\mathbb R /\mathbb Z \) or \((\mathbb R /\mathbb Z ) \times (\mathbb Z /2\mathbb Z )\), and so \(E_n\) is isomorphic to either \(\mathbb Z /n\mathbb Z \) or to \((\mathbb Z /(n/2)\mathbb Z ) \times (\mathbb Z /2\mathbb Z )\). It has \(1 + 2 \cdot \mathbf 1 _{3|n}\) third roots in either case, and the claim follows.

The analysis in the shifted case \(E_n \oplus x\) is analogous, the only difference being that \(E_n \oplus x\) does not contain any third roots of unity. \(\square \)

Remarks

For the sake of comparison, the \(n\)-point examples in Proposition 2.1 can all be computed to have \(n^2/8 + O(n)\) 3-rich lines instead of \(n^2/6 + O(n)\) for the Sylvester examples. This discrepancy can be explained by the existence of a high-multiplicity line with \(n/2+O(1)\) points in those examples. This absorbs many of the pairs of points that could otherwise be generating 3-rich lines.

We note also that the acnodal case allows for a quite explicit construction of a set of \(n\) points defining \(\sim n^2/6\) 3-rich lines, without the use of the Weierstrass \(\wp \)-function which would be necessary in the elliptic curve case. We leave the reader to supply the details, using the parametrisation detailed after the statement of Theorem 2.5. We are not sure whether this point has been raised in the literature before.

Near-counterexamples In addition to the actual examples coming from Böröczky’s constructions and from elliptic curve subgroups, there are also some important “near-counterexamples” which do not directly enter into the analysis (because they involve an infinite number of points, rather than a finite number), but which nevertheless appear to indirectly complicate the analysis by potentially generating spurious counterexamples to the structural theory of points with few ordinary lines. These then need to be eliminated by additional arguments.

As with the previously discussed examples, the near-counterexamples discussed here will lie on cubic curves. But whilst the actual examples were on an elliptic curve, an acnodal singular cubic curve, or on the union of a conic and a line, the near-counterexamples will lie on three lines (which may or may not be concurrent), or on a non-acnodal singular cubic curve.

We first consider a near-counterexample on three concurrent lines. Up to projective transformation, one can take the lines to be the parallel lines

$$\begin{aligned} \ell _1&:= \{ [x_1,0,1]: x_1\in \mathbb R \} \cup \{ [1,0,0] \}, \\ \ell _2&:= \{ [x_2,1,1]: x_2\in \mathbb R \} \cup \{ [1,0,0] \}, \\ \ell _3&:= \{ [x_3,2,1]: x_3\in \mathbb R \} \cup \{ [1,0,0] \}. \end{aligned}$$

Observe that \([x_1, 0, 1], [x_2, 1, 1]\) and \([x_3, 2, 1]\) are colinear if and only if \(x_1 + x_3 = 2x_2\). Thus, if we consider the infinite point set

$$\begin{aligned}&P := \{ [n_1,0,1]: n_1 \in \mathbb Z \} \cup \{ [n_2,1,1]: n_2 \in \textstyle \frac{1}{2}\displaystyle \mathbb Z \} \nonumber \\&\qquad \quad \cup \{ [n_3,2,1]: n_3 \in \mathbb Z \} \end{aligned}$$
(2.1)

then there are no ordinary lines whatsoever; every line joining a point in \(P \cap \ell _1\) with a point in \(P \cap \ell _2\) meets a point in \(P \cap \ell _3\), and similarly for permutations. If \(\mathbb Z \) could somehow have a non-trivial finite subgroup, then one could truncate this example into a counterexample to the Sylvester–Gallai theorem, i.e. a finite set with no ordinary lines. Of course, this cannot actually happen, but this example strongly suggests that one needs to somehow use the torsion-free nature of the additive group \(\mathbb R \) at some point in the arguments, for instance by exploiting arguments based on convexity, or by using additive combinatorics results exploiting the ordered nature of \(\mathbb R \). One such example, a variant of which we prove in Lemma A.3, is the trivial inequality \(|A+B| \geqslant |A| + |B|-1\) for finite subsets \(A,B\) of \(\mathbb R \). This can be viewed as a quantitative version of the assertion that \(\mathbb R \) has no non-trivial finite subgroups.

There is a similar near-counterexample involving three non-concurrent lines. Again, after applying a projective transformation, we may work with the lines

$$\begin{aligned} \ell _1&:= \{ [x,0,1]: x\in \mathbb R \} \cup \{ [1,0,0] \}, \\ \ell _2&:= \{ [0,y,1]: y\in \mathbb R \} \cup \{ [0,1,0] \}, \\ \ell _3&:= \{ [-z,1,0]: z\in \mathbb R \} \cup \{ [1,0,0] \}. \end{aligned}$$

Observe that if \(x,y,z \in \mathbb R ^\times := \mathbb R \backslash \{0\}\), then \([x, 0, 1], [0, y, 1]\) and \([-z, 1, 0]\) are concurrent precisely when \(z = x/y\). Thus, if we consider the infinite point set

$$\begin{aligned}&P :=\{ [2^{n_1},0,1]: n_1 \in \mathbb Z \} \cup \{ [0, 2^{n_2},1]: n_2 \in \mathbb Z \} \nonumber \\&\qquad \quad \cup \{ [-2^{n_3},1,0]: n_3 \in \mathbb Z \}, \end{aligned}$$
(2.2)

then again there are no ordinary lines: every line joining a point in \(P \cap \ell _1\) with a point in \(P \cap \ell _2\) meets a point in \(P \cap \ell _3\), and similarly for permutations. As before, this example suggests that the (essentially) torsion-free nature of the multiplicative group \(\mathbb R ^{\times }\) must somehow come into play at some point in the argument.

Finally, we give an example that lies on a cuspidal singular cubic curve, which after projective transformation can be written as

$$\begin{aligned} \gamma := \{ [x,y,z]: y z^2 = x^3 \}. \end{aligned}$$

Removing the singular point at \([0,1,0]\), we may parameterise the smooth points \(\gamma ^*\) of this curve by \(\{ [t, t^3, 1]: t \in \mathbb R \}\). One can compute after a brief determinant computation that three distinct smooth points \([t_1, t_1^3, 1], [t_2, t_2^3, 1]\) and \([t_3, t_3^3, 1]\) on the curve are concurrent precisely when \(t_1 + t_2 + t_3 = 0\). Thus, if one sets \(P\) to be the infinite set

$$\begin{aligned} P := \{ [n,n^3,1]: n \in \mathbb Z \} \end{aligned}$$
(2.3)

then there are very few ordinary lines—indeed only those lines that are tangent to \(\gamma \) at one point \([n, n^3, 1]\) and meet \(\gamma \) at a second point \([(-2n), (-2n)^3, 1]\) for some \(n \in \mathbb Z \backslash \{0\}\) will be ordinary. This example can be viewed as a degenerate limit of the Sylvester examples \(E_n\) when the discriminant is sent to zero and \(n\) sent to infinity. Again, finitary versions of this example can be ruled out, but only after one exploits the torsion-free nature of the group associated to \(\gamma ^*\), which in this case is isomorphic to \(\mathbb R \). Similar remarks also apply to nodal singular cubic curves such as \(\{ [x,y,z]: y^2 z = x^3 + x^2 z\}\), the smooth points of which form a group isomorphic to \(\mathbb R \times (\mathbb Z /2\mathbb Z )\), which is essentially torsion free in the sense that there are no large finite subgroups.

A variant of the example (2.3) lies on the union

$$\begin{aligned} \{ [x,y,z]: yz = x^2 \} \cup \{ [x, y, z]: z = 0 \} \end{aligned}$$

of a parabola and the line at infinity. Observe that two points \([t_1, t_1^2, 1], [t_2, t_2^2, 1]\) on the parabola and a point \([0,t_3,1]\) on the line at infinity with \(t_1,t_2,t_3 \in \mathbb R \) are concurrent if and only if \(t_3 = t_1 + t_2\). Thus, the infinite set

$$\begin{aligned} P := \{ [n, n^2, 1]: n \in \mathbb Z \} \cup \{ [0, n, 1]: n \in \mathbb Z \}, \end{aligned}$$
(2.4)

which can be viewed as a degenerate limit of a Böröczky example, has very few ordinary lines (namely, the line tangent to the parabola at one point \([n, n^2, 1]\) and also passing through \([0, 2n, 1]\)).

The existence of these near-counterexamples forces us to use a somewhat ad hoc case-by-case analysis. Tools such as Chasles’s version of the Cayley–Bacharach theorem, which are valid for all cubic curves, get us only so far. They must be followed up by more specialised arguments exploiting the torsion or lack thereof in the group structure. In this way we can rule out near-counterexamples involving triples of lines, or singular irreducible cubics, until only the Böröczky and Sylvester type of examples remain.

3 Melchior’s Proof of the Sylvester–Gallai Theorem

In this section, we review Melchior’s proof [25] of the Sylvester–Gallai theorem. As mentioned in the Sect. 1, this is the starting point for all of our arguments.

Theorem 1.1 (Sylvester–Gallai, again) Suppose that \(P\) is a finite set of points in the plane, not all on one line. Then \(P\) spans at least one ordinary line.

Proof

Let \(P\) be a set of \(n\) points in \(\mathbb R \mathbb P ^2\). Consider the dual collection \(P^* := \{ p^*: p \in P \}\) of \(n\) lines in \(\mathbb R \mathbb P ^2\). These lines determine a graphFootnote 4 \(\Gamma _P\) in \(\mathbb R \mathbb P ^2\) whose vertices are the intersections of pairs of lines \(p_1^*, p_2^*\) (or equivalently points \(\ell ^*\), where \(\ell \) is a line joining two or more points of \(P\)), and whose edges are (projective) line segments of lines in \(P^*\) connecting two vertices of \(\Gamma _P\) with no vertex in the interior. Note that as the points in \(P\) were assumed not to lie on one line, every line in \(P^*\) must meet at least two vertices of \(\Gamma _P\); in particular, the graph \(\Gamma _P\) contains no loops. (It is however possible for a line to meet exactly two vertices in \(\Gamma _P\), in which case those two vertices are joined by two edges, rather than one.) Also, by construction, each vertex of \(\Gamma _P\) is incident to at least two lines in \(P^*\). As such, the graph \(\Gamma _P\) partitions the projective plane \(\mathbb R \mathbb P ^2\) into some number \(V\) of vertices, some number \(E\) of edges, and some number \(F\) of faces, each of which is the projective image of a polygon. In particular, each face has at least three edges, and any edge is incident to two distinct faces.

By Euler’s formula in the projective plane \(\mathbb R \mathbb P ^2\) we haveFootnote 5

$$\begin{aligned} V - E + F = 1. \end{aligned}$$
(3.1)

To proceed further, suppose that for each \(k = 2,3,4,\dots \) the set \(P\) has \(N_k\) lines containing precisely \(k\) points of \(P\). Then \(V\), which by duality is the number of lines defined by pairs of points in \(P\), satisfies

$$\begin{aligned} V = \sum _{k = 2}^n N_k. \end{aligned}$$
(3.2)

Furthermore the degree \(d(\ell ^*)\) of a vertex \(\ell ^*\) in our graph is twice the number of lines in \(P^*\) passing through \(\ell ^*\), which is \(2|P \cap \ell |\). Thus, summing over all lines \(\ell \),

$$\begin{aligned} 2E = \sum _{\ell } d(\ell ^*) = 2\sum _{\ell } |P\cap \ell | = \sum _{k = 2}^n 2k N_k. \end{aligned}$$
(3.3)

Finally, for \(s = 3,4,5,\ldots \) write \(M_s\) for the number of faces in \(\Gamma _P\) with \(s\) edges. Since each edge is incident to exactly two faces, we have

$$\begin{aligned} 2E = \sum _{s = 3}^n s M_s. \end{aligned}$$
(3.4)

Combining (3.1), (3.2), (3.3) and (3.4) gives the following expression for \(N_2\), the number of ordinary lines:

$$\begin{aligned} N_2 = 3 + \sum _{k = 4}^n (k-3)N_k + \sum _{s = 4}^n (s-3) M_s. \end{aligned}$$
(3.5)

It follows immediately that \(N_2 \geqslant 3\), which of course implies the Sylvester-Gallai theorem. \(\square \)

After discarding the non-negative term \(\sum _{s=4}^n (s-3) M_s\), Eq. (3.5) implies Melchior’s inequality

$$\begin{aligned} N_2 \geqslant 3 + \sum _{k = 4}^n (k-3)N_k. \end{aligned}$$

In this paper, however, we will need to save the term \(\sum _{s=4}^n (s-3) M_s\), as it gives crucial control on the geometry of the dual configuration \(\Gamma _P\), ensuring that this configuration resembles a triangulation when \(N_2\) is small. More precisely, we have the following proposition.

Proposition 3.1

(Few bad edges) Suppose that \(P\) is a set of \(n\) points in the projective plane \(\mathbb R \mathbb P ^2\), not all on a line, and suppose that \(P\) has at most \(Kn\) ordinary lines. Consider the planar graph \(\Gamma _P\) obtained by dualising \(P\) as described above. Then \(\Gamma _P\) is an “almost triangulation” in the following sense. Say that an edge of \(\Gamma _P\) is good if both of its vertices have degree \(6\), and if both faces it adjoins are triangles. Say that it is bad otherwise. Then the number of bad edges in \(\Gamma _P\) is at most \(16Kn\).

Proof

From (3.5) we have

$$\begin{aligned} \sum _{s=4}^n s M_s \leqslant 4 \sum _{s=4}^n (s-3)M_s \leqslant 4N_2 \leqslant 4Kn. \end{aligned}$$
(3.6)

Secondly, let us observe that

$$\begin{aligned} \sum _{\ell : d(\ell ^*) \ne 6} d(\ell ^*) \leqslant 12Kn. \end{aligned}$$
(3.7)

To see this, recall that \(d(\ell ^*) = 2 |P\cap \ell |\). We thus obtain

$$\begin{aligned} \sum _{\ell : d(\ell ^*) > 6} d(\ell ^*)&= 2\sum _{\ell : |P\cap \ell | > 3} |P\cap \ell | \\&= \sum _{k \geqslant 4} 2kN_k \leqslant 8 \sum _{k \geqslant 4} (k-3) N_k \leqslant 8Kn. \end{aligned}$$

Noting also that

$$\begin{aligned} \sum _{\ell :d(\ell ^*) = 4} d(\ell ^*) = 2\sum _{\ell : |P\cap \ell | = 2} |P\cap \ell | = 4 N_2 \leqslant 4Kn, \end{aligned}$$

(3.7) follows.

Now we can place an upper bound on the number \(B\) of bad edges. Each face with \(s > 3\) sides gives \(s\) bad edges, and each vertex \(\ell ^*\) with degree \(d(\ell ^*) \ne 6\) gives \(d(\ell ^*)\) bad edges. As these are the only sources of bad edges, we have

$$\begin{aligned} B \leqslant \sum _{s > 3} s M_s + \sum _{\ell :d(\ell ^*) \ne 6} d(\ell ^*) \leqslant 16Kn, \end{aligned}$$

by (3.6) and (3.7). \(\square \)

4 Triangular Structure in the Dual and Cubic Curves

In general, the number of edges overall in \(\Gamma _P\) is expected to be of the order of \(n^2\) (cf. Beck’s theorem [2]). Thus, when \(K\) is small, Proposition 3.1 should be viewed as an assertion that almost all the edges of \(\Gamma _P\) are good. For instance, it shows that any dual line \(p^*, p \in P\) should contain at most \(O(K)\) bad edges on the average. Intuitively, this suggests that \(\Gamma _P\) is an “almost triangulation” in which most vertices have degree \(6\) and most faces are triangles. In Sect. 5 we will use this information to put the points of \(P\) on a small number of cubic curves, which will be our starting point for more powerful structural theorems on \(P\).

By a cubic curve we mean a set of points in \(\mathbb R \mathbb P ^2\) of the form

$$\begin{aligned}&\{[X,Y,Z]: a_1X^3 + a_2X^2 Y + a_3XY^2 + a_4Y^3 + a_5X^2 Z \\&\qquad \qquad \qquad +\, a_6XYZ + a_7Y^2 Z + a_8XZ^2 + a_9YZ^2 + a_{10}Z^3 = 0 \} \end{aligned}$$

for some coefficients \(a_1,\ldots ,a_{10} \in \mathbb R \), not all zero, or in other words the locus of a nontrivial homogeneous polynomial of degree \(3\). Note that, we do not assume this polynomial to be irreducible. In particular, we consider the union of three lines, as well as the union of a conic and a line, to be examples of cubic curves.

A key observation in our arguments will be the fact that pockets of true triangular structure in the dual \(\Gamma _P\) signify a collection of points of \(P\) lying on a single cubic curve. Results of this type may be found in Lemmas 4.4 and 4.5 below. A key ingredient will be the following incredibly classical fact from projective geometry, usually known as the Cayley–Bacharach theorem (although the case we require was proven by Chasles [9], prior to the more general results of Cayley [8] and Bacharach [1]).

Proposition 4.1

(Chasles) Suppose that two sets of three lines define nine distinct points of intersection in \(\mathbb R \mathbb P ^2\). Then any cubic curve passing through eight of these points also passes through the ninth.

This situation is shown in Fig. 12 below. See [13] or the blog post [34] for a discussion of this result, including its link to Pappus’s theorem, Pascal’s theorem, and the associativity of the group law on an elliptic curve.

Fig. 12
figure 12

Chasles’s theorem. There are two sets of three lines: the solid lines \({\overline{\{p_0,q_{-1},r_1\}}}, {\underline{\{p_1, q_0, r_{-1}\}}}, {\overline{\{p_{-1}, q_0, r_1\}}}\) and the dotted lines \({\overline{\{p_0, q_1, r_{-1}\}}}, {{\{p_1, q_{-1}, r_0\}}}\) and \({\{p_{-1}, q_1, r_0\}}\). The nine points of intersection \(p_{-1}, p_0, p_1, q_{-1}, q_0, q_1, r_{-1}, r_0\) and \(r_1\) are all distinct. Any cubic curve passing through 8 of these points also passes through the 9th; one such curve is shown

Proposition 4.1 allows one to establish some duality relationshipsFootnote 6 between triangular grids and cubic curves. We first define what we mean by a triangular grid:

Definition 4.2

(Triangular grid) Let \(I,J,K\) be three discrete intervals in \(\mathbb Z \) (thus \(I\) takes the form \(\{ i \in \mathbb Z : i_- \leqslant i \leqslant i_+\}\) for some integers \(i_-, i_+\), and similarly for \(J\) and \(K\)). A triangular grid with dimensions \(I,J,K\) is a collection of lines \((p_i^*)_{i \in I}, (q_j^*)_{j \in J}, (r_k^*)_{k \in K}\) in \(\mathbb R \mathbb P ^2\), which we will view as duals of not necessarily distinct points \(p_i, q_j, r_k\) in \(\mathbb R \mathbb P ^2\), obeying the following axioms:

  1. (i)

    If \(i \in I, j \in J, k \in K\) are integers with \(i+j+k=0\), then the lines \(p_i^*, q_j^*, r_k^*\) are distinct and meet at a point \(P_{ijk}\). Furthermore, this point \(P_{ijk}\) is not incident to any line in the grid which is not already identical to one of the lines \(p_i^*, q_j^*, r_k^*\). Thus, for instance, if \(i^{\prime } \in I\) is such that \(p_{i^{\prime }}^* \ne p_i^*, q_j^*, r_k^*\), then \(p_{i^{\prime }}^*\) cannot contain \(P_{ijk}\).

  2. (ii)

    If \(i \in I, j,j^{\prime } \in J, k,k^{\prime } \in K\) are such that \(i+j+k=i+j^{\prime }+k^{\prime }=0\) and \(0 < |j-j^{\prime }| \leqslant 2\) (or equivalently \(0 < |k-k^{\prime }| \leqslant 2\)), then the intersection points \(P_{ijk}\) and \(P_{ij^{\prime }k^{\prime }}\) are distinct. In particular, this forces \(q_j^* \ne q_{j^{\prime }}^*\) and \(r_k^* \ne r_{k^{\prime }}^*\). Similarly for cyclic permutations of \(i,j,k\) and of \(j^{\prime },k^{\prime }\).

An example of a triangular grid is depicted in Fig. 13.

Fig. 13
figure 13

A triangular grid with dimensions \(\{-2,\ldots ,2\}, \{-10,\ldots ,-1\}\) and \(\{1,\ldots ,10\}\)

The following basic consequence of Proposition 4.1 drives our whole argument.

Lemma 4.3

(Completing a hexagon) Let \(i_0,j_0,k_0\) be integers with \(i_0+j_0+k_0=0\), let \(I := \{i_0-1,i_0,i_0+1\}, J := \{j_0-1,j_0,j_0+1\}, K := \{k_0-1,k_0,k_0+1\}\), and let \((p_i)_{i \in I}, (q_j)_{j \in J}, (r_k)_{k \in K}\) be triples of points whose duals form a triangular grid with dimensions \(I, J, K\). Then the nine points \((p_i)_{i \in I}, (q_j)_{j \in J}, (r_k)_{k \in K}\) are distinct, and any cubic curve which passes through eight of them passes through the ninth.

Proof

By relabeling, we may assume that \(i_0=j_0=k_0=0\), thus the nine points are \(p_{-1}, p_0, p_1, q_{-1}, q_0, q_1, r_{-1}, r_0, r_1\). Once it is shown that these nine points are distinct, their duals form a “hexagon” as depicted in Fig. 14 below as part of a larger triangular grid. The configuration in Fig. 14, however, is precisely the dual of the configuration of 9 points appearing in Chasles’s theorem (see Fig. 12), and the claim then follows.

It remains to establish the distinctness of the nine points. By applying Definition 4.2(ii) to the intersections of \(p_i^*, q_0^*, r_{-i}^*\) for \(i=-1,0,1\) we see that the \(p_{-1}, p_0, p_1\) are distinct; similarly for \(q_{-1}, q_0, q_1\) and \(r_{-1}, r_0, r_1\). Next, from Definition 4.2(i) we see that \(p_i\) and \(q_j\) are distinct as long as \(-1 \leqslant i+j \leqslant 1\), and similarly for cyclic permutations. The only remaining claim left to check, up to permutations and reflections, is that \(p_1\) and \(q_1\) are distinct. But if these two points coincided, then the intersections of \(p_1^*, q_{-1}^*, r_0^*\) and \(p_{-1}^*, q_1^*, r_0^*\) would then also coincide, contradicting Definition 4.2(ii). \(\square \)

Fig. 14
figure 14

The dual of the configuration in Chasles’s theorem, showing a “hexagon” formed by the duals of two sets of three lines. The light grey lines are duals of other points on the cubic curve shown in Fig. 12, specifically those points in longer arithmetic progressions (in the group law on \(\gamma \)) containing the \(p_i, q_j, r_k\). We have included them mainly for aesthetic interest, but also as a more complicated example of a triangular grid

We now iterate the above proposition.

Lemma 4.4

Suppose that \(m \geqslant 4\) is an integer and that \(i_-, i_+\) are integers with \(2 \leqslant i_+ \leqslant m-2\) and \(2-m \leqslant i_- \leqslant -1\). Suppose that we have a collection of points \((p_i)_{i_- \leqslant i \leqslant i_+}, (q_j)_{-m \leqslant j \leqslant -1}\) and \((r_k)_{1 \leqslant k \leqslant m}\) in \(\mathbb R \mathbb P ^2\) whose duals form a triangular grid with the indicated dimensions (The case \(i_- = -2, i_+ = 2\) and \(m = 10\) is illustrated in Fig. 13). Then all of the points \(p_i, q_j, r_k\) lie on a single cubic curve \(\gamma \).

Proof

Consider the nine points \(p_{-1}, p_0, p_1, p_2, q_{-3}, q_{-2}, q_{-1}, r_1, r_2\). The space of cubic homogeneous polynomials is a vector space of dimension 10, and so by straightforward linear algebra there is a cubic curve \(\gamma \) containing these nine points \(p_{-1}, p_0, p_1, p_2, q_{-3}, q_{-2}, q_{-1}, r_1\) and \(r_2\). (Note that it is not necessary for the nine points to be distinct in order to obtain this claim.) We will now claim that all the remaining points \(p_i, q_j, r_k\) in the configuration also lie on \(\gamma \).

Firstly, by applying Lemma 4.3 to the set

$$\begin{aligned} p_{-1}, p_0, p_1, q_{-3}, q_{-2}, q_{-1}, r_1, r_2, r_3 \end{aligned}$$

we see that as eight of the points already lie in \(\gamma \), the ninth point \(r_3\) must also. We now know that the 10 points

$$\begin{aligned} p_{-1}, p_0, p_1, p_2, q_{-3}, q_{-2}, q_{-1}, r_1, r_2, r_3 \end{aligned}$$

all lie on \(\gamma \). Now apply Lemma 4.3 to the set

$$\begin{aligned} p_{0}, p_1, p_2, q_{-4}, q_{-3}, q_{-2}, r_1, r_2, r_3. \end{aligned}$$

We conclude that \(q_{-4}\) also lies on \(\gamma \), so now the 11 points

$$\begin{aligned} p_{-1}, p_0, p_1, p_2, q_{-4}, q_{-3}, q_{-2}, q_{-1}, r_1, r_2, r_3 \end{aligned}$$

all lie on \(\gamma \).

Next apply Lemma 4.3 to the set

$$\begin{aligned} p_{-1}, p_0, p_1, q_{-4}, q_{-3}, q_{-2}, r_2, r_3, r_4 \end{aligned}$$

to conclude that \(r_{4}\) lies on \(\gamma \). We now know that the 12 points

$$\begin{aligned} p_{-1}, p_0, p_1, p_2, q_{-4}, q_{-3}, q_{-2}, q_{-1}, r_1, r_2, r_3, r_4 \end{aligned}$$

all lie on \(\gamma \).

By shifting the \(q\) indices down by one and \(r\) indices up by one repeatedly, we may then inductively place \(q_{-k}\) and \(r_k\) in \(\gamma \) for all \(4 \leqslant k \leqslant m\). Finally, by applying Lemma 4.3 inductively to the sets

$$\begin{aligned} p_{i-1}, p_{i}, p_{i+1}, q_{-i-3}, q_{-i-2}, q_{-i-1}, r_1, r_2, r_3 \end{aligned}$$

for \(i=2,\ldots ,i_+-1\) (noting that \(-i-3 \geqslant -i_+-2 \geqslant -m\)) we may place \(p_i\) in \(\gamma \) for all \(2 < i \leqslant i_+\), and similarly by applying Lemma 4.3 inductively to

$$\begin{aligned} p_{-i-1}, p_{-i}, p_{-i+1}, q_{-3}, q_{-2}, q_{-1}, r_{i+1}, r_{i+2}, r_{i+3} \end{aligned}$$

for \(i=1,\ldots ,-i\_ {-}1\) we can also place \(p_{-i}\) in \(\gamma \) for all \(1 < i \leqslant -i\_\). This concludes the proof of the claim. \(\square \)

This lemma is already enough to imply our most basic structural result for sets with few ordinary lines, Proposition 5.1. To get stronger results, such as Proposition 5.3, we need to perform a deeper analysis. The new feature in the following lemma is the last statement.

Lemma 4.5

Suppose that \(L \geqslant 10\) and that \(m \geqslant 10L\). Suppose that we have a collection of \(4L + 1 + 2m\) points \((p_i)_{-2L \leqslant i \leqslant 2L}, (q_j)_{-m \leqslant j \leqslant -1}\) and \((r_k)_{1 \leqslant k \leqslant m}\) in \(\mathbb R \mathbb P ^2\) whose duals form a \((4L+1) \times m \times m\) triangular grid with the indicated dimensions. Assume furthermore that the points \(p_i, q_j, r_k\) are all distinct. Then all of the points \(p_i, q_j, r_k\) lie on a single cubic curve \(\gamma \), each irreducible component of which contains at least \(L\) of the points \(p_i, q_j, r_k\).

Proof

Note from Definition 4.2 and the distinctness of the \(p_i, q_j, r_k\) that the intersection points \(P_{ijk} = p_i^* \cap q_j^* \cap r_k^*\) in the grid are all distinct.

That all the \(p_i, q_j, r_k\) lie on a single cubic curve \(\gamma \) follows from Lemma 4.4.

If \(\gamma \) is already an irreducible cubic then we are done. By enlarging \(\gamma \) via the addition of extra lines if necessary, we may otherwise suppose that we are in one of the following two cases:

  • Case 1: \(\gamma \) is the union of three distinct lines \(\ell , \ell ^{\prime }, \ell ^{\prime \prime }\);

  • Case 2: \(\gamma \) is the union of an irreducible conic \(\sigma \) and a line \(\ell \).

In each case, we are to show that all irreducible components contain at least \(L\) points \(p_i, q_j, r_k\).

In Case 1, consider a triple of points \(p_i, q_j, r_k\) with \(i + j + k = 0\). Since \(p_i, q_j, r_k\) are collinear and lie on \(\ell \cup \ell ^{\prime } \cup \ell ^{\prime \prime }\), one of the following two possibilities holds:

  1. (i)

    One of the three lines \(\ell ,\ell ^{\prime },\ell ^{\prime \prime }\) is incident to all three of \(p_i, q_j, r_k\) (i.e. the line \({\overline{\{p_i,q_j,r_k\}}}\) is one of \(\ell , \ell ^{\prime }\), or \(\ell ^{\prime \prime }\));

  2. (ii)

    \(p_i, q_j, r_k\) lie on one of each of the lines \(\ell , \ell ^{\prime }, \ell ^{\prime \prime }\) (for instance, one could have \(p_i \in \ell ^{\prime }, q_j \in \ell ^{\prime \prime }, r_k \in \ell \), or any of the other five possible permutations). Note that we allow a point to lie on more than one of the lines \(\ell ,\ell ^{\prime },\ell ^{\prime \prime }\).

First of all note that (i) cannot hold for more than three triples \((i, j, k)\) with \(i + j + k = 0\). Indeed, as observed previously, the intersection points \(P_{ijk} := p_i^* \cap q_j^* \cap r_k^*\) are all distinct, and so the lines containing \(\{p_i, q_j, r_k\}\) are distinct for distinct triples \((i,j,k)\).

Let \(\Omega \) be the set of triples \((i,j,k)\) with \(i + j + k = 0, -L \leqslant i \leqslant L, -2L \leqslant j < -L\) and \(L < k \leqslant 2L\). Suppose that (i) holds for some triple \((i,j,k) \in \Omega \) and that \(p_i, q_j, r_k\) all lie on \(\ell \) (say). We now consider the triples \((i^{\prime }, j^{\prime }, k^{\prime }) \in \Omega \) with \(i \ne i^{\prime }, j \ne j^{\prime }, k \ne k^{\prime }\). With at most two exceptions, (ii) holds for any such triple. Fix one of these triples for which (ii) holds. One of the points \(p_{i^{\prime }}, q_{j^{\prime }}, r_{k^{\prime }}\) then lies on \(\ell \). Suppose that \(p_{i^{\prime }}\) lies in \(\ell \). But, noting that \(1 \leqslant -i^{\prime } - j \leqslant m\), we see that \(p_{i^{\prime }}, q_j\) and \(r_{-i^{\prime } - j}\) are collinear and so \(r_{-i^{\prime } - j}\) lies on \(\ell \) as well. That is, the lines containing \(\{ p_i, q_j, r_k\}\) and \(\{p_{i^{\prime }}, q_j, r_{-i^{\prime }-j}\}\) are the same. This is a contradiction as we noted above. Similarly, if \(q_{j^{\prime }}\) lies in \(\ell \), then \(1 \leqslant -i -j^{\prime } \leqslant m\) and we can conclude that the lines containing \(\{p_i,q_j,r_k\}\) and \(\{p_i,q_{j^{\prime }},r_{-i-j^{\prime }}\}\) are again coincident, a contradiction. Finally, if \(r_{k^{\prime }}\) lies in \(\ell \), then \(-m \leqslant -i-k^{\prime } \leqslant 1\) and the lines containing \(\{p_i,q_j,r_k\}\) and \(\{p_i,q_{-i-k^{\prime }},r_{k^{\prime }}\}\) are coincident, again a contradiction.

It follows that, whenever \((i,j,k) \in \Omega \), we are in case (ii) and not in (i), that is to say the points \(p_i, q_j, r_k\) lie on one of each of the lines \(\ell , \ell ^{\prime }, \ell ^{\prime \prime }\), but do not all lie on one of the lines \(\ell , \ell ^{\prime }\), or \(\ell ^{\prime \prime }\). Suppose without loss of generality that \(p_0 \in \ell , q_{-2L} \in \ell ^{\prime }, r_{2L} \in \ell ^{\prime \prime }\). If \(q_{-2L + 1} \in \ell ^{\prime \prime }\) then the concurrent lines \(p_{-1},q_{-2L+1},r_{2L}\) all lie in \(\ell ^{\prime \prime }\); as \((-1,-2L+1, 2L) \in \Omega \), we obtain a contradiction. Similarly, if \(q_{-2L+1} \in \ell \), then \(p_0, q_{-2L+1},r_{2L+1}\) all lie in \(\ell \), again a contradiction. Thus \(q_{-2L + 1} \in \ell ^{\prime }\), which implies \(r_{2L-1} \in \ell ^{\prime \prime }\). Repeating this argument we see that in fact all of the points \(q_j, -2L \leqslant j < -L\), lie on \(\ell ^{\prime }\) and all of the points \(r_k, L < k \leqslant 2L\), lie on \(\ell ^{\prime \prime }\). Finally, considering the triple \((i, -2L - i, 2L) \in \Omega \), we see that all of the points \(p_i, 0 \leqslant i < L\), lie on \(\ell \). We have established that each of the lines \(\ell ,\ell ^{\prime },\ell ^{\prime \prime }\) contains at least \(L\) of the points \(p_i, q_j, r_k\), concluding the proof of the lemma in Case 1.

We turn now to Case 2, where the argument is very similar. Consider once again a triple of points \(p_i, q_j, r_k\) with \(i + j + k = 0\). These lie on a line. By Bézout’s theorem, there are two cases:

  1. (i)

    \(p_i, q_j, r_k\) all lie on \(\ell \);

  2. (ii)

    two of \(p_i, q_j, r_k\) lie on \(\sigma \) and the other lies on \(\ell \).

If (i) ever holds for some triple \((i,j,k)\) then there is at most one such triple. Suppose it holds for some triple \((i,j,k) \in \Omega \). There are again many triples \((i^{\prime },j^{\prime },k^{\prime }) \in \Omega \) with \(i \ne i^{\prime }, j \ne j^{\prime }, k \ne k^{\prime }\). For any such triple, two of \(p_{i^{\prime }}, q_{j^{\prime }}, r_{k^{\prime }}\) lie on \(\sigma \) and the other lies on \(\ell \). Suppose, without loss of generality, that \(p_{i^{\prime }} \in \ell \). Then, noting that \(1 \leqslant -i^{\prime } - j \leqslant m\), we see that \(p_{i^{\prime }}, q_j, r_{-i^{\prime }-j}\) are collinear and so \(r_{-i^{\prime }-j}\) lies on \(\ell \) as well, and thus (i) also holds for the triple \((i^{\prime }, j, -i^{\prime } - j)\). This leads to a contradiction exactly as before.

It follows that, whenever \((i,j,k) \in \Omega \), two of the points \(p_i, q_j, r_k\) lie on \(\sigma \) and the other lies on \(\ell \). If \(p_0 \in \sigma \) then one of \(q_{-j}, r_j\) lies on \(\sigma \) and the other lies on \(\ell \), for each \(j\) with \(L < j \leqslant 2L\). Thus both \(\sigma \) and \(\ell \) contain at least \(L\) of the points \(p_i, q_j, r_k\). If, on the other hand, \(p_0 \in \ell \), then all the \(q_{-j}, r_j\) with \(L < j \leqslant 2L\) lie on \(\sigma \). But for each \(i, |i| \leqslant L-1\), there are some \(j,k\) with \(-2L \leqslant j < -L\) and \(L < k \leqslant 2L\) such that \((i, j, k) \in \Omega \), and so \(p_i \in \ell \) for all these \(i\) too. Thus both \(\ell \) and \(\sigma \) contain at least \(L\) of the points \(p_i, q_j, r_k\) in this case also. \(\square \)

We remark that this analysis can be pushed further in order to say something about the distribution of the points \(p_i, q_j, r_k\) on (for example) three lines \(\ell , \ell ^{\prime },\ell ^{\prime \prime }\). One could most probably give some kind of complete classification of (say) \(100 \times 100 \times 100\) triangular grids. However it is also possible to take a self-contained additive-combinatorial approach, leading to better bounds, and this is the technique we pursue in Sect. 6.

To conclude this section let us remark that a number of beautiful pictures of triangular structures arising from cubic curves (for various different types of cubic) may be found in the paper [6], another work in the interpolation theory literature.

5 Almost Triangular Structure and Covering by Cubics

Recall that if \(P \subset \mathbb R \mathbb P ^2\) is a set of points then \(\Gamma _P\) is the graph defined by the dual lines \(p^*, p \in P\). We now know (Proposition 3.1) that if \(P\) has few ordinary lines then \(\Gamma _P\) has a highly triangular structure. We also understand (Lemma 4.4) that triangular structure in \(\Gamma _P\) corresponds to points of \(P\) lying on a cubic curve. In this section, we put these facts together to prove some of the structural results stated in the introduction.

The main result is Lemma 5.2, whose statement and proof are somewhat technical. To convey the main idea (and because we will need it later, and because it may be of independent interest) we first establish the following much easier result. This result also comes with better bounds—indeed it says something even if one only knows that \(P\) spans \(o(n^2)\) ordinary lines—than our more technical later result.

Proposition 5.1

(Cheap structure theorem) Suppose that \(P\) is a finite set of \(n\) points in the plane. Suppose that \(P\) spans at most \(Kn\) ordinary lines for some \(K \geqslant 1\). Then \(P\) lies on the union of \(500K\) cubic curves.

Proof

We first dispose of a degenerate case. Suppose that one of the dual lines \(p^*, p \in P\), meets fewer than \(500K\) points in \(\Gamma _P\). Then every dual line meets one of these points, which means that \(P\) is covered by at most \(500K\) lines. As every line is already a cubic curve, we are done in this case. Thus we may assume that each dual line \(p^*\) meets at least \(500K\) points in \(\Gamma _P\). In particular, it meets at least three points of \(\Gamma _P\).

Recall the definition of a “good edge” of \(\Gamma _P\): an edge both of whose vertices have degree \(6\), and where both faces adjoining it are triangles.

Let us say that an edge is really good if all paths of length two from both of its endpoints consist entirely of good edges. If we have a segment \(S\) of \(l \geqslant 1\) consecutive edges on \(p^*\), all of which are really good, then the structure of \(\Gamma _P\) is locally that of a triangular grid with dimensions \(\{-2,\ldots ,2\}, \{-l-4,\ldots ,-1\}, \{1,\ldots ,l+4\}\); note that the distinct intersection property of Definition 4.2(i) is automatic since every dual line is assumed to meet at least three points in \(\Gamma _P\). Applying Lemma 4.4, we conclude that if \(S\) is such a segment of consecutive really good edges, containing at least one edge, then the set of \(q \in P \setminus \{p\}\) for which \(q^*\) meets \(S\) all lie on a cubic curve \(\gamma _S\) (which also contains \(p\)).

If an edge is not really good, we say that it is somewhat bad. We know, by Proposition 3.1, that the number of bad edges is at most \(16K n\). Now associated to any somewhat bad edge \(e\) is a path of length \(1, 2\) or \(3\) whose first edge is \(e\) and whose last edge is bad, and which is furthermore the only bad edge on that path (take a minimal path starting in \(e\) and ending in a bad edge). The number of paths of length 3 of the form bad–good–good is at most \(16Kn \times 5 \times 5\), since each vertex of a good edge has degree 6. Taking account of paths of length 2 and 1 as well, we obtain an upper bound of \(500Kn\) for the number of somewhat bad edges.

By the pigeonhole principle there is a line \(p^*\) which contains \(t \leqslant 500K\) somewhat bad edges. These somewhat bad edges partition \(p^*\) into \(t\) segments of consecutive really good edges (a segment may have length zero). Let the segments with at least one edge be \(S_1,\ldots , S_{t^{\prime }}\), and let the segments of length zero, which are simply vertices, consist of vertices \(v_{t^{\prime }+1}, \ldots , v_{t}\).

If \(q \in P \setminus \{p\}\), then \(q^*\) meets \(p^*\) either in a vertex of one of the \(S_i\), or in one of the additional vertices \(v_j\). In the former case, as discussed previously, Lemma 4.4 places \(q\) in a cubic curve \(\gamma _{S_i}\) depending on \(S_i\). In the latter case, \(q\) lies in the dual line \(v_j^*\). Such a dual line can be thought of as a (degenerate) cubic curve. Taking the union of all these cubic curves, of which there are at most \(t^{\prime } + (t-t^{\prime }) \leqslant 500K\), gives the result. \(\square \)

Proposition 5.1 is already a fairly strong structure theorem for sets with few ordinary lines. It is possible that the ordinary lines in a union of \(O(1)\) cubics can be analysed directly, though this certainly does not seem to be straightforward. Fortunately, there is much more to be extracted from Proposition 3.1 and the results of Sect. 4, enabling us to prove more precise statements that refine Proposition 5.1, albeit with somewhat worse explicit constants.

The next lemma is the main technical result of this section. In an effort to make the paper more readable, we have formulated it so that, once it is proven, we will have no further need of the dual graph \(\Gamma _P\) and consequences of Melchior’s inequality.

Lemma 5.2

Suppose that \(P\) is a set of \(n\) points in the plane. Suppose that \(P\) spans at most \(Kn\) ordinary lines for some \(K \geqslant 1\), and let \(L \geqslant 10\) be a parameter. Suppose that \(P\) cannot be covered by a collection of \(4L\) concurrent lines. Then for every \(p \in P\) there is a partition \(P = \{p\} \cup \Sigma _{1,p} \cup \dots \cup \Sigma _{c_p, p}\) with the following properties:

  1. (i)

    For \(i = 1,\ldots ,c_p\) the points of \(\Sigma _{i,p}\) lie on a (not necessarily irreducible) curve \(\gamma _{i,p}\) of degree at most three, which also contains \(p\);

  2. (ii)

    If \(\gamma _{i,p}\) is not a line, then each irreducible component of it contains at least \(L\) points of \(P\);

  3. (iii)

    If \(\gamma _{i,p}\) is not a line, then the points of \(\Sigma _{i,p}\) may be partitioned into pairs \((q,r)\) such that \(p, q, r\) are collinear, and no other points of \(P\) are on the line joining \(p, q\) and \(r\);

  4. (iv)

    We have the upper bound \(\sum _{p \in P} c_p \leqslant 2^{19} L^3 K n\) on the average size of \(c_p\).

Proof

The proof of this lemma is basically the same as the last one, except now we work with a considerably enhanced notion of what it means to be a “really good edge”. Call an edge extremely good if all paths of length \(2L\) from both of its endpoints consist entirely of good edges. In the last proposition, we only needed paths of length 2. If an edge \(e\) is not extremely good, let us say that it is slightly bad.

We now count the number of slightly bad edges by an argument similar to that used to prove the previous proposition. Let \(e\) be a slightly bad edge, and let \(r\) be the length of the shortest path from an endpoint of \(e\) to a vertex of a bad edge. Thus \(0 \leqslant r \leqslant 2L-1\), and there is a vertex \(v\) of a bad edge \(e^{\prime }\) that is at distance exactly \(r\) from an endpoint \(w\) of \(e\). Then all paths of length up to \(r\) from either of the vertices of \(e\) are good, which means that the \(r\)-neighbourhood of \(e\) has the combinatorial structure of a triangular grid, and also that \(v\) lies on the boundary of this neighbourhood and has degree six. Among other things, this implies that among all the paths of length \(r\) from \(v\) to \(w\), there is a path that changes direction only once. To describe this path, as well as the slightly bad edge \(e\), one could specify the bad edge \(e^{\prime }\), followed by an endpoint \(v\) of that bad edge of degree six, followed by an edge emanating from \(v\), which is followed along for some length \(r_1\) to a vertex of degree six, at which point one switches to one of the other four available directions and follows that direction for a further length \(r_2\), with \(r_1+r_2 \leqslant r\) (so in particular \(0 \leqslant r_1,r_2 \leqslant 2L-1\)), until one reaches a vertex \(w\), at which point the slightly bad edge \(e\) is one of the six edges adjacent to \(w\). From Proposition 3.1 and simple counting arguments, we may thus bound the total number of slightly bad edges \(e\) crudely by

$$\begin{aligned} 16Kn \times 2 \times 6 \times 2L \times 4 \times 2L \times 6 \leqslant 2^{15} K L^2n \end{aligned}$$

and so we conclude that the number of slightly bad edges is at most \(2^{15}KL^2n\). One could save a few powers of two here by being more careful, but we will not do so.

Suppose that there are \(b_p\) slightly bad edges on \(p^*\). Then

$$\begin{aligned} \sum _{p \in P} b_p \leqslant 2^{15} L^2 K n. \end{aligned}$$
(5.1)

If we have a segment of \(m \geqslant 10L\) consecutive edges on \(p^*\), all of which are extremely good, and with \(p^*\) containing at least \(4L\) additional edges beyond these \(m\), then the structure of \(\Gamma _P\) is locally that of a triangular grid of dimensions \(\{-2L,\ldots ,2L\}, \{-m,\ldots ,-1\}, \{1,\ldots ,m\}\). Note that as we are assuming that \(P\) cannot be covered by \(4L\) concurrent lines, every dual line \(p^*\) meets at least \(4L+1\) distinct points in \(\Gamma _P\), ensuring the disjointness property in Definition 4.2(ii). Indeed, the fact that each dual line meets at least \(4L+1\) distinct points, and that \(p^*\) contains at least \(4L\) additional edges beyond the \(m\) consecutive edges, ensures that all the lines in this triangular grid are distinct.

Thus if \(S\) is such a segment then, by Lemma 4.5, the set \(\Sigma _S\) of all \(q \in P \setminus \{p\}\) for which \(q^*\) meets \(S\) all lie on a cubic curve \(\gamma _S\) which contains \(p\), and each component of which contains at least \(L\) points of \(P\). Furthermore, since the lines \(q^*\) meet the vertices of \(S\) in pairs (since each such vertex certainly has degree 6) the points of \(\Sigma _S\) may be divided into pairs \((q,r)\) such that \(p,q,r\) are collinear, and no other point of \(P\) lies on the line joining \(p,q\) and \(r\). Compare with conclusion (iii) of this lemma.

The line \(p^*\) is divided into \(b_p\) segments, each containing one or more vertices, by the slightly bad edges. We then create some subsegments \(S_1,\ldots ,S_t\) by the following rule:

  1. (i)

    If \(p^*\) contains at most \(14L\) edges in all, then we set \(t=0\), so no subsegments \(S_1,\ldots ,S_t\) are created;

  2. (ii)

    If \(p^*\) contains more than \(14L\) edges, and one of the segments \(S\) cut out by the slightly bad edges contains all but at most \(4L\) of the edges, we set \(t=1\), and define \(S_1\) to be a subsegment of \(S\) omitting precisely \(4L\) edges;

  3. (iii)

    In all other cases, we set \(S_1,\ldots , S_t\) be those segments cut out by the slightly bad edges with at least \(10L\) edges.

We then let \(v_{t + 1},\ldots ,v_{c_p}\) be the vertices not contained in any of the \(S_1,\ldots ,S_t\). By construction, we see that we always have \(t \leqslant b_p\) and that the number \(c_p-t\) of remaining vertices \(v_i\) is at most \(\max ( 14L, 4L, (10L+1)b_p) \leqslant (10L+1)b_p + 14L\). We thus have

$$\begin{aligned} c_p \leqslant (10L + 2)b_p + 14L. \end{aligned}$$
(5.2)

Define \(\Sigma _{i,p} := \Sigma _{S_i}\) and \(\gamma _{i,p} := \gamma _{S_i}\) for \(i \leqslant t\), and for \(i \geqslant t+1\) let \(\gamma _{i,p}\) be the line \(v^*_i\) and take \(\Sigma _{i,p}\) to consist of the points of \(P \setminus \{p\}\) lying on this line.

This collection of cubics and lines has properties (i), (ii) and (iii) claimed in the lemma. The bound (iv) follows immediately from (5.1) and (5.2) and the crude bound \((10L+2) 2^{15} L^2 K + 14L \leqslant 2^{19} L^3 K\), valid for \(L \geqslant 10\) and \(K \geqslant 1\). \(\square \)

We are now in a position to prove a result which is still not quite as strong as our main structure theorem, Theorem 1.5, but is still considerably more powerful (albeit with worse explicit constants) than the rather crude statement of Proposition 5.1.

Proposition 5.3

(Intermediate structure theorem) Suppose that \(P\) is a finite set of \(n\) points in the plane. Suppose that \(P\) spans at most \(Kn\) ordinary lines for some \(K \geqslant 1\). Then one of the following three alternatives holds:

  1. (i)

    \(P\) is contained in the union of an irreducible cubic \(\gamma \) and an additional \(2^{75} K^5\) points.

  2. (ii)

    \(P\) lies on the union of an irreducible conic \(\sigma \) and an additional \(2^{64} K^4\) lines. Furthermore, \(\sigma \) contains between \(\frac{n}{2} - 2^{76} K^5\) and \(\frac{n}{2} + 2^{76} K^5\) points of \(P\), and \(P \setminus \sigma \) spans at most \(2^{62} K^4 n\) ordinary lines.

  3. (iii)

    \(P\) is contained in the union of \(2^{16} K\) lines and an additional \(2^{87} K^6\) points.

Remark

The explicit expressions such as \(2^{75} K^5\) in the above proposition could of course be replaced by the less specific notation \(O(K^{O(1)})\) if desired, and the reader may wish to do so in the proof below as well.

Proof

If \(P\) can be covered by \(60000K \leqslant 2^{16}K\) concurrent lines then we are of course done, so we will assume that this is not the case.

By Proposition 5.1 we know that \(P\) is covered by at most \(500K\) cubic curves. By breaking each of these curves up into irreducible components, we may thus cover \(P\) by distinct irreducible cubic curves \(\gamma _1,\ldots ,\gamma _m\) for some

$$\begin{aligned} m \leqslant {1500K}. \end{aligned}$$
(5.3)

By Bézout’s Theorem, no pair of distinct irreducible curves intersects in more than 9 points, and so there is a set \(P^{\prime } \subset P\), with

$$\begin{aligned} |P \backslash P^{\prime }| \leqslant 9 \big (\begin{array}{l}{m}\\ {2} \end{array}\big ) \leqslant 2^{24} K^2, \end{aligned}$$

such that each point of \(P^{\prime }\) lies on just one of the curves \(\gamma _i\).

Suppose first of all that one of the \(\gamma _i\), say \(\gamma _1\), is an irreducible cubic and contains at least \(2^{76} K^5\) points of \(P\). Then it also contains at least \(2^{75} K^5\) points of \(P^{\prime }\). Write \(n_0 := |P^{\prime } \cap \gamma _1|\): thus \(n_0 \geqslant 2^{75} K^5\).

By construction and (5.3), \(P\) is not covered by \(40m\) concurrent lines. Applying Lemma 5.2 with \(L := 10m\), we see that for each \(p^{\prime } \in P\) we may partition \(P\) as \(\{p^{\prime }\} \cup \Sigma _{1,p^{\prime }} \cup \dots \cup \Sigma _{c_{p^{\prime }},p^{\prime }}\), where \(\sum _{p^{\prime } \in P} c_{p^{\prime }} \leqslant 2^{19} (10 m)^3 K n \leqslant 2^{61} K^4 n\) and each \(\Sigma _{i,p^{\prime }}\) is contained in some (not necessarily irreducible) cubic \(\gamma _{i,p^{\prime }}\) containing \(p^{\prime }\) which is either a line, or has the property that each irreducible component of it contains at least \(10m\) points of \(P\).

By the pigeonhole principle, there is some \(p^{\prime } \in P^{\prime } \cap \gamma _1\) with the property that \(c_{p^{\prime }} \leqslant 2^{61} K^4 n/n_0\). Fix this \(p^{\prime }\). By Bézout’s theorem, an irreducible curve of degree at most three that is not already one of the \(\gamma _j\) meets \(P\) in no more than \(9m\) points, and so we infer that each \(\gamma _{i,p^{\prime }}\) is either a line, or else every irreducible component of it is one of the \(\gamma _j\). Since \(p^{\prime }\) lies on \(\gamma _1\) but not on any other \(\gamma _i\), we infer that all the \(\gamma _{i,p^{\prime }}\) are lines except that one of them, say \(\gamma _{1,p^{\prime }}\), may be \(\gamma _1\). Furthermore, none of the lines \(\gamma _{j,p^{\prime }}, j = 2,\dots ,c_{p^{\prime }}\), which all contain \(p^{\prime }\), coincides with any of \(\gamma _2,\ldots ,\gamma _m\). By another application of Bézout’s theorem, each of them contains at most \(3m\) points of \(P\).

It follows that

$$\begin{aligned} n = |P|&\leqslant |P \cap \gamma _1| + \sum _{j = 2}^{c_{p^{\prime }}} |P \cap \gamma _{j,p^{\prime }}| \\&\leqslant n_0 + 2^{24} K^2 + 3mc_{p^{\prime }} \leqslant n_0 + \frac{2^{74} K^5 n}{n_0}. \end{aligned}$$

Since \(n_0 \geqslant 2^{75} K^5\), we conclude that \(n_0 \geqslant n/2\), which when inserted again into the above inequality gives \(n_0 \geqslant n - 2^{75} K^5\), which is option (i) of Proposition 5.3.

The analysis of option (ii) goes along similar lines but is a little more complicated. Suppose now that one of the \(\gamma _i\), say \(\gamma _1\), is an irreducible conic and contains at least \(2^{76} K^5\) points of \(P\). Once again, it also contains at least \(2^{75} K^5\) points of \(P^{\prime }\). Write \(n_0 := |P^{\prime } \cap \gamma _1|\); thus \(n_0 \geqslant 2^{75} K^5\).

By Lemma 5.2 as before we may, for each \(p^{\prime } \in P\), partition \(P\) as \(\{p^{\prime }\} \cup \Sigma _{1,p^{\prime }} \cup \cdots \cup \Sigma _{c_{p^{\prime }},p^{\prime }}\) with \(\sum _{p^{\prime }} c_{p^{\prime }} \leqslant 2^{19} (10 m)^3 Kn \leqslant 2^{61} K^4 n\) and each \(\Sigma _{i,p^{\prime }}\) contained in some (not necessarily irreducible) curve \(\gamma _{i,p^{\prime }}\) of degree at most three containing \(p^{\prime }\) which is either a line, or has the property that each irreducible component contains at least \(10m\) points of \(P\).

By the pigeonhole principle as before, we may find \(p^{\prime } \in P^{\prime } \cap \gamma _1\) such that \(c_{p^{\prime }} \leqslant 2^{61} K^4 n/n_0\). Now fix this \(p^{\prime }\).

Suppose that \(\gamma _{i,p^{\prime }}\) is not a line. Then, by Bézout’s theorem as above, each irreducible component of \(\gamma _{i,p^{\prime }}\) is one of the \(\gamma _j\). Since \(p^{\prime } \in \gamma _{i,p^{\prime }}\), and \(p^{\prime }\) lies on \(\gamma _1\) but not on any other \(\gamma _j\), one of the irreducible components of \(\gamma _{i,p^{\prime }}\) is \(\gamma _1\). Thus for each \(i\) one of the following is true:

  1. (i)

    \(\Sigma _{i,p^{\prime }}\) is contained in a line through \(p^{\prime }\);

  2. (ii)

    \(\Sigma _{i,p^{\prime }}\) is contained in the conic \(\gamma _1\);

  3. (iii)

    \(\Sigma _{i,p^{\prime }}\) is contained in the union of the conic \(\gamma _1\) and a line \(\gamma _{j_i}\).

Now recall Lemma 5.2. Item (iii) of that lemma asserts that in cases (ii) and (iii) above the points of \(\Sigma _{i,p^{\prime }}\) may be divided into collinear triples \((p^{\prime },q,r)\). This immediately rules out option (ii). For those \(i\) satisfying (iii) we see that \(|\Sigma _{i,p^{\prime }}| = 2|\Sigma _{i,p^{\prime }} \cap \gamma _1|\). For those \(i\) satisfying (i) it follows from Bézout’s theorem that \(|\Sigma _{i,p^{\prime }}| \leqslant 3m\).

Let \(I\) be the set of indices \(i\) for which \(\Sigma _{i,p^{\prime }}\) is not contained in a line through \(p^{\prime }\), that is to say for which option (iii) above holds. It follows that

$$\begin{aligned} n = |P| = 1 + \sum _{i = 1}^{c_{p^{\prime }}} |\Sigma _{i,p^{\prime }}| \leqslant 1 + 2\sum _{i \in I} |\Sigma _{i,p^{\prime }} \cap \gamma _1| + 3mc_{p^{\prime }} \end{aligned}$$

However, any line through \(p^{\prime }\) meets \(\gamma _1\) (which contains \(p^{\prime }\)) in at most one other point, and so

$$\begin{aligned} \sum _{i \in I} |\Sigma _{i,p^{\prime }} \cap \gamma _1| \leqslant |P \cap \gamma _1| + c_{p^{\prime }}. \end{aligned}$$

Since

$$\begin{aligned} |P \cap \gamma _1| \leqslant |P^{\prime } \cap \gamma _1| + |P \backslash P^{\prime }| \leqslant n_0 + 2^{24} K^2 \end{aligned}$$

we conclude that

$$\begin{aligned} n \leqslant 2n_0 + 2^{25} K^2 + (3m+2) c_{p^{\prime }} + 1 \leqslant 2n_0 + \frac{2^{74} K^5 n}{n_0}. \end{aligned}$$

Since \(n_0 \geqslant 2^{75} K^5\), this is easily seen to imply that \(n_0 \geqslant n/4\), and hence \(n_0 \geqslant n/2 - 2^{76} K^5\) and \(c_{p^{\prime }} \leqslant 2^{63} K^4\). In the converse direction, we have

$$\begin{aligned} n \geqslant \sum _{i \in I} |\Sigma _{i,p^{\prime }}| = 2\sum _{i \in I} |\Sigma _{i,p^{\prime }} \cap \gamma _1| \end{aligned}$$

and so

$$\begin{aligned} |P \cap \gamma _1| \leqslant \frac{n}{2} + 1 + |\bigcup _{i \not \in I} \Sigma _{i,p^{\prime }} \cap \gamma _1|. \end{aligned}$$

For \(i \not \in I, \Sigma _{i,p^{\prime }} \cap \gamma _1\) consists of at most one point, and so

$$\begin{aligned} |P \cap \gamma _1| \leqslant \frac{n}{2} + 1 + c_{p^{\prime }} \leqslant \frac{n}{2} + 2^{76} K^5 \end{aligned}$$

and hence \(\gamma _1\) contains between \(\frac{n}{2} - 2^{76} K^5\) and \(\frac{n}{2} - 2^{76} K^5\) elements of \(P\).

Looking back at the three possibilities (i), (ii) and (iii) above, we see that the other points lie in the union of the lines \(\gamma _{i,p^{\prime }}\) and \(\gamma _j\), of which there are at most \(c_{p^{\prime }} + m \leqslant 2^{64} K^4\).

To complete the proof that we are in case (ii) claimed in the proposition, we need to give an upper bound for the number of ordinary lines spanned by the set \(P \setminus \gamma _1\). Such a line could be ordinary in \(P\), but there are at most \(Kn\) such lines. Otherwise, such a line passes through a point \(p^{\prime } \in P \cap \gamma _1\), and contains precisely two points in \(P \setminus \gamma _1\). Let us say that such a line is bad. The number of bad lines arising from \(p^{\prime } \in P \setminus P^{\prime }\), a set of cardinality at most \(2^{24} K^2\), is at most \(2^{23} K^2 n\). Suppose then that \(p^{\prime } \in P^{\prime } \cap \gamma _1\). As above, for each such \(p^{\prime }\) we have a partition \(P = \{p^{\prime }\} \cup \Sigma _{1,p^{\prime }} \cup \cdots \cup \Sigma _{c_{p^{\prime }},p^{\prime }}\), and we now know that \(\Sigma _{i,p^{\prime }}\) is either contained in a line through \(p^{\prime }\) or is contained in the union of \(\gamma _1\) and a line. Furthermore in the latter case we know from Lemma 5.2 (iii) that every line though \(p^{\prime }\) and a point of \(\Sigma _{i,p^{\prime }}\) passes through precisely two other points of \(P\), one on \(\gamma _1\) and the other not. Therefore it is not bad. The number of bad lines through \(p^{\prime }\) is thus at most \(c_{p^{\prime }}\), and so the total number of bad lines arising from \(p^{\prime } \in P \cap \gamma _1\) is at most \(\sum _{p^{\prime }} c_{p^{\prime }} \leqslant 2^{61} K^4 n\). Statement (ii) of the proposition follows immediately.

We have now considered all cases in which any irreducible cubic or conic from amongst the \(m\) curves \(\gamma _i\) contains more than \(2^{76} K^5\) points of \(P\). If this is not the case, the only curves among the \(\gamma _i\) containing more than \(2^{76} K^5\) points of \(P\) are lines. Thus \(P\) may be covered by \(m \leqslant 2^{16} K\) lines and at most \(2^{76} K^5 m \leqslant 2^{87} K^6\) points, which gives option (iii). \(\square \)

6 Unions of Lines

Suppose that \(P\) is a set of \(n\) points spanning at most \(Kn\) ordinary lines. We know from Proposition 5.3 that all but \(O(K^{O(1)})\) points of \(P\) lie on an irreducible cubic, an irreducible conic and some lines, or some lines. The aim of this section is to reduce the number of lines, in all cases, to at most one. The main result of this section is the following theorem, which may again be of independent interest.

Proposition 6.1

Suppose that a set \(P \subset \mathbb R \mathbb P ^2\) of size \(n\) lies on a union \(\ell _1 \cup \cdots \cup \ell _m\) of lines, and that \(P\) spans at most \(Kn\) ordinary lines. Suppose that \(n \geqslant n_0(m,K)\) is sufficiently large. Then all except at most \(3K\) of the points of \(P\) lie on a single line.

We first handle the (easy) case where one line has almost all the points. For this, one does not need to know that \(P\) is contained in the union of a few lines.

Lemma 6.2

Suppose that \(P \subset \mathbb R \mathbb P ^2\) is a set of size \(n\), that \(P\) spans at most \(Kn\) ordinary lines, and that at least \(\frac{2}{3}n\) of the points of \(P\) lie on a single line \(\ell \). Then in fact all except at most \(3K\) of the points lie on \(\ell \).

Proof

Let \(p \in P \setminus \ell \). Then \(p\) forms at least \(2n/3\) lines with the points of \(P \cap \ell \). At most \(n/3\) of these contain another point of \(P\), and so at least \(n/3\) of them are ordinary. Therefore the number of ordinary lines is at least \(|P \setminus \ell |n/3\), and the claim follows immediately. \(\square \)

Suppose now, and for the rest of the section, that \(P \subset \ell _1 \cup \cdots \cup \ell _m\). The opposite extreme to that considered by the above lemma is when all the lines \(\ell _i\) contain many points of \(P\). The next result, which is the key technical step in the proof of Proposition 6.1 (and is in fact rather stronger than that proposition), may be of independent interest.

Proposition 6.3

Suppose that \(m \geqslant 2\) and that a set \(P \subset \mathbb R \mathbb P ^2\) of size \(n\) lies on a union \(\ell _1 \cup \cdots \cup \ell _m\) of lines, and that at least \(\varepsilon n\) points of \(P\) lie on each of the lines \(\ell _i\). Suppose that \(m, \frac{1}{\varepsilon } \leqslant n^{\frac{1}{10000}}\). Then \(P\) spans at least \(\varepsilon ^{12} n^2/m^6\) ordinary lines.

Remark

The exponent \(\frac{1}{10000}\) could certainly be improved somewhat, but a really significant improvement—beyond \(\frac{1}{100}\), say—would require new methods.

The proof of this proposition is quite long. Before embarking upon it we show how to derive Proposition 6.1 as a consequence.

Deduction of Proposition 6.1 from Proposition 6.3 and Lemma 6.2 Reorder the lines so that \(n_1 \geqslant n_2 \geqslant \cdots \geqslant n_m\), where \(n_i := |P \cap \ell _i|\). Set \(\varepsilon _j := n_j/n\). If \(\varepsilon _2 \leqslant 1/3m\) then \(\varepsilon _1 \geqslant 2/3\) and we are done by Lemma 6.2, so suppose that \(\varepsilon _2 \geqslant 1/3m\). Write \(P_j := P \cap (\ell _1 \cup \cdots \cup \ell _j), j = 2,3,\ldots ,m\). By Proposition 6.3, the set \(P_j\) determines at least \(\varepsilon _j^{12} n^2/m^6\) ordinary lines (here we have used the trivial lower bound \(|P_j| \geqslant |P_1| \geqslant n/m\)). Since \(|P \setminus P_j| \leqslant m \varepsilon _{j+1}n\), the number of these which fail to be ordinary lines in \(P\) is bounded above by \(m \varepsilon _{j+1} n^2\). Let \(j\) be the least index such that \(\varepsilon _{j+1} < \frac{1}{2}\varepsilon _j^{12}m^{-7}\) or, if there is no such index, set \(j := m\). Then it follows that the number of ordinary lines in \(P\) is at least \(\frac{1}{2}\epsilon _j^6 n^2/m^5\). We have the lower bound \(\varepsilon _j \geqslant \exp (-e^{Cm})\), and so \(P\) spans \(\gg \exp (-e^{Cm})n^2\) ordinary lines. If \(n \geqslant n_0(m,K)\) is sufficiently large this is greater than \(Kn\), so we obtain a contradiction. \(\square \)

Remark

We note that \(n_0(m,K)\) can be taken to have the shape \(n_0(m,K)~\sim ~K \exp \exp (Cm)\).

We may now focus our attention on establishing Proposition 6.3. We will divide into two quite different cases, according as the lines \(\ell _i\) all intersect at a point or not.

Proposition 6.4

Suppose that \(m \geqslant 2\) and that \(P\) lies on a union \(\ell _1 \cup \cdots \cup \ell _m\) of lines, all of which pass through a point, and that at least \(\varepsilon n\) points of \(P\) lie on each of the lines \(\ell _i\). Then \(P\) spans at least \(\varepsilon ^2n^2/50\) ordinary lines.

Proof

We are greatly indebted to Luke Alexander Betts, a second year undergraduate at Trinity College, Cambridge, who showed us the following argument. For brevity we give a slightly crude version of the argument he showed us.

Applying a projective transformation, we may assume without loss of generality that all the lines pass through the origin \([0,0,1]\) in the affine part of \(\mathbb R \mathbb P ^2\). Dualising, these \(m\) lines become points on the line at infinity, and the sets \(P \cap \ell _i\) become sets of parallel lines. If \(\mathcal{L }\) is a set of lines in \(\mathbb R ^2\), we say that a point is ordinary for \(\mathcal{L }\) if it lies on precisely two of the lines in \(\mathcal{L }\). We say that \(\mathcal{L }\) is \(t\)-parallel if, for every line \(\ell \in \mathcal{L }\), there are at least \(t\) other lines parallel to it. Finally, we say that a point lying on three or more of the lines from \(\mathcal{L }\) is a triple point. The dual statement to Proposition 6.4 (with \(t\) replacing \(\varepsilon n\)) is then the following. \(\square \)

Proposition 6.5

Let \(t>0\) be a real number. Suppose that \(\mathcal{L }\) is a \(t\)-parallel set of lines in \(\mathbb R ^2\), and that not all the lines of \(\mathcal{L }\) are parallel. Then there are at least \(t^2/50\) ordinary points for \(\mathcal{L }\).

The heart of the matter is the following lemma.

Lemma 6.6

Suppose that \(\mathcal{L }\) is \(t\)-parallel, but not all the lines of \(\mathcal{L }\) are parallel. Then there is a line \(\ell \in \mathcal{L }\) containing at least \(t/2\) ordinary points for \(\mathcal{L }\).

Proof

If there are no triple points determined by \(\mathcal{L }\) then the conclusion is immediate, as every line intersects at least \(t+1 > t/2\) other lines. If there are triple points determined by \(\mathcal{L }\), let \(T\) be the set of them. Let \(v\) be a vertex of the convex hull of \(T\), lying on lines \(\ell _1,\ell _2,\ell _3 \in \mathcal{L }\). There are open rays (half-lines) \(\ell _1^+,\ell _2^+,\ell _3^+\) emanating from \(v\) which do not intersect \(T\). Suppose without loss of generality that the rays \(\ell _1^+, \ell _3^+\) lie on either side of the line \(\ell _2\), as depicted in Fig. 15. Each of the \(t\) other lines in \(\mathcal{L }\) parallel to \(\ell _2\) meets either \(\ell _1^+\) or \(\ell _3^+\), and so at least one of these rays meets at least \(t/2\) lines in \(\mathcal{L }\). All of these points of intersection must, by construction, be ordinary points for \(\mathcal{L }\).

Now let us return to the main problem, the dual form of Proposition 6.4 stated above. Note that if \(\mathcal{L }^{\prime }\) is formed by removing at most \(t/5\) lines from \(\mathcal{L }\) then it is still \(4t/5\)-parallel. Thus, by \(\lceil t/5 \rceil \) applications of Lemma 6.6 we may inductively find distinct lines \(\ell _1,\ldots , \ell _{\lceil t/5\rceil }\) such that \(\ell _i\) contains at least \(2t/5\) ordinary points for \(\mathcal{L } \setminus \{\ell _1,\ldots ,\ell _{i-1}\}\). These are not necessarily ordinary points for \(\mathcal{L }\), but any such point that is not lies on one of \(\ell _1,\ldots ,\ell _{i-1}\). Since it also lies on \(\ell _i\), there are at most \(i-1 < t/5\) such points, and so \(\ell _i\) contains at least \(t/5\) ordinary points of \(\mathcal{L }\). Each ordinary point of \(\mathcal{L }\) lies on at most two of the lines \(\ell _i\), so we get at least \(t^2/50\) ordinary points in total. \(\square \)

Fig. 15
figure 15

Figure relevant to the proof of Lemma 6.6. Here, \(t = 6\) and the ray \(\ell _3^+\) meets \(4 > 6/2\) lines in \(\mathcal{L }\). All of these points of intersection are ordinary as they lie outside the convex hull of the triple points of \(\mathcal{L }\), shaded in blue (Color figure online)

We have now established Proposition 6.4, which is the particular case of Proposition 6.3 in which all the lines \(\ell _i\) pass through a single point. We turn now to the case in which this is not so. The next proposition, together with Proposition 6.4, immediately implies Proposition 6.3 and hence the main result of the section, Proposition 6.1.

Proposition 6.7

Suppose that \(m \geqslant 2\) and that a set \(P \subset \mathbb R \mathbb P ^2\) of size \(n\) lies on a union \(\ell _1 \cup \cdots \cup \ell _m\) of lines, not all of which pass through a single point, and that at least \(\varepsilon n\) points of \(P\) lie on each of the lines \(\ell _i\). Suppose that \(m, \frac{1}{\varepsilon } \leqslant n^{\frac{1}{10000}}\). Then \(P\) spans \(\gg \varepsilon ^{12} n^2/m^6\) ordinary lines.

The proof proceeds via several lemmas. It also requires some additive-combinatorial ingredients not needed elsewhere in the paper, which we collect in the appendix. It is convenient, for this portion of the argument, to work entirely in the affine plane. Let us begin, then, by supposing that a projective transformation has been applied so that all lines \(\ell _i\) and their intersections lie in the affine plane \(\mathbb R ^2\).

Suppose that \(\ell \) is a line. Then by a ratio map on \(\ell \) we mean a map

$$\begin{aligned} \psi = \psi _{q,q^{\prime }}: \ell \rightarrow \mathbb R \cup \{\infty \} \end{aligned}$$

of the form

$$\begin{aligned} \psi _{q,q^{\prime }}(p) = \frac{\mathrm{length }(pq)}{\mathrm{length }(pq^{\prime })}, \end{aligned}$$

where \(q,q^{\prime }\) are distinct points on \(\ell \) and the lengths \(\mathrm{length }(p q),\mathrm{length }( p q^{\prime })\) are signed lengths on \(\ell \). We say that the ratio maps \(\psi _{q, q^{\prime }}\) and \(\psi _{q^{\prime }, q}\) are equivalent, but otherwise all ratio maps are deemed inequivalent. Note that such ratio maps implicitly appeared in the analysis of the infinite near-counterexample (2.2).

An ordered triple of lines \(\ell _i, \ell _j, \ell _k\) not intersecting in a single point defines two ratio maps \(\phi _{i,j,k} : \ell _i \rightarrow \mathbb R \cup \{\infty \}\) and \(\tilde{\phi }_{i,j,k} : \ell _j \rightarrow \mathbb R \cup \{\infty \}\) via

$$\begin{aligned} \phi _{i,j,k} := \psi _{\ell _i \cap \ell _j, \ell _i \cap \ell _k} \qquad \text{ and } \qquad \tilde{\phi }_{i,j,k} := \psi _{\ell _i \cap \ell _j, \ell _j \cap \ell _k}. \end{aligned}$$

We will make considerable use of these maps in what follows, as well as of the following definition.

Definition 6.8

(Quotient set) Suppose that \(X \subset \mathbb R \cup \{\infty \}\) is a set. Then we write \(\mathcal{Q }(X)\) for the set of all quotients \(x_1/x_2\) with \(x_1, x_2 \in X\) and \(x_1, x_2 \notin \{0,\infty \}\).

The following definition depends on the parameter \(n\), which is the number of points in the set \(P\). Since this is fixed throughout the section, we do not indicate dependence on it explicitly.

Definition 6.9

Let \(A\) be a finite subset of some line \(\ell \), and let \(\psi \) be a ratio map on \(\ell \). Then we say that \(A\) is a \(\psi \)-grid if \(A\) is a union of at most \(n^{\frac{1}{30}}\) sets \(S\) such that \(|\mathcal{Q }(\psi (S))| \leqslant n^{1+ \frac{1}{10}}\). We say that \(A\) is a grid if it is a \(\psi \)-grid for some ratio map \(\psi \) on \(\ell \).

Lemma 6.10

Let \(\ell \) be a line, and suppose that \(\psi \) and \(\psi ^{\prime }\) are inequivalent ratio maps on \(\ell \). Suppose that \(A\) is a \(\psi \)-grid and that \(A^{\prime }\) is a \(\psi ^{\prime }\)-grid. Then \(|A \cap A^{\prime }| \ll n^{1 - \frac{1}{25}}\).

Proof

Without loss of generality we may assume that \(\ell \) is the \(x\)-axis parametrised as \(\{(t,0) : t \in \mathbb R \}\). By abuse of notation we identify \(A\) and \(A^{\prime }\) with subsets of \(\mathbb R \). The ratio maps \(\psi , \psi ^{\prime }\) are given by \(\psi (t) = (t+a)/(t+b), \psi ^{\prime }(t) = (t + a^{\prime })/(t + b^{\prime })\) with \(a \ne b, a^{\prime } \ne b^{\prime }\) and \(\{a,b\} \ne \{a^{\prime }, b^{\prime }\}\).

Suppose that \(A\) is a \(\psi \)-grid and that \(A^{\prime }\) is a \(\psi ^{\prime }\)-grid. Write \(A = \bigcup _{i = 1}^{n^{1/30}} S_i\) and \(A^{\prime } = \bigcup _{j = 1}^{n^{1/30}} S^{\prime }_j\) where \(|\mathcal{Q }(\psi (S_i))|\) and \(|\mathcal{Q }(\psi ^{\prime }(S^{\prime }_j))|\) are both at most \(n^{1 + \frac{1}{10}}\). It suffices to show that \(|S_i \cap S^{\prime }_j| \ll n^{\frac{22}{25}}\).

Suppose that \(X\) is the set of all \(x \in S_i \cap S^{\prime }_j\) for which \(\psi (x) > 0\). Since \(X\) is contained in both \(S_i\) and \(S^{\prime }_j, |\mathcal{Q }(\psi (X))|\) and \(|\mathcal{Q }(\psi ^{\prime }(X))|\) are both at most \(n^{1 + \frac{1}{10}}\). Writing \(Y := \{\log \psi (x) : x \in X\}\), we see that

$$\begin{aligned} |Y - Y|, |f(Y) - f(Y)| \leqslant 2n^{1 + \frac{1}{10}}, \end{aligned}$$
(6.1)

where

$$\begin{aligned} f(y)&= \log \circ \psi ^{\prime } \circ \psi ^{-1} \circ \exp (y) \\&= \log ((b - a^{\prime })e^y + a^{\prime } - a) - \log ((b - b^{\prime })e^y + b^{\prime } - a) \\&= \log (Ae^y + B) - \log (A^{\prime } e^y + B^{\prime }), \end{aligned}$$

say, and \(f(Y) := \{f(y) : y \in Y, f(y) \text{ is } \text{ defined } \}\). Note that we do not have \(AA^{\prime } = 0\) and \(B B^{\prime } = 0\) (there are four cases to consider). We thus compute

$$\begin{aligned} f^{\prime \prime }(y) = \frac{e^y (b - a)(b^{\prime } - a^{\prime }) (A A^{\prime } e^{2y} - B B^{\prime })}{(A e^y + B)^2 (A^{\prime } e^y + B^{\prime })^2}. \end{aligned}$$

This is continuous except when \(e^y = -B^{\prime }/A^{\prime }\) or \(-B/A\), and nonzero except when \(e^{2y} = BB^{\prime }/AA^{\prime }\). It follows that \(\mathbb R \) may be split into at most \(4\) pieces on which \(f\) is defined and strictly concave/convex. By Proposition A.9, this implies that at least one of \(|Y - Y|, |f(Y) - f(Y)|\) has size \(\gg |Y|^{5/4}\). Comparing with (6.1) we see that \(|X| = |Y| \ll n^{\frac{22}{25}}\).

An almost identical argument applies when \(X\) is the set of all \(x \in S_i \cap S^{\prime }_j\) for which \(\psi (x) < 0\), taking \(Y := \{ \log (-\psi (x)) : x \in X\}\): now we have \(f(y) = \log (-Ae^y + B) - \log (-A^{\prime } e^y + B^{\prime })\), but the rest of the argument is the same.

Putting these two cases together gives \(|S_i \cap S^{\prime }_j| \ll n^{\frac{22}{25}}\), which is what we wanted to prove. \(\square \)

Ratio maps may be used in understanding the metric properties of intersections of lines as a consequence of Menelaus’s theorem, illustrated in Fig. 16.

Fig. 16
figure 16

An illustration of Menelaus’s theorem. The lengths are signed

Lemma 6.11

Let \(\ell _i, \ell _j, \ell _k\) be three lines not meeting at a point. Let \(X_i \subset \ell _i\) and \(X_j \subset \ell _j\), and let \(\Gamma \subset X_i \times X_j\) be a set of pairs, with neither \(X_i\) nor \(X_j\) containing \(\ell _i \cap \ell _j\). Let \(X_k \subset \ell _k\) be the set of points on \(\ell _k\) formed by intersecting the lines \(\overline{\{x_i ,x_j\}}, (x_i, x_j) \in \Gamma \), with \(\ell _k\). Then

$$\begin{aligned} |X_k| = | \{ \phi _{i,j,k}(x_i)/\tilde{\phi }_{i,j,k}(x_j) : (x_i, x_j) \in \Gamma \}|. \end{aligned}$$

Proof

Apply Menelaus’s theorem with \(AC = \ell _i, AB = \ell _j\) and \(BC = \ell _k\). Suppose that \(x_i \in \ell _i\) and \(x_j \in \ell _j\), and write \(E = x_i, F = x_j\) in the diagram. Then \(D\) is the point at which \(\overline{\{x_i,x_j\}}\) intersects \(\ell _k\). Note that \(\phi _{i,j,k}(x_i) = EA/EC, {\tilde{\phi }}_{i,j,k}(x_j) = FA/FB\). By Menelaus’ theorem it follows that \(\phi _{i,j,k}(x_i)/{\tilde{\phi }}_{i,j,k}(x_j) = DB/DC\). This ratio uniquely determines the point \(D\), and the lemma follows. \(\square \)

Lemma 6.12

Suppose that \(P\) is a set of \(n\) points lying on a union \(\ell _1 \cup \cdots \cup \ell _m\) of lines, that the lines \(\ell _i\) are not all concurrent, and that at least \(\varepsilon n\) points of \(P\) lie on each line \(\ell _i\). Suppose that \(m, \frac{1}{\varepsilon } \leqslant n^{\frac{1}{10000}}\). Then either \(P\) spans at least \(\varepsilon ^2n^2/8\) ordinary lines, or else there are at least two values of \(i\) such that \(P \cap \ell _i\) contains a grid with size \(\gg \varepsilon ^4 n/m^2\).

Proof

By the dual version of the Sylvester–Gallai theorem, there is some pair of lines \(\ell _i, \ell _j\) such that no other line passes through \(\ell _i \cap \ell _j\). Each of \(\ell _i, \ell _j\) contains at least \(\varepsilon n\) points of \(P\), at least \(\varepsilon n - 1 \geqslant \varepsilon n/2\) of which are not the intersection point \(\ell _i \cap \ell _j\). If \(P\) spans fewer than \(\varepsilon ^2 n^2/8\) ordinary lines then there is some \(k\) such that for at least \(\varepsilon ^2 n^2/8m\) pairs \(p_i \in \ell _i, p_j \in \ell _j (p_i, p_j \ne \ell _i \cap \ell _j\)) the line \({\overline{\{p_i ,p_j\}}}\) meets \(\ell _k\) in a point of \(P\). Write \(X_i := (\ell _i \cap P) \setminus (\ell _i \cap \ell _j), X_j := (\ell _j \cap P) \setminus (\ell _i \cap \ell _j)\) and \(\Gamma \subset X_i \times X_j\) for the set of pairs \((p_i, p_j) \in X_i \times X_j\) for which \({\overline{\{p_i, p_j\}}}\) meets \(\ell _k\) in a point of \(X_k = \ell _k \cap P\). By Lemma 6.11 it follows that

$$\begin{aligned} n \geqslant |X_k| \geqslant | \{ \phi _{i,j,k}(x_i)/{\tilde{\phi }}_{i,j,k}(x_j):(x_i, x_j) \in \Gamma \}|. \end{aligned}$$

By Corollary A.2 and the hypothesis on \(m\) and \(\frac{1}{\varepsilon }\) it follows that there are sets \(X^{\prime }_i \subset X_i, X^{\prime }_j \subset X_j\) with \(|X^{\prime }_i|, |X^{\prime }_j| \gg \varepsilon ^4 n/m^2\) and

$$\begin{aligned} |\mathcal{Q }(\phi _{i,j,k}(X^{\prime }_i))|, |\mathcal{Q }({\tilde{\phi }}_{i,j,k}(X^{\prime }_j) )| \ll m^{11} n/\varepsilon ^{22} < n^{1 + \frac{1}{10}}. \end{aligned}$$

Thus certainly \(X^{\prime }_i\) is a \(\phi _{i,j,k}\)-grid and \(X^{\prime }_j\) is a \({\tilde{\phi }}_{i,j,k}\)-grid. \(\square \)

As a result of this lemma we may, in proving Proposition 6.7, restrict attention to sets where \(P \cap \ell _i\) contains a large grid for at least two values of \(i\). We study this situation further in the next lemma, whose proof is a little involved.

Lemma 6.13

Suppose \(P\) is a set of \(n\) points lying on a union of lines \(\ell _1 \cup \cdots \cup \ell _m\), where \(m \leqslant n^{\frac{1}{10000}}\). Suppose that \(i \ne j\) and that \(X_i \subset P \cap \ell _i, X_j \subset P \cap \ell _j\) are grids, both of size at least \(\varepsilon ^{\prime } n\), where \(\varepsilon ^{\prime } \gg n^{-\frac{1}{1000}}\), and neither containing \(\ell _i \cap \ell _j\). Then either \(P\) spans \(\gg (\varepsilon ^{\prime })^3 n^2\) ordinary lines, or else there is a line \(\ell _k\), not passing through \(\ell _i \cap \ell _j\), and a grid \(X_k \subset P \cap \ell _k\), such that all but at most \(\varepsilon ^{\prime } |X_i||X_j|/25\) pairs \((x_i , x_j ) \in X_i \times X_j\) are such that the line \(\overline{\{x_i, x_j\}}\) meets \(\ell _k\) in a point of \(X_k\).

Proof

Write \(\eta := \varepsilon ^{\prime }/100\). Suppose that \(P\) does not contain at least \((\varepsilon ^{\prime })^3 n^2/100 = \eta (\varepsilon ^{\prime } n)^2\) ordinary lines. Then there are at least \((1 - \eta ) |X_i||X_j|\) pairs \((x_i, x_j) \in X_i \times X_j\) such that \(\overline{\{x_i, x_j\}}\) meets some other line \(\ell _k\). Write \(\Gamma _k \subset X_i \times X_j\) for the set of pairs \((x_i, x_j) \in X_i \times X_j\) such that \(\overline{\{x_i, x_j\}}\) meets \(\ell _k\) in a point of \(P\). Thus \(\sum _k |\Gamma _k| \geqslant (1 - \eta ) |X_i||X_j|\). We claim that there is at most one value of \(k\) for which \(|\Gamma _k| \geqslant \eta |X_i||X_j|/m\), and that for this \(k\) (if it exists) the line \(\ell _k\) does not pass through \(\ell _i \cap \ell _j\). Note that if \(|\Gamma _k| \geqslant \eta |X_i||X_j|/m\) then certainly \(|\Gamma _k| \geqslant \delta n^2\), where \(\delta = n^{-1/250}\).

This claim is a consequence of the following two facts.

Fact 1 If \(\ell _k\) passes through \(\ell _i \cap \ell _j\) then \(|\Gamma _k| < \delta n^2\).

Fact 2 If we have two lines \(\ell _k, \ell _{k^{\prime }}\), neither passing through \(\ell _i \cap \ell _j\), then at least one of \(|\Gamma _k|, |\Gamma _{k^{\prime }}|\) has size at most \(\delta n^2\).

Proof of Fact 1

Suppose that \(\ell _k\) passes through \(\ell _i \cap \ell _j\), but that \(|\Gamma _k| \geqslant \delta n^2\). Here (and in the proof of Fact 2 below) we apply an affine transformation so that \(\ell _i\) is the \(x\)-axis and \(\ell _j\) is the \(y\)-axis. Suppose that \(\ell _k\) is the line \(\{(t, \lambda _k t) : t \in \mathbb R \}\). Suppose that \(X_i = \{(a,0) : a \in A\}\) and \(X_j = \{(0,b) : b \in B\}\) and, by slight abuse of notation, identify \(\Gamma _k\) with a subset of \(A \times B\) in the obvious way. A short computation confirms that the intersection of the line through \((a,0) \in \ell _i\) and \((0,b) \in \ell _j\) with \(\ell _k\) is the point \(( \frac{1}{1/x + \lambda _k/y}, \frac{\lambda }{1/x + \lambda _k/y} )\). By Corollary A.2 there are sets \(A^{\prime } \subset A\) and \(B^{\prime } \subset B\) with \(|A^{\prime }|, |B^{\prime }|\gg \delta n\) and \(|\frac{1}{A^{\prime }} - \frac{1}{A^{\prime }}| \ll \delta ^{-11}n \ll n^{1 + \frac{1}{10}}\). Since \(X_i\) is a grid, there is some ratio function \(\psi (t) = (t + \alpha )/(t + \beta ), \alpha \ne \beta \), such that \(A^{\prime } \subset \bigcup _{i = 1}^{n^{1/30}} S_i\), where \(|\mathcal{Q }(\psi (S_i))| \leqslant n^{1 + \frac{1}{10}}\) for each \(i\). Let \(A^{\prime \prime }\) be the largest of the intersections \(A^{\prime } \cap S_i\); then

$$\begin{aligned} |A^{\prime \prime }| \gg \delta n^{1 - \frac{1}{30} }, \, |\mathcal{Q }(\psi (A^{\prime \prime }))| \leqslant n^{1 + \frac{1}{10}} , \, \left| \frac{1}{A^{\prime \prime }} - \frac{1}{A^{\prime \prime }}\right| \ll n^{1 + \frac{1}{10}}. \end{aligned}$$

Writing \(Y := 1/A^{\prime \prime }\) and \(f(y) := \log (1 + \alpha y) - \log (1 + \beta y)\) we thus have

$$\begin{aligned} |Y| \gg \delta n^{1 - \frac{1}{30}}, \, |Y - Y|, |f(Y) - f(Y)| \ll n^{1 + \frac{1}{10}}. \end{aligned}$$
(6.2)

However

$$\begin{aligned} f^{\prime \prime }(y) = \frac{\beta ^2}{(1 + \beta y)^2} - \frac{\alpha ^2}{(1 + \alpha y)^2} = (\alpha - \beta )\frac{2\alpha \beta y - (\alpha + \beta )}{(1 + \alpha y)^2 (1 + \beta y)^2} \end{aligned}$$

is continuous away from \(y = -1/\alpha \) and \(y = - 1/\beta \) and has just one real zero, and therefore one may divide \(\mathbb R \) into at most 4 intervals on the interior of which \(f\) is defined and strictly convex/concave. By Proposition A.9 it follows that one of \(|Y - Y|, |f(Y) - f(Y)|\) has size \(\gg |Y|^{5/4}\). This comfortably contradicts (6.2) for large \(n\). \(\square \)

Proof of Fact 2

Suppose that \(|\Gamma _k|, |\Gamma _{k^{\prime }}| \geqslant \delta n^2\). By Lemma 6.11 and the fact that neither \(\ell _k\) nor \(\ell _{k^{\prime }}\) contains more than \(n\) points we have

$$\begin{aligned} | \{ \phi _{i,j,k}(x_i)/{\tilde{\phi }}_{i,j,k}(x_j) : (x_i, x_j) \in \Gamma _{k}\}| \leqslant n \end{aligned}$$

and

$$\begin{aligned} | \{ \phi _{i,j,k^{\prime }}(x_i)/{\tilde{\phi }}_{i,j,k^{\prime }}(x_j):(x_i, x_j) \in \Gamma _{k^{\prime }} \} | \leqslant n. \end{aligned}$$

Applying Corollary A.2 exactly as before we deduce that there are sets \(X^{(k)}_i, X^{(k^{\prime })}_i \subset X_i\) and sets \(X^{(k)}_j, X^{(k^{\prime })}_j \subset X_j\), all of size \(\gg \delta n > n^{1 - \frac{1}{25}}\), such that all of \(|\mathcal{Q }(\phi _{i,j,k}(X^{(k)}_i)|, |\mathcal{Q }(\phi _{i,j,k^{\prime }}(X^{(k^{\prime })}_i))|, |\mathcal{Q }({\tilde{\phi }}_{i,j,k}(X^{(k)}_j))|\) and \(|\mathcal{Q }({\tilde{\phi }}_{i,j,k^{\prime }}(X^{(k^{\prime })}_j))|\) have size \(\ll \delta ^{-12}n < n^{1 + \frac{1}{10}}\). Thus \(X^{(k)}_i\) is a \(\phi _{i,j,k}\)-grid, \(X^{(k^{\prime })}_i\) is a \(\phi _{i,j,k^{\prime }}\)-grid, \(X^{(k)}_j\) is a \(\tilde{\phi }_{i,j,k}\)-grid and \(X^{(k^{\prime })}_j\) is a \(\tilde{\phi }_{i,j,k^{\prime }}\)-grid. Since \(X_i\) and \(X_j\) are themselves grids, and since grids corresponding to inequivalent ratio functions intersect in a set of size no more than \(n^{1 - \frac{1}{25}}\) by Lemma 6.10, we see that \(\phi _{i,j,k} \sim \phi _{i,j,k^{\prime }}\) and \({\tilde{\phi }}_{i,j,k} \sim {\tilde{\phi }}_{i,j,k^{\prime }}\).

It follows immediately from the definition of \(\phi _{i,j,k}, \phi _{i,j,k^{\prime }}, {\tilde{\phi }}_{i,j,k}, {\tilde{\phi }}_{i,j,k^{\prime }}\) that \(\ell _i \cap \ell _k = \ell _i \cap \ell _{k^{\prime }}\) and \(\ell _j \cap \ell _k = \ell _j \cap \ell _{k^{\prime }}\), and therefore \(\ell _k = \ell _{k^{\prime }}\). This is contrary to assumption, and so we have established Fact 2.

This completes the proof of the claim. It follows that there is some \(k\) such that at least \((1 - 2\eta )|X_i||X_j|\) pairs \(x_i \in X_i, x_j \in X_j\) are such that \({\overline{\{x_i, x_j\}}}\) meets \(P \cap \ell _k\), and furthermore \(\ell _k\) does not pass through \(\ell _i \cap \ell _j\). If \(x_k \in P \cap \ell _k\), we say that \(x_k\) is well-covered if there are at least \(\eta (\varepsilon ^{\prime })^2 n\) pairs \((x_i , x_j) \in X_i \times X_j\) such that \({\overline{\{x_i, x_j\}}}\) passes through \(x_k\). Since \(|P \cap \ell _k| \leqslant n\), the number of pairs \((x_i, x_j)\) used up by poorly-covered \(x_k\) is no more than \(\eta (\varepsilon ^{\prime })^2n^2 \leqslant \eta |X_i||X_j|\). Thus, for at least \((1 - 3\eta )|X_i||X_j|\) pairs \((x_i, x_j)\), the line \(\overline{\{x_i ,x_j\}}\) meets \(\ell _k\) in a well-covered point \(x_k\). Write \({\tilde{X}}_k \subset P \cap \ell _k\) for the set of well-covered points; we are going to show that there is a grid \(X_k\) occupying almost all of \({\tilde{X}}_k\).

Note that, by definition of well-covered, for any \(Y \subset {\tilde{X}}_k\) there is a graph \(\Gamma \subset Y \times X_i, |\Gamma | \geqslant \eta (\varepsilon ^{\prime })^2 |Y||X_i|\), such that each line \({\overline{\{y , x_i\}}}\) meets \(X_j\) whenever \((y, x_i) \in \Gamma \). By Lemma 6.11 and the trivial bound \(|X_j| \leqslant n\), this implies that

$$\begin{aligned} | \{ \phi _{k,i,j}(y)/{\tilde{\phi }}_{k,i,j}(x_i) : (y, x_i) \in \Gamma \} | \leqslant n. \end{aligned}$$

Suppose that \(|Y| \geqslant \eta n\). Then \(|\Gamma | \gg \eta ^2 (\varepsilon ^{\prime })^3 n^2 \gg n^{2-\frac{1}{200}}\). By Corollary A.2 there are sets \(Y^{\prime } \subset Y\) and \(X^{\prime }_i \subset X_i, |Y^{\prime }|, |X^{\prime }_i| \gg n^{1-\frac{1}{200}}\), such that \( | \mathcal{Q }(\phi _{i,j,k}(Y^{\prime }))| < n^{1 +\frac{1}{10}}\). Applying this argument repeatedly, we see that all of \({\tilde{X}}_k\) except for a set of size at most \(\eta n\) can be covered by disjoint sets \(Y^{\prime }\) with these properties. There are at most \(Cn^{\frac{1}{200}} < n^{\frac{1}{30}}\) of these sets, and so the union of them, \(X_k\) say, is a \(\phi _{i,j,k}\)-grid.

Finally, note that the number of pairs \((x_i, x_j)\) for which \({\overline{\{ x_i ,x_j\}}}\) passes through \({\tilde{X}}_k \setminus X_k\) is at most \(n |{\tilde{X}}_k \setminus X_k| \leqslant \eta n\). For all other pairs, \({\overline{\{x_i, x_j\}}}\) passes through the grid \(X_k\).

At last, this completes the proof of the lemma. \(\square \)

We are now in a position to complete the proof of Proposition 6.7 and hence of all the other results in this section.

Proof of Proposition 6.7

Suppose that \(P\) lies on \(\ell _1 \cup \dots \cup \ell _m\), with at least \(\varepsilon n\) points on each line and not all of the lines through a single point. Suppose that \(m, \frac{1}{\varepsilon } \leqslant n^{\frac{1}{10000}}\). By Lemma 6.12 we are done unless there are two values \(i,j \in \{1,\ldots ,m\}\) such that \(P \cap \ell _i, P \cap \ell _j\) both contain grids of size at least \(\varepsilon ^{\prime } n\), with \(\varepsilon ^{\prime } \gg \varepsilon ^4/m^2 > n^{-\frac{1}{1000}}\). Suppose without loss of generality that \(\ell _i\) contains the largest grid amongst all grids in \(P\); call this \(X_i\). Let \(X_j\) be a grid in \(P \cap \ell _j\) of size at least \(\varepsilon ^{\prime } n\), and apply Lemma 6.13 to these two grids \(X_i, X_j\). If the first conclusion of that lemma holds then \(P\) contains \(\gg (\varepsilon ^{\prime } )^3 n^2 \gg \varepsilon ^{12} n^2/m^6\) ordinary lines, and we are done. Otherwise there is a line \(\ell _k\), not passing through \(\ell _i \cap \ell _j\), and a grid \(X_k \subset P \cap \ell _k\), such that for all \((x_i, x_j)\) in a set \(\Gamma \) of size at least \((1 - 4\eta ) |X_i||X_j|\), the line \({\overline{\{x_i,x_j\}}}\) meets \(\ell _k\) in a point of \(X_k\).

By Lemma 6.11 we have

$$\begin{aligned} |\{ \phi _{i,j,k}(x_i)/{\tilde{\phi }}_{i,j,k}(x_j) : (x_i, x_j) \in \Gamma \}| \leqslant |X_k|. \end{aligned}$$

It follows from Lemma A.4 that

$$\begin{aligned} |X_k| \geqslant |X_i| + |X_j| - 4 - 4\sqrt{2\eta |X_i||X_j|}. \end{aligned}$$

Since \(\eta = \varepsilon ^{\prime }/100\) and \(n\) is sufficiently large, this is strictly greater than \(|X_i|\), contrary to the assumption that \(X_i\) was a grid in \(P\) of largest size. \(\square \)

In this rather long section, we have proved results about sets \(P\), contained in a union \(\ell _1 \cup \cdots \cup \ell _m\) of lines, spanning few ordinary lines. They are plausibly of independent interest. Let us now, however, return to our main task and combine what we have established with the main results of previous sections. By combining Proposition 6.1 (using the bound on \(n_0(m,K)\) noted after Proposition 6.3) with Proposition 5.3 we obtain the following structural result. From the qualitative point of view at least, this supersedes all previous results in the paper (and, in particular, implies Theorem 1.4 as a corollary).

Proposition 6.14

Suppose that \(P\) is a set of \(n\) points in \(\mathbb R \mathbb P ^2\) for some \(n \geqslant 100\) spanning at most \(Kn\) ordinary lines, where \(1 \leqslant K \leqslant c(\log \log n)^c\) for some sufficiently small absolute constant \(c\). Then \(P\) differs in at most \(O(K^{O(1)})\) points from a subset of a set of one of the following three types:

  1. (i)

    An irreducible cubic curve;

  2. (ii)

    The union of an irreducible conic and a line;

  3. (iii)

    A line.

Proof

We apply Proposition 5.3. In case (i) we are already done. In cases (ii) and (iii) we see that, after removing an irreducible conic if necessary, that we have \(\geqslant n/2 - O(K^{O(1)})\) points on \(O(K^{O(1)})\) lines determining at most \(O(K^{O(1)}) n\) ordinary lines. By Proposition 6.1 and the hypothesis \(K \leqslant (\log \log n)^c\), all but \(O(K^{O(1)})\) of these points lie on a single line, and the claim follows. \(\square \)

7 The Detailed Structure Theorem

We turn now to the proof of the detailed structure theorem, Theorem 1.5. We first establish a slightly weaker version in which the linear bounds \(O(K)\) have been relaxed to polynomial bounds \(O(K^{O(1)})\). This somewhat weaker statement is already sufficient for our main application, the proof of the Dirac–Motzkin conjecture for large \(n\). At the end of this section we indicate how to recover the full strength of Theorem 1.5.

Theorem 7.1

Suppose that \(P\) is a finite set of \(n\) points in the projective plane \(\mathbb R \mathbb P ^2\). Suppose that \(P\) spans at most \(Kn\) ordinary lines for some \(K \geqslant 1\), and suppose that \(n \geqslant \exp \exp (CK^C)\) for some sufficiently large absolute constant \(C\). Then, after applying a projective transformation if necessary, \(P\) differs by at most \(O(K^{O(1)})\) points from an example of one of the following three types:

  1. (i)

    \(n-O(K^{O(1)})\) points on a line;

  2. (ii)

    The set \(X_{2m}\) defined in (1.1) for some \(m = \frac{1}{2}n - O(K^{O(1)})\);

  3. (iii)

    A coset \(H \oplus x, 3x \in H\), of some finite subgroup \(H\) of the real points on an irreducible cubic curve with \(H\) having cardinality \(n+O(K^{O(1)})\).

We now prove this theorem. We already know from Proposition 6.14 that a set with at most \(Kn\) ordinary lines must mostly lie on a line, the union of a conic and a line, or an irreducible cubic curve, and the proof of Theorem 1.5 proceeds by analysing the second two possibilities further.

The key to doing this is the fact that collinearities on (possibly reducible) cubic curves are related to group structure. This is particularly clear in the case of an irreducible cubic, as we briefly discussed in Sect. 2. However, one can see some group structure even in somewhat degenerate cases.

An important ingredient in our analysis is the following result of a fairly standard type from additive combinatorics, which may be thought of as a kind of structure theorem for triples of sets \(A, B, C\) with few “arithmetic ordinary lines”.

Proposition A.5 Suppose that \(A, B, C\) are three subsets of some abelian group \(G\), all of cardinality within \(K\) of \(n\), where \(K \leqslant \varepsilon n\) for some absolute constant \(\varepsilon > 0\). Suppose that there are at most \(Kn\) pairs \((a,b) \in A \times B\) for which \(a +b \notin C\). Then there is a subgroup \(H \leqslant G\) and cosets \(x + H, y + H\) such that \(|A \triangle (x + H)|, |B \triangle (y + H)|, |C \triangle (x + y + H)| \leqslant 7K\).

This result will not be at all surprising to the those initiated in additive combinatorics, but we do not know of a convenient reference for it. We supply a complete proof in Appendix A.

Suppose that \(P\) is mostly contained in a cubic curve \(\gamma \) which is not a line. We subdivide into two cases according to whether \(\gamma \) is irreducible or not.

Lemma 7.2

(Configurations mostly on an irreducible cubic) Suppose that \(P\) is a set of \(n\) points in \(\mathbb R \mathbb P ^2\) spanning at most \(Kn\) ordinary lines. Suppose that all but \(K\) of the points of \(P\) lie on an irreducible cubic curve \(\gamma \), and suppose that \(n \geqslant CK\) for a suitably large absolute constant \(C\). Then there is a coset \(H \oplus x\) of \(\gamma \) with \(3x = x \oplus x \oplus x \in H\) such that \(|P \triangle (H \oplus x)| = O(K)\). In particular, \(\gamma \) is either an elliptic curve or an acnodal cubic.

Proof

Let \(\gamma ^*\) be the smooth points of \(\gamma \), which we give a group law as in Sect. 2.

Set \(P^{\prime } := P \cap \gamma ^*\). Then \(|P^{\prime }| = |P| + O(K)\), and \(P^{\prime }\) spans at most \(O(Kn)\) ordinary lines. If \(p_1, p_2 \in \gamma ^*\) are distinct then the line joining \(p_1\) and \(p_2\) meets \(\gamma ^*\) again in the unique point \( \ominus p_1 \ominus p_2\). This assumption implies that \(\ominus p_1 \ominus p_2 \in P^{\prime }\) for all but at most \(O(Kn)\) pairs \(p_1, p_2 \in P^{\prime }\). Applying Proposition A.5, it follows that there is a coset \(H \oplus x\) of \(\gamma ^*\) such that \(|P \triangle (H \oplus x)| = O(K)\) and also \(|P \triangle (H \ominus 2x)| = O(K)\). From this it follows that \(|(H \oplus x) \triangle (H \ominus 2x)| = O(K)\), which implies that \(3x \in H\) if \(n \geqslant CK\) is large enough.

If \(\gamma \) is not an elliptic curve or an acnodal cubic then the group \(\gamma ^*\) is isomorphic to \(\mathbb R \) or to \(\mathbb R \times \mathbb Z /2\mathbb Z \), and neither of these groups has a finite subgroup of size larger than 2. \(\square \)

We turn now to the consideration of sets \(P\) which are almost contained in the union of an irreducible conic \(\sigma \) and a line \(\ell \). The union \(\sigma \cup \ell \) is a reducible cubic and so does not have a bona fide group law. There is, however, a very good substitute for one as the following proposition shows. In what follows we write \(\sigma ^* := \sigma \setminus (\sigma \cap \ell )\) and \(\ell ^* := \ell \setminus (\sigma \cap \ell )\). Note that the intersection \(\sigma \cap \ell \) has size \(0, 1\) or \(2\).

Proposition 7.3

(Quasi-group law) Suppose that \(\sigma \) is an irreducible conic and that \(\ell \) is a line. Then there is an abelian group \(G = G_{\sigma , \ell }\) with operation \(\oplus \) and bijective maps \(\psi _{\sigma } : G \rightarrow \sigma ^*, \psi _{\ell } : G \rightarrow \ell ^*\) such that \(\psi _{\sigma }(x), \psi _{\sigma }(y)\) and \(\psi _{\ell }(z)\) are collinear precisely if \(x \oplus y \oplus z = 0\). Furthermore \(G_{\sigma , \ell }\) is isomorphic to \(\mathbb Z /2\mathbb Z \times \mathbb R \) if \(|\sigma \cap \ell | = 2\), to \(\mathbb R \) if \(|\sigma \cap \ell | = 1\) and to \(\mathbb R /\mathbb Z \) if \(|\sigma \cap \ell | = 0\).

Proof

This is certainly a known result, but it is also an easy and fun exercise to work through by hand, as we now sketch. If \(|\sigma \cap \ell | = 2\), we may apply a projective transformation to move the two points of intersection to \([0,0,1]\) and \([0,1,0]\), and \(\sigma ^*\) to the parabola \(\{[a,a^2,1] : a \in \mathbb R ^\times \}\) and \(\ell ^*\) to \(\{[0,-b,1] : b \in \mathbb R ^{\times } \}\). We note that \([a_1,a_1^2,1], [a_2, a_2^2, 1]\) and \([0,-b,1]\) are collinear if and only if \(b = a_1 a_2\).

Consider the maps \(\psi _{\ell } : \mathbb Z /2\mathbb Z \times \mathbb R \rightarrow \ell ^*\) defined by \(\psi _{\ell }(\varepsilon , x) = [0, -b, 1]\) where \(b = (-1)^\varepsilon e^{-x}\) and \(\psi _{\sigma } : \mathbb Z /2\mathbb Z \times \mathbb R \rightarrow \sigma ^*\) defined by \(\psi _{\sigma }(\varepsilon , x) = [a, a^2, 1]\) where \(a = (-1)^\varepsilon e^x\). Then we see that the maps \(\psi _{\ell }, \psi _{\sigma }\) are bijections and that the claimed collinearity property holds.

Now suppose that we are in case (ii), that is to say \(|\sigma \cap \ell | = 1\). Applying a projective transformation, we may suppose that the point of intersection is \([0,1,0]\) and move \(\sigma ^*\) to the parabola \(\{ [a,a^2,1] : a \in \mathbb R \}\) and \(\ell ^*\) to the line at infinity \(\{ [1,-b,0]: b \in \mathbb R \}\). Now note that if \([a_1,a_1^2,1], [a_2, a_2^2, 1]\) and \([1,-b,0]\) are distinct and collinear, then \(a_1 + a_2 + b = 0\) [cf. the near-example (2.4)].

Finally suppose that \(\sigma \) and \(\ell \) do not intersect. Applying a projective transformation we may map \(\ell \) to the line at infinity \(\{ [\sin \pi \theta , \cos \pi \theta , 0]: \theta \in \mathbb R /\mathbb Z \}\). As \(\sigma \) is disjoint from \(\ell \) it must be a compact conic section in \(\mathbb R ^2\), that is to say an ellipse. By a further affine transformation we may assume that it is in fact the unit circle \(\{[\cos 2\pi \theta , \sin 2 \pi \theta ,1] : \theta \in \mathbb R /\mathbb Z \}\). By elementary trigonometry it may be verified that the points \([\cos 2\pi \alpha _1, \sin 2 \pi \alpha _1, 1], [\cos 2\pi \alpha _2, \sin 2 \pi \alpha _2, 1]\) and \([\sin \pi \beta , \cos \pi \beta , 0]\) are collinear if and only if \(\alpha _1 + \alpha _2 + \beta = 0\), thus in this case the result is true with \(\psi _{\sigma }(\theta ) = [\cos 2\pi \theta , \sin 2 \pi \theta ,1] \) and \(\psi _{\ell }(\theta ) = [\sin \pi \theta , \cos \pi \theta , 0]\). \(\square \)

We may now derive the following consequence, analogously to Lemma 7.2.

Lemma 7.4

(Conic and line) Suppose that \(P \subset \mathbb R \mathbb P ^2\) is a set of \(n \geqslant n_0(K)\) points, all except \(K\) of which lie on the union of an irreducible conic \(\sigma \) and a line \(\ell \). Suppose that \(P\) defines at most \(Kn\) ordinary lines, and suppose that \(P\) has \(n/2 + O(K)\) points on each of \(\sigma \) and \(\ell \). Then, after a projective transformation, \(P\) differs from one of the sets \(X_{n^{\prime }}\) by at most \(O(K)\) points.

Proof

Write \(P^{\prime } := P \cap (\sigma \cup \ell )\). Set \(P_{\sigma } := P \cap \sigma ^*\) and \(P_{\ell } := P \cap \ell ^*\). Then \(|P_{\sigma }| + |P_{\ell }| = |P| + O(K)\), and \(P^{\prime }\) spans at most \(O(Kn)\) ordinary lines. Consider also the pull-backs \(A := \psi _{\sigma }^{-1}(P_{\sigma })\) and \(B := \psi _{\ell }^{-1}(P_{\ell })\), where \(\psi _{\ell }, \psi _{\sigma }\) are the “quasi-group law” maps introduced in the preceding proposition. Both \(A\) and \(B\) are subsets of \(G_{\sigma , \ell }\), a group for which there are three possibilities, detailed in Proposition 7.3.

The assumption about ordinary lines implies that \(\ominus a_1 \ominus a_2 \in B\) for all but at most \(O(K n)\) pairs \(a_1, a_2 \in A\). Applying Proposition A.5, it follows that there is a subgroup \(H \leqslant G_{\gamma }\) and cosets \(x \oplus H, -2x \oplus H\) such that \(|A \triangle (x \oplus H)|, |B \triangle (-2x \oplus H)| = O(K)\).

If \(n \geqslant CK\) for large enough \(C\) then it follows that \(|\sigma \cap \ell | = 0\) since (with reference to the three possibilities for \(G_{\sigma , \ell }\) described in Proposition 7.3) neither \(\mathbb Z /2\mathbb Z \times \mathbb R \) nor \(\mathbb R \) has a finite subgroup of size larger than 2. Applying a projective transformation, we may assume that \(\ell \) is the line at infinity and, as in the proof of Proposition 7.3, that \(\sigma \) is the unit circle. We may apply a further rotation so that \(x = 0\), that is to say \(|A \triangle H| = O(K)\) and \(|B \triangle H| = O(K)\).

All finite subgroups of \(\mathbb R /\mathbb Z \) are cyclic and so we have \(H = \{j/m:j \in \{0,1,\dots , m-1\}\}\) for some \(m\), that is to say \(H\) consists of the (additive) \(m\)th roots of unity. But then \(\psi _{\sigma }(H) \cup \psi _{\ell }(H)\) is precisely the set \(X_{n^{\prime }} = X_{2m}\) described in the introduction and in the statement of Theorem 1.5. \(\square \)

Putting Lemmas 7.2 and 7.4 together with the main result of the previous section, Proposition 6.14, we immediately obtain Theorem 7.1. The remainder of this section is devoted to establishing our most precise structure theorem, Theorem 1.5. Let us remind the reader that this is the same as Theorem 7.1, only the polynomial error terms \(O(K^{O(1)})\) are replaced by linear errors \(O(K)\). The reader interested in the proof of the Dirac–Motzkin conjecture for large \(n\) may proceed immediately to the next section, where Theorem 7.1 is already sufficient.

Proof of Theorem 1.5

The converse claim to this theorem already follows from the analysis in Sect. 2, so we focus on the forward claim. We may assume that the constant \(C\) is sufficiently large. We then apply Theorem 7.1 to obtain (after a projective transformation) that \(P\) differs by \(O(K^{O(1)})\) points from one of the three examples (i), (ii), (iii) listed. Our task is to bootstrap this \(O(K^{O(1)})\) error to a linear error \(O(K)\).

Suppose first that case (i) holds, thus all except \(O(K^{O(1)})\) points of \(P\) lie on a line \(\ell \). Then every point \(p\) in \(P\) that does not lie on \(\ell \) forms at least \(n-O(K^{O(1)})\) lines with a point in \(P \cap \ell \). At most \(O(K^{O(1)})\) of these can meet a further point in \(P\), so each point in \(P \backslash \ell \) produces at least \(n-O(K^{O(1)})\) ordinary lines connecting that point with a point in \(P \cap \ell \). We conclude that the number of ordinary lines is at least \((n-O(K^{O(1)})) |P \backslash \ell |\); since there are at most \(Kn\) ordinary lines, we conclude that \(|P \backslash \ell | = O(K)\), and the claim follows.

Now suppose that case (ii) holds, thus \(P\) differs by \(O(K^{O(1)})\) points from \(X_{2m}\) for some \(m = \frac{1}{2}n-O(K^{O(1)})\). To analyse this we need the following result, essentially due to Poonen and Rubinstein [29]. \(\square \)

Proposition 7.5

Let \(\Pi _n \subset \mathbb{C }\equiv \mathbb R ^2\) denote the regular \(n\)-gon consisting of the \(n\)th roots of unity. Then no point other than the origin or an element of \(\Pi _n\) lies on more than \(C\) lines joining pairs of vertices in \(\Pi _n\), for some absolute constant \(C\).

Actually, in [29] it was shown that \(C\) could be taken to be 7 when one restricts attention to points inside the unit circle. The case of points outside the unit circle was not directly treated in that paper, but can be handled by a variant of the methods of that paper. See Appendix B for details. For the purposes of establishing Theorem 1.5, the full strength of Proposition 7.5 is unnecessary. Indeed, the more elementary weaker version established in Proposition B.2 would also suffice for this purpose.

Corollary 7.6

Suppose that \(p \in \mathbb R \mathbb P ^2\) does not lie on the line at infinity, is not an \(m\)th root of unity and is not the origin [0,0,1]. Then at least \(2m - O(1)\) of the \(2m\) lines joining \(p\) to a point of \(X_{2m}\) pass through no other point of \(X_{2m}\).

Proof

If \(x \in X_{2m}\) then we say that \(x\) is bad if the line \(\overline{\{p,x\}}\) passes through another point \(y\in X_{2m}, y \ne x\). Suppose first that \(x\) is an \(m^{\mathrm{th }}\) root of unity. We claim that if \(x\) is bad then \(px\) passes through another \(m\)th root of unity, different from \(x\), or else \(\overline{\{p,x\}}\) is tangent to the unit circle. If \(y\) is already an \(m\)th root of unity then we are done; otherwise \(y\) is one of the \(m\) points on the line at infinity. But then the line \(\overline{\{x,y\}}\) passes through another \(m\)th root of unity \(x^{\prime }\) unless it is tangent to the unit circle, and we have proved the claim.

This enables us to count the number of bad \(x\) which are \(m\)th roots of unity. There are at most two coming from the possibility that \(\overline{\{p,x\}}\) is tangent to the unit circle. Otherwise, \(\overline{\{ p, x\}}\) contains another \(m\)th root of unity \(x^{\prime }\), whence \(p\) lies on the line \(\overline{\{x,x^{\prime }\}}\). This gives at most \(O(1)\) further possibilities by Proposition 7.5.

Now suppose that \(x\) is one of the \(m\) points on the line at infinity. If \(\overline{\{p,x\}}\) passes through an \(m\)th root of unity \(y\) then it passes through another such root of unity \(y^{\prime }\) unless \(\overline{\{p,x\}}\) is tangent to the unit circle. There are at most \(2\) points on the line at infinity corresponding to the tangent lines, and then at most \(O(1)\) corresponding to the chords \(\overline{\{y ,y^{\prime }\}}\) on which \(p\) lies, by another application of Proposition 7.5. \(\square \)

We return now to the analysis of case (ii). Let \(p\) be a point of \(P\) not on either the unit circle or the line at infinity. Then by Corollary 7.6, only \(O(K^{O(1)})\) of the \(n-O(K^{O(1)})\) lines connecting \(p\) with \(X_{2m}\), also meet another element of \(X_{2m}\). As \(P\) only differs from \(X_{2m}\) by \(O(K^{O(1)})\) points, we conclude that there are \(n-O(K^{O(1)})\) ordinary lines of \(P\) that connect \(p\) with an element of \(X_{2m}\). As in case (i), this implies that there are at most \(O(K)\) points of \(P\) lying outside the union of the unit circle and the line at infinity. Applying Lemma 7.4, we obtain the claim.

Finally, we consider the case (iii). By Lemma 7.2, it suffices to show that there are at most \(O(K)\) points of \(P\) that do not lie on the curve \(E\), which is either an elliptic curve or the smooth points of an acnodal singular cubic curve. By the same argument used to handle cases (i) and (ii), it then suffices to show that each point \(p\) in \(P \backslash E\) generates \(\gg n\) ordinary lines in \(P\).

For this, it suffices to establish the following lemma.

Lemma 7.7

Suppose that \(E\) is an elliptic curve or the smooth points of an acnodal singular cubic curve and that \(H \oplus x\) is a coset of a finite subgroup of \(E\) of size \(n > 10^4\). Then, if \(p \notin E\) is a point, there are at least \(n/1000\) lines through \(p\) that meet exactly one element of \(H \oplus x\). \(\square \)

Remark

The constant \(1/1000\) could be improved a little by our methods, but we have not bothered to perform such an optimisation here. It seems reasonable to conjecture, in analogy with the results in [29], that in fact there are only \(O(1)\) lines through \(p\) that can meet three elements of a coset \(x \oplus H\) of a finite subgroup of an elliptic curve, but this would seem to lie far deeper.

Proof

We first exclude one degenerate case, in which \(E\) is the smooth points of an acnodal singular cubic curve, and \(p\) is the isolated (i.e. acnodal) singular point of that curve. In this case, any line through \(p\) meets exactly one point of \(E\), and the claim is trivial. Thus, we may assume that \(p\) does not lie on the cubic curve that contains \(E\).

Suppose the result is false. Then at least \(0.999n\) of the lines joining \(p\) to \(x \oplus H\) meet \(x \oplus H\) in \(2\) or \(3\) points. In the former case, the line must be tangent to \(E\). There are at most \(6\) such tangents.Footnote 7 Thus at least \(0.998n\) of the lines joining \(p\) to points of \(x \oplus H\) meet \(x \oplus H\) in 3 points.

As a topological group, the cubic curve \(E\) is isomorphic to either \(\mathbb R /\mathbb Z \) or \(\mathbb R /\mathbb Z \times (\mathbb Z /2\mathbb Z )\). Consider all the lines through \(p\) that are tangent to \(E\); there are at most \(6\) such lines, each meeting \(E\) in at most \(2\) points. These (at most) 12 points partition \(E\) into no more than \(13\) connected open sets \(A_1,\ldots ,A_{13}\) (topologically, these are either arcs or closed loops), plus \(12\) endpoint vertices. From a continuity argument, we see that for each \(i\) one of the following statements is true:

  1. (i)

    the lines connecting \(p\) to points of \(A_i\) do not meet \(E\) again;

  2. (ii)

    there exist \(A_j, A_k\), distinct from each other and from \(A_i\), such that any line connecting \(p\) and a point in \(A_i\) meets \(E\) again, once at a point in \(A_j\), and once at a point in \(A_k\).

Suppose that \(i\) is of type (i). Then by our supposition that the lemma is false we may assume that \(|A_i \cap (H \oplus x)| < 0.001n\), since all the lines from \(p\) to \(A_i \cap (H \oplus x)\) contain no other point of \(E\). By the pigeonhole principle there is some \(i\) of type (ii) with \(|A_i \cap (H \oplus x)| > \frac{1}{13}(1-0.012) n > 0.05n > 3\). By property (ii), lines from \(p\) through \(A_i\) meet the curve \(E\) again in \(A_j\) and \(A_k\).

Recall that for all except at most \(0.002n\) elements \(q\) of \(A_i \cap (H \oplus x)\), the line \(\overline{\{p,q\}}\) meets \(A_j\) and \(A_k\) at elements of \(H \oplus x\). It is easy to conclude from this, and similar statements for \(j,k\), that the sizes of \(A_i \cap (H \oplus x), A_j \cap (H \oplus x)\) and \(A_k \cap (H \oplus x)\) differ by at most \(0.004 n\).

Let \(\phi _{ij}: A_i \rightarrow A_j\) be the map that sends a point \(q\) in \(A_i\) to the point \(\overline{\{p,q\}} \cap A_j\), then \(\phi _{ij}\) is a homeomorphism from the set \(A_i\) to the set \(A_j\); in particular, \(\phi _{ij}\) is either orientation-preserving or orientation-reversing, once one places an orientation on both \(A_i\) and \(A_j\). Furthermore, \(\phi _{ij}\) maps all but at most \(0.002 n\) of the elements of \(A_i \cap (H \oplus x)\) to \(A_j \cap (H \oplus x)\) and vice versa. Now as \(H\) is a subgroup of \(E\), which as an abelian topological group is either \(\mathbb R /\mathbb Z \) or \(\mathbb R /\mathbb Z \times (\mathbb Z /2\mathbb Z )\), we see that the sets \(A_i \cap (H \oplus x), A_j \cap (H \oplus x)\), being intersections of arcs in \(\mathbb R /\mathbb Z \) with the discrete coset \(x \oplus H\), are arithmetic progressions in \(x \oplus H\) with a common spacing \(h\).

Now for all but at most \(0.002 n\) values of \(y \in A_i \cap (H \oplus x), \phi _{ij}\) maps \(y\) to a point of \(A_j \cap (H \oplus x)\). For all but at most \(1 + 0.002 n\) values of \(y \in A_i \cap (H \oplus x), \phi _{ij}\) maps \(y \oplus h\) to a point of \(A_j \cap (H \oplus x)\). (The extra \(1\) comes from the fact that there is one endpoint value of \(y\) in the progression \(A_i \cap (x \oplus H)\) for which \(y \oplus h\) does not lie in this progression.) Of the values of \(y\) satisfying both of these statements, for all but at most \(0.004n\) values we have

$$\begin{aligned} \phi _{ij}(y \oplus h) = \phi _{ij}(y) \oplus h^{\prime } \end{aligned}$$
(7.1)

for \(h^{\prime }\) equal to either \(h\) or \(\ominus h\) (depending on whether \(\phi _{ij}\) is orientation-preserving or orientation-reversing). Thus (7.1) holds for all except at most \(0.01n\) values of \(y \in A_i \cap (H \oplus x)\).

Similarly, defining \(\phi _{ik}\) in exactly the same way as \(\phi _{ij}\), we see that for all except at most \(0.01n\) elements \(y\) in \(A_i \cap (H \oplus x)\) we have

$$\begin{aligned} \phi _{ik}(y \oplus h) = \phi _{ik}(y) \oplus h^{\prime \prime } \end{aligned}$$
(7.2)

for \(h^{\prime \prime }\) equal to either \(h\) or \(\ominus h\).

Recalling that \(|A_i \cap (x \oplus H)| > 0.05n\), we may thus find \(y \in (x \oplus H) \cap A_i\) such that both (7.1) and (7.2) hold. On the other hand, as \(y, \phi _{ij}(y), \phi _{ik}(y)\) are collinear, we have

$$\begin{aligned} y \oplus \phi _{ij}(y) \oplus \phi _{ik}(y) = O \end{aligned}$$

and similarly

$$\begin{aligned} y \oplus h \oplus \phi _{ij}(y \oplus h) \oplus \phi _{ik}(y \oplus h) = O. \end{aligned}$$

From these equations and (7.1), (7.2) we conclude that

$$\begin{aligned} h \oplus h^{\prime } \oplus h^{\prime \prime } = O. \end{aligned}$$

Since \(h^{\prime }, h^{\prime \prime }\) are equal to either \(h\) or \(\ominus h\), we conclude that \(h\) has order at most 3, and so \(|A_i \cap (H \oplus x)| \leqslant 3\). However we have already observed that \(|A_i \cap (H \oplus x)| > 3\), a contradiction. \(\square \)

The proof of Theorem 1.5 is now complete.

8 The Dirac–Motzkin Conjecture

The Dirac–Motzkin conjecture is the statement that, for \(n\) large, a set of \(P\) points in \(\mathbb R ^2\) not all lying on a line spans at least \(n/2\) ordinary lines. The main result of this paper is a proof of a more precise version of this for large \(n\), Theorem 2.2, together with a characterization of the extremal examples. We prove the even more precise Theorem 2.4, of which Theorem 2.2 is an easy consequence, in this section. We refer the reader to Sect. 2 for a precise statement of these two results and a leisurely discussion of the relevant examples.

Suppose that \(P\) spans at most \(n\) ordinary lines and that \(P\) is not collinear. We may apply our main structure theorem, Theorem 1.5, to conclude that \(P\) differs in \(O(1)\) points from one of three examples: points on a line, a set \(X_{n^{\prime }}\), and a coset of a subgroup of an irreducible cubic curve (Sylvester-type example). In fact, the weaker and rather easier Theorem 7.1 suffices for this purpose.

It is obvious that the first type of set spans at least \(n - O(1)\) ordinary lines. Sets close to a Sylvester example are also relatively easy to handle.

Lemma 8.1

Suppose that \(P \subset \mathbb R \mathbb P ^2\) differs in \(K\) points from a coset \(H \oplus x\) of a subgroup \(H\) of some irreducible cubic curve, where \(3x = x \oplus x \oplus x \in H\). Then \(P\) spans at least \(n - O(K)\) ordinary lines.

Proof

Write \(h_0 := 3x\), thus \(h_0 \in H\). For every \(h \in H\), the line joining \(h \oplus x\) and \((-2h \ominus h_0) \oplus x\) is tangent to \(\gamma \) at \(h \oplus x\), since \((h \oplus x) \oplus (h \oplus x) \oplus (-2h \ominus h_0 \oplus x) = 0\). Therefore it is an ordinary line unless \(3h \oplus h_0 = 0\), in which case the points \(h \oplus x\) and \(-2h \ominus h_0 \oplus x\) coincide. Thus the only points of \(H \oplus x\) not belonging to an ordinary line spanned by \(H \oplus x\) correspond to the points of \(H\) with \(3h \oplus h_0 = 0\). Since \(H\) is isomorphic to a subgroup of either \(\mathbb R /\mathbb Z \) or \(\mathbb Z /2\mathbb Z \times \mathbb R /\mathbb Z \), there are no more than \(3\) of these. It follows immediately that any set formed by removing at most \(K\) points of \(H \oplus x\) has at least \(n - O(K)\) ordinary lines, and these are all tangent lines to \(\gamma \). No point in the plane lies on more than 6 tangent lines to \(\gamma \), and so the addition of a point destroys no more than 6 of our \(n - O(K)\) ordinary lines. It follows that \(P\) itself spans at least \(n - O(K)\) ordinary lines, as we wanted to prove. \(\square \)

Combining this lemma with the remarks just preceding it, we have now established the existence of an absolute constant \(C\) such that a set of \(n\) points, not all on a line, and spanning at most \(n - C\) ordinary lines, differs in \(O(1)\) points from a set \(X_{2m}\) consisting of the \(m\)th roots of unity plus \(m\) corresponding points on the line at infinity. Now the \(m\) tangents to the unit circle at roots of unity pass through only one other point of \(X_{2m}\), and so \(X_{2m}\) has \(m\) ordinary lines. Furthermore, since each point not on the unit circle can be incident to at most two such tangent lines, the addition/deletion of \(O(1)\) points does not affect more than \(O(1)\) of these lines. This already establishes a weak version of the Dirac–Motzkin conjecture: every non-collinear set of \(n\) points spans at least \(n/2 - O(1)\) ordinary lines.

To prove Theorem 2.4, a much more precise result, we must analyse configurations close to \(X_{2m}\) more carefully. What is needed is precisely the following result which, together with what we have already said in this section, completes the proof of Theorem 2.4. Recall from Sect. 2 the examples of Böröczky.

Proposition 8.2

There is an absolute constant \(C\) such that the following is true. Suppose that \(P\) differs from \(X_{2m}\) in at most \(K\) points, and that \(P\) spans at most \(2m - CK\) ordinary lines. Then \(P\) is a Böröczky example or a near-Böröczky example.

We now prove this proposition. Suppose that \(P\) differs from \(X_{2m}\) in at most \(K\) points. Suppose first of all that \(P\) contains a point \(p\) outside \(X_{2m}\), that \(p\) does not lie on the line at infinity and that \(p \ne [0,0,1]\). Then by Corollary 7.6, the \(m\) tangent lines at the unit circle, as well as at least \(m - O(1)\) of the lines \(\overline{\{p,x\}}\) connecting \(p\) to a point \(x \in X_{2m}\), pass through precisely \(2\) points of \(X_{2m} \cup \{p\}\). It is clear that the addition/deletion of \(K\) points other than \(p\) cannot add or delete points on more than \(O(K)\) of these lines, and so \(P\) spans \(2m - O(K)\) ordinary lines.

Now suppose that \(P\) contains an additional point \(p\) on the line at infinity. Then the \(m\) tangent lines to the \(m\)th roots of unity, as well as the at least \(2m - 2\) lines \({\overline{\{p,x\}}}, x\) an \(m\)th root of unity, which are not tangent to the unit circle contain precisely two points of \(X_{2m} \cup \{p\}\). Once again the addition/deletion of \(K\) points other than \(p\) cannot add or delete points on more than \(O(K)\) of these lines, and so again \(P\) spans \(3m - O(K)\) ordinary lines.

We have now reduced matters to the case \(P \subset X_{2m} \cup [0,0,1]\). Starting from \(X_{2m}\), the omission of a point or the addition of \([0,0,1]\) creates a certain number of new lines with precisely two points, and of course no point other than \([0,0,1]\) or the omitted point is on more than one of these new lines. By inspection in any case there are always at least \(m/2 - O(1)\) of these lines, and so there are at least \(2m - O(K)\) ordinary lines unless we do at most one of the operations of adding \([0,0,1]\) or removing a point of \(X_{2m}\). At this point a short inspection of the possibilities leads to the conclusion that the Böröczky examples and the near-Böroczky examples are the only ones which do not have at least \(3m - O(1)\) ordinary lines. This, at last, concludes the proof of Theorem 2.4. \(\square \)

Remark

We relied on Corollary 7.6, which depended on the result of Poonen and Rubinstein [29]. For the purposes of proving the Dirac–Motzkin conjecture for large \(n\), the somewhat easier Proposition B.2 is sufficient.

9 The Orchard Problem

In this section we establish Theorem 1.3, the statement that a set of \(n\) points in the plane contains no more than \(\lfloor \frac{1}{6} n (n-3) \rfloor + 1\) \(3\)-rich lines when \(n\) is sufficiently large. The sharpness of this bound was established in Proposition 2.6.

If \(N_k\) is the number of lines containing precisely \(k\) points of \(P\) then, by double-counting pairs of points in \(P\), we have

$$\begin{aligned} \sum _{k \geqslant 2} \big (\begin{array}{l}{k}\\ {2}\end{array}\big ) N_k = \big (\begin{array}{l}{n}\\ {2}\end{array}\big ). \end{aligned}$$
(9.1)

From this it follows that if \(N_3 > \lfloor \frac{1}{6} n(n-3) \rfloor + 1\) then

$$\begin{aligned} N_2 + \sum _{k \geqslant 4} \big (\begin{array}{l}{k}\\ {2}\end{array}\big ) N_k \leqslant n, \end{aligned}$$
(9.2)

from which we conclude that \(N_2\), the number of ordinary lines spanned by \(P\), is at most \(n\). Furthermore no line contains more than \(O(\sqrt{n})\) points.

We may now apply Theorem 1.5, our structure theorem for sets with few ordinary lines. Since no line contains more than \(O(\sqrt{n})\) points of \(P\) we see that in fact only option (iii) of that theorem can occur, that is to say \(P\) differs in \(O(1)\) points from a coset \(H \oplus x, 3x \in H\), of a subgroup \(H\) of some irreducible cubic curve \(\gamma \), which is either an elliptic curve or (the smooth points of) an acnodal singular curve. The rest of the analysis is straightforward but a little tedious.

Suppose that \(3x = h_0\). As in the proof of Lemma 8.1, the tangent line to \(\gamma \) at \(h \oplus x\) meets \(H \oplus x\) in the point \((-2h \ominus h_0) \oplus x\), which is distinct from the first point unless \(3h \oplus h_0 = 0\). There are at most \(O(1)\) of these. In creating \(P\) from \(H \oplus x\) by the addition/deletion of \(O(1)\) points, at most \(O(1)\) of these lines are affected.

Since \(P\) spans at most \(n\) ordinary lines, it follows that \(P\) contains only \(O(1)\) ordinary lines other than these tangent lines. urthermore, since we now know that \(P\) contains at least \(n + O(1)\) ordinary lines, that is to say \(N_2 = n + O(1)\), we conclude from (9.2) that \(N_4 = O(1)\). We are going to conclude that \(P = H \oplus x\), a statement whose proof we divide into three parts.

Claim 1 There is no point of \(P\) off the curve \(\gamma \). If \(p\) is such a point, all except \(O(1)\) of the lines joining \(p\) to points of \(P \cap \gamma \) must contain precisely two points of \(P \cap \gamma \), or else there would be too many lines containing \(p\) with \(2\) or \(4\) points of \(P\). Note that this cannot happen if \(\gamma \) is the smooth points of an acnodal singular cubic curve and \(p\) is the isolated singular point, since every line through \(p\) meets at most one point of \(\gamma \); thus \(p\) lies outside of the cubic curve containing \(\gamma \). Consider the lines \(\ell \) from \(p\) to \(P \cap \gamma \) which are not tangent to \(\gamma \) and which contain precisely two points of \(P \cap \gamma \) and precisely two points of \(H \oplus x\), these points being the same. Since \(P \cap \gamma \) differs from \(H \oplus x\) in \(O(1)\) points, all except \(O(1)\) of the lines from \(p\) have this property. But any line not tangent to \(\gamma \) and containing the two points \(h \oplus x\) and \(h^{\prime } \oplus x\) also contains \(-(h \oplus h^{\prime } \oplus h_0) \oplus x\), a third point of \(H \oplus x\). This is a contradiction.

Claim 2 There is no point of \(P\) outside the set \(H \oplus x\). Suppose that \(k \oplus x\) is such a point. Then if \(h \in H\), the line joining \(k \oplus x\) and \(h \oplus x\) meets \(\gamma \) again at \(-(k \oplus h \oplus h_0) \oplus x\), which is not a point of \(H \oplus x\). This point can thus only lie in \(P\) for \(O(1)\) values of \(h\), and hence there are \(n - O(1)\) ordinary lines of \(P\) emanating from \(k \oplus x\). In addition to the \(n - O(1)\) tangent lines, this gives at least \(2n - O(1)\) ordinary lines in \(P\), a contradiction.

Claim 3 \(P\) contains all of \(H \oplus x\). Suppose that \(h_* \oplus x\) is a point of \(H \oplus x\) not contained in \(P\). For all except \(O(1)\) values of \(h\), the points \(h \oplus x\) and \(-(h \oplus h_* \oplus h_0) \oplus x\) lie in \(P\), and the line joining \(h_* \oplus x\) to them is not tangent to \(\gamma \). All such lines then contain precisely two points of \(P\), and once again we obtain \(n - O(1)\) ordinary lines to add to the \(n - O(1)\) tangent lines we already have. Once again a contradiction ensues.

We have now shown that if \(P\) is a set of \(n\) points in the plane with \(N_3\), the number of lines in \(P\) spanning precisely 3 points, satisfying \(N_3 > \lfloor \frac{1}{6} n(n-3) \rfloor + 1\), then \(P\) is a coset \(H \oplus x\) on \(\gamma \), an elliptic curve or the smooth points of an acnodal cubic, with \(3x \in H\). But by Proposition 2.6 we have \(N_3 \leqslant \lfloor \frac{1}{6} n(n-3) \rfloor + 1\) in any such case, and we are done.

Remarks

Note that, we have in fact classified (for large \(n\)) the optimal configurations in the orchard problem as coming from cosets in elliptic curves or acnodal cubics. We note that nothing like the full force of Theorem 1.5 is required for the orchard problem (as opposed to the Dirac–Motzkin conjecture). Once the much weaker Proposition 5.3 is established, we can immediately rule out possibilities (ii) and (iii) of that proposition and hence do away with all of the material in Sect. 6 and some of the material in Sect. 7 too.