1 Introduction

The Euclidean Steiner problem asks for a tree of shortest total length that interconnects a given collection of points or terminals in Euclidean space. For example, to interconnect the four vertices of a square in the plane, a shortest tree contains two further points apart from the four terminals (Fig. 1). Such a shortest tree is called a minimum Steiner tree on the given collection of terminals, and the additional points are called Steiner points. The Steiner problem is well studied, especially in the plane. An overview of the extensive literature on this problem can be found in the monographs of Hwang, Richards and Winter [1], Cieslik [2], Prömel and Steger [3], and the recent Brazil and Zachariasen [4]. For more on the history of the problem, see Boltyanski, Martini, and Soltan [5] and the recent Brazil, Graham, Thomas, and Zachariasen [6].

It is well known that a minimum Steiner tree in Euclidean space has maximum degree three, that the Steiner points always have degree three, and that each angle spanned by two edges with a common endpoint is at least 120\(^{\circ }\), and exactly \(120^{\circ }\) at each Steiner point [1, Section 6.1]. In the plane, there is a ruler-and-compass construction of a minimum Steiner tree once the graph structure (or topology) is known. This construction, also known as the Melzak algorithm [7], can be done in linear time [8]. On the other hand, determining the topology of a minimum Steiner tree is hard. There is a super-exponential number of different topologies [9], and it is already NP-hard to decide whether a given set of points in the plane has a Steiner tree of length smaller than a given length [10]. On the other hand, the GeoSteiner package of Warme, Winter and Zachariasen quickly finds minimum Steiner trees on a relatively large number of points in the plane [11].

There are polynomial time approximation schemes to calculate minimum Steiner trees in Euclidean space (Arora [12] and Mitchell [13]; see also [14]). However, for the actual implementation of these schemes, there has been progress so far only for certain planar problems [15]. A major obstacle in the implementation of these schemes for higher-dimensional problems is that their time complexity depends doubly exponentially on the dimension, and there is some evidence that this is unavoidable [16].

Fig. 1
figure 1

Minimum Steiner tree (in red) of the vertices of a square

In higher dimensions, the Steiner points are not necessarily constructible, and finding the optimal Steiner points results in solving high-degree algebraic equations, or solving a convex optimisation problem numerically [17]. See the papers [9, 17,18,19,20,21,22,23] for work on finding minimum Steiner trees in Euclidean spaces of dimension at least 3. We mention that Steiner trees in 3-space have been considered in theoretical investigations of multiquarks in particle physics [24] and in higher dimensions have been used to determine phylogenetic trees [25].

One problem arising from a numerical approach is that of estimating how close an approximation is to a locally minimum Steiner tree with a given Steiner topology. Rubinstein et al. [22] studied the relative error in the length of an approximate Steiner tree in terms of how far the angles at Steiner points deviate from \(120^{\circ }\). This paper is a further contribution to this topic.

Before we can give an exact definition of the relative error, we introduce our terminology and notation in Sect. 2. Then, in Sect. 3 we define the relative error and formulate the main conjectures from [22]. Our results are stated and summarised in Sect. 4. Sect. 5 is a brief discussion of the monotonicity of the relative error as the number of terminals increases. In Sect. 6, we prove our results for large relative errors. For small relative errors, we subdivide the proofs into a section on upper bounds (Sect. 7) and lower bounds (Sect. 8). We conclude in Sect. 9 with some remarks. There are two tedious induction proofs of results in Sect. 8 which are presented in “Appendix”.

2 Terminology

We define a Steiner topology for n terminals to be a tree \({\mathcal T}\) with n special vertices \(t_1,\ldots ,t_n\), called terminals , all of degree at most 3, and all other vertices, called Steiner points , of degree exactly 3. A Steiner topology is full if all terminals have degree 1. Let \(N=\{p_1,\ldots ,p_n\}\) be a family of n points in \(\mathbb {R}^d\) (allowing repeated points). A Steiner tree T for N, with topology \({\mathcal T}\), is a representation of \({\mathcal T}\) in \(\mathbb {R}^d\), with each \(t_i\) represented by \(p_i\), each Steiner point of \({\mathcal T}\) represented by an arbitrary point of \(\mathbb {R}^d\), and edges represented by straight-line segments. We say that such a Steiner tree interconnects N. A Steiner tree is full if its topology is full. We allow Steiner points to coincide with each other and with terminals, hence edges incident to a Steiner point to be of length 0. An edge of length 0 is called degenerate , and we say that a Steiner tree that contains a degenerate edge is degenerate . We allow edges to intersect each other.

The (convex) angle determined by two edges xy and xz with a common endpoint x is denoted \(\sphericalangle yxz\). Its angular measure is also denoted by \(\sphericalangle yxz\), and we assume that angular measures are in the interval \([0,\pi ]\). We use radians for angular measure throughout the paper, except in a few places, where it will be clear that we use degrees.

We denote the Euclidean length of an edge pq by \(|pq|\). The length L(T) of a tree T is the sum of the Euclidean lengths of its edges. Among all the trees that interconnect a given set N of terminals there is at least one tree of minimum length, which we call a minimum Steiner tree of N. We define a locally minimum Steiner tree to be a non-degenerate tree with a Steiner topology and with all angles spanned by the edges at each vertex at least \(2\pi /3\). Since each Steiner point in a Steiner topology has degree 3, it easily follows (in any dimension) that each of the three angles at a Steiner point is exactly \(2\pi /3\) and that the three edges incident to the Steiner point are coplanar. As mentioned above, any minimum Steiner tree is a locally minimum Steiner tree. A full minimum Steiner tree is a minimum Steiner tree that is also full.

We denote the largest integer not greater than x by \(\lfloor x\rfloor \).

3 Formulation of the Problem, Conjectures and Previous Results

In [22] the following notions were introduced. Let \(\varepsilon \geqslant 0\) be given. An \(\varepsilon \)- approximate Steiner tree is a tree with a Steiner topology, with all the angles spanned by the edges at each Steiner point belonging to the interval \([2\pi /3-\varepsilon ,2\pi /3+\varepsilon ]\). Note that a 0-approximate Steiner tree is the same as a locally minimum Steiner tree (in [22] the distinction was made between a pseudo-Steiner point of an \(\varepsilon \)-approximate Steiner tree and a Steiner point of a locally minimum Steiner tree. For the sake of simplicity we make no such distinction and use the term Steiner point for both).

For \(d\geqslant 2, n\geqslant 3\) and \(\varepsilon \geqslant 0\), let \({\mathcal A}_\varepsilon ^d(n)\) denote the set of all full \(\varepsilon \)-approximate Steiner trees on n terminals in \(\mathbb {R}^d\), and let \(\overline{{\mathcal A}}_\varepsilon ^d(n)\) denote the subset of all \(T\in {\mathcal A}_\varepsilon ^d(n)\) for which the terminals have a minimum Steiner tree with the same topology as T. In particular, \({\mathcal A}_0^d(n)\) is the set of all full locally minimum Steiner trees on n terminals in \(\mathbb {R}^d\), and \(\overline{{\mathcal A}}_0^d(n)\) is the set of all full minimum Steiner trees on n terminals in \(\mathbb {R}^d\).

Given a tree T in \(\mathbb {R}^d\) with Steiner topology \({\mathcal T}\), let S(T) denote the shortest tree in \(\mathbb {R}^d\) on the terminals of T with topology \({\mathcal T}\), where we allow degenerate shortest trees. Even though S(T) is not necessarily a Steiner tree (see, for instance, [4, Figure 1.7]), it can be shown that S(T) is always unique [9, Section 4].

Rubinstein, Weng and Wormald [22] defined the following two quantities:

$$\begin{aligned} F_d(\varepsilon ,n) := \sup \left\{ \frac{L(T)-L(S(T))}{(L(S(T))}:T\in {\mathcal A}_\varepsilon ^d(n)\right\} \end{aligned}$$

and

$$\begin{aligned} \overline{F}_d(\varepsilon ,n) := \sup \left\{ \frac{L(T)-L(S(T))}{(L(S(T))}:T\in \overline{{\mathcal A}}_\varepsilon ^d(n)\right\} , \end{aligned}$$

and made the following conjectures in the case \(d\geqslant 3\). Although they did not consider the two-dimensional case, we include it, as it is also still open, and most of our results will be in the plane.

Conjecture 3.1

For any \(d\geqslant 2\) there exist \(\varepsilon _0>0\) and \(C_d>0\) such that for all \(\varepsilon \in ]0,\varepsilon _0[\) and \(n\in \mathbb {N}, F_d(\varepsilon ,n) < C_d\varepsilon \).

Conjecture 3.2

For any \(d\geqslant 2\) there exist \(\varepsilon _0>0\) and \(C_d>0\) such that for all \(\varepsilon \in ]0,\varepsilon _0[\) and \(n\in \mathbb {N}, \overline{F}_d(\varepsilon ,n) < C_d\varepsilon \).

The second conjecture is weaker than the first, but it seems difficult to deduce an upper bound for \(\overline{F}_d\) that cannot already be deduced for \(F_d\). Rubinstein, Weng and Wormald [22] showed that for \(\varepsilon <1/n^2, F_d(\varepsilon ,n)\leqslant C_d (\varepsilon \log n + \varepsilon ^2 n^3)\). They also consider larger values of \(\varepsilon \).

4 Overview of New Results

Our results are summarised in Table 1. Our first main result is an upper bound for the relative error in the plane.

Table 1 Summary of results

Theorem 4.1

If \(n\geqslant 3\) and \(0<\varepsilon <\pi /(n-2)\), then

$$\begin{aligned} F_2(\varepsilon ,n)\leqslant \frac{1}{\cos \tfrac{(n-2)\varepsilon }{2}}-1. \end{aligned}$$

The proof is in Sect. 7. As a consequence, Conjecture 3.1 holds in the plane if \(\varepsilon \) is sufficiently small, depending on n.

Corollary 4.1

If \(0<\varepsilon <\pi /(n-2)\), then \(F_2(\varepsilon ,n) = O(n^2\varepsilon ^2)\). Consequently, if \(\varepsilon = O(1/n^2)\) as \(n\rightarrow \infty \), then \(F_2(\varepsilon ,n)=O(\varepsilon )\).

In [22] an example is given, which shows that Conjecture 3.1 is sharp for each \(d\geqslant 3\). Our second main result, Theorem 4.2, is a lower bound for \(F_2\), which shows that Conjecture 3.1 is already sharp in the plane for sufficiently small \(\varepsilon \).

Theorem 4.2

For any \(k\geqslant 1\), if \(\varepsilon =c/k^2\) with \(0<c<1\), then

$$\begin{aligned} F_2(\varepsilon ,2^k+1) > \frac{c}{24}\varepsilon . \end{aligned}$$

Consequently, if \(\varepsilon <(\log _2 n)^{-2}\), then \(F_2(\varepsilon ,n) =\varOmega ((\log n)^2\varepsilon ^2)\).

The proof is in Sect. 8. In Sect. 6, we show some bounds for larger \(\varepsilon \).

In the above definition of \(F_d\), we consider the worst-case relative error between a full \(\varepsilon \)-approximate Steiner tree T on n terminals and the shortest tree S(T) with the same topology as T, even though S(T) may have a degenerate topology. Instead, we could restrict ourselves to trees T, for which S(T) is non-degenerate. Note that for any \(T\in {\mathcal A}_\varepsilon ^d(n), S(T)\) is non-degenerate iff S(T) is a locally minimum Steiner tree. We therefore introduce the following variants of the previous two quantities:

$$\begin{aligned} G_d(\varepsilon ,n) := \sup \left\{ \frac{L(T)-L(S(T))}{(L(S(T))}:T\in {\mathcal A}_\varepsilon ^d(n), S(T)\in {\mathcal A}_0^d(n)\right\} \end{aligned}$$

and

$$\begin{aligned} \overline{G}_d(\varepsilon ,n) := \sup \left\{ \frac{L(T)-L(S(T))}{(L(S(T))}:T\in \overline{{\mathcal A}}_\varepsilon ^d(n), S(T)\in \overline{{\mathcal A}}_0^d(n)\right\} . \end{aligned}$$

Clearly, \(G_d(\varepsilon ,n)\leqslant F_d(\varepsilon ,n)\) and \(\overline{G}_d(\varepsilon ,n)=\overline{F}_d(\varepsilon ,n)\). The construction that we make to prove the lower bounds of Theorem 4.2 in fact gives a lower bound for \(G_2(\varepsilon ,n)\) for certain values of n, as in Theorem 4.3.

Theorem 4.3

For any \(k\geqslant 1\), if \(\varepsilon =c/k^2\) with \(0<c<1\), then

$$\begin{aligned} G_2(\varepsilon ,2^k+1) > \frac{c}{24}\varepsilon . \end{aligned}$$

Unfortunately we do not know whether \(G_2(\varepsilon ,n)\) is monotone in n (see Sect. 5), so we cannot state a lower bound for general n.

5 Monotonicity of \(F_d\) and \(G_d\)

In many of the examples constructed in this paper, the number of terminals is of a special form such as a power of 2. In order to make general statements for all n, we need to know that \(F_d\) and \(G_d\) are monotone in n. Monotonicity in \(\varepsilon \) and in d are straightforward. Indeed, if \(0\leqslant \varepsilon _1 < \varepsilon _2\), then an \(\varepsilon _1\)-approximate Steiner tree is also an \(\varepsilon _2\)-approximate Steiner tree, hence \(F_d(\varepsilon _1,n)\leqslant F_d(\varepsilon _2,n), G_d(\varepsilon _1,n)\leqslant G_d(\varepsilon _2,n)\) and \(\overline{F}_d(\varepsilon _1,n)\leqslant \overline{F}_d(\varepsilon _2,n)\). Clearly \(F_d, G_d\) and \(\overline{F}_d\) are monotone in d:

$$\begin{aligned} F_2\leqslant F_3\leqslant \cdots ,\quad G_2\leqslant G_3\leqslant \cdots \quad \text {and}\quad \overline{F}_2\leqslant \overline{F}_3\leqslant \cdots . \end{aligned}$$

It is still relatively simple to show that \(F_d\) is also monotone in n, as we show next.

Proposition 5.1

For any \(d\geqslant 2, \varepsilon >0\) and \(n\geqslant 3, F_d(\varepsilon ,n)\leqslant F_d(\varepsilon ,n+1)\).

Proof

Consider any \(\varepsilon \)-approximate Steiner tree T with a full Steiner topology on n terminals. Let S be a shortest tree with the same terminals set and with the same (possibly degenerate) topology as T. We show that

$$\begin{aligned} F_d(\varepsilon ,n+1)\geqslant L(T)/L(S) -1. \end{aligned}$$
(1)

Let \(\delta >0\) be arbitrary. Modify T to obtain an \(\varepsilon \)-approximate Steiner tree \(T'\) on \(n+1\) terminals as follows. Choose any terminal t of T. It is joined to a Steiner point s of T. Let \(t_1\) and \(t_2\) be two points at distance \(\delta \) from t such that the three angles at t are equal: \(\sphericalangle t_1tt_2=\sphericalangle t_1ts=\sphericalangle t_2ts\). (Thus, \(t_1, t_2, t\) and s have to be coplanar.) If we consider \(t_1\) and \(t_2\) to be two new terminals, and consider t to be a Steiner point, then we obtain an \(\varepsilon \)-approximate Steiner tree \(T'\) on \(n+1\) terminals of length \(L(T')=L(T)+2\delta \).

We modify S by adding the edges \(t_1t\) and \(t_2t\) to obtain a tree \(S'\) with the same topology as \(T'\) (allowing degenerate topologies). Then,

$$\begin{aligned} L(S(T'))\leqslant L(S')=L(S)+2\delta , \end{aligned}$$

and

$$\begin{aligned} F_d(\varepsilon ,n+1)\geqslant \frac{L(T')}{L(S(T'))}-1\geqslant \frac{L(T)+2\delta }{L(S)+2\delta }-1. \end{aligned}$$

Since this holds for all \(\delta >0\), (1) follows. Since (1) holds for an arbitrary \(\varepsilon \)-approximate Steiner tree on n terminals,

$$\begin{aligned} F_d(\varepsilon ,n+1)\geqslant \sup L(T)/L(S)-1=F_d(\varepsilon ,n). \end{aligned}$$

\(\square \)

The monotonicity of \(G_d(\varepsilon ,n)\) in n seems to be subtler, and we have only been able to show it for \(d\geqslant 3\).

Proposition 5.2

For any \(d\geqslant 3, \varepsilon >0\) and \(n\geqslant 3, G_d(\varepsilon ,n)\leqslant G_d(\varepsilon ,n+1)\).

Proof

Let \(\delta >0\) be arbitrary. Let T be a full \(\varepsilon \)-approximate Steiner tree on n terminals in \(\mathbb {R}^d\) such that S(T) is non-degenerate (in particular, S(T) is still full). Choose any terminal t of T. It is joined to a Steiner point s in T and also to a Steiner point \(s'\) in S(T). Choose a point \(t_1\) such that \(t_1t\) is perpendicular to ts and to \(ts'\), and \(|tt_1|=\sqrt{3}\delta \). Let \(t_2\) be the unique point such that t is the midpoint of \(t_1t_2\). Without any loss of generality, \(\delta <|ts|, |ts'|\). Then, there exists a unique point \(s_2\) on st such that \(\sphericalangle t_1s_2t_2=2\pi /3\) and a unique point \(s'_2\) on \(s't\) such that \(\sphericalangle t_1s'_2t_2=2\pi /3\). Let \(T'\) be the tree obtained from T by removing t and st, and adding the Steiner point \(s_2\), terminals \(t_1\) and \(t_2\), and edges \(ss_2, t_1s_2\) and \(t_2s_2\). Then, \(T'\) is an \(\varepsilon \)-approximate Steiner tree on \(n+1\) terminals, and \(L(T')=L(T)+3\delta \). Furthermore, \(S(T')\) is the tree obtained from S(T) by removing t and \(s't\), and adding the Steiner point \(s'_2\), terminals \(t_1\) and \(t_2\), and edges \(s's'_2, t_1s'_2\) and \(t_2s'_2\). Then, \(L(S(T'))=L(S(T))+3\delta \). We conclude that

$$\begin{aligned} G_d(\varepsilon ,n+1) \geqslant \frac{L(T')}{L(S(T'))}-1 = \frac{L(T)+3\delta }{L(S(T))+3\delta }-1, \end{aligned}$$

and by letting \(\delta \rightarrow 0\) and taking the \(\sup \) of the right-hand side, the proof is finished. \(\square \)

We have not been able to show that \(\overline{F}_d(\varepsilon ,n)=\overline{G}_d(\varepsilon ,n)\) is monotone in n. We are also not sure whether \(G_2(\varepsilon ,n)\leqslant G_2(\varepsilon ,n+1)\) or \(\overline{F}_2(\varepsilon ,n)\leqslant \overline{F}_2(\varepsilon ,n+1)\) always hold.

6 Results for Large \(\varepsilon \)

This section contains upper and lower bounds for \(F_d\) for values of \(\varepsilon \) that are independent of n. In Proposition 6.1 we obtain the modest upper bound of \(2n-4\) for \(F_2(\varepsilon ,n)\), as long as \(\varepsilon \leqslant \pi /6\). We do not know of any better upper bound in the plane for small and fixed \(\varepsilon \). In Theorem 6.1 we give an explicit upper bound for \(F_d(\varepsilon ,n)\) for all values of \(\varepsilon <2\pi /3\). For instance, we obtain \(F_d(\varepsilon ,n)\leqslant O\left( \left( 2/\sqrt{3}+\varepsilon \right) ^n\right) \) for small \(\varepsilon \).

Theorem 6.2 sharpens Lemma 2.2 of [22] in the range \(\varepsilon \in ]\pi /3,2\pi /3[\) by giving a lower bound for \(F_d\) for all \(d\geqslant 2\) of the form \(n^{\alpha (\varepsilon )}\), where \(\alpha (\varepsilon )\) is an explicit function of \(\varepsilon \). In particular, it will follow that, if \(\varepsilon >105.6\ldots ^\circ \), then \(\alpha (\varepsilon )>2\); hence, the lower bound grows super-quadratically. This indicates that Theorem 2.1 of [22] can only hold if \(\varepsilon \) is sufficiently small. We also obtain a lower bound for \(\varepsilon =\pi /3\) of the form \(\varOmega (\log n)\).

Proposition 6.1

If \(\varepsilon \leqslant \pi /6\) and \(n\geqslant 3\), then \(F_2(\varepsilon ,n)\leqslant 2n-4\).

Proof

Since \(2\pi /3-\varepsilon \geqslant \pi /2\), it follows that each Steiner point of an \(\varepsilon \)-approximate Steiner tree T is in the convex hull of its neighbours. It easily follows that each Steiner point is in the convex hull K of the terminals. Therefore, each edge of T has length at most \({{\mathrm{diam}}}K\). Since T has \(2n-3\) edges, and any Steiner tree on the terminals has length at least \({{\mathrm{diam}}}K\), it follows that \(L(T)/L(S(T))\leqslant 2n-3\), hence \(F_2(\varepsilon ,n)\leqslant 2n-4\). \(\square \)

We will often use the following reverse triangle inequality.

Lemma 6.1

In \(\triangle abc\),

$$\begin{aligned} |ab|+|bc| \leqslant \frac{|ac|}{\cos (\theta /2)}, \end{aligned}$$

where \(\theta \) is the exterior angle at b.

Proof

Let the angular measures of the interior angles of \(\triangle abc\) at abc, be \(\alpha , \beta , \gamma \), respectively. By the sine rule,

$$\begin{aligned} \frac{|ab|+|bc|}{|ac|}&= \frac{\sin \gamma }{\sin \beta }+\frac{\sin \alpha }{\sin \beta } = \frac{\sin \alpha +\sin \gamma }{\sin \theta } = \frac{2\sin \left( \frac{\alpha +\gamma }{2}\right) \cos \left( \frac{\alpha +\gamma }{2}\right) }{\sin \theta }\\&\leqslant \frac{2\sin \left( \frac{\alpha +\gamma }{2}\right) }{\sin \theta } = \frac{2\sin (\theta /2)}{2\sin (\theta /2)\cos (\theta /2)} = \frac{1}{\cos (\theta /2)}. \end{aligned}$$

\(\square \)

We define a cherry of a Steiner topology \({\mathcal T}\) to be a subgraph of \({\mathcal T}\), consisting of two terminals with a common Steiner point. It is easy to see that any Steiner topology on at least 3 terminals has at least two cherries. We will later use the fact that for any terminal t there exists a cherry with two terminals not equal to t (to see this, note that in the subtree of \({\mathcal T}\) on the Steiner points, there are at least two leaves, unless \(n=3\)).

Lemma 6.2

Let T be an \(\varepsilon \)-approximate Steiner tree in \(\mathbb {R}^d\), (\(0\leqslant \varepsilon <2\pi /3\)).

  1. (i)

    For any cherry with terminals \(t_1\) and \(t_2\) and Steiner point s,

    $$\begin{aligned} |st_1|+|st_2|\leqslant |t_1t_2|/\sin (\pi /3-\varepsilon /2). \end{aligned}$$
  2. (ii)

    If D is the diameter of the set of terminals, then for any terminal t and Steiner point s,

    $$\begin{aligned}|ts|\leqslant D\cos (\varepsilon /2)/\sin (\pi /3-\varepsilon /2). \end{aligned}$$

Proof

For the first statement, we use Lemma 6.1:

$$\begin{aligned} \frac{|st_1|+|st_2|}{|t_1t_2|} \leqslant \frac{1}{\cos \frac{1}{2}(\pi -\sphericalangle t_1st_2)} = \frac{1}{\sin \frac{1}{2}\sphericalangle t_1st_2} \leqslant \frac{1}{\sin (\pi /3-\varepsilon /2)}. \end{aligned}$$

For the second statement, consider the plane \(\varPi \) through t and the terminals \(t_1, t_2\) of a cherry (if these points are collinear, choose any plane through them). Let o be the midpoint of \(t_1t_2\). Let \(C_i\) be the circle with centre \(t_i\) and radius D. Denote the half plane bounded by \(t_1t_2\) and containing t by H. Let p be the point where \(C_1\) and \(C_2\) intersect in H. Without any loss of generality, t is inside the angle \(\sphericalangle pot_2\).

Fig. 2
figure 2

Proof of Lemma 6.2

First, suppose that \(\varepsilon \leqslant \pi /6\) (Fig. 2a). Let c be the point on the line op in the half plane H such that \(\sphericalangle ct_1t_2=\pi /6-\varepsilon \). Let \(t'\) be the point where the ray from c through t intersects \(C_1\). Then, \(|ct|\leqslant |ct'|\leqslant |cp|\) (Euclid III.7 [26]). Let C be the circle with centre c that passes through \(t_1\) and \(t_2\), and let it intersect the line op in the half plane opposite H in q. Then, for any point \(x\in C\cap H, \sphericalangle t_1xt_2=\pi /3+\varepsilon \) and for any \(x\in C{\setminus } H, \sphericalangle t_1xt_2=2\pi /3-\varepsilon \). Since \(\pi /3+\varepsilon <2\pi /3-\varepsilon \leqslant \sphericalangle t_1st_2, s\) is in the ball B with centre c that passes through \(t_1\) and \(t_2\). In particular, \(|cs|\leqslant |cq|\). We conclude that

$$\begin{aligned} |ts|\leqslant |tc|+|cs|\leqslant |pc|+|cq|=|pq|=D\frac{\sin \sphericalangle pt_1q}{\sin \sphericalangle t_1qp}. \end{aligned}$$
(2)

We bound \(\sphericalangle pt_1q\) from below as follows. Since \(|t_1t_2|\leqslant D=|t_1p|, \sphericalangle pt_1t_2\geqslant \pi /3\). Furthermore, \(\sphericalangle qt_1t_2=\pi /6+\varepsilon /2\). Therefore, \(\sphericalangle pt_1q\geqslant (\pi +\varepsilon )/2\). We substitute this estimate, together with \(\sphericalangle t_1qp=\pi /3-\varepsilon /2\) into (2), to obtain

$$\begin{aligned} |ts| \leqslant D\frac{\sin (\pi /2+\varepsilon /2)}{\sin (\pi /3-\varepsilon /2)} = D\frac{\cos (\varepsilon /2)}{\sin (\pi /3-\varepsilon /2)}. \end{aligned}$$

The case where \(\varepsilon >\pi /6\) is similar (Fig. 2b). Let \(c_1\) and \(c_2\) be points on the line op such that \(\sphericalangle c_it_1o=\varepsilon -\pi /6, i=1,2\). Similar to the previous case, \(|ot|\leqslant |op|\).

Let \(B_i\) be the ball with centre \(c_i\) and radius \(|c_1t_1|=|c_2t_1|, i=1,2\). Let \(q_1\) be the point where the line \(oc_1\) intersects \(B_1\) in the half plane H, and \(q_2\) be the point where \(oc_1\) intersects \(B_2\) in the half plane opposite H. Since \(B_1\cup B_2\) is the set of all points x such that \(\sphericalangle t_1xt_2\geqslant 2\pi /3-\varepsilon , s\in B_1\cup B_2\). If \(s\in B_1\), then Euclid III.7 gives that \(|os|\leqslant |oq_1|=|oq_2|\). It follows that \(|st|=|so|+|ot|\leqslant |q_2o|+|op|=|pq_2|\). Similar to the previous case, we obtain

$$\begin{aligned} |pq_2| = D\frac{\sin \sphericalangle pt_1q_2}{\sin \sphericalangle t_1q_2p} \leqslant D\frac{\sin (\pi /2+\varepsilon /2)}{\sin (\pi /3-\varepsilon /2)}= D\frac{\cos (\varepsilon /2)}{\sin (\pi /3-\varepsilon /2)}. \end{aligned}$$

\(\square \)

Theorem 6.1

For any \(\varepsilon \in (0,2\pi /3)\) and \(d\geqslant 2\),

$$\begin{aligned} F_d(\varepsilon ,n)=O\left( \left( \cos (\varepsilon /2)/\sin (\pi /3-\varepsilon /2)\right) ^n\right) . \end{aligned}$$

Proof

Let \(A=\cos (\varepsilon /2)/\sin (\pi /3-\varepsilon /2)\) and \(B=1/\sin (\pi /3-\varepsilon /2)\). We show by induction on \(n\geqslant 2\) that

$$\begin{aligned} L(T)\leqslant \left( A^{n-2}+\frac{(A^{n-2}-1)B}{A-1}\right) D. \end{aligned}$$
(3)

If \(n=2\), then \(L(T)=D\), which equals the right-hand side. Next let \(n>2\) and assume that (3) holds for \(\varepsilon \)-approximate Steiner trees on \(n-1\) terminals. Consider a cherry of T with Steiner point s and terminals \(t_1\) and \(t_2\). By Lemma 6.2, the distance between s and any terminal of T is at most \( AD \), and \(|st_1|+|st_2|\leqslant B|t_1t_2|\leqslant BD \). Remove \(t_1\) and \(t_2\) and the edges \(st_1\) and \(st_2\) from T and change s into a terminal to obtain an \(\varepsilon \)-approximate Steiner tree \(T'\) on \(n-1\) terminals. The diameter of this set of terminals is \(D'\leqslant AD \). By the induction hypothesis, \(L(T')\leqslant \left( A^{n-3}+\frac{(A^{n-3}-1)B}{A-1}\right) D'\). Therefore,

$$\begin{aligned} L(T)&= L(T') +|st_1|+|st_2| \\&\leqslant \left( A^{n-3}+\frac{(A^{n-3}-1)B}{A-1}\right) AD + BD \\&= \left( A^{n-2}+\frac{(A^{n-2}-1)B}{A-1}\right) D. \end{aligned}$$

Finally, the length of a Steiner minimal tree joining the terminals of T is at least D, and it follows that

$$\begin{aligned} \frac{L(T)}{L(S(T))}-1\leqslant A^{n-2}+\frac{(A^{n-2}-1)B}{A-1}-1 = O(A^n). \end{aligned}$$

\(\square \)

The following is a sharper version of Lemma 2.2 in [22]. The proof is along the lines of the proof of Lemma 2.2 in [22], but is done in the plane.

Theorem 6.2

For each \(\varepsilon \in ]\pi /3,2\pi /3[\),

$$\begin{aligned} F_2(\varepsilon ,n) =\varOmega \bigl (n^{\log _2 C_2(\varepsilon )}\bigr ), \end{aligned}$$

where \(C_2(\varepsilon ):=\left( 2\sin \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) \right) ^{-1}\). Furthermore, \(F_2(\pi /3, n)=\varOmega (\log n)\).

By making \(\varepsilon \) large enough, the lower bound in Theorem 6.2 grows faster than any polynomial. In particular, if \(\varepsilon >105.6\ldots ^\circ \), then the lower bound is super-quadratic (compare with Theorem 2.1 in [22]). Theorem 6.2 follows from the following lemma (combined with Proposition 5.1).

Lemma 6.3

Let \(k\geqslant 1\) and \(\pi /3<\varepsilon <2\pi /3\). Then, \(F_2(\varepsilon ,2^{k+1}) > \sqrt{3}\frac{C^k-1}{C-1}-1\), where \(C=\left( 2\sin \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) \right) ^{-1}\). Furthermore, for \(k\geqslant 1, F_2(\pi /3, 2^{k+1})\geqslant \sqrt{3}k-1\).

Proof

Let \(\pi /3\leqslant \varepsilon <2\pi /3\) and \(k\geqslant 1\). We construct an \(\varepsilon \)-approximate Steiner tree with \(2^{k+1}\) terminals. Let \(r=\sin \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) \). Let \(C_0,C_1,\ldots ,C_k\) be concentric circles with common centre o and with \(C_i\) of radius \(r^i\).

First, we construct “half” the tree with \(2^k\) terminals on \(C_k\) and Steiner points on the other circles. Fix any \(p_1\in C_0\). There are two tangent lines from \(p_1\) to \(C_1\). Denote the points where they touch \(C_1\) by \(p_2\) and \(p_3\), chosen such that \(\sphericalangle p_2p_1p_3\) is positively oriented. See Fig. 3. Note that \(\sphericalangle p_2p_1p_3=2\pi /3-\varepsilon \).

Fig. 3
figure 3

Constructing a lower bound in the plane

In general, for each \(i=1,\ldots ,k\), once \(p_{2^{i-1}},p_{2^{i-1}+1},\ldots ,p_{2^i-1}\in C_{i-1}\) have been determined, for each \(p_j\in C_{i-1}\), let \(p_{2j}\) and \(p_{2j+1}\) be the two points where the tangents from \(p_j\) touch \(C_i\), chosen such that \(\sphericalangle p_{2j}p_jp_{2j+1}\) is positively oriented. Again, \(\sphericalangle p_{2j}p_jp_{2j+1}=2\pi /3-\varepsilon \). The points \(p_{2^k},\ldots ,p_{2^{k+1}-1}\in C_k\) will be \(2^k\) of the terminals. We join each \(p_j\) to \(p_{2j}\) and \(p_{2j+1}\), for \(j=1,\ldots ,2^k-1\).

Next, we “double” the tree, by choosing one of the directions on the tangent line of \(C_0\) at \(p_1\), and moving each \(p_i\) in that direction by a distance of \(\delta \), where \(\delta >0\) is very small. Denote the moved points by \(p'_i\). We move o in the same direction to obtain \(o'\). The moved points \(p'_{2^k},\ldots ,p'_{2^{k+1}-1}\) will give another \(2^k\) terminals. We join \(p'_j\) to \(p'_{2j}\) and \(p'_{2j+1}\), for \(j=1,\ldots ,2^k-1\). Finally, we join \(p_1\) and \(p_1'\). All \(p_j\) and \(p'_j\) with \(j<2^k\) are Steiner points. Each angle at a Steiner point is one of three values \(2\pi /3-\varepsilon , 5\pi /6-\varepsilon /2\), and \(\pi /6+\varepsilon /2\). These all belong to the interval \([2\pi /3-\varepsilon ,2\pi /3+\varepsilon ]\), since \(\varepsilon \geqslant \pi /3\). Thus, we obtain a full \(\varepsilon \)-approximate Steiner tree T on \(2^{k+1}\) terminals, all on the circle \(C_k\) of radius \(r^k\). Note that many of the \(p_j\) coincide. For instance, it is always the case that \(p_5=p_6\). This is allowed in our definition of an \(\varepsilon \)-approximate Steiner tree. Alternatively, we could have slightly perturbed the radii of the circles by \(\delta \) to ensure that all \(p_j\) are distinct.

Next, we calculate L(T). An edge from a point of T on \(C_{i}\) to a point on \(C_{i+1}\) has length \(r^i\cos \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) \). Therefore,

$$\begin{aligned} L(T)&= \delta + 2\left( 2\cos \left( \tfrac{\pi }{3}-\tfrac{\varepsilon }{2}\right) +4r\cos \left( \tfrac{\pi }{3}-\tfrac{\varepsilon }{2}\right) +\cdots +2^kr^{k-1}\cos \left( \tfrac{\pi }{3}-\tfrac{\varepsilon }{2}\right) \right) \\&= \delta + 4\cos \left( \tfrac{\pi }{3}-\tfrac{\varepsilon }{2}\right) \left( 1+2r+\cdots +(2r)^{k-1}\right) \\&= \delta + 4\cos \left( \tfrac{\pi }{3}-\tfrac{\varepsilon }{2}\right) \frac{1-(2r)^k}{1-2r} \end{aligned}$$

if \(\varepsilon >\pi /3\), and \(L(T)=\delta +2\sqrt{3}k\) if \(\varepsilon =\pi /3\). We form a Steiner tree S with a degeneration of the topology of T by joining each \(p_i\) to o, each \(p_i'\) to \(o'\), and o to \(o'\). Then, \(L(S(T))\leqslant L(S)=\delta +2 (2r)^{k}\), which equals \(\delta +2\) if \(\varepsilon =\pi /3\).

Therefore, if \(\varepsilon >\pi /3\), then

$$\begin{aligned} F_2(\varepsilon ,2^{k+1}) \geqslant \frac{\delta + 4\cos \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) \dfrac{1-(2r)^k}{1-2r}}{\delta +2(2r)^{k}}-1 \end{aligned}$$

for each \(\delta >0\); hence,

$$\begin{aligned} F_2(\varepsilon ,2^{k+1})&\geqslant \frac{4\cos \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) \left( 1-(2r)^k\right) }{2(2r)^{k}(1-2r)} - 1 \\&= \frac{2\cos \left( \frac{\pi }{3}-\frac{\varepsilon }{2}\right) }{1-2r}\left( \left( \frac{1}{2r}\right) ^k-1\right) -1 \\&= \frac{2\sqrt{1-\frac{1}{4C^2}}}{1-\frac{1}{C}}(C^k-1)-1 \\&= \frac{\sqrt{4C^2-1}}{C-1}(C^k-1)-1 > \sqrt{3}\frac{C^k-1}{C-1}-1, \end{aligned}$$

where \(C=1/(2r)=(2\sin (\frac{\pi }{3}-\frac{\varepsilon }{2}))^{-1}\). Similarly, if \(\varepsilon =\pi /3\), then

$$\begin{aligned} F_2(\varepsilon ,2^{k+1})\geqslant \frac{\delta +2\sqrt{3}k}{2+\delta }-1, \end{aligned}$$

and letting \(\delta \rightarrow 0\), we obtain the required result. \(\square \)

7 Upper Bounds for Small \(\varepsilon \) (Proof of Theorem 4.1)

In this section, we prove Theorem 4.1 using an unfolding algorithm described in [18] and [22] based on Melzak’s algorithm for finding the shortest Steiner tree for a fixed Steiner topology (if this shortest tree happens to be what we call a locally minimum Steiner tree). This algorithm unfolds an approximate Steiner tree into a broken line segment. First, we describe this unfolding and then use it in the special cases of 3 and 4 terminals in the plane to determine the exact values of \(F_2(\varepsilon ,3)\) and \(F_2(\varepsilon ,4)\) (Propositions 7.1 and 7.2). Then, the proof of Theorem 4.1 should be clear.

The following inequality and its proof forms the basis for the unfolding algorithm.

Lemma 7.1

Let \(\triangle abc\) be an equilateral triangle in \(\mathbb {R}^d\). Then, for any \(x\in \mathbb {R}^d, |xa|\leqslant |xb|+|xc|\), with equality iff x is on the minor arc of the circumcircle of \(\triangle abc\).

Proof

The proof is essentially the same as the classical proof that the Fermat point of a triangle with all angles less than \(2\pi /3\) minimises the sum of the distances to the vertices. Because there are only 4 points to consider, we may assume without any loss of generality that \(x,a,b,c\in \mathbb {R}^3\).

Rotate \(\triangle bxc\) by an angle of \(\pi /3\) around the axis through b perpendicular to the plane \(\varPi \) through ab and c such that c is rotated to a. Then, b stays fixed, and x is rotated to \(x'\), say. Also, \(|xc|=|x'a|\). Let \(p:\mathbb {R}^3\rightarrow \varPi \) be the orthogonal projection onto \(\varPi \). Then, \(\triangle bp(x)p(x')\) is equilateral. Since \(xx'\) is parallel to \(\varPi , |xx'|=|p(x)p(x')|=|bp(x)|\leqslant |bx|\). Therefore, \(|xa|\leqslant |xx'|+|x'a|\leqslant |bx|+|xc|\). Equality holds iff x is in the plane \(\varPi \) and \(a, x', x\) are collinear, which holds iff \(\sphericalangle bx'a=2\pi /3\), iff \(\sphericalangle bxc=2\pi /3\), iff x is on the minor arc of the circumcircle of \(\triangle abc\). \(\square \)

Consider a family of n terminals \(N_n\) in \(\mathbb {R}^d\) and a full Steiner topology \({\mathcal T}_n\) for those terminals. Choose one of the terminals \(t_0\) as root of \({\mathcal T}_n\). We define a Melzak sequence of \(N_n\) and \({\mathcal T}_n\) to be two sequences \(N_n,N_{n-1},\ldots ,N_2\) and \({\mathcal T}_n,{\mathcal T}_{n-1},\ldots ,{\mathcal T}_2\), where each \({\mathcal T}_i\) is a full Steiner topology on \(N_i\) and with root \(t_0\) (thus, \(t_0\in N_i\) for all i). We obtain \(N_{i-1}\) and \({\mathcal T}_{i-1}\) from \(N_i\) and \({\mathcal T}_i\) as follows. Choose any cherry of \({\mathcal T}_i\) with two terminals \(t_1, t_2\ne t_0\) and Steiner point s with neighbours, say, \(t_1, t_2\) and p. Replace \(t_1\) and \(t_2\) in \(N_{i}\) by any point \(t\in \mathbb {R}^d\) such that \(\triangle t_1t_2t\) is an equilateral triangle, thus obtaining \(N_{i-1}\). Remove s and its incident edges from \({\mathcal T}_{i}\) and replace them by the edge pt, to obtain \({\mathcal T}_{i-1}\). If \(N_2=\left\{ t_0,t\right\} \), say, then we call the line segment \(t_0t\) an unfolding of \(N_n\) with respect to the topology \({\mathcal T}_n\).

It is not difficult to see that, if there is more than one cherry to choose from at a certain stage, it does not matter which we choose first. We may in fact process both cherries in parallel (this is equivalent to saying that in the subtree of \({\mathcal T}_n\) on the Steiner points, it does not matter in which order we remove leaves, and that this may be done in parallel).

Lemma 7.1 and induction immediately give the following, which is Theorem 3.1 of [22] and Theorem 4.2 of [18]:

Lemma 7.2

The length of any unfolding of a terminal set \(N_n\subset \mathbb {R}^d\) with respect to a full Steiner topology \({\mathcal T}_n\) is a lower bound for the shortest tree on \(N_n\), which has \({\mathcal T}_n\) as topology (allowing degenerate topologies).

Next, we describe the plan of the proof of Theorem 4.1. First, we unfold a planar \(\varepsilon \)-approximate Steiner tree into a polygonal path of the same length, and estimate the turn at each internal vertex of the path. By Lemma 7.2, the length between the endpoints of the unfolding is a lower bound on the length of a Steiner minimal tree on the same terminal set. By a result of E. Schmidt [27] (Lemma 7.3 below), this length is minimised among all polygonal paths with the same angles and edges of the same length, by a planar, convex path. Finally, we minimise the length of the endpoint among all polygonal paths of the same total length and the same sum of turns.

Before providing the detail of the general case, we show how to determine exact values for small n.

Proposition 7.1

For all \(\varepsilon \in ]0,\pi /3[, F_2(\varepsilon ,3)=G_2(\varepsilon ,3)=\frac{1}{\cos \varepsilon /2}-1\).

Fig. 4
figure 4

Unfolding an \(\varepsilon \)-approximate Steiner tree on 3 terminals

Proof

We show that \(F_2(\varepsilon ,3)\leqslant (\cos \varepsilon /2)^{-1}-1\). Consider an \(\varepsilon \)-approximate Steiner tree T on three terminals \(t_0, t_1, t_2\) in the plane, with Steiner point s and edges \(e_i=st_i, i=0,1,2\), numbered in such a way that \(e_0, e_1, e_2\) are in anticlockwise order around s. See Fig. 4. Let \(\sphericalangle t_0st_1=2\pi /3+\varepsilon _1, \sphericalangle t_0st_2=2\pi /3+\varepsilon _2\) and \(\sphericalangle t_1st_2=2\pi /3+\varepsilon _3\), where \(\left|\varepsilon _i\right|\leqslant \varepsilon , i=1,2,3\). Since \(\varepsilon \leqslant \pi /3\), the three angles sum to \(2\pi \), and \(\varepsilon _1+\varepsilon _2+\varepsilon _3=0\).

We unfold the tree into a polygonal line of total length L(T) as follows. We rotate \(e_1=st_1\) by an angle of \(\pi /3\) around s to obtain the edge \(e_1'=st_1'\), say. We rotate \(e_2=st_2\) by an angle of \(-\pi /3\) around \(t_1\) to obtain the edge \(e_2'=t_1't_2'\). Then, \(t_0st_1't_2'\) is a polygonal line of length L(T) (see Fig. 4). The turn from edge \(e_0\) to \(e_1'\) equals \(\varepsilon _1\), and the turn from \(e_1'\) to \(e_2'\) equals \(\varepsilon _3\). Since \(\left|\varepsilon _1+\varepsilon _3\right|=\left|\varepsilon _2\right|\leqslant \varepsilon <\pi \), the rays \(\overrightarrow{t_0s}\) and \(\overrightarrow{t_2't_1'}\) intersect in p, say. Then, \(L(T)=|t_0s|+|st_1'|+|t_1't_2'|\leqslant |t_0p|+|pt_2'|\). By Lemma 7.2, \(L(S(T))\geqslant |t_0t_2'|\). It follows that

$$\begin{aligned} \frac{L(T)}{L(S(T))}\leqslant \frac{|t_0p|+|pt_2'|}{|t_0t_2'|}\leqslant \frac{1}{\cos {\varepsilon /2}}\quad \text {by Lemma 6.1}, \end{aligned}$$

and

$$\begin{aligned} F_2(\varepsilon ,3)=\sup \frac{L(T)}{L(S(T))}-1\leqslant \frac{1}{\cos {\varepsilon /2}}-1. \end{aligned}$$

To show that \(G_2(\varepsilon ,3)\geqslant (\cos {\varepsilon /2})^{-1}-1\), consider an \(\varepsilon \)-approximate tree T as above with \(\varepsilon _1=\varepsilon _2=-\varepsilon /2, \varepsilon _3=\varepsilon , |t_0s|=\delta \) for arbitrarily small \(\delta >0\), and \(|t_1s|=|t_2s|=1\). Then, \(L(T)=2+\delta \) and \(L(S(T))=\delta +2\cos (\varepsilon /2)\). Since all angles in \(\triangle t_0t_1t_2\) are less than \(2\pi /3\) if \(\delta \) is small enough, S(T) is not degenerate, hence \(G_2(\varepsilon ,3)\geqslant \frac{2+\delta }{\delta +2\cos \varepsilon /2}-1\) for all \(\delta >0\). It follows that \(G_2(\varepsilon ,3)\geqslant (\cos \varepsilon /2)^{-1}-1\). \(\square \)

Proposition 7.2

For all \(\varepsilon \in ]0,\pi /3[, F_2(\varepsilon ,4)=G_2(\varepsilon ,4)=\frac{1}{\cos \varepsilon }-1\).

Fig. 5
figure 5

Unfolding an \(\varepsilon \)-approximate Steiner tree on 4 terminals

Proof

Consider an \(\varepsilon \)-approximate Steiner tree on four terminals \(t_1, t_2, t_3, t_4\), Steiner points \(s_1\) and \(s_2\), and edges \(e_1=s_1t_1, e_2=s_1t_2, e_0=s_1s_2, e_3=s_2t_3, e_4=s_2t_4\), labelled in such a way that \(e_0, e_1, e_2\) are in anticlockwise order around \(s_1\), and \(e_0, e_4, e_3\) are in anticlockwise order around \(s_2\). Furthermore, let \(\sphericalangle t_1s_1t_2=2\pi /3+\varepsilon _1, \sphericalangle t_1s_1s_2=2\pi /3+\varepsilon _2, \sphericalangle s_1s_2t_4=2\pi /3+\varepsilon _3\) and \(\sphericalangle t_3s_2t_4=2\pi /3+\varepsilon _4\), where \(\left|\varepsilon _i\right|\leqslant \varepsilon , i=1,2,3,4\), and \(\left|\varepsilon _1+\varepsilon _2\right|, \left|\varepsilon _3+\varepsilon _4\right|\leqslant \varepsilon \). See Fig. 5. As in the proof of Proposition 7.1, we unfold the tree into a polygonal line of total length L(T), and with the distance between the endpoints a lower bound to L(S(T)). Rotate \(e_1\) by \(\pi /3\) around \(s_1\) to obtain \(e_1'=s_1t_1'\). Rotate \(e_2\) by \(-\pi /3\) around \(t_1\) to obtain \(e_2'=t_1't_2'\). Rotate \(e_3\) by \(-\pi /3\) around \(t_4\) to obtain \(e_3'=t_4't_3'\). Rotate \(e_4\) by \(\pi /3\) around \(s_2\) to obtain \(e_4'=s_2t_4'\). This gives a polygonal line \(P=t_2't_1's_1s_2t_4't_3'\) of length L(T), with turns \(-\varepsilon _1\) at \(t_1', -\varepsilon _2\) at \(s_1, \varepsilon _3\) at \(s_2\), and \(\varepsilon _4\) at \(t_4'\). Note that the turn between any two of the five edges of P will be at most \(2\varepsilon \) in absolute value. For instance, the absolute turn between \(e_1'\) and \(e_3'\) equals \(\left|-\varepsilon _2+\varepsilon _3+\varepsilon _4\right|\leqslant \left|\varepsilon _2\right|+\left|\varepsilon _3+\varepsilon _4\right|\leqslant 2\varepsilon \). If we reorder the edges of P to make a new, convex polygonal line \(P'\) with the same endpoints as P (Fig. 5, middle), then \(P'\) will lie inside the triangle \(\triangle t_2't_3'p\) bounded by \(t_2't_3'\) and the lines through the first and last edges of \(P'\). The turn from the first edge to the last edge of \(P'\) is exactly the maximum turn between two edges of P, so is at most \(2\varepsilon \). Hence, the angle at the apex of this triangle will be at least \(\pi -2\varepsilon \), and by Lemma 6.1, \(L(T)/|t_2't_3'|\leqslant 1/\cos \varepsilon \). The proof of the upper bound concludes in the same way as that of Proposition 7.1.

To show that \((\cos \varepsilon )^{-1}-1\geqslant G_2(\varepsilon ,4)\), fix the above \(\varepsilon \)-approximate Steiner tree to have \(\varepsilon _1=0, \varepsilon _2=\varepsilon , \varepsilon _3=-\varepsilon , \varepsilon _4=0, |s_1s_2|=\delta \) and

$$\begin{aligned} |s_1t_1|=|s_1t_2|=|s_2t_3|=|s_2t_4|=1. \end{aligned}$$

It is not difficult to see that the Melzak algorithm obtains a locally minimum Steiner tree S(T) for any \(\varepsilon <\pi /3\). \(\square \)

The following generalises the idea in the above proof of estimating the length of a polygonal path in terms of the distance between its endpoints. We do not know the history of this elementary result, but an extension of this lemma to curves of finite total curvature was proved by Schmidt [27] (see also [28, Theorem 5.8.1] and [29, Proposition 7.1]).

Lemma 7.3

Consider a polygonal path \(p_0p_1\ldots p_n\) in the plane. For each \(i=1,\ldots ,n-1\), define the turn \(\varepsilon _i\) at \(p_i\) to be the signed angular measure in \([-\pi ,\pi ]\) by which the ray with source at \(p_i\) in the direction opposite to \(\overrightarrow{p_ip_{i-1}}\) has to turn to coincide with the ray \(\overrightarrow{p_ip_{i+1}}\). Let

$$\begin{aligned} \kappa =\max _{1\leqslant i\leqslant j\leqslant n-1}\left|\sum _{t=i}^j\varepsilon _t\right|. \end{aligned}$$

If \(\kappa <\pi \), then

$$\begin{aligned} \frac{\sum _{i=0}^{n-1}|p_i p_{i+1}|}{|p_0 p_n|}\leqslant \frac{1}{\cos (\kappa /2)}. \end{aligned}$$

Proof

The case \(n=2\) is just Lemma 6.1, so assume that \(n\geqslant 3\). Since \(\kappa <\pi \), the n unit vectors

$$\begin{aligned} u_i=\left||p_{i+1}-p_i\right||^{-1}(p_{i+1}-p_i) \end{aligned}$$

all lie in an open half circle. The polygonal path \(p_0p_1\ldots p_n\) can be replaced with a convex polygonal path \(p_o'p_1'\ldots p_n'\) such that \(p_0=p_o', p_n=p_n'\) and each segment of the new path is a translation of a segment of the original path, selected so that the turns all have the same sign. Then, \(p_0'p_1'\ldots p_n'\) is a convex polygonal path with the same \(\kappa \) and the same endpoints as the original polygonal path. Let the lines \(p_0'p_1'\) and \(p_{n-1}'p_n'\) intersect in q. Since \(\kappa <\pi , p_0'p_1'\ldots p_n'\) is contained in \(\triangle p_o'qp_n'\). By a well-known elementary geometric inequality, \(\sum _{i=1}^{n-1}|p_i' p_{i+1}'|\leqslant |p_0'q|+|qp_n'|\). It remains to apply the case \(n=2\) of the lemma to the path \(p_0'qp_n'\). \(\square \)

Fig. 6
figure 6

Proof of Theorem 4.1

Proof of Theorem 4.1

We choose a root edge of an \(\varepsilon \)-approximate Steiner tree T on n terminals and unfold the two parts of T separated by the root edge to obtain a polygonal path P with \(2n-3\) edges, of the same length as T. See Fig. 6, where the blue \(\varepsilon \)-approximate tree has been unfolded. The turn at each internal vertex of the polygonal path P is indicated. The quantity \(\kappa \) of Lemma 7.3 is the maximum absolute turn between any two edges of P. For example, the total turn between edge a and edge h on P in Fig. 6 equals \(-\varepsilon _1-\varepsilon _2+\varepsilon _3+\varepsilon _4+\varepsilon _5-\varepsilon _4-\varepsilon _5-\varepsilon _6 = (-\varepsilon _1-\varepsilon _2) -\varepsilon _6\), which is the sum of the errors at the two Steiner points on the path between edges a and h in the tree. Thus, the absolute turn between a and h in P is at most \(2\varepsilon \). In general, since there are at most \(n-2\) Steiner points in a full Steiner topology on n terminals, there are at most \(n-2\) Steiner points on the path between any two edges in an \(\varepsilon \)-approximate Steiner tree, each contributing an error of absolute value at most \(\varepsilon \). It follows that \(\kappa \leqslant (n-2)\varepsilon \). We now apply Lemma 7.3 to obtain that \(L(T)/L(S(T))\leqslant 1/\cos (\frac{1}{2}(n-2)\varepsilon )\). \(\square \)

8 Construction of an \(\varepsilon \)-Approximate Full Binary Tree in the Plane

In this section we prove Theorems 4.2 and 4.3 by constructing a sequence of \(\varepsilon \)-approximate Steiner trees \(T_k\) (\(k\in \mathbb {N}\)), for which it is possible to calculate the ratio between their length and the length of a locally minimum Steiner tree on the same terminals, if \(\varepsilon \leqslant 1/k^2\). A somewhat similar construction is made in [30]. The calculation will make essential use of complex numbers. Using complex numbers to solve problems in classical Euclidean geometry is an old trick [31,32,33], and even in the geometric Steiner tree literature there are papers where complex numbers appear [34, 35].

Proof of Theorems 4.2 and 4.3

Throughout the proof we denote the largest integer, not greater than x, by \(\lfloor x\rfloor \). Fix \(k\in \mathbb {N}\). We describe an \(\varepsilon \)-approximate Steiner tree \(T_k\) with \(2^k+1\) terminals \(p_i\) (for \(i=0\) and \(2^k\leqslant i\leqslant 2^{k+1}-1\)), \(2^k-1\) Steiner points \(p_i\) (\(1\leqslant i\leqslant 2^k-1\)) and \(2^{k+1}-1\) edges \(e_i=p_ip_{\lfloor i/2\rfloor }\) (\(1\leqslant i\leqslant 2^{k+1}-1\)). Let each \(e_i\) have length \(2^{-\lfloor \log _2 i\rfloor }\), and let the angles at the edges incident to the Steiner point \(p_i\) be

$$\begin{aligned} \sphericalangle p_{2i}p_ip_{2i+1}=2\pi /3, \sphericalangle p_{2i+1}p_ip_{\lfloor i/2\rfloor }=2\pi /3-\varepsilon ,\text { and }\sphericalangle p_{\lfloor i/2\rfloor }p_i p_{2i}=2\pi /3+\varepsilon \end{aligned}$$

(Fig. 7). This determines the tree uniquely up to congruence. See Fig. 8 for the case \(k=3\). Since there are \(2^j\) edges of length \(2^{-j+1}\) (\(j=0,1,\ldots ,k\)),

$$\begin{aligned} L(T_k)=k+1. \end{aligned}$$
(4)

We construct this tree recursively, using complex numbers. Let \(p_0=0\in \mathbb {C}\) and \(p_1=1\in \mathbb {C}\). Then, \(e_1=p_0p_1\). Let \(\omega =e^{i\pi /3}\) and \(z=e^{i\varepsilon }\). \(\square \)

Fig. 7
figure 7

Angles around a Steiner point in the binary tree construction

Fig. 8
figure 8

Construction of \(T_k, k=3\)

Once \(p_{\lfloor i/2\rfloor }\) and \(p_i\) have been defined, define \(p_{2i}\) and \(p_{2i+1}\) as in Fig. 7. If we walk from \(p_{\lfloor i/2\rfloor }\) to \(p_i\) and then turn in the direction of \(p_{2i}\), the turn is a right turn by an angle of \(\pi /3-\varepsilon \). Furthermore, \(|p_ip_{2i}|=\frac{1}{2} |p_{\lfloor i/2\rfloor }p_i|\). Therefore,

$$\begin{aligned} p_{2i}-p_i = \frac{1}{2}(p_i-p_{\lfloor i/2\rfloor })\omega ^{-1}z. \end{aligned}$$
(5)

Similarly, if we turn instead in the direction of \(p_{2i+1}\), this is a left turn by an angle of \(\pi /3+\varepsilon \), which gives

$$\begin{aligned} p_{2i+1}-p_i = \frac{1}{2}(p_i-p_{\lfloor i/2\rfloor })\omega z. \end{aligned}$$
(6)

We obtain the following recurrence:

(7)

To describe its solution, we have to consider the sequence of left and right turns as we walk from \(p_0\) to \(p_i\). This can be found from the binary expression of i. Let \(h(i)=\lfloor \log _2 i\rfloor \). Let \(b_0,b_1,\ldots ,b_{h(i)}\in \{0,1\}\) be the unique values such that

$$\begin{aligned} i=\sum _{j=0}^{h(i)-1} b_j 2^j + 2^{h(i)}. \end{aligned}$$

If we replace 0 by R and 1 by L in the sequence \(b_{h(i)-1},\ldots , b_0\), we obtain the left and right turns in the path from \(p_0\) to \(p_i\). Let \(a_j(i)\) be the number of 1s in \(b_{h(i)-1}, \ldots , b_{h(i)-j}\) minus the number of 0s in \(b_{h(i)-1}, \ldots , b_{h(i)-j}\). In particular, \(a_0(i)=0\).

Lemma 8.1

For each \(i\geqslant 1\),

$$\begin{aligned} p_i=\sum _{j=0}^{h(i)} \omega ^{a_j(i)}\left( \frac{z}{2}\right) ^j. \end{aligned}$$
(8)

Proof

Observe that \(h(2i)=h(2i+1)=h(i)+1\),

$$\begin{aligned} a_j(2i)= & {} a_j(2i+1)=a_j(i)\text { for each } j=0,\ldots ,h(i), \text {and}\nonumber \\ a_{h(i)}(i)= & {} a_{h(2i)}(2i)+1=a_{h(2i)-1}(2i)=a_{h(2i+1)}(2i+1)-1=a_{h(2i+1)-1}(2i+1).\nonumber \\ \end{aligned}$$
(9)

It then follows by induction, using (5) and (6), that

$$\begin{aligned} p_i-p_{\lfloor i/2\rfloor }=\omega ^{a_{h(i)}(i)}\left( \frac{z}{2}\right) ^{h(i)}. \end{aligned}$$
(10)

Finally, by induction and (7) we obtain (8). \(\square \)

We remark that each \(p_i\) is a polynomial in z of degree h(i) with coefficients in the ring \(\mathbb {Z}[1/2,\omega ]\). Next, we apply Melzak’s Algorithm to the terminals of \(T_k\) to obtain the locally minimum Steiner tree \(S(T_k)\) with the same topology. Surprisingly, it turns out that the Steiner points of \(S(T_k)\) are also polynomials in z with coefficients in \(\mathbb {Z}[1/2,\omega ]\).

Fig. 9
figure 9

Melzak’s algorithm

The first step in Melzak’s algorithm is to calculate the so-called quasi-terminals \(q_i\) (\(1\leqslant i\leqslant 2^{k+1}-1\)) [18]. For each \(i=2^k,\ldots ,2^{k+1}-1\), let \(q_i=p_i\). Then, for each \(i=2^k-1,\ldots ,1\), once \(q_{2i}\) and \(q_{2i+1}\) have been defined, let \(q_i\) be the unique point such that the triangle \(\varDelta _i=\triangle q_i q_{2i}q_{2i+1}\) is equilateral, and such that \(p_i\) and \(q_i\) are on opposite sides of the line \(q_{2i}q_{2i+1}\). Let \(C_i\) be the circumcircle of \(\varDelta _i\) and \(c_i\) its centre (Fig. 9). Since \(\sphericalangle p_{2i}p_ip_{2i+1}=2\pi /3, \sphericalangle p_{\lfloor i/2\rfloor }p_iq_i=\pi -\varepsilon \) and \(|p_ip_{2i}|=|p_ip_{2i+1}|\), we obtain by induction that for \(i=1,\ldots ,2^k-1, \sphericalangle q_{2i}p_iq_{2i+1}=2\pi /3, \sphericalangle q_ip_iq_{2i}=\sphericalangle q_ip_iq_{2i+1}=\pi /3\). Hence, \(p_i\) is on \(C_i\) and the centre \(c_i\) of \(C_i\) is the midpoint of \(p_i\) and \(q_i\). Furthermore, \(|p_iq_i|=2|p_iq_{2i}|=2|p_iq_{2i+1}|\). Since \(c_iq_{2i}p_iq_{2i+1}\) is a parallelogram, we have

$$\begin{aligned} c_i&=p_i+(q_{2i}-p_i)+(q_{2i+1}-p_i) \\ \end{aligned}$$

and

$$\begin{aligned} q_i&= p_i+2(c_i-p_i)\nonumber \\&= p_i+2(q_{2i}-p_i)+2(q_{2i+1}-p_i). \end{aligned}$$
(11)

If \(2^{k-1}\leqslant i< 2^k\), then we have \(q_{2i}=p_{2i}\) and \(q_{2i+1}=p_{2i+1}\), hence

$$\begin{aligned} q_i&= p_i+2(p_{2i}-p_i)+2(p_{2i+1}-p_i)\\&= p_i+(p_i-p_{\lfloor i/2\rfloor })\omega ^{-1}z+(p_i-p_{\lfloor i/2\rfloor })\omega z\quad \text {by (5) and (6)}\\&= p_i + (p_i-p_{\lfloor i/2\rfloor })z\\&= p_i + \omega ^{a_{k-1}(i)}\left( \frac{z}{2}\right) ^{k-1}z\quad \text {by (10)}. \end{aligned}$$

By induction, we obtain that for each \(i<2^{k-1}\) (use (11), (10), (9); see “Appendix”)

$$\begin{aligned} q_i=p_i+\omega ^{a_{h(i)}(i)}\left( \frac{z}{2}\right) ^{h(i)}\sum _{j=1}^{k-h(i)}z^j \quad (i\geqslant 1). \end{aligned}$$
(12)

Therefore, each \(q_i\) is a polynomial in z of degree k. In particular,

$$\begin{aligned} q_1=\sum _{j=0}^k z^j. \end{aligned}$$
(13)

Furthermore, the centres

$$\begin{aligned} c_i=\frac{1}{2}(p_i+q_i)=p_i+\frac{1}{2}\omega ^{a_{h(i)}(i)}\left( \frac{z}{2}\right) ^{h(i)}\sum _{j=1}^{k-h(i)} z^j \end{aligned}$$
(14)

are polynomials in z of degree k. In particular,

$$\begin{aligned} c_1=1+\frac{1}{2}\sum _{j=1}^k z^j. \end{aligned}$$
(15)

Finally, we construct the Steiner points \(s_i, 1\leqslant i\leqslant 2^k-1\). Formally, we let \(s_0=p_0=0\). Once \(s_{\lfloor i/2\rfloor }\) has been constructed, \(s_i\) is the point where the minor arc of \(C_i\) intersects the segment \(s_{\lfloor i/2\rfloor }q_i\). See Fig. 9. This gives the shortest Steiner tree for this tree topology as long as intersects \(s_{\lfloor i/2\rfloor }q_i\). This happens iff \(\sphericalangle s_{\lfloor i/2\rfloor }q_ip_i\leqslant \pi /6\) and \(s_{\lfloor i/2\rfloor }\) is outside \(C_i\). For \(i\geqslant 1\), we calculate \(s_i\) by solving \(\left|s_i-c_i\right|=\left|q_i-c_i\right|\), where

$$\begin{aligned} s_i=q_i-\lambda (q_i-s_{\lfloor i/2\rfloor }),\quad 0<\lambda <1. \end{aligned}$$
(16)

If we square \(\left|q_i-\lambda (q_i-s_{\lfloor i/2\rfloor })-c_i\right|=\left|q_i-c_i\right|\) and use conjugates, we can solve for \(\lambda \):

$$\begin{aligned} \lambda = \frac{q_i-c_i}{q_i-s_{\lfloor i/2\rfloor }} + \frac{\overline{q_i}-\overline{c_i}}{\overline{q_i}-\overline{s_{\lfloor i/2\rfloor }}}, \end{aligned}$$

and substitute into (16) to determine \(s_i\):

$$\begin{aligned} s_i&= q_i - \left( \frac{q_i-c_i}{q_i-s_{\lfloor i/2\rfloor }} + \frac{\overline{q_i}-\overline{c_i}}{\overline{q_i}-\overline{s_{\lfloor i/2\rfloor }}}\right) (q_i-s_{\lfloor i/2\rfloor })\nonumber \\&= c_i - \frac{(\overline{q_i}-\overline{c_i})(q_i-s_{\lfloor i/2\rfloor })}{\overline{q_i}-\overline{s_{\lfloor i/2\rfloor }}}. \end{aligned}$$
(17)

In particular, using (13) and (15), \(s_1=\frac{1}{2} + \frac{1}{2} z^k\). It follows by induction (use (17), (14), (12); see “Appendix”) that

$$\begin{aligned} s_i = p_i + \frac{\omega ^{a_{h(i)}(i)}}{2^{h(i)+1}}\left( \sum _{j=0}^{k-h(i)-1} z^j\right) (z^{h(i)+1}-1)\qquad (i=1,\ldots ,2^k-1). \end{aligned}$$
(18)

Next, we calculate the edge lengths of the Steiner tree.

$$\begin{aligned} s_{2i}-s_i&= p_{2i} + \frac{\omega ^{a_{h(2i)}(2i)}}{2^{h(2i)+1}}\left( \sum _{j=0}^{k-h(2i)-1}z^j\right) (z^{h(2i)+1}-1)\\&\quad - p_i - \frac{\omega ^{a_{h(i)}(i)}}{2^{h(i)+1}}\left( \sum _{j=0}^{k-h(i)-1}z^j\right) (z^{h(i)+1}-1)\quad \text {by (18)}\\&= \frac{\omega ^{a_{h(2i)}(2i)}}{2^{h(2i)}} \Biggl [z^{h(2i)} + \frac{1}{2}\left( \sum _{j=0}^{k-h(2i)-1}z^j\right) (z^{h(2i)+1}-1)\\&\quad -\omega \left( \sum _{j=0}^{k-h(2i)}z^j\right) (z^{h(2i)}-1)\Biggr ] \quad \text {by (9) and (10).} \end{aligned}$$

Similarly,

$$\begin{aligned} s_{2i+1}-s_i&= \frac{\omega ^{a_{h(2i+1)}(2i+1)}}{2^{h(2i+1)}} \Biggl [z^{h(2i+1)} + \frac{1}{2}\left( \sum _{j=0}^{k-h(2i+1)-1}z^j\right) (z^{h(2i+1)+1}-1)\\&\quad -\omega ^{-1}\left( \sum _{j=0}^{k-h(2i+1)}z^j\right) (z^{h(2i+1)}-1)\biggr ]. \end{aligned}$$

Let \(h\in \left\{ 1,\ldots ,k\right\} \) and define

$$\begin{aligned} p_{k,h}(z) := z^h + \frac{1}{2}\left( \sum _{j=0}^{k-h-1}z^j\right) (z^{h+1}-1)-\omega \left( \sum _{j=0}^{k-h}z^j\right) (z^h-1) \end{aligned}$$

and

$$\begin{aligned} q_{k,h}(z) := z^h + \frac{1}{2}\left( \sum _{j=0}^{k-h-1}z^j\right) (z^{h+1}-1)-\omega ^{-1}\left( \sum _{j=0}^{k-h}z^j\right) (z^h-1). \end{aligned}$$

It follows that

$$\begin{aligned} s_{2i}-s_i=0\text { iff }p_{k,h(2i)}(z) = 0, \end{aligned}$$

and

$$\begin{aligned} s_{2i+1}-s_i=0\text { iff }q_{k,h(2i+1)}(z) = 0. \end{aligned}$$

Since \(p_{k,h}(1)=q_{k,h}(1)=1\), both \(p_{k,h}(z)-1\) and \(q_{k,h}(z)-1\) have \(z-1\) as a factor. In fact,

$$\begin{aligned} \left|p_{k,h}(z)-1\right|&= \biggl |\sum _{j=0}^{h-1}z^j +\frac{1}{2}\sum _{j=0}^{k-h-1}z^j \sum _{j=0}^{h}z^j -\omega \sum _{j=0}^{k-h}z^j \sum _{j=0}^{h-1}z^j \biggr |\cdot \left|z-1\right|\\&\leqslant \left( h+\frac{1}{2}(k-h)(h+1)+(k-h+1)h\right) \left|z-1\right|\\&< k^2\left|z-1\right|, \end{aligned}$$

and similarly, \(\left|q_{k,h}(z)-1\right| < k^2\left|z-1\right|\). It follows that, if \(\left|z-1\right|<1/k^2\), then \(p_{k,h}(z)\ne 0\) and \(q_{k,h}(z)\ne 0\). Therefore, the Melzak construction gives a non-degenerate locally minimum Steiner tree for all \(\varepsilon \in [0,1/k^2[\), since \(\left|z-1\right|\leqslant \varepsilon \).

The length of the Steiner tree is

$$\begin{aligned} L(S(T_k))=|p_0q_1|=\left|\sum _{j=0}^k z^j\right|. \end{aligned}$$

The modulus of this sum of complex numbers can be interpreted as the distance between the endpoints of a convex polygonal path consisting of \(k+1\) segments, of unit length, with a turn of \(\varepsilon \) between two adjacent segments. This is easily calculated to be \(\sin [(k+1)\varepsilon /2]/\sin (\varepsilon /2)\). Thus, the ratio between the length of the approximate tree \(T_k\) and the length of the locally minimum Steiner tree \(S(T_k)\) is (recall (4))

$$\begin{aligned} \frac{L(T_k)}{L(S(T_k))} = \frac{(k+1)\sin (\varepsilon /2)}{\sin [(k+1)\varepsilon /2]} \geqslant 1+\frac{k^2+2k}{24}\varepsilon ^2. \end{aligned}$$

Therefore, \(G_2(\varepsilon ,2^k+1) > (k\varepsilon )^2/24\) if \(\varepsilon <1/k^2\), and Theorems 4.2 and 4.3 follow. \(\square \)

9 Conclusions

  1. 1.

    In this paper we considered the planar case of the conjectures of Rubinstein, Wormald and Weng [22]. Although we proved one of their conjectures when \(\varepsilon \) is sufficiently small in terms of the number of terminals (Corollary 4.1), the full conjecture is still open even in the plane, a setting that one would have expected to be simple. It is especially frustrating that for a small constant \(\epsilon \) (for instance, \(\varepsilon =10^{-3}\)), the best upper bound we have is \(F_2(\varepsilon ,n)=O(n)\) (Proposition 6.1).

  2. 2.

    In the \(\varepsilon \)-approximate Steiner tree constructed in Sect. 8, the edge lengths are halved at each new level of the tree. If we let the edge lengths decay sufficiently fast, then most likely the topology of the \(\varepsilon \)-approximate tree will be the same as the topology of a minimum Steiner tree for \(\varepsilon \) sufficiently small [36]. Thus, the locally minimum tree constructed, using the Melzak algorithm as in Sect. 8, will most likely be a minimum Steiner tree on the terminals. This would then give a (miniscule) lower bound for \(\overline{F}_2(\varepsilon ,n)\). However, the calculations are much harder when the ratio, at which the edge lengths change, is not exactly 1 / 2, and we have not carried these out. For similar ideas, see the papers [36] and [30].

  3. 3.

    In the proof of Theorems 4.2 and 4.3 (Sect. 8) we showed that the polynomials \(p_{k,h}\) and \(q_{k,h}\) do not have roots at distance smaller than \(1/k^2\) from 1. We suspect that these polynomials actually have roots at distance approximately \(c/k^2\) to 1.

  4. 4.

    It is to be expected that the lower bound in Theorem 4.3 should hold for general n, even if it turns out that \(G_2(\varepsilon ,n)\) is not monotone in n. Most likely the proof can be adapted for values of n other than \(2^k+1\) by modifying the construction in Sect. 8, but we did not look at this in detail.

  5. 5.

    In the definitions of \(F_d, \overline{F}_d\) and \(G_d\) in Sects. 3 and 4, we could have included all \(\varepsilon \)-approximate trees on n points instead of considering only the full ones. However, by decomposing a Steiner tree into full components, it can be shown that the values of \(F_d, d\geqslant 2\), and \(G_d, d\geqslant 3\), will not change (use the inequality \(\frac{a+b}{c+d}\leqslant \max \{\frac{a}{c},\frac{b}{d}\}\) and Propositions 5.1 and 5.2). We do not know whether the values of \(\overline{F}_d\) or \(G_2\) will also be unchanged.