Skip to main content
Log in

A homogeneous interior-point algorithm for nonsymmetric convex conic optimization

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

A homogeneous interior-point algorithm for solving nonsymmetric convex conic optimization problems is presented. Starting each iteration from the vicinity of the central path, the method steps in the approximate tangent direction and then applies a correction phase to locate the next well-centered primal–dual point. Features of the algorithm include that it makes use only of the primal barrier function, that it is able to detect infeasibilities in the problem and that no phase-I method is needed. We prove convergence to \(\epsilon \)-accuracy in \({\mathcal {O}}(\sqrt{\nu } \log {(1/\epsilon )})\) iterations. To improve performance, the algorithm employs a new Runge–Kutta type second order search direction suitable for the general nonsymmetric conic problem. Moreover, quasi-Newton updating is used to reduce the number of factorizations needed, implemented so that data sparsity can still be exploited. Extensive and promising computational results are presented for the \(p\)-cone problem, the facility location problem, entropy maximization problems and geometric programs; all formulated as nonsymmetric convex conic optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. A positive semidefinite matrix with all non-negative entries is called doubly non-negative.

  2. See Appendix 1 for a list of properties of this class of functions.

References

  1. Andersen, E.D., Andersen, K.D.: The MOSEK interior point optimization for linear programming: an implementation of the homogeneous algorithm. In: Frenk, H., Roos, K., Terlaky, T., Zhang, S. (eds.) High Performance Optimization, pp. 197–232. Kluwer, Boston (1999)

    Google Scholar 

  2. Andersen, E.D., Roos, C., Terlaky, T.: On implementing a primal–dual interior-point method for conic quadratic optimization. Math. Program. 95(2), 249–277 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  3. Andersen, E.D., Ye, Y.: On a homogeneous algorithm for the monotone complementarity problem. Math. Program. 84(2), 375–399 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  4. Ben-Tal, A., Nemirovski, A.S.: Lectures on Modern Convex Optimization: Analysis, Algorithms and Engineering Applications. SIAM, Philadelphia (2001)

    Book  Google Scholar 

  5. Boyd, S., Kim, S.J., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8, 67–127 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  6. Butcher, J.C.: Numerical Methods for Ordinary Differential Equations, 2nd edn. Wiley, New York (2008)

    Book  MATH  Google Scholar 

  7. Chares, P.R.: Cones and interior-point algorithms for structured convex optimization involving powers and exponentials. PhD thesis, Uni. Catholique de Louvain (2009)

  8. Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 1.21 (2010). http://cvxr.com/cvx

  9. Güler, O.: Barrier functions in interior point methods. Math. Oper. Res. 21, 860–885 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  10. Karmarkar, N.: A new polynomial-time algorithm for linear programming. Combinatorica 4, 373–395 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  11. Luo, Z.Q., Sturm, J.F., Zhang, S.: Conic convex programming and self-dual embedding. Optim. Method Softw. 14, 169–218 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  12. Mehrotra, S.: On the implementation of a primal–dual interior point method. SIAM J. Optim. 2, 575–601 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  13. MOSEK optimization software: developed by MOSEK ApS. www.mosek.com

  14. Nesterov, Y.E.: Constructing self-concordant barriers for convex cones. CORE Discussion Paper (2006/30)

  15. Nesterov, Y.E.: Towards nonsymmetric conic optimization. Optim. Method Softw. 27, 893–917 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  16. Nesterov, Y.E., Nemirovski, A.S.: Interior-Point Polynomial Algorithms in Convex Programming. SIAM, Philadelphia (1994)

    Book  MATH  Google Scholar 

  17. Nesterov, Y.E., Todd, M.J.: Self-scaled barriers and interior-point methods for convex programming. Math. Oper. Res. 22, 1–42 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  18. Nesterov, Y.E., Todd, M.J.: Primal–dual interior-point methods for self-scaled cones. SIAM J. Optim. 8, 324–364 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  19. Nesterov, Y.E., Todd, M.J., Ye, Y.: Infeasible-start primal–dual methods and infeasibility detectors for nonlinear programming problems. Math. Program. 84, 227–267 (1999)

    MATH  MathSciNet  Google Scholar 

  20. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)

    MATH  Google Scholar 

  21. Renegar, J.: A Mathematical View of Interior-Point Methods in Convex Optimization. SIAM, Philadelphia (1987)

    Google Scholar 

  22. Sturm, J.F.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Method Softw. 12, 625–653 (1999)

    Article  MathSciNet  Google Scholar 

  23. Sturm, J.F.: Implementation of interior point methods for mixed semidefinite and second order cone optimization problems. Optim. Method Softw. 17, 1105–1154 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  24. Tuncel, L.: Primal–dual symmetry and scale invariance of interior-point algorithms for convex optimization. Math. Oper. Res. 23, 708–718 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  25. Tuncel, L.: Generalization of primal–dual interior-point methods to convex optimization problems in conic form. Found. Comput. Math. 1, 229–254 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  26. Xu, X., Hung, P.F., Ye, Y.: A simplified homogeneous and self-dual linear programming algorithm and its implementation. Ann. Oper. Res. 62, 151–171 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  27. Xue, G., Ye, Y.: An efficient algorithm for minimizing a sum of \(p\)-norms. SIAM J. Optim. 10, 551–579 (1999)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors thank Erling D. Andersen and Joachim Dahl of Mosek ApS for lots of insights and for supplying us with test problems for the geometric programs and the entropy problems. The authors also thank the reviewers for many helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anders Skajaa.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 324 KB)

Appendices

Appendix 1: Properties of the barrier function

Here we list some properties of logarithmically homogeneous self-concordant barriers (lhscb) that we use in this paper. Many more properties and proofs can be found in [17, 18].

Let \(\mathcal {K}^{\circ }\) denote the interior of \(\mathcal {K}\). We assume that \(F: \mathcal {K}^{\circ } \mapsto \mathbb {R}\) is a lhscb for \(\mathcal {K}\) with barrier parameter \(\nu \). This means that for all \(x \in \mathcal {K}^{\circ }\) and \(t>0\),

$$\begin{aligned} F(tx) = F(x) - \nu \log {t}. \end{aligned}$$

It follows that the conjugate of \(F\), denoted \(F^*\) and defined for \(s \in (\mathcal {K}^*)^{\circ }\) by

$$\begin{aligned} F^*(s) = \sup _{x \in \mathcal {K}}\{ - s^T x - F(x) \} \end{aligned}$$

is a lhscb for the dual cone \(\mathcal {K}^*\). Similarly to the notation used in [17, 18], we write the local Hessian norms on \(\mathcal {K}\) and \(\mathcal {K}^*\) as:

$$\begin{aligned} \Vert g \Vert _x&= \Vert H_x^{1/2} g \Vert , \quad \text {for }x \in \mathcal {K}^{\circ } \\ \Vert h \Vert _s^*&= \Vert (H_s^*)^{1/2} g \Vert , \quad \text {for }s \in (\mathcal {K}^*)^\circ \\ \Vert h \Vert _x^*&= \Vert H_x^{-1/2} h \Vert , \quad \text {for }x \in (\mathcal {K})^\circ , \end{aligned}$$

where \(H_s^* = \nabla ^2 F^*(s)\). Notice the different definitions of \(\Vert \cdot \Vert _y^*\) depending on whether \(y\) is in \(\mathcal {K}\) or \(\mathcal {K}^*\). Using this convention and that \(-g_x \in (\mathcal {K}^*)^\circ \) and \(H_{-g_x}^* = H_x^{-1}\), we see that

$$\begin{aligned} \Vert s \Vert ^*_{-g_x} = \Vert (H_{-g_x}^*)^{-1/2} s \Vert = \Vert H_x^{1/2} s \Vert = \Vert s \Vert _x^*. \end{aligned}$$
(18)

For \(x \in \mathcal {K}^\circ \), \(F\) satisfies

$$\begin{aligned} H_x x&= -g_x \end{aligned}$$
(19)
$$\begin{aligned} x^T g_x&= -\nu \end{aligned}$$
(20)
$$\begin{aligned} \Vert x \Vert _x^2&= \nu . \end{aligned}$$
(21)

The Dikin ellipsoids are feasible [4]. That is:

$$\begin{aligned} x \in \mathcal {K}^\circ&\Rightarrow W(x) = \{ u, \Vert u - x \Vert _x \le 1 \} \subseteq \mathcal {K} \end{aligned}$$
(22)
$$\begin{aligned} s \in (\mathcal {K}^*)^\circ&\Rightarrow W^*(s) = \{h, \Vert h - s \Vert _s^* \le 1 \} \subseteq K^*. \end{aligned}$$
(23)

Appendix 2: The homogeneous and self-dual model

1.1 Optimality and infeasibility certificate

Let \(G\) be defined by (5) and notice that \(G\) is skew-symmetric: \(G = -G^T\).

  1. 1.

    Observe that we can write (hsd) as \(G(y; x; \tau )^T - (0; s;\kappa )^T = 0\). Pre-multiplying this equation by \((y; x; \tau )^T\) gives \(x^T s + \tau \kappa = 0\).

  2. 2.

    \(\tau > 0\) implies \(\kappa = 0\) and hence \(b^T (y/\tau ) - c^T (x/\tau ) = 0\) and therefore \(x^T s = 0\). Dividing the two first linear feasibility equations of (hsd) by \(\tau \), we obtain the linear feasibility equations of (1). Thus \((x,y,s)/\tau \) is optimal for (pd).

  3. 3.

    If \(\kappa > 0\) then \(\tau = 0\) so \(Ax=0\) and \(A^T y + s = 0\). Further \(c^T x - b^T y = -\kappa < 0\) so not both \(c^T x\) and \(-b^T y\) can be non-negative. Assume \(-b^T x < 0\). If (pd) is primal-feasible then there exists \(\bar{x} \in \mathcal {K}\) such that \(A \bar{x} = b\). But then \(0 > -b^T y = -\bar{x}^T A^T y = \bar{x}^T s \ge 0\), a contradiction. We can argue similarly if \(c^T x < 0\),

and this completes the proof of Lemma 1.

1.2 Self-duality

The dual of (hsd) problem is

$$\begin{aligned} \max _{\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{s}}\qquad&0 \nonumber \\ \text {s.t.}\qquad&\left( \begin{array}{c@{\quad }c@{\quad }c} A^T &{} 0 &{} -c \\ 0 &{} c^T &{} -b^T \\ 0 &{} -I &{} 0 \\ -1 &{} 0 &{} 0 \end{array} \right) \left( \begin{array}{c} {\hat{y}_1} \\ {\hat{y}_2} \\ {\hat{y}_3} \end{array} \right) + \left( \begin{array}{c} {\hat{s}_1} \\ {\hat{s}_2} \\ {\hat{s}_3} \\ {\hat{s}_4} \end{array} \right) = 0 \end{aligned}$$
(24)
$$\begin{aligned}&-A \hat{y}_2 + b \hat{y}_3 = 0 \end{aligned}$$
(25)
$$\begin{aligned}&\hat{s} \in (\mathcal {K}\times \mathbb {R}_+ \times \mathcal {K}^*\times \mathbb {R}_+)^*, \quad \hat{y} \text { free}. \end{aligned}$$
(26)

After a few eliminations, we see that (24)–(26) are equivalent to

$$\begin{aligned}&\begin{array}{c@{\quad }l@{\quad }l@{\quad }l@{\quad }rcl} \qquad \qquad \quad \quad \; &{} &{} A\hat{s}_3 &{} -b\hat{s}_4 &{} &{}=&{} 0 \\ &{} -A^T \hat{y}_1 &{} &{} +c\hat{s}_4 &{}-\hat{s}_1 &{}=&{} 0 \\ &{} b^T \hat{y}_1 &{} -c^T \hat{s}_3 &{} &{} -\hat{s}_2 &{}=&{} 0 \\ \end{array} \\& (\hat{s}_3,\hat{s}_4) \in \mathcal {K}\times \mathbb {R}_+,\;\; (\hat{s}_1,\hat{s}_2) \in \mathcal {K}^*\times \mathbb {R}_+,\;\; \hat{y}_1 \in \mathbb {R}^m. \nonumber \end{aligned}$$
(27)

Through the following identification of variables

$$\begin{aligned} \hat{s}_1&\sim s, \quad \hat{s}_2 \sim \kappa , \quad \hat{s}_3 \sim x, \quad \hat{s}_4 \sim \tau , \quad \hat{y}_1 \sim y, \end{aligned}$$

it is clear that the constraints (27) are equivalent to those of the problem (hsd). Since the objective function in both problems is constant zero, the two problems are identical and this proves Lemma 2.

Appendix 3: Prediction

The direction \(d_{z}\) is defined by

$$\begin{aligned} G(d_{y}; d_{{\bar{x}}}) - (0; d_{{\bar{s}}})&= -\left( G(y; {\bar{x}}) - (0; {\bar{s}}) \right) \end{aligned}$$
(28)
$$\begin{aligned} d_{{\bar{s}}} + \mu H_{\bar{x}} d_{{\bar{x}}}&= -{\bar{s}} \end{aligned}$$
(29)

1.1 Reduction of residuals

We first show:

$$\begin{aligned} \text {1.}\&{\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}} + {\bar{x}}^T {\bar{s}} = \psi (z)^T d_{{\bar{x}}} \end{aligned}$$
(30)
$$\begin{aligned} \text {2.}\&({\bar{x}}+d_{{\bar{x}}})^T ({\bar{s}}+d_{{\bar{s}}}) = 0 \end{aligned}$$
(31)
$$\begin{aligned} \text {3.}\&d_{{\bar{x}}}^T d_{{\bar{s}}} = -\psi (z)^T d_{{\bar{x}}}. \end{aligned}$$
(32)
  1. 1.

    We get \({\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}} + {\bar{x}}^T {\bar{s}} \mathop {=}\limits ^{(29)} {\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T(-{\bar{s}}-\mu H_{\bar{x}} d_{{\bar{x}}}) + {\bar{x}}^T {\bar{s}}\), which, after reduction, gives \(d_{{\bar{x}}}^T ({\bar{s}} - \mu H_{\bar{x}} {\bar{x}}) = \psi (z)^T d_{{\bar{x}}}\).

  2. 2.

    Equation (28) is equivalent to \(G(y+d_{y}; {\bar{x}}+d_{{\bar{x}}}) - (0; {\bar{s}}+d_{{\bar{s}}}) = 0\). Pre-multiplying this equation by \((y+d_{y},{\bar{x}}+d_{{\bar{x}}})\) gives (31).

  3. 3.

    Follows from expanding (31) and using (30).

Now the lemma follows readily: We simply note that the first equation follows directly from elementary linear algebra. To show the second:

$$\begin{aligned} {\bar{\nu }}\mu (z^+)&\mathop {=}\limits ^{} ({\bar{x}}+\alpha d_{{\bar{x}}})^T ({\bar{s}}+\alpha d_{{\bar{s}}}) \\&\mathop {=}\limits ^{} {\bar{x}}^T {\bar{s}} + \alpha ({\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}}) + \alpha ^2 d_{{\bar{x}}}^Td_{{\bar{s}}} \\&\mathop {=}\limits ^{(30){-}(32)} {\bar{x}}^T {\bar{s}} + \alpha (-{\bar{x}}^T {\bar{s}} + \psi (z)^T d_{{\bar{x}}}) + \alpha ^2(-\psi (z)^T d_{{\bar{x}}}) \\&\mathop {=}\limits ^{} (1-\alpha ){\bar{x}}^T {\bar{s}} + \alpha (1-\alpha ) \psi (z)^T d_{{\bar{x}}} \end{aligned}$$

which after division by \({\bar{\nu }}\) proves Lemma 3.

1.2 Bounds on \({\bar{s}}\), \(d_{{\bar{s}}}\) and \(d_{{\bar{x}}}\)

Assume \(\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu \). By definition, \(\psi = {\bar{s}} - \mu H_{\bar{x}} {\bar{x}}\), which after left-multiplication by \(H_{\bar{x}}^{-1/2}\), taking norms and squaring both sides gives

$$\begin{aligned} (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2&= (\Vert \psi \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert {\bar{x}} \Vert _{\bar{x}}^2 + 2\mu {\bar{x}}^T \psi \nonumber \\&= (\Vert \psi \Vert _{\bar{x}}^*)^2 + 2 + \mu ^2 {\bar{\nu }} \le \mu ^2 ({\bar{\nu }} + \eta ^2) \nonumber \\ \Vert {\bar{s}} \Vert _{\bar{x}}^*&\le \mu \sqrt{\eta ^2 + {\bar{\nu }}} \end{aligned}$$
(33)

where we used (21) and \({\bar{x}}^T \psi = 0\).

This bound allows us to obtain bounds on \(d_{{\bar{x}}}\) and \(d_{{\bar{s}}}\): Left-multiplying (29) by \(H_{\bar{x}}^{-1/2}\), taking norms and squaring both sides gives

$$\begin{aligned} (\Vert d_{{\bar{s}}} \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2&= (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 - 2 \mu d_{{\bar{x}}}^T d_{{\bar{s}}} \mathop {=}\limits ^{(32)} (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu d_{{\bar{x}}}^T \psi \\&\le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^* \end{aligned}$$

by the Cauchy–Schwarz inequality. Therefore: \( \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2 \le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^* \). Now subtracting \(2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^*\) and adding \((\Vert \psi \Vert _{\bar{x}}^*)^2\) to both sides, we get

$$\begin{aligned} \left( \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} - \Vert \psi \Vert _{\bar{x}}^* \right) ^2 \le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + ( \Vert \psi \Vert _{\bar{x}}^* )^2 \end{aligned}$$

or

$$\begin{aligned} \Vert d_{{\bar{x}}} \Vert _{\bar{x}}&\le \mu ^{-1} \left( \Vert \psi \Vert _{\bar{x}}^* + \sqrt{(\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + ( \Vert \psi \Vert _{\bar{x}}^* )^2} \right) \nonumber \\&\le \mu ^{-1}(\eta \mu + \sqrt{\mu ^2(\eta ^2 + {\bar{\nu }}) + \eta ^2 \mu ^2}) = \eta + \sqrt{\eta ^2 + {\bar{\nu }}} =: k_{\bar{x}}. \end{aligned}$$
(34)

For \(d_{{\bar{s}}}\), we similarly have

$$\begin{aligned} (\Vert d_{{\bar{s}}} \Vert _{\bar{x}}^*)^2&\le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert d_{{\bar{s}}} \Vert _{\bar{x}}^* \nonumber \\ (\Vert d_{{\bar{s}}} \Vert _{\bar{x}}^* - \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} )^2&\le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2 \nonumber \\ \Vert d_{{\bar{s}}} \Vert _{\bar{x}}^*&\le k_{\bar{x}} \mu + \sqrt{\mu ^2(\eta ^2 + {\bar{\nu }}) + k_{\bar{x}}^2 \mu ^2} = k_{\bar{s}} \mu \end{aligned}$$
(35)

where \(k_{\bar{s}} := k_{\bar{x}} + \sqrt{(\eta ^2 + {\bar{\nu }}) + k_{\bar{x}}^2}\).

1.3 Feasibility of \(z^+\)

Define \(\alpha _1 := k_{\bar{x}}^{-1} = \Omega (1/\sqrt{\bar{\nu }})\). Then for any \(\alpha \le \alpha _1\), we have

$$\begin{aligned} \Vert {\bar{x}} - ({\bar{x}}+\alpha d_{{\bar{x}}}) \Vert _{\bar{x}}&= \alpha \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \mathop {\le }\limits ^{(34)} \alpha k_{\bar{x}} \le 1 \end{aligned}$$

and so from (22), we conclude \({\bar{x}}+\alpha d_{{\bar{x}}} = {\bar{x}}^+ \in \bar{\mathcal {K}}\).

Now, define \(\alpha _2 := (1-\eta )k_{\bar{s}}^{-1} = \Omega (1/\sqrt{\bar{\nu }})\). Then for \(\alpha \le \alpha _2\), we have

$$\begin{aligned} \mu ^{-1} \Vert {\bar{s}}^+ + \mu g_{\bar{x}} \Vert _{-g_{\bar{x}}}^*&\mathop {=}\limits ^{} \mu ^{-1}\Vert {\bar{s}} + \alpha d_{{\bar{s}}} + \mu g_{\bar{x}} \Vert _{-g_{\bar{x}}}^* =\mu ^{-1}\Vert \psi + \alpha d_{{\bar{s}}} \Vert _{-g_{\bar{x}}}^* \\&\mathop {\le }\limits ^{(18)} \mu ^{-1}\Vert \psi \Vert _{\bar{x}}^* + \mu ^{-1}\alpha \Vert d_{{\bar{s}}} \Vert _{{\bar{x}}}^* \mathop {\le }\limits ^{(35)} \eta + \alpha k_{\bar{s}} \le 1. \end{aligned}$$

Since \(-g_{\bar{x}} \in \bar{\mathcal {K}}^*\), we have by (23) that \(\mu ^{-1}{\bar{s}}^+ \in \bar{\mathcal {K}}^*\) and therefore \({\bar{s}}^+ \in \bar{\mathcal {K}}^*\). Therefore, Lemma 4 holds with \(\alpha = \min \{\alpha _1,\alpha _2\} = \Omega (1/\sqrt{\bar{\nu }}) = \Omega (1/\sqrt{\nu })\).

1.4 Bound on \(\psi ^+\)

First recall the definition (6): \(\psi ({\bar{x}},{\bar{s}},t) = {\bar{s}} + t g_{\bar{x}}\). Now consider for a fixed \(v_0\) the function

$$\begin{aligned} \Phi _t({\bar{x}}) = {\bar{x}}^T v_0 + t F\left( {\bar{x}} \right) \end{aligned}$$

which is self-concordant with respect to \({\bar{x}}\). Define its Newton step by \( n_t({\bar{x}}) := -\nabla ^2 \Phi _t({\bar{x}})^{-1} \nabla \Phi _t({\bar{x}}) \). Define also \(q{}=\Vert n_{t_2}({\bar{x}}) \Vert _{\bar{x}}\). From the general theory of self-concordant functions, the following inequality holds. If \(q{} \le 1\), then

$$\begin{aligned} \Vert n_{t_2}({\bar{x}}_2) \Vert _{{\bar{x}}_2}&\le \left( \frac{q{}}{1-q{}}\right) ^2. \end{aligned}$$
(36)

For a proof of this relation, see e.g. Theorem 2.2.4 in [21]. With \(v_0 = {\bar{s}}^+, t_2=\mu ^+\) and \({\bar{x}}_2 = {\bar{x}}^+\), the inequality (36) is

$$\begin{aligned} \Vert \psi ^+ \Vert _{{\bar{x}}^+}^*&\le \mu ^+ \left( \frac{q{}}{1-q{}}\right) ^2. \end{aligned}$$
(37)

where \(\mu ^+ q{} = \Vert H_{\bar{x}}^{-1}({\bar{s}}^+ + \mu ^+ g_{\bar{x}}) \Vert _{\bar{x}} = \Vert {\bar{s}}^+ + \mu ^+ g_{\bar{x}} \Vert _{\bar{x}}^*\). From Lemma 3 and (34):

$$\begin{aligned} | \mu - \mu ^+ |&= | -\alpha \mu + \alpha (1-\alpha ) {\bar{\nu }}^{-1} \psi ^T d_{{\bar{x}}} | \nonumber \\&\le \mu \alpha \left( 1 + (1-\alpha )\eta k_{\bar{x}} {\bar{\nu }}^{-1} \right) . \end{aligned}$$
(38)

By the assumption \(\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu \) combined with (34), we have \(\psi ^T d_{{\bar{x}}} \ge -\eta k_{\bar{x}} \mu \). Therefore

$$\begin{aligned} \mu ^+&= (1-\alpha )\mu + \alpha (1-\alpha ){\bar{\nu }}^{-1}\psi ^T d_{{\bar{x}}} \nonumber \\&\ge \mu (1-\alpha )( 1-\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1} ) \nonumber \\ \mu /\mu ^+&\le (1-\alpha )^{-1}( 1-\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1} )^{-1} \end{aligned}$$
(39)

Let us now obtain a bound on \(q{}\).

$$\begin{aligned} \mu ^+ q{}&= \Vert {\bar{s}}^+ + \mu ^+ g_{\bar{x}} \Vert _{\bar{x}}^* = \Vert \psi - (\mu - \mu ^+) g_{\bar{x}} + \alpha d_{{\bar{s}}} \Vert _{\bar{x}}^* \\&\le \Vert \psi \Vert _{\bar{x}}^* + |\mu - \mu ^+| \Vert g_{\bar{x}} \Vert _{\bar{x}}^* + \alpha \Vert d_{{\bar{s}}} \Vert _{\bar{x}}^* \nonumber \\&\le \eta \mu + \mu \alpha \left( 1 + (1-\alpha )\eta k_{\bar{x}} {\bar{\nu }}^{-1} \right) \sqrt{{\bar{\nu }}} + \alpha k_{\bar{s}} \mu \nonumber \\&\le \mu \left( \eta + \alpha k_{\bar{s}} + \alpha (1 + (1-\alpha ){\bar{\nu }}^{-1}\eta k_{\bar{x}})\sqrt{{\bar{\nu }}} \right) \nonumber \\ q{}&\le (\mu /\mu ^+)( \eta + \alpha (\sqrt{{\bar{\nu }}} + k_{\bar{s}} + \eta k_{\bar{x}}) ) \nonumber \\&\le (1-\alpha )^{-1}( 1-\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1} )^{-1} ( \eta + \alpha (\sqrt{{\bar{\nu }}} + k_{\bar{s}} + \eta k_{\bar{x}}) ) \nonumber \end{aligned}$$
(40)

where we used (35), (38), (39) and the assumption \(\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu \). Now the reader can verify that for \(\eta \le 1/6\) and \({\bar{\nu }}\ge 2\), we have the implication

$$\begin{aligned} \alpha \le \alpha _3 := \frac{1}{11 \sqrt{{\bar{\nu }}}} = \Omega (1/\sqrt{\bar{\nu }}) \Rightarrow q{}^2/(1-q{})^2 \le 2\eta \le 1/3 \end{aligned}$$
(41)

which also implies \(q< 1\). Now by (37), we see that (41) implies \( \Vert \psi ^+ \Vert _{{\bar{x}}^+}^* \le 2\eta \mu ^+\) and hence \(z^+ \in \mathcal {N}(2\eta )\) which finishes the proof of Lemma 5.

Appendix 4: Correction phase

Assume \(\Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^* \le \beta \mu \) where \(\mu := \mu (z)\) with \(z = ({\bar{x}},y,{\bar{s}})\). The equations defining the correction step \((\delta _{{\bar{x}}},\delta _{y},\delta _{{\bar{s}}})\) are

$$\begin{aligned} G(\delta _{y}; \delta _{{\bar{x}}})-(0; \delta _{{\bar{s}}})&= 0 \end{aligned}$$
(42)
$$\begin{aligned} \delta _{{\bar{s}}} + \mu H_{{\bar{x}}} \delta _{{\bar{x}}}&= -\psi ({\bar{x}},{\bar{s}},\mu ) \end{aligned}$$
(43)

and the next point is then \(({\bar{x}}^+,y^+,{\bar{s}}^+) := ({\bar{x}},y,{\bar{s}}) + \hat{\alpha }(\delta _{{\bar{x}}},\delta _{y},\delta _{{\bar{s}}})\). Left-multiplying (42) by \((\delta _{y},\delta _{{\bar{x}}})^T\), we get \(\delta _{{\bar{x}}}^T \delta _{{\bar{s}}} = 0\). From (43), we then have

$$\begin{aligned} (\Vert \delta _{{\bar{s}}} \Vert _{\bar{x}}^*)^2,\; \mu ^2\Vert \delta _{{\bar{x}}} \Vert _{\bar{x}}^2 \le (\Vert \delta _{{\bar{s}}} \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert \delta _{{\bar{x}}} \Vert _{\bar{x}}^2 = (\Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^*)^2 \le \beta ^2 \mu ^2 \end{aligned}$$

and therefore

$$\begin{aligned} \Vert \delta _{{\bar{x}}} \Vert _{\bar{x}}&\le \beta , \;\; \Vert \delta _{{\bar{s}}} \Vert _{\bar{x}}^* \le \beta \mu . \end{aligned}$$
(44)

From (43), we also have

$$\begin{aligned} \Vert \psi ({\bar{x}},{\bar{s}},\mu ) + \hat{\alpha } \delta _{{\bar{s}}} \Vert _{\bar{x}}^*&= \Vert (1-\hat{\alpha }) \psi ({\bar{x}},{\bar{s}},\mu ) + \hat{\alpha } \mu H_{\bar{x}} \delta _{{\bar{x}}} \Vert _{\bar{x}}^* \nonumber \\&\le (1-\hat{\alpha }) \Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^* + \hat{\alpha } \mu \Vert \delta _{{\bar{x}}} \Vert _{\bar{x}} \nonumber \\&\le (1-\hat{\alpha }) \beta \mu + \hat{\alpha } \mu \beta = \beta \mu \end{aligned}$$
(45)

Where we used (44). Now define \(q{}=(\mu ^+)^{-1}\Vert {\bar{s}}^+ + \mu ^+ g_{{\bar{x}}} \Vert _{\bar{x}}^*\). Then estimating similarly to (40), we get

$$\begin{aligned} \mu ^+ q{}&\le \Vert \psi ({\bar{x}},{\bar{s}},\mu ) + (\mu ^+ - \mu ) g_{\bar{x}} + \hat{\alpha } \delta _{{\bar{s}}} \Vert _{\bar{x}}^* \\&\le \beta \mu (1 + \hat{\alpha } (\beta {\bar{\nu }}^{-1/2} + 1) ) \end{aligned}$$

and similarly to the computation in (39), we therefore find

$$\begin{aligned} \mu /\mu ^+ \le (1-\hat{\alpha } {\bar{\nu }}^{-1} \beta ^2)^{-1} \end{aligned}$$

so that altogether

$$\begin{aligned} q{}&\le \beta (1-\hat{\alpha } {\bar{\nu }}^{-1} \beta ^2)^{-1} (1 + \hat{\alpha } (\beta {\bar{\nu }}^{-1/2} + 1) ). \end{aligned}$$
(46)

Now we can apply the theorem (36) with \(v_0 = {\bar{s}}^+\), \(t=\mu \) and \({\bar{x}}_2 = {\bar{x}}^+\):

$$\begin{aligned} \Vert \psi ({\bar{x}}^+,{\bar{s}}^+,\mu ^+) \Vert _{{\bar{x}}^+}^* \le \mu ^+ \left( \frac{q{}}{1-q{}} \right) ^2 \end{aligned}$$
(47)

The reader can verify that for \(\hat{\alpha } \le 1/84, {\bar{\nu }} \ge 2, \beta \le 2\eta \le 1/3\), the bound (46) implies that when recursively using (47) twice, we obtain

$$\begin{aligned} \Vert \psi ({\bar{x}}^+,{\bar{s}}^+,\mu ^+) \Vert _{{\bar{x}}^+}^* \le \frac{1}{2} \beta \le \eta \end{aligned}$$

and therefore \(z^+ \in \mathcal {N}(\eta )\) which proves Lemma 6.

Appendix 5: Algorithm complexity

From Lemma 3, we have that the linear residuals \(G(y; {\bar{x}})-(0; {\bar{s}})\) are reduced by a factor \((1-\alpha )\) in each iteration. Since we can always take \(\alpha = \Omega (1/\sqrt{\bar{\nu }})\), we see that \(G(y; {\bar{x}})-(0; {\bar{s}})\) decreases geometrically with a rate of \((1-\Omega {(1/\sqrt{\bar{\nu }})})\) which implies that

$$\begin{aligned} \Vert G(y; {\bar{x}})-(0; {\bar{s}}) \Vert \le \epsilon \Vert G(y^0; {\bar{x}}^0)-(0; {\bar{s}}^0) \Vert \end{aligned}$$

in \(\mathcal {O}{(\sqrt{{\bar{\nu }}}\log {(1/\epsilon )})} = \mathcal {O}{(\sqrt{{\nu }}\log {(1/\epsilon )})}\) iterations.

To see that the same holds for \(\mu (z)\), let us briefly use the following notation: \(z\) is the starting point, \(z^+\) is the point after prediction and \(z^{(j)}\) is the point after applying \(j\) correction steps starting in \(z^+\). Then from Lemma 3 and (34), we have

$$\begin{aligned} \mu (z^+)&\le (1-\alpha )\mu (z) + \alpha (1-\alpha ){\bar{\nu }}^{-1}\mu \eta k_{\bar{x}} \nonumber \\&\le \mu (z)(1-\alpha )(1+\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1})\nonumber \\&= \mu (z)(1-\Omega {(1/\sqrt{\bar{\nu }})}) \end{aligned}$$
(48)

Since \(\delta _{{\bar{x}}}^T \delta _{{\bar{s}}} = 0\), we see from (43) that

$$\begin{aligned} ({\bar{x}}^+)^T\delta _{{\bar{s}}} = \mu (z^+) \delta _{{\bar{x}}}^T g_{{\bar{x}}^+} = \delta _{{\bar{x}}}^T \psi ({\bar{x}}^+,{\bar{s}}^+,\mu (z^+) ) - \delta _{{\bar{x}}}^T {\bar{s}}^+ \end{aligned}$$
(49)

Therefore

$$\begin{aligned} {\bar{\nu }} \mu (z^{(1)})&= ({\bar{x}}^+ + \hat{\alpha } \delta _{{\bar{x}}})^T({\bar{s}}^+ + \hat{\alpha } \delta _{{\bar{s}}}) \mathop {=}\limits ^{(49)} ({\bar{x}}^+)^T({\bar{s}}^+) + \hat{\alpha } \delta _{{\bar{x}}}^T \psi ({\bar{x}}^+,{\bar{s}}^+,\mu (z^+) ) \\&\le {\bar{\nu }} \mu (z^+) + \hat{\alpha } \beta ^2 \mu (z^+) \\&= {\bar{\nu }} \mu (z^+)(1+\hat{\alpha } \beta ^2 {\bar{\nu }}^{-1}) \end{aligned}$$

and hence

$$\begin{aligned} \mu (z^{(2)})&\mathop {\le }\limits ^{} \mu (z^+)(1+\hat{\alpha } \beta ^2 {\bar{\nu }}^{-1})^2 \\&\mathop {\le }\limits ^{(48)} \mu (z)(1-\Omega {(1/\sqrt{\bar{\nu }})})(1+\hat{\alpha } \beta ^2{\bar{\nu }}^{-1})^2 \\&\mathop {=}\limits ^{}\mu (z)(1-\Omega {(1/\sqrt{\bar{\nu }})}) \end{aligned}$$

which shows that also \(\mu (z)\) is decreased geometrically with a rate of \((1-\Omega {(1/\sqrt{\bar{\nu }})})\). Therefore \( \mu (z) \le \epsilon \mu (z^0) \) in \(\mathcal {O}{(\sqrt{{\bar{\nu }}}\log {(1/\epsilon )})} = \mathcal {O}{(\sqrt{{\nu }}\log {(1/\epsilon )})}\) iterations, finishing the proof of Theorem 1.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Skajaa, A., Ye, Y. A homogeneous interior-point algorithm for nonsymmetric convex conic optimization. Math. Program. 150, 391–422 (2015). https://doi.org/10.1007/s10107-014-0773-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-014-0773-1

Keywords

Mathematics Subject Classification

Navigation