A homogeneous interior-point algorithm for nonsymmetric convex conic optimization

Skajaa, Anders; Ye, Yinyu

doi:10.1007/s10107-014-0773-1

A homogeneous interior-point algorithm for nonsymmetric convex conic optimization

Full Length Paper
Series A
Published: 03 May 2014

Volume 150, pages 391–422, (2015)
Cite this article

Mathematical Programming Submit manuscript

Anders Skajaa¹ &
Yinyu Ye^2,3

1208 Accesses
38 Citations
Explore all metrics

Abstract

A homogeneous interior-point algorithm for solving nonsymmetric convex conic optimization problems is presented. Starting each iteration from the vicinity of the central path, the method steps in the approximate tangent direction and then applies a correction phase to locate the next well-centered primal–dual point. Features of the algorithm include that it makes use only of the primal barrier function, that it is able to detect infeasibilities in the problem and that no phase-I method is needed. We prove convergence to $\epsilon $-accuracy in ${\mathcal {O}}(\sqrt{\nu } \log {(1/\epsilon )})$ iterations. To improve performance, the algorithm employs a new Runge–Kutta type second order search direction suitable for the general nonsymmetric conic problem. Moreover, quasi-Newton updating is used to reduce the number of factorizations needed, implemented so that data sparsity can still be exploited. Extensive and promising computational results are presented for the $p$-cone problem, the facility location problem, entropy maximization problems and geometric programs; all formulated as nonsymmetric convex conic optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A primal-dual interior-point algorithm for nonsymmetric exponential-cone optimization

Article Open access 09 March 2021

Two wide neighborhood interior-point methods for symmetric cone optimization

Article 09 March 2017

Information Geometry and Interior-Point Algorithms

Notes

A positive semidefinite matrix with all non-negative entries is called doubly non-negative.
See Appendix 1 for a list of properties of this class of functions.

References

Andersen, E.D., Andersen, K.D.: The MOSEK interior point optimization for linear programming: an implementation of the homogeneous algorithm. In: Frenk, H., Roos, K., Terlaky, T., Zhang, S. (eds.) High Performance Optimization, pp. 197–232. Kluwer, Boston (1999)
Google Scholar
Andersen, E.D., Roos, C., Terlaky, T.: On implementing a primal–dual interior-point method for conic quadratic optimization. Math. Program. 95(2), 249–277 (2003)
Article MATH MathSciNet Google Scholar
Andersen, E.D., Ye, Y.: On a homogeneous algorithm for the monotone complementarity problem. Math. Program. 84(2), 375–399 (1999)
Article MATH MathSciNet Google Scholar
Ben-Tal, A., Nemirovski, A.S.: Lectures on Modern Convex Optimization: Analysis, Algorithms and Engineering Applications. SIAM, Philadelphia (2001)
Book Google Scholar
Boyd, S., Kim, S.J., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8, 67–127 (2007)
Article MATH MathSciNet Google Scholar
Butcher, J.C.: Numerical Methods for Ordinary Differential Equations, 2nd edn. Wiley, New York (2008)
Book MATH Google Scholar
Chares, P.R.: Cones and interior-point algorithms for structured convex optimization involving powers and exponentials. PhD thesis, Uni. Catholique de Louvain (2009)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 1.21 (2010). http://cvxr.com/cvx
Güler, O.: Barrier functions in interior point methods. Math. Oper. Res. 21, 860–885 (1996)
Article MATH MathSciNet Google Scholar
Karmarkar, N.: A new polynomial-time algorithm for linear programming. Combinatorica 4, 373–395 (1984)
Article MATH MathSciNet Google Scholar
Luo, Z.Q., Sturm, J.F., Zhang, S.: Conic convex programming and self-dual embedding. Optim. Method Softw. 14, 169–218 (2000)
Article MATH MathSciNet Google Scholar
Mehrotra, S.: On the implementation of a primal–dual interior point method. SIAM J. Optim. 2, 575–601 (1992)
Article MATH MathSciNet Google Scholar
MOSEK optimization software: developed by MOSEK ApS. www.mosek.com
Nesterov, Y.E.: Constructing self-concordant barriers for convex cones. CORE Discussion Paper (2006/30)
Nesterov, Y.E.: Towards nonsymmetric conic optimization. Optim. Method Softw. 27, 893–917 (2012)
Article MATH MathSciNet Google Scholar
Nesterov, Y.E., Nemirovski, A.S.: Interior-Point Polynomial Algorithms in Convex Programming. SIAM, Philadelphia (1994)
Book MATH Google Scholar
Nesterov, Y.E., Todd, M.J.: Self-scaled barriers and interior-point methods for convex programming. Math. Oper. Res. 22, 1–42 (1997)
Article MATH MathSciNet Google Scholar
Nesterov, Y.E., Todd, M.J.: Primal–dual interior-point methods for self-scaled cones. SIAM J. Optim. 8, 324–364 (1998)
Article MATH MathSciNet Google Scholar
Nesterov, Y.E., Todd, M.J., Ye, Y.: Infeasible-start primal–dual methods and infeasibility detectors for nonlinear programming problems. Math. Program. 84, 227–267 (1999)
MATH MathSciNet Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)
MATH Google Scholar
Renegar, J.: A Mathematical View of Interior-Point Methods in Convex Optimization. SIAM, Philadelphia (1987)
Google Scholar
Sturm, J.F.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Method Softw. 12, 625–653 (1999)
Article MathSciNet Google Scholar
Sturm, J.F.: Implementation of interior point methods for mixed semidefinite and second order cone optimization problems. Optim. Method Softw. 17, 1105–1154 (2002)
Article MATH MathSciNet Google Scholar
Tuncel, L.: Primal–dual symmetry and scale invariance of interior-point algorithms for convex optimization. Math. Oper. Res. 23, 708–718 (1998)
Article MATH MathSciNet Google Scholar
Tuncel, L.: Generalization of primal–dual interior-point methods to convex optimization problems in conic form. Found. Comput. Math. 1, 229–254 (2001)
Article MATH MathSciNet Google Scholar
Xu, X., Hung, P.F., Ye, Y.: A simplified homogeneous and self-dual linear programming algorithm and its implementation. Ann. Oper. Res. 62, 151–171 (1996)
Article MATH MathSciNet Google Scholar
Xue, G., Ye, Y.: An efficient algorithm for minimizing a sum of $p$-norms. SIAM J. Optim. 10, 551–579 (1999)
Article MathSciNet Google Scholar

Download references

Acknowledgments

The authors thank Erling D. Andersen and Joachim Dahl of Mosek ApS for lots of insights and for supplying us with test problems for the geometric programs and the entropy problems. The authors also thank the reviewers for many helpful comments.

Author information

Authors and Affiliations

Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 , Kgs. Lyngby, Denmark
Anders Skajaa
Department of Management Science and Engineering, Stanford University, Stanford, CA , 94305-4121, USA
Yinyu Ye
The International Center of Management Science and Engineering, Nanjing University, Nanjing, 210093, China
Yinyu Ye

Authors

Anders Skajaa
View author publications
You can also search for this author in PubMed Google Scholar
Yinyu Ye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anders Skajaa.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 324 KB)

Appendices

Appendix 1: Properties of the barrier function

Here we list some properties of logarithmically homogeneous self-concordant barriers (lhscb) that we use in this paper. Many more properties and proofs can be found in [17, 18].

Let $\mathcal {K}^{\circ }$ denote the interior of $\mathcal {K}$. We assume that $F: \mathcal {K}^{\circ } \mapsto \mathbb {R}$ is a lhscb for $\mathcal {K}$ with barrier parameter $\nu $. This means that for all $x \in \mathcal {K}^{\circ }$ and $t>0$,

$$\begin{aligned} F(tx) = F(x) - \nu \log {t}. \end{aligned}$$

It follows that the conjugate of $F$, denoted $F^*$ and defined for $s \in (\mathcal {K}^*)^{\circ }$ by

$$\begin{aligned} F^*(s) = \sup _{x \in \mathcal {K}}\{ - s^T x - F(x) \} \end{aligned}$$

is a lhscb for the dual cone $\mathcal {K}^*$. Similarly to the notation used in [17, 18], we write the local Hessian norms on $\mathcal {K}$ and $\mathcal {K}^*$ as:

$$\begin{aligned} \Vert g \Vert _x&= \Vert H_x^{1/2} g \Vert , \quad \text {for }x \in \mathcal {K}^{\circ } \\ \Vert h \Vert _s^*&= \Vert (H_s^*)^{1/2} g \Vert , \quad \text {for }s \in (\mathcal {K}^*)^\circ \\ \Vert h \Vert _x^*&= \Vert H_x^{-1/2} h \Vert , \quad \text {for }x \in (\mathcal {K})^\circ , \end{aligned}$$

where $H_s^* = \nabla ^2 F^*(s)$. Notice the different definitions of $\Vert \cdot \Vert _y^*$ depending on whether $y$ is in $\mathcal {K}$ or $\mathcal {K}^*$. Using this convention and that $-g_x \in (\mathcal {K}^*)^\circ $ and $H_{-g_x}^* = H_x^{-1}$, we see that

$$\begin{aligned} \Vert s \Vert ^*_{-g_x} = \Vert (H_{-g_x}^*)^{-1/2} s \Vert = \Vert H_x^{1/2} s \Vert = \Vert s \Vert _x^*. \end{aligned}$$

(18)

For $x \in \mathcal {K}^\circ $, $F$ satisfies

$$\begin{aligned} H_x x&= -g_x \end{aligned}$$

(19)

$$\begin{aligned} x^T g_x&= -\nu \end{aligned}$$

(20)

$$\begin{aligned} \Vert x \Vert _x^2&= \nu . \end{aligned}$$

(21)

The Dikin ellipsoids are feasible [4]. That is:

$$\begin{aligned} x \in \mathcal {K}^\circ&\Rightarrow W(x) = \{ u, \Vert u - x \Vert _x \le 1 \} \subseteq \mathcal {K} \end{aligned}$$

(22)

$$\begin{aligned} s \in (\mathcal {K}^*)^\circ&\Rightarrow W^*(s) = \{h, \Vert h - s \Vert _s^* \le 1 \} \subseteq K^*. \end{aligned}$$

(23)

Appendix 2: The homogeneous and self-dual model

1.1 Optimality and infeasibility certificate

Let $G$ be defined by (5) and notice that $G$ is skew-symmetric: $G = -G^T$.

1.
Observe that we can write (hsd) as $G(y; x; \tau )^T - (0; s;\kappa )^T = 0$. Pre-multiplying this equation by $(y; x; \tau )^T$ gives $x^T s + \tau \kappa = 0$.
2.
$\tau > 0$ implies $\kappa = 0$ and hence $b^T (y/\tau ) - c^T (x/\tau ) = 0$ and therefore $x^T s = 0$. Dividing the two first linear feasibility equations of (hsd) by $\tau $, we obtain the linear feasibility equations of (1). Thus $(x,y,s)/\tau $ is optimal for (pd).
3.
If $\kappa > 0$ then $\tau = 0$ so $Ax=0$ and $A^T y + s = 0$. Further $c^T x - b^T y = -\kappa < 0$ so not both $c^T x$ and $-b^T y$ can be non-negative. Assume $-b^T x < 0$. If (pd) is primal-feasible then there exists $\bar{x} \in \mathcal {K}$ such that $A \bar{x} = b$. But then $0 > -b^T y = -\bar{x}^T A^T y = \bar{x}^T s \ge 0$, a contradiction. We can argue similarly if $c^T x < 0$,

and this completes the proof of Lemma 1.

1.2 Self-duality

The dual of (hsd) problem is

$$\begin{aligned} \max _{\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{y}_1,\hat{s}}\qquad&0 \nonumber \\ \text {s.t.}\qquad&\left( \begin{array}{c@{\quad }c@{\quad }c} A^T &{} 0 &{} -c \\ 0 &{} c^T &{} -b^T \\ 0 &{} -I &{} 0 \\ -1 &{} 0 &{} 0 \end{array} \right) \left( \begin{array}{c} {\hat{y}_1} \\ {\hat{y}_2} \\ {\hat{y}_3} \end{array} \right) + \left( \begin{array}{c} {\hat{s}_1} \\ {\hat{s}_2} \\ {\hat{s}_3} \\ {\hat{s}_4} \end{array} \right) = 0 \end{aligned}$$

(24)

$$\begin{aligned}&-A \hat{y}_2 + b \hat{y}_3 = 0 \end{aligned}$$

(25)

$$\begin{aligned}&\hat{s} \in (\mathcal {K}\times \mathbb {R}_+ \times \mathcal {K}^*\times \mathbb {R}_+)^*, \quad \hat{y} \text { free}. \end{aligned}$$

(26)

After a few eliminations, we see that (24)–(26) are equivalent to

$$\begin{aligned}&\begin{array}{c@{\quad }l@{\quad }l@{\quad }l@{\quad }rcl} \qquad \qquad \quad \quad \; &{} &{} A\hat{s}_3 &{} -b\hat{s}_4 &{} &{}=&{} 0 \\ &{} -A^T \hat{y}_1 &{} &{} +c\hat{s}_4 &{}-\hat{s}_1 &{}=&{} 0 \\ &{} b^T \hat{y}_1 &{} -c^T \hat{s}_3 &{} &{} -\hat{s}_2 &{}=&{} 0 \\ \end{array} \\& (\hat{s}_3,\hat{s}_4) \in \mathcal {K}\times \mathbb {R}_+,\;\; (\hat{s}_1,\hat{s}_2) \in \mathcal {K}^*\times \mathbb {R}_+,\;\; \hat{y}_1 \in \mathbb {R}^m. \nonumber \end{aligned}$$

(27)

Through the following identification of variables

$$\begin{aligned} \hat{s}_1&\sim s, \quad \hat{s}_2 \sim \kappa , \quad \hat{s}_3 \sim x, \quad \hat{s}_4 \sim \tau , \quad \hat{y}_1 \sim y, \end{aligned}$$

it is clear that the constraints (27) are equivalent to those of the problem (hsd). Since the objective function in both problems is constant zero, the two problems are identical and this proves Lemma 2.

Appendix 3: Prediction

The direction $d_{z}$ is defined by

$$\begin{aligned} G(d_{y}; d_{{\bar{x}}}) - (0; d_{{\bar{s}}})&= -\left( G(y; {\bar{x}}) - (0; {\bar{s}}) \right) \end{aligned}$$

(28)

$$\begin{aligned} d_{{\bar{s}}} + \mu H_{\bar{x}} d_{{\bar{x}}}&= -{\bar{s}} \end{aligned}$$

(29)

1.1 Reduction of residuals

We first show:

$$\begin{aligned} \text {1.}\&{\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}} + {\bar{x}}^T {\bar{s}} = \psi (z)^T d_{{\bar{x}}} \end{aligned}$$

(30)

$$\begin{aligned} \text {2.}\&({\bar{x}}+d_{{\bar{x}}})^T ({\bar{s}}+d_{{\bar{s}}}) = 0 \end{aligned}$$

(31)

$$\begin{aligned} \text {3.}\&d_{{\bar{x}}}^T d_{{\bar{s}}} = -\psi (z)^T d_{{\bar{x}}}. \end{aligned}$$

(32)

1.
We get ${\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}} + {\bar{x}}^T {\bar{s}} \mathop {=}\limits ^{(29)} {\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T(-{\bar{s}}-\mu H_{\bar{x}} d_{{\bar{x}}}) + {\bar{x}}^T {\bar{s}}$, which, after reduction, gives $d_{{\bar{x}}}^T ({\bar{s}} - \mu H_{\bar{x}} {\bar{x}}) = \psi (z)^T d_{{\bar{x}}}$.
2.
Equation (28) is equivalent to $G(y+d_{y}; {\bar{x}}+d_{{\bar{x}}}) - (0; {\bar{s}}+d_{{\bar{s}}}) = 0$. Pre-multiplying this equation by $(y+d_{y},{\bar{x}}+d_{{\bar{x}}})$ gives (31).
3.
Follows from expanding (31) and using (30).

Now the lemma follows readily: We simply note that the first equation follows directly from elementary linear algebra. To show the second:

$$\begin{aligned} {\bar{\nu }}\mu (z^+)&\mathop {=}\limits ^{} ({\bar{x}}+\alpha d_{{\bar{x}}})^T ({\bar{s}}+\alpha d_{{\bar{s}}}) \\&\mathop {=}\limits ^{} {\bar{x}}^T {\bar{s}} + \alpha ({\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}}) + \alpha ^2 d_{{\bar{x}}}^Td_{{\bar{s}}} \\&\mathop {=}\limits ^{(30){-}(32)} {\bar{x}}^T {\bar{s}} + \alpha (-{\bar{x}}^T {\bar{s}} + \psi (z)^T d_{{\bar{x}}}) + \alpha ^2(-\psi (z)^T d_{{\bar{x}}}) \\&\mathop {=}\limits ^{} (1-\alpha ){\bar{x}}^T {\bar{s}} + \alpha (1-\alpha ) \psi (z)^T d_{{\bar{x}}} \end{aligned}$$

which after division by ${\bar{\nu }}$ proves Lemma 3.

1.2 Bounds on ${\bar{s}}$, $d_{{\bar{s}}}$ and $d_{{\bar{x}}}$

Assume $\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu $. By definition, $\psi = {\bar{s}} - \mu H_{\bar{x}} {\bar{x}}$, which after left-multiplication by $H_{\bar{x}}^{-1/2}$, taking norms and squaring both sides gives

$$\begin{aligned} (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2&= (\Vert \psi \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert {\bar{x}} \Vert _{\bar{x}}^2 + 2\mu {\bar{x}}^T \psi \nonumber \\&= (\Vert \psi \Vert _{\bar{x}}^*)^2 + 2 + \mu ^2 {\bar{\nu }} \le \mu ^2 ({\bar{\nu }} + \eta ^2) \nonumber \\ \Vert {\bar{s}} \Vert _{\bar{x}}^*&\le \mu \sqrt{\eta ^2 + {\bar{\nu }}} \end{aligned}$$

(33)

where we used (21) and ${\bar{x}}^T \psi = 0$.

This bound allows us to obtain bounds on $d_{{\bar{x}}}$ and $d_{{\bar{s}}}$: Left-multiplying (29) by $H_{\bar{x}}^{-1/2}$, taking norms and squaring both sides gives

$$\begin{aligned} (\Vert d_{{\bar{s}}} \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2&= (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 - 2 \mu d_{{\bar{x}}}^T d_{{\bar{s}}} \mathop {=}\limits ^{(32)} (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu d_{{\bar{x}}}^T \psi \\&\le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^* \end{aligned}$$

by the Cauchy–Schwarz inequality. Therefore: $ \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2 \le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^* $. Now subtracting $2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^*$ and adding $(\Vert \psi \Vert _{\bar{x}}^*)^2$ to both sides, we get

$$\begin{aligned} \left( \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} - \Vert \psi \Vert _{\bar{x}}^* \right) ^2 \le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + ( \Vert \psi \Vert _{\bar{x}}^* )^2 \end{aligned}$$

or

$$\begin{aligned} \Vert d_{{\bar{x}}} \Vert _{\bar{x}}&\le \mu ^{-1} \left( \Vert \psi \Vert _{\bar{x}}^* + \sqrt{(\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + ( \Vert \psi \Vert _{\bar{x}}^* )^2} \right) \nonumber \\&\le \mu ^{-1}(\eta \mu + \sqrt{\mu ^2(\eta ^2 + {\bar{\nu }}) + \eta ^2 \mu ^2}) = \eta + \sqrt{\eta ^2 + {\bar{\nu }}} =: k_{\bar{x}}. \end{aligned}$$

(34)

For $d_{{\bar{s}}}$, we similarly have

$$\begin{aligned} (\Vert d_{{\bar{s}}} \Vert _{\bar{x}}^*)^2&\le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert d_{{\bar{s}}} \Vert _{\bar{x}}^* \nonumber \\ (\Vert d_{{\bar{s}}} \Vert _{\bar{x}}^* - \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} )^2&\le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2 \nonumber \\ \Vert d_{{\bar{s}}} \Vert _{\bar{x}}^*&\le k_{\bar{x}} \mu + \sqrt{\mu ^2(\eta ^2 + {\bar{\nu }}) + k_{\bar{x}}^2 \mu ^2} = k_{\bar{s}} \mu \end{aligned}$$

(35)

where $k_{\bar{s}} := k_{\bar{x}} + \sqrt{(\eta ^2 + {\bar{\nu }}) + k_{\bar{x}}^2}$.

1.3 Feasibility of $z^+$

Define $\alpha _1 := k_{\bar{x}}^{-1} = \Omega (1/\sqrt{\bar{\nu }})$. Then for any $\alpha \le \alpha _1$, we have

$$\begin{aligned} \Vert {\bar{x}} - ({\bar{x}}+\alpha d_{{\bar{x}}}) \Vert _{\bar{x}}&= \alpha \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \mathop {\le }\limits ^{(34)} \alpha k_{\bar{x}} \le 1 \end{aligned}$$

and so from (22), we conclude ${\bar{x}}+\alpha d_{{\bar{x}}} = {\bar{x}}^+ \in \bar{\mathcal {K}}$.

Now, define $\alpha _2 := (1-\eta )k_{\bar{s}}^{-1} = \Omega (1/\sqrt{\bar{\nu }})$. Then for $\alpha \le \alpha _2$, we have

$$\begin{aligned} \mu ^{-1} \Vert {\bar{s}}^+ + \mu g_{\bar{x}} \Vert _{-g_{\bar{x}}}^*&\mathop {=}\limits ^{} \mu ^{-1}\Vert {\bar{s}} + \alpha d_{{\bar{s}}} + \mu g_{\bar{x}} \Vert _{-g_{\bar{x}}}^* =\mu ^{-1}\Vert \psi + \alpha d_{{\bar{s}}} \Vert _{-g_{\bar{x}}}^* \\&\mathop {\le }\limits ^{(18)} \mu ^{-1}\Vert \psi \Vert _{\bar{x}}^* + \mu ^{-1}\alpha \Vert d_{{\bar{s}}} \Vert _{{\bar{x}}}^* \mathop {\le }\limits ^{(35)} \eta + \alpha k_{\bar{s}} \le 1. \end{aligned}$$

Since $-g_{\bar{x}} \in \bar{\mathcal {K}}^*$, we have by (23) that $\mu ^{-1}{\bar{s}}^+ \in \bar{\mathcal {K}}^*$ and therefore ${\bar{s}}^+ \in \bar{\mathcal {K}}^*$. Therefore, Lemma 4 holds with $\alpha = \min \{\alpha _1,\alpha _2\} = \Omega (1/\sqrt{\bar{\nu }}) = \Omega (1/\sqrt{\nu })$.

1.4 Bound on $\psi ^+$

First recall the definition (6): $\psi ({\bar{x}},{\bar{s}},t) = {\bar{s}} + t g_{\bar{x}}$. Now consider for a fixed $v_0$ the function

$$\begin{aligned} \Phi _t({\bar{x}}) = {\bar{x}}^T v_0 + t F\left( {\bar{x}} \right) \end{aligned}$$

which is self-concordant with respect to ${\bar{x}}$. Define its Newton step by $ n_t({\bar{x}}) := -\nabla ^2 \Phi _t({\bar{x}})^{-1} \nabla \Phi _t({\bar{x}}) $. Define also $q{}=\Vert n_{t_2}({\bar{x}}) \Vert _{\bar{x}}$. From the general theory of self-concordant functions, the following inequality holds. If $q{} \le 1$, then

$$\begin{aligned} \Vert n_{t_2}({\bar{x}}_2) \Vert _{{\bar{x}}_2}&\le \left( \frac{q{}}{1-q{}}\right) ^2. \end{aligned}$$

(36)

For a proof of this relation, see e.g. Theorem 2.2.4 in [21]. With $v_0 = {\bar{s}}^+, t_2=\mu ^+$ and ${\bar{x}}_2 = {\bar{x}}^+$, the inequality (36) is

$$\begin{aligned} \Vert \psi ^+ \Vert _{{\bar{x}}^+}^*&\le \mu ^+ \left( \frac{q{}}{1-q{}}\right) ^2. \end{aligned}$$

(37)

where $\mu ^+ q{} = \Vert H_{\bar{x}}^{-1}({\bar{s}}^+ + \mu ^+ g_{\bar{x}}) \Vert _{\bar{x}} = \Vert {\bar{s}}^+ + \mu ^+ g_{\bar{x}} \Vert _{\bar{x}}^*$. From Lemma 3 and (34):

$$\begin{aligned} | \mu - \mu ^+ |&= | -\alpha \mu + \alpha (1-\alpha ) {\bar{\nu }}^{-1} \psi ^T d_{{\bar{x}}} | \nonumber \\&\le \mu \alpha \left( 1 + (1-\alpha )\eta k_{\bar{x}} {\bar{\nu }}^{-1} \right) . \end{aligned}$$

(38)

By the assumption $\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu $ combined with (34), we have $\psi ^T d_{{\bar{x}}} \ge -\eta k_{\bar{x}} \mu $. Therefore

$$\begin{aligned} \mu ^+&= (1-\alpha )\mu + \alpha (1-\alpha ){\bar{\nu }}^{-1}\psi ^T d_{{\bar{x}}} \nonumber \\&\ge \mu (1-\alpha )( 1-\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1} ) \nonumber \\ \mu /\mu ^+&\le (1-\alpha )^{-1}( 1-\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1} )^{-1} \end{aligned}$$

(39)

Let us now obtain a bound on $q{}$.

$$\begin{aligned} \mu ^+ q{}&= \Vert {\bar{s}}^+ + \mu ^+ g_{\bar{x}} \Vert _{\bar{x}}^* = \Vert \psi - (\mu - \mu ^+) g_{\bar{x}} + \alpha d_{{\bar{s}}} \Vert _{\bar{x}}^* \\&\le \Vert \psi \Vert _{\bar{x}}^* + |\mu - \mu ^+| \Vert g_{\bar{x}} \Vert _{\bar{x}}^* + \alpha \Vert d_{{\bar{s}}} \Vert _{\bar{x}}^* \nonumber \\&\le \eta \mu + \mu \alpha \left( 1 + (1-\alpha )\eta k_{\bar{x}} {\bar{\nu }}^{-1} \right) \sqrt{{\bar{\nu }}} + \alpha k_{\bar{s}} \mu \nonumber \\&\le \mu \left( \eta + \alpha k_{\bar{s}} + \alpha (1 + (1-\alpha ){\bar{\nu }}^{-1}\eta k_{\bar{x}})\sqrt{{\bar{\nu }}} \right) \nonumber \\ q{}&\le (\mu /\mu ^+)( \eta + \alpha (\sqrt{{\bar{\nu }}} + k_{\bar{s}} + \eta k_{\bar{x}}) ) \nonumber \\&\le (1-\alpha )^{-1}( 1-\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1} )^{-1} ( \eta + \alpha (\sqrt{{\bar{\nu }}} + k_{\bar{s}} + \eta k_{\bar{x}}) ) \nonumber \end{aligned}$$

(40)

where we used (35), (38), (39) and the assumption $\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu $. Now the reader can verify that for $\eta \le 1/6$ and ${\bar{\nu }}\ge 2$, we have the implication

$$\begin{aligned} \alpha \le \alpha _3 := \frac{1}{11 \sqrt{{\bar{\nu }}}} = \Omega (1/\sqrt{\bar{\nu }}) \Rightarrow q{}^2/(1-q{})^2 \le 2\eta \le 1/3 \end{aligned}$$

(41)

which also implies $q< 1$. Now by (37), we see that (41) implies $ \Vert \psi ^+ \Vert _{{\bar{x}}^+}^* \le 2\eta \mu ^+$ and hence $z^+ \in \mathcal {N}(2\eta )$ which finishes the proof of Lemma 5.

Appendix 4: Correction phase

Assume $\Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^* \le \beta \mu $ where $\mu := \mu (z)$ with $z = ({\bar{x}},y,{\bar{s}})$. The equations defining the correction step $(\delta _{{\bar{x}}},\delta _{y},\delta _{{\bar{s}}})$ are

$$\begin{aligned} G(\delta _{y}; \delta _{{\bar{x}}})-(0; \delta _{{\bar{s}}})&= 0 \end{aligned}$$

(42)

$$\begin{aligned} \delta _{{\bar{s}}} + \mu H_{{\bar{x}}} \delta _{{\bar{x}}}&= -\psi ({\bar{x}},{\bar{s}},\mu ) \end{aligned}$$

(43)

and the next point is then $({\bar{x}}^+,y^+,{\bar{s}}^+) := ({\bar{x}},y,{\bar{s}}) + \hat{\alpha }(\delta _{{\bar{x}}},\delta _{y},\delta _{{\bar{s}}})$. Left-multiplying (42) by $(\delta _{y},\delta _{{\bar{x}}})^T$, we get $\delta _{{\bar{x}}}^T \delta _{{\bar{s}}} = 0$. From (43), we then have

$$\begin{aligned} (\Vert \delta _{{\bar{s}}} \Vert _{\bar{x}}^*)^2,\; \mu ^2\Vert \delta _{{\bar{x}}} \Vert _{\bar{x}}^2 \le (\Vert \delta _{{\bar{s}}} \Vert _{\bar{x}}^*)^2 + \mu ^2 \Vert \delta _{{\bar{x}}} \Vert _{\bar{x}}^2 = (\Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^*)^2 \le \beta ^2 \mu ^2 \end{aligned}$$

and therefore

$$\begin{aligned} \Vert \delta _{{\bar{x}}} \Vert _{\bar{x}}&\le \beta , \;\; \Vert \delta _{{\bar{s}}} \Vert _{\bar{x}}^* \le \beta \mu . \end{aligned}$$

(44)

From (43), we also have

$$\begin{aligned} \Vert \psi ({\bar{x}},{\bar{s}},\mu ) + \hat{\alpha } \delta _{{\bar{s}}} \Vert _{\bar{x}}^*&= \Vert (1-\hat{\alpha }) \psi ({\bar{x}},{\bar{s}},\mu ) + \hat{\alpha } \mu H_{\bar{x}} \delta _{{\bar{x}}} \Vert _{\bar{x}}^* \nonumber \\&\le (1-\hat{\alpha }) \Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^* + \hat{\alpha } \mu \Vert \delta _{{\bar{x}}} \Vert _{\bar{x}} \nonumber \\&\le (1-\hat{\alpha }) \beta \mu + \hat{\alpha } \mu \beta = \beta \mu \end{aligned}$$

(45)

Where we used (44). Now define $q{}=(\mu ^+)^{-1}\Vert {\bar{s}}^+ + \mu ^+ g_{{\bar{x}}} \Vert _{\bar{x}}^*$. Then estimating similarly to (40), we get

$$\begin{aligned} \mu ^+ q{}&\le \Vert \psi ({\bar{x}},{\bar{s}},\mu ) + (\mu ^+ - \mu ) g_{\bar{x}} + \hat{\alpha } \delta _{{\bar{s}}} \Vert _{\bar{x}}^* \\&\le \beta \mu (1 + \hat{\alpha } (\beta {\bar{\nu }}^{-1/2} + 1) ) \end{aligned}$$

and similarly to the computation in (39), we therefore find

$$\begin{aligned} \mu /\mu ^+ \le (1-\hat{\alpha } {\bar{\nu }}^{-1} \beta ^2)^{-1} \end{aligned}$$

so that altogether

$$\begin{aligned} q{}&\le \beta (1-\hat{\alpha } {\bar{\nu }}^{-1} \beta ^2)^{-1} (1 + \hat{\alpha } (\beta {\bar{\nu }}^{-1/2} + 1) ). \end{aligned}$$

(46)

Now we can apply the theorem (36) with $v_0 = {\bar{s}}^+$, $t=\mu $ and ${\bar{x}}_2 = {\bar{x}}^+$:

$$\begin{aligned} \Vert \psi ({\bar{x}}^+,{\bar{s}}^+,\mu ^+) \Vert _{{\bar{x}}^+}^* \le \mu ^+ \left( \frac{q{}}{1-q{}} \right) ^2 \end{aligned}$$

(47)

The reader can verify that for $\hat{\alpha } \le 1/84, {\bar{\nu }} \ge 2, \beta \le 2\eta \le 1/3$, the bound (46) implies that when recursively using (47) twice, we obtain

$$\begin{aligned} \Vert \psi ({\bar{x}}^+,{\bar{s}}^+,\mu ^+) \Vert _{{\bar{x}}^+}^* \le \frac{1}{2} \beta \le \eta \end{aligned}$$

and therefore $z^+ \in \mathcal {N}(\eta )$ which proves Lemma 6.

Appendix 5: Algorithm complexity

From Lemma 3, we have that the linear residuals $G(y; {\bar{x}})-(0; {\bar{s}})$ are reduced by a factor $(1-\alpha )$ in each iteration. Since we can always take $\alpha = \Omega (1/\sqrt{\bar{\nu }})$, we see that $G(y; {\bar{x}})-(0; {\bar{s}})$ decreases geometrically with a rate of $(1-\Omega {(1/\sqrt{\bar{\nu }})})$ which implies that

$$\begin{aligned} \Vert G(y; {\bar{x}})-(0; {\bar{s}}) \Vert \le \epsilon \Vert G(y^0; {\bar{x}}^0)-(0; {\bar{s}}^0) \Vert \end{aligned}$$

in $\mathcal {O}{(\sqrt{{\bar{\nu }}}\log {(1/\epsilon )})} = \mathcal {O}{(\sqrt{{\nu }}\log {(1/\epsilon )})}$ iterations.

To see that the same holds for $\mu (z)$, let us briefly use the following notation: $z$ is the starting point, $z^+$ is the point after prediction and $z^{(j)}$ is the point after applying $j$ correction steps starting in $z^+$. Then from Lemma 3 and (34), we have

$$\begin{aligned} \mu (z^+)&\le (1-\alpha )\mu (z) + \alpha (1-\alpha ){\bar{\nu }}^{-1}\mu \eta k_{\bar{x}} \nonumber \\&\le \mu (z)(1-\alpha )(1+\alpha \eta k_{\bar{x}} {\bar{\nu }}^{-1})\nonumber \\&= \mu (z)(1-\Omega {(1/\sqrt{\bar{\nu }})}) \end{aligned}$$

(48)

Since $\delta _{{\bar{x}}}^T \delta _{{\bar{s}}} = 0$, we see from (43) that

$$\begin{aligned} ({\bar{x}}^+)^T\delta _{{\bar{s}}} = \mu (z^+) \delta _{{\bar{x}}}^T g_{{\bar{x}}^+} = \delta _{{\bar{x}}}^T \psi ({\bar{x}}^+,{\bar{s}}^+,\mu (z^+) ) - \delta _{{\bar{x}}}^T {\bar{s}}^+ \end{aligned}$$

(49)

Therefore

$$\begin{aligned} {\bar{\nu }} \mu (z^{(1)})&= ({\bar{x}}^+ + \hat{\alpha } \delta _{{\bar{x}}})^T({\bar{s}}^+ + \hat{\alpha } \delta _{{\bar{s}}}) \mathop {=}\limits ^{(49)} ({\bar{x}}^+)^T({\bar{s}}^+) + \hat{\alpha } \delta _{{\bar{x}}}^T \psi ({\bar{x}}^+,{\bar{s}}^+,\mu (z^+) ) \\&\le {\bar{\nu }} \mu (z^+) + \hat{\alpha } \beta ^2 \mu (z^+) \\&= {\bar{\nu }} \mu (z^+)(1+\hat{\alpha } \beta ^2 {\bar{\nu }}^{-1}) \end{aligned}$$

and hence

$$\begin{aligned} \mu (z^{(2)})&\mathop {\le }\limits ^{} \mu (z^+)(1+\hat{\alpha } \beta ^2 {\bar{\nu }}^{-1})^2 \\&\mathop {\le }\limits ^{(48)} \mu (z)(1-\Omega {(1/\sqrt{\bar{\nu }})})(1+\hat{\alpha } \beta ^2{\bar{\nu }}^{-1})^2 \\&\mathop {=}\limits ^{}\mu (z)(1-\Omega {(1/\sqrt{\bar{\nu }})}) \end{aligned}$$

which shows that also $\mu (z)$ is decreased geometrically with a rate of $(1-\Omega {(1/\sqrt{\bar{\nu }})})$. Therefore $ \mu (z) \le \epsilon \mu (z^0) $ in $\mathcal {O}{(\sqrt{{\bar{\nu }}}\log {(1/\epsilon )})} = \mathcal {O}{(\sqrt{{\nu }}\log {(1/\epsilon )})}$ iterations, finishing the proof of Theorem 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Skajaa, A., Ye, Y. A homogeneous interior-point algorithm for nonsymmetric convex conic optimization. Math. Program. 150, 391–422 (2015). https://doi.org/10.1007/s10107-014-0773-1

Download citation

Received: 08 August 2012
Accepted: 27 March 2014
Published: 03 May 2014
Issue Date: May 2015
DOI: https://doi.org/10.1007/s10107-014-0773-1

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A homogeneous interior-point algorithm for nonsymmetric convex conic optimization

Abstract

Access this article

Similar content being viewed by others

A primal-dual interior-point algorithm for nonsymmetric exponential-cone optimization

Two wide neighborhood interior-point methods for symmetric cone optimization

Information Geometry and Interior-Point Algorithms

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 324 KB)

Appendices

Appendix 1: Properties of the barrier function

Appendix 2: The homogeneous and self-dual model

1.1 Optimality and infeasibility certificate

1.2 Self-duality

Appendix 3: Prediction

1.1 Reduction of residuals

1.2 Bounds on \({\bar{s}}\), \(d_{{\bar{s}}}\) and \(d_{{\bar{x}}}\)

1.3 Feasibility of \(z^+\)

1.4 Bound on \(\psi ^+\)

Appendix 4: Correction phase

Appendix 5: Algorithm complexity

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A homogeneous interior-point algorithm for nonsymmetric convex conic optimization

Abstract

Access this article

Similar content being viewed by others

A primal-dual interior-point algorithm for nonsymmetric exponential-cone optimization

Two wide neighborhood interior-point methods for symmetric cone optimization

Information Geometry and Interior-Point Algorithms

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 324 KB)

Appendices

Appendix 1: Properties of the barrier function

Appendix 2: The homogeneous and self-dual model

1.1 Optimality and infeasibility certificate

1.2 Self-duality

Appendix 3: Prediction

1.1 Reduction of residuals

1.2 Bounds on \({\bar{s}}\), \(d_{{\bar{s}}}\) and \(d_{{\bar{x}}}\)

1.3 Feasibility of \(z^+\)

1.4 Bound on \(\psi ^+\)

Appendix 4: Correction phase

Appendix 5: Algorithm complexity

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation