Abstract
A homogeneous interior-point algorithm for solving nonsymmetric convex conic optimization problems is presented. Starting each iteration from the vicinity of the central path, the method steps in the approximate tangent direction and then applies a correction phase to locate the next well-centered primal–dual point. Features of the algorithm include that it makes use only of the primal barrier function, that it is able to detect infeasibilities in the problem and that no phase-I method is needed. We prove convergence to \(\epsilon \)-accuracy in \({\mathcal {O}}(\sqrt{\nu } \log {(1/\epsilon )})\) iterations. To improve performance, the algorithm employs a new Runge–Kutta type second order search direction suitable for the general nonsymmetric conic problem. Moreover, quasi-Newton updating is used to reduce the number of factorizations needed, implemented so that data sparsity can still be exploited. Extensive and promising computational results are presented for the \(p\)-cone problem, the facility location problem, entropy maximization problems and geometric programs; all formulated as nonsymmetric convex conic optimization problems.
Similar content being viewed by others
Notes
A positive semidefinite matrix with all non-negative entries is called doubly non-negative.
See Appendix 1 for a list of properties of this class of functions.
References
Andersen, E.D., Andersen, K.D.: The MOSEK interior point optimization for linear programming: an implementation of the homogeneous algorithm. In: Frenk, H., Roos, K., Terlaky, T., Zhang, S. (eds.) High Performance Optimization, pp. 197–232. Kluwer, Boston (1999)
Andersen, E.D., Roos, C., Terlaky, T.: On implementing a primal–dual interior-point method for conic quadratic optimization. Math. Program. 95(2), 249–277 (2003)
Andersen, E.D., Ye, Y.: On a homogeneous algorithm for the monotone complementarity problem. Math. Program. 84(2), 375–399 (1999)
Ben-Tal, A., Nemirovski, A.S.: Lectures on Modern Convex Optimization: Analysis, Algorithms and Engineering Applications. SIAM, Philadelphia (2001)
Boyd, S., Kim, S.J., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8, 67–127 (2007)
Butcher, J.C.: Numerical Methods for Ordinary Differential Equations, 2nd edn. Wiley, New York (2008)
Chares, P.R.: Cones and interior-point algorithms for structured convex optimization involving powers and exponentials. PhD thesis, Uni. Catholique de Louvain (2009)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 1.21 (2010). http://cvxr.com/cvx
Güler, O.: Barrier functions in interior point methods. Math. Oper. Res. 21, 860–885 (1996)
Karmarkar, N.: A new polynomial-time algorithm for linear programming. Combinatorica 4, 373–395 (1984)
Luo, Z.Q., Sturm, J.F., Zhang, S.: Conic convex programming and self-dual embedding. Optim. Method Softw. 14, 169–218 (2000)
Mehrotra, S.: On the implementation of a primal–dual interior point method. SIAM J. Optim. 2, 575–601 (1992)
MOSEK optimization software: developed by MOSEK ApS. www.mosek.com
Nesterov, Y.E.: Constructing self-concordant barriers for convex cones. CORE Discussion Paper (2006/30)
Nesterov, Y.E.: Towards nonsymmetric conic optimization. Optim. Method Softw. 27, 893–917 (2012)
Nesterov, Y.E., Nemirovski, A.S.: Interior-Point Polynomial Algorithms in Convex Programming. SIAM, Philadelphia (1994)
Nesterov, Y.E., Todd, M.J.: Self-scaled barriers and interior-point methods for convex programming. Math. Oper. Res. 22, 1–42 (1997)
Nesterov, Y.E., Todd, M.J.: Primal–dual interior-point methods for self-scaled cones. SIAM J. Optim. 8, 324–364 (1998)
Nesterov, Y.E., Todd, M.J., Ye, Y.: Infeasible-start primal–dual methods and infeasibility detectors for nonlinear programming problems. Math. Program. 84, 227–267 (1999)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)
Renegar, J.: A Mathematical View of Interior-Point Methods in Convex Optimization. SIAM, Philadelphia (1987)
Sturm, J.F.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Method Softw. 12, 625–653 (1999)
Sturm, J.F.: Implementation of interior point methods for mixed semidefinite and second order cone optimization problems. Optim. Method Softw. 17, 1105–1154 (2002)
Tuncel, L.: Primal–dual symmetry and scale invariance of interior-point algorithms for convex optimization. Math. Oper. Res. 23, 708–718 (1998)
Tuncel, L.: Generalization of primal–dual interior-point methods to convex optimization problems in conic form. Found. Comput. Math. 1, 229–254 (2001)
Xu, X., Hung, P.F., Ye, Y.: A simplified homogeneous and self-dual linear programming algorithm and its implementation. Ann. Oper. Res. 62, 151–171 (1996)
Xue, G., Ye, Y.: An efficient algorithm for minimizing a sum of \(p\)-norms. SIAM J. Optim. 10, 551–579 (1999)
Acknowledgments
The authors thank Erling D. Andersen and Joachim Dahl of Mosek ApS for lots of insights and for supplying us with test problems for the geometric programs and the entropy problems. The authors also thank the reviewers for many helpful comments.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix 1: Properties of the barrier function
Here we list some properties of logarithmically homogeneous self-concordant barriers (lhscb) that we use in this paper. Many more properties and proofs can be found in [17, 18].
Let \(\mathcal {K}^{\circ }\) denote the interior of \(\mathcal {K}\). We assume that \(F: \mathcal {K}^{\circ } \mapsto \mathbb {R}\) is a lhscb for \(\mathcal {K}\) with barrier parameter \(\nu \). This means that for all \(x \in \mathcal {K}^{\circ }\) and \(t>0\),
It follows that the conjugate of \(F\), denoted \(F^*\) and defined for \(s \in (\mathcal {K}^*)^{\circ }\) by
is a lhscb for the dual cone \(\mathcal {K}^*\). Similarly to the notation used in [17, 18], we write the local Hessian norms on \(\mathcal {K}\) and \(\mathcal {K}^*\) as:
where \(H_s^* = \nabla ^2 F^*(s)\). Notice the different definitions of \(\Vert \cdot \Vert _y^*\) depending on whether \(y\) is in \(\mathcal {K}\) or \(\mathcal {K}^*\). Using this convention and that \(-g_x \in (\mathcal {K}^*)^\circ \) and \(H_{-g_x}^* = H_x^{-1}\), we see that
For \(x \in \mathcal {K}^\circ \), \(F\) satisfies
The Dikin ellipsoids are feasible [4]. That is:
Appendix 2: The homogeneous and self-dual model
1.1 Optimality and infeasibility certificate
Let \(G\) be defined by (5) and notice that \(G\) is skew-symmetric: \(G = -G^T\).
-
1.
Observe that we can write (hsd) as \(G(y; x; \tau )^T - (0; s;\kappa )^T = 0\). Pre-multiplying this equation by \((y; x; \tau )^T\) gives \(x^T s + \tau \kappa = 0\).
-
2.
\(\tau > 0\) implies \(\kappa = 0\) and hence \(b^T (y/\tau ) - c^T (x/\tau ) = 0\) and therefore \(x^T s = 0\). Dividing the two first linear feasibility equations of (hsd) by \(\tau \), we obtain the linear feasibility equations of (1). Thus \((x,y,s)/\tau \) is optimal for (pd).
-
3.
If \(\kappa > 0\) then \(\tau = 0\) so \(Ax=0\) and \(A^T y + s = 0\). Further \(c^T x - b^T y = -\kappa < 0\) so not both \(c^T x\) and \(-b^T y\) can be non-negative. Assume \(-b^T x < 0\). If (pd) is primal-feasible then there exists \(\bar{x} \in \mathcal {K}\) such that \(A \bar{x} = b\). But then \(0 > -b^T y = -\bar{x}^T A^T y = \bar{x}^T s \ge 0\), a contradiction. We can argue similarly if \(c^T x < 0\),
and this completes the proof of Lemma 1.
1.2 Self-duality
The dual of (hsd) problem is
After a few eliminations, we see that (24)–(26) are equivalent to
Through the following identification of variables
it is clear that the constraints (27) are equivalent to those of the problem (hsd). Since the objective function in both problems is constant zero, the two problems are identical and this proves Lemma 2.
Appendix 3: Prediction
The direction \(d_{z}\) is defined by
1.1 Reduction of residuals
We first show:
-
1.
We get \({\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T d_{{\bar{s}}} + {\bar{x}}^T {\bar{s}} \mathop {=}\limits ^{(29)} {\bar{s}}^T d_{{\bar{x}}} + {\bar{x}}^T(-{\bar{s}}-\mu H_{\bar{x}} d_{{\bar{x}}}) + {\bar{x}}^T {\bar{s}}\), which, after reduction, gives \(d_{{\bar{x}}}^T ({\bar{s}} - \mu H_{\bar{x}} {\bar{x}}) = \psi (z)^T d_{{\bar{x}}}\).
-
2.
Equation (28) is equivalent to \(G(y+d_{y}; {\bar{x}}+d_{{\bar{x}}}) - (0; {\bar{s}}+d_{{\bar{s}}}) = 0\). Pre-multiplying this equation by \((y+d_{y},{\bar{x}}+d_{{\bar{x}}})\) gives (31).
- 3.
Now the lemma follows readily: We simply note that the first equation follows directly from elementary linear algebra. To show the second:
which after division by \({\bar{\nu }}\) proves Lemma 3.
1.2 Bounds on \({\bar{s}}\), \(d_{{\bar{s}}}\) and \(d_{{\bar{x}}}\)
Assume \(\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu \). By definition, \(\psi = {\bar{s}} - \mu H_{\bar{x}} {\bar{x}}\), which after left-multiplication by \(H_{\bar{x}}^{-1/2}\), taking norms and squaring both sides gives
where we used (21) and \({\bar{x}}^T \psi = 0\).
This bound allows us to obtain bounds on \(d_{{\bar{x}}}\) and \(d_{{\bar{s}}}\): Left-multiplying (29) by \(H_{\bar{x}}^{-1/2}\), taking norms and squaring both sides gives
by the Cauchy–Schwarz inequality. Therefore: \( \mu ^2 \Vert d_{{\bar{x}}} \Vert _{\bar{x}}^2 \le (\Vert {\bar{s}} \Vert _{\bar{x}}^*)^2 + 2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^* \). Now subtracting \(2 \mu \Vert d_{{\bar{x}}} \Vert _{\bar{x}} \Vert \psi \Vert _{\bar{x}}^*\) and adding \((\Vert \psi \Vert _{\bar{x}}^*)^2\) to both sides, we get
or
For \(d_{{\bar{s}}}\), we similarly have
where \(k_{\bar{s}} := k_{\bar{x}} + \sqrt{(\eta ^2 + {\bar{\nu }}) + k_{\bar{x}}^2}\).
1.3 Feasibility of \(z^+\)
Define \(\alpha _1 := k_{\bar{x}}^{-1} = \Omega (1/\sqrt{\bar{\nu }})\). Then for any \(\alpha \le \alpha _1\), we have
and so from (22), we conclude \({\bar{x}}+\alpha d_{{\bar{x}}} = {\bar{x}}^+ \in \bar{\mathcal {K}}\).
Now, define \(\alpha _2 := (1-\eta )k_{\bar{s}}^{-1} = \Omega (1/\sqrt{\bar{\nu }})\). Then for \(\alpha \le \alpha _2\), we have
Since \(-g_{\bar{x}} \in \bar{\mathcal {K}}^*\), we have by (23) that \(\mu ^{-1}{\bar{s}}^+ \in \bar{\mathcal {K}}^*\) and therefore \({\bar{s}}^+ \in \bar{\mathcal {K}}^*\). Therefore, Lemma 4 holds with \(\alpha = \min \{\alpha _1,\alpha _2\} = \Omega (1/\sqrt{\bar{\nu }}) = \Omega (1/\sqrt{\nu })\).
1.4 Bound on \(\psi ^+\)
First recall the definition (6): \(\psi ({\bar{x}},{\bar{s}},t) = {\bar{s}} + t g_{\bar{x}}\). Now consider for a fixed \(v_0\) the function
which is self-concordant with respect to \({\bar{x}}\). Define its Newton step by \( n_t({\bar{x}}) := -\nabla ^2 \Phi _t({\bar{x}})^{-1} \nabla \Phi _t({\bar{x}}) \). Define also \(q{}=\Vert n_{t_2}({\bar{x}}) \Vert _{\bar{x}}\). From the general theory of self-concordant functions, the following inequality holds. If \(q{} \le 1\), then
For a proof of this relation, see e.g. Theorem 2.2.4 in [21]. With \(v_0 = {\bar{s}}^+, t_2=\mu ^+\) and \({\bar{x}}_2 = {\bar{x}}^+\), the inequality (36) is
where \(\mu ^+ q{} = \Vert H_{\bar{x}}^{-1}({\bar{s}}^+ + \mu ^+ g_{\bar{x}}) \Vert _{\bar{x}} = \Vert {\bar{s}}^+ + \mu ^+ g_{\bar{x}} \Vert _{\bar{x}}^*\). From Lemma 3 and (34):
By the assumption \(\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu \) combined with (34), we have \(\psi ^T d_{{\bar{x}}} \ge -\eta k_{\bar{x}} \mu \). Therefore
Let us now obtain a bound on \(q{}\).
where we used (35), (38), (39) and the assumption \(\Vert \psi \Vert _{\bar{x}}^* \le \eta \mu \). Now the reader can verify that for \(\eta \le 1/6\) and \({\bar{\nu }}\ge 2\), we have the implication
which also implies \(q< 1\). Now by (37), we see that (41) implies \( \Vert \psi ^+ \Vert _{{\bar{x}}^+}^* \le 2\eta \mu ^+\) and hence \(z^+ \in \mathcal {N}(2\eta )\) which finishes the proof of Lemma 5.
Appendix 4: Correction phase
Assume \(\Vert \psi ({\bar{x}},{\bar{s}},\mu ) \Vert _{\bar{x}}^* \le \beta \mu \) where \(\mu := \mu (z)\) with \(z = ({\bar{x}},y,{\bar{s}})\). The equations defining the correction step \((\delta _{{\bar{x}}},\delta _{y},\delta _{{\bar{s}}})\) are
and the next point is then \(({\bar{x}}^+,y^+,{\bar{s}}^+) := ({\bar{x}},y,{\bar{s}}) + \hat{\alpha }(\delta _{{\bar{x}}},\delta _{y},\delta _{{\bar{s}}})\). Left-multiplying (42) by \((\delta _{y},\delta _{{\bar{x}}})^T\), we get \(\delta _{{\bar{x}}}^T \delta _{{\bar{s}}} = 0\). From (43), we then have
and therefore
From (43), we also have
Where we used (44). Now define \(q{}=(\mu ^+)^{-1}\Vert {\bar{s}}^+ + \mu ^+ g_{{\bar{x}}} \Vert _{\bar{x}}^*\). Then estimating similarly to (40), we get
and similarly to the computation in (39), we therefore find
so that altogether
Now we can apply the theorem (36) with \(v_0 = {\bar{s}}^+\), \(t=\mu \) and \({\bar{x}}_2 = {\bar{x}}^+\):
The reader can verify that for \(\hat{\alpha } \le 1/84, {\bar{\nu }} \ge 2, \beta \le 2\eta \le 1/3\), the bound (46) implies that when recursively using (47) twice, we obtain
and therefore \(z^+ \in \mathcal {N}(\eta )\) which proves Lemma 6.
Appendix 5: Algorithm complexity
From Lemma 3, we have that the linear residuals \(G(y; {\bar{x}})-(0; {\bar{s}})\) are reduced by a factor \((1-\alpha )\) in each iteration. Since we can always take \(\alpha = \Omega (1/\sqrt{\bar{\nu }})\), we see that \(G(y; {\bar{x}})-(0; {\bar{s}})\) decreases geometrically with a rate of \((1-\Omega {(1/\sqrt{\bar{\nu }})})\) which implies that
in \(\mathcal {O}{(\sqrt{{\bar{\nu }}}\log {(1/\epsilon )})} = \mathcal {O}{(\sqrt{{\nu }}\log {(1/\epsilon )})}\) iterations.
To see that the same holds for \(\mu (z)\), let us briefly use the following notation: \(z\) is the starting point, \(z^+\) is the point after prediction and \(z^{(j)}\) is the point after applying \(j\) correction steps starting in \(z^+\). Then from Lemma 3 and (34), we have
Since \(\delta _{{\bar{x}}}^T \delta _{{\bar{s}}} = 0\), we see from (43) that
Therefore
and hence
which shows that also \(\mu (z)\) is decreased geometrically with a rate of \((1-\Omega {(1/\sqrt{\bar{\nu }})})\). Therefore \( \mu (z) \le \epsilon \mu (z^0) \) in \(\mathcal {O}{(\sqrt{{\bar{\nu }}}\log {(1/\epsilon )})} = \mathcal {O}{(\sqrt{{\nu }}\log {(1/\epsilon )})}\) iterations, finishing the proof of Theorem 1.
Rights and permissions
About this article
Cite this article
Skajaa, A., Ye, Y. A homogeneous interior-point algorithm for nonsymmetric convex conic optimization. Math. Program. 150, 391–422 (2015). https://doi.org/10.1007/s10107-014-0773-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-014-0773-1
Keywords
- Convex optimization
- Nonsymmetric conic optimization
- Homogeneous self-dual model
- Interior-point algorithm