Riemannian barycentres of Gibbs distributions: new results on concentration and convexity in compact symmetric spaces

Said, Salem; Manton, Jonathan H.

doi:10.1007/s41884-021-00055-5

Riemannian barycentres of Gibbs distributions: new results on concentration and convexity in compact symmetric spaces

Research Paper
Published: 01 September 2021

Volume 4, pages 329–362, (2021)
Cite this article

Information Geometry Aims and scope Submit manuscript

435 Accesses
2 Altmetric
Explore all metrics

Abstract

The Riemannian barycentre (or Fréchet mean) is the workhorse of data analysis for data taking values in Riemannian manifolds. The Riemannian barycentre of a probability distribution P on a Riemannian manifold M is a possible generalisation of the concept of expected value, at least when the barycentre is unique. Knowing when the barycentre of P is unique is of fundamental importance for its interpretation and computation. Existing results can only guarantee this uniqueness by assuming P is supported inside a convex geodesic ball $B(x^*,\delta ) \subset M$. This assumption is overly restrictive since many distributions have support equal to M yet are sufficiently concentrated within a convex geodesic ball that they nevertheless have a unique barycentre. This paper studies the concentration of Gibbs distributions on Riemannian manifolds and gives conditions for the barycentre to be unique. Specifically, consider the Gibbs distribution $P =P_{\scriptscriptstyle T}$ with unnormalised density $\exp \left( -U/T\right) $ for some potential $U:M\rightarrow {\mathbb {R}}$ and some temperature $T > 0$. If M is a simply connected compact Riemannian symmetric space, and U has a unique global minimum at $x^*$, then for each $\delta < \frac{1}{2}r_{\scriptscriptstyle cx}$ ($r_{\scriptscriptstyle cx}$ the convexity radius of M), there exists a critical temperature $T_{\scriptscriptstyle \delta }$ such that $T < T_{\scriptscriptstyle \delta }$ implies $P_{\scriptscriptstyle T}$ has a unique Riemannian barycentre ${\bar{x}}_{\scriptscriptstyle T}$ and this ${\bar{x}}_{\scriptscriptstyle T}$ belongs to the geodesic ball $B(x^*,\delta )$. Moreover, if U is invariant by geodesic symmetry about $x^*$, then ${\bar{x}}_{\scriptscriptstyle T} = x^*$. Remarkably, this conclusion does not require the potential U to be smooth and therefore serves as the foundation of a new general algorithm for black-box optimisation. This algorithm is briefly illustrated with two numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Existence and consistency of Wasserstein barycenters

Article 17 August 2016

Information geometry and asymptotic geodesics on the space of normal distributions

Article 25 June 2021

Bounds for $$L_p$$ -Discrepancies of Point Distributions in Compact Metric Measure Spaces

Article 01 July 2019

References

Fréchet, M.R.: Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. H. Poincaré 10(4), 215–310 (1948)
MathSciNet MATH Google Scholar
Said, S., Hajri, H., Bombrun, L., Vemuri, B.C.: Gaussian distributions on Riemannian symmetric spaces: statistical learning with structured covariance matrices. IEEE Trans. Inf. Theory 64(2), 752–772 (2018)
Article MathSciNet Google Scholar
Chakraborty, R., Vemuri, B.C.: Statistics on the compact Stiefel manifold: theory and applications. Ann. Stat. 47(1), 415–438 (2019)
Article Google Scholar
Afsari, B.: Riemannian $L^p$ center of mass: existence, uniqueness, and convexity. Proc. Am. Math. Soc. 139(2), 655–673 (2010)
Article MathSciNet Google Scholar
Petersen, P.: Riemannian Geometry, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Karcher, H.: Riemannian center of mass and mollifier smoothing. Commun. Pure Appl. Math. 30(5), 509–541 (1977)
Article MathSciNet Google Scholar
Mardia, K.V., Jupp, P.E.: Directional Statistics. Academic Press Inc., London (1972)
MATH Google Scholar
Kendall, D.G.: Shape manifolds, Procrustean metrics, and complex projective spaces. Bull. Lond. Math. Soc. 16(2), 82–121 (1984)
MathSciNet MATH Google Scholar
Srivastava, A., Klassen, E.: Bayesian and geometric subspace tracking. Adv. Appl. Probab. 36(1), 43–56 (2004)
Article MathSciNet Google Scholar
Buss, S.R., Fillmore, J.P.: Spherical averages and applications to spherical splines and interpolations. ACM Trans. Graph. 20, 2 (2001)
Article Google Scholar
Kantorovich, L.V., Akilov, G.P.: Functional Analysis, 2nd edn. Pergamon Press, Oxford (1982)
MATH Google Scholar
Villani, C.: Optimal Transport, Old and New, 2nd edn. Springer, Berlin (2009)
Book Google Scholar
Wong, R.: Asymptotic approximations of integrals. Society of Industrial and Applied Mathematics (2001)
Chavel, I.: Riemannian Geometry, A Modern Introduction. Cambridge University Press, Cambridge (2006)
Book Google Scholar
Rifford, L.: A Morse–Sard theorem for the distance function on Riemannian manifolds. Manuscripta Math. 113(2), 25–265 (2004)
Article MathSciNet Google Scholar
Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. American Mathematical Society, New York (2001)
Book Google Scholar
Beals, R., Wong, R.: Special Functions, A Graduate Text. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Roberts, G.O., Rosenthal, J.S.: General state space Markov chains and MCMC algorithms. Probab. Surv. 1, 20–71 (2004)
Article MathSciNet Google Scholar
Arnaudon, M., Dombry, C., Phan, A., Yang, L.: Stochastic algorithms for computing means of probability measures. Stoch. Proc. Appl. 122, 1437–1455 (2012)
Article MathSciNet Google Scholar
Durmus, A., Jiménez, P., Moulines, E., Said, S., Wai, H.T.: Convergence analysis of Riemannian stochastic approximation schemes. arXiv:2005.13284
Robert, C.P., Casella, G.: Introducing Monte Carlo methods with R. Springer, New York (2010)
Book Google Scholar
Nesterov, Yu., Spokoiny, V.G.: Random gradient-free minimisation for convex functions. Found. Comput. Math. 17, 527–566 (2017)
Article MathSciNet Google Scholar
Rall, L.B.: Automatic Differentiation: Techniques and Applications. Springer, Berlin (1981)
Book Google Scholar
Bogachev, V.I.: Measure Theory, vol. I. Springer, Berlin (2007)
Book Google Scholar
Hurewicz, W., Wallman, H.: Dimension Theory. Princeton University Press, Princeton (1941)
MATH Google Scholar
Crittenden, R.: Minimum and conjugate points in symmetric spaces. Can. J. Math. 14, 320–328 (1962)
Article MathSciNet Google Scholar
Ferreira, R., Xavier, J., Costeira, J.P., Barroso, V.: Newton algorithms for Riemannian distance related problems on connected locally-symmetric manifolds. IEEE J. Sel. Top. Signal Process. 7(4), 634–645 (2013)
Article Google Scholar
Besse, A.L.: Manifolds All of Whose Geodesics are Closed. Springer, New York (1978)
Book Google Scholar

Download references

Author information

Authors and Affiliations

CNRS, Laboratoire IMS (UMR 5218), Université de Bordeaux, Bordeaux, France
Salem Said
Department of Electrical and Electronic Engineering, The University of Melbourne, Melbourne, Australia
Jonathan H. Manton

Authors

Salem Said
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan H. Manton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Salem Said.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

An elementary example

Let $M = S^2$, the unit sphere in ${\mathbb {R}}^3$. Then, M is a simply connected, compact rank-one symmetric space (see 3). It is even a space of constant sectional curvature, equal to $+1$.

Fix some $x^* \in M$, and consider the potential function $U(x) = \frac{1}{2}\,d^2(x,x^*)$. In (3), this gives rise to a “Gaussian” distribution,

$$\begin{aligned} P_{\scriptscriptstyle T}(dy)\,=\, \left( Z(T)\right) ^{-1}\,\exp \left[ -\frac{d^2(y,x^*)}{2T}\right] \,\mathrm {vol}(dy) \end{aligned}$$

(33a)

with the normalising constant,

$$\begin{aligned} Z(T) = 2\pi \,\int ^{\scriptscriptstyle \pi }_{\scriptscriptstyle 0}\,e^{-\frac{r^2}{2T}}\sin (r)dr \end{aligned}$$

(33b)

The concentration of the barycentres of $P_{\scriptscriptstyle T}$ can be understood using Proposition 1. This proposition may be applied, since U(x) has a unique global minimum, at $x = x^*$, and verifies (5), with $\mu _{\min } = \mu _{\max } = 1$. In particular, Inequality (7) takes on the form,

$$\begin{aligned} W(P_{\scriptscriptstyle T\,},\delta _{\scriptscriptstyle x^*}) \le \,\left( \pi /2\right) ^{\frac{3}{2}}\,\times \,T^{\frac{1}{2}} \end{aligned}$$

(33c)

for any $T\le T_{\scriptscriptstyle W}$ where $T_{\scriptscriptstyle W} \approx 7\times 10^{\scriptscriptstyle -2}$ (this is found from (12)). Then, using (6), it follows :

$\diamond $ let $B(x^*,\eta )$ be any open ball centred at $x^*$. All Riemannian barycentres, of the Gaussian distribution (33a), lie within $B(x^*,\eta )$, as soon as $8\pi T < \left( \eta \big /\pi \right) ^4$.

The probability density function in (33a) is not smooth. Indeed, it depends on the squared distance function $\frac{1}{2}\,d^2(x,x^*)$, whose Hessian diverges on the cut locus $\mathrm {Cut}(x^*) = \lbrace -x^*\rbrace $ (see 2). This appears twice under the integral, defining the variance function

$$\begin{aligned} {\mathcal {E}}_{\scriptscriptstyle T}(x) \,=\, \frac{1}{2}\left( Z(T)\right) ^{-1}\,\int _{S^2}\,d^2(x,y)\,\exp \left[ -\frac{d^2(y,x^*)}{2T}\right] \,\mathrm {vol}(dy) \end{aligned}$$

(33d)

Therefore, it is not immediately clear that the variance function should be smooth. However, this function is indeed smooth, by Proposition 2. In turn, Proposition 2 can be applied, since M is simply connected, so that Lemma 1 holds. In the present case, it is possible to give an elementary proof of this lemma :

$\diamond $ let $\gamma : I \rightarrow M = S^2$ be a geodesic, defined on a compact interval I. For each $t \in I$, $\mathrm {Cut}(\gamma (t))$ is made up of one point, $\mathrm {Cut}(\gamma (t)) = \lbrace - \gamma (t)\rbrace $. If $\mathrm {Cut}(\gamma )$ is the union of $\mathrm {Cut}(\gamma (t))$, for each $t \in I$, then $\mathrm {Cut}(\gamma )$ is itself the image of the geodesic $-\gamma $ (that is, $\mathrm {Cut}(\gamma ) = -\gamma (I)$). Thus, it is clear that $\mathrm {Cut}(\gamma )$ is of dimension 1, strictly less than the dimension of M. In particular, the Riemannian volume (the spherical surface area) of $\mathrm {Cut}(\gamma )$ is equal to 0.

While Proposition 2 guarantees ${\mathcal {E}}_{\scriptscriptstyle T}(x)$ is $C^2$-smooth, throughout M, Proposition 3 implies :

$\diamond $ for each $\delta < \frac{\pi }{4}$, there exists $T_{\scriptscriptstyle \delta }$ such that $T < T_{\scriptscriptstyle \delta }$ implies : (i) ${\bar{x}}_{\scriptscriptstyle T} = x^*$ is the unique Riemannian barycentre of the Gaussian distribution (33a). (ii) the variance function ${\mathcal {E}}_{\scriptscriptstyle T}(x)$ is strongly convex on $B(x^*,\delta )$.

The critical temperature $T_{\scriptscriptstyle \delta }$ may be found from (13). In fact, it may be seen that ${\bar{x}}_{\scriptscriptstyle T} = x^*$ is the unique Riemannian barycentre of the distribution (33a), as soon as $T < 6\times 10^{\scriptscriptstyle -4}$. This value of the critical temperature seems “small”. In addition, numerical experiments suggest ${\bar{x}}_{\scriptscriptstyle T} = x^*$ is the unique barycentre, for values of T as much as ten times larger. This may be explained as follows.

The critical temperature $T_{\scriptscriptstyle \delta }$ is introduced in Lemma 2, in order to ensure the variance function is strongly convex on $B(x^*,\delta )$, where $\delta < \frac{\pi }{4}$. Strong convexity implies the variance function has a unique stationary point in $B(x^*,\delta )$, necessarily the Riemannian barycentre, ${\bar{x}}_{\scriptscriptstyle T} = x^*$. However, it is possible to drop this requirement of strong convexity, and still maintain the uniqueness of the barycentre. This would allow for the upper bound $T < T_{\scriptscriptstyle \delta }$ to be relaxed.

Of course, the argument just made (i.e., convexity is not necessary for uniqueness) is of a general nature, not restricted to the present example, $M = S^2$. Therefore, one hopes that Proposition 3 may still be considerably strengthened, for general, simply connected, compact Riemannian symmetric spaces.

Symmetric spaces of the compact type

This appendix presents the material on Riemannian symmetric spaces of the compact type, needed in the proofs of Propositions 2 and 3. This material is here derived, after some additional work, from Helgason’s monograph [16] (Chapters IV and VII).

In all of the following, let M be a Riemannian symmetric space of the compact type, with Riemannian metric $\langle \cdot ,\cdot \rangle $. There exists a compact, connected, and semisimple Lie group G, which acts transitively and isometrically on M. Given some $x \in M$, denote K the isotropy subgroup of x in G. Then, $M \simeq G/K$ as a Riemannian homogeneous space.

The Riemannian geometry of M can be described in algebraic terms. Let ${\mathfrak {g}}$ and ${\mathfrak {k}}$ be the Lie algebras of G and K, and denote B the Killing form of ${\mathfrak {g}}$. If ${\mathfrak {p}}$ is the orthogonal complement of ${\mathfrak {k}}$ with respect to B, then the tangent space $T_xM$ may be identified with ${\mathfrak {p}}$, and the Riemannian metric of M given by (possibly after re-scaling)

$$\begin{aligned} \langle u,v\rangle _x \,=\, - B(u,v) u,v \in T_xM \simeq {\mathfrak {p}} \end{aligned}$$

(34a)

In addition, the curvature tensor $R_x$ verifies

$$\begin{aligned} R_x(u,v)w \,=\,-[[u,v],w] u,v,w \in T_xM \simeq {\mathfrak {p}} \end{aligned}$$

(34b)

where $[\cdot ,\cdot ]$ denotes the Lie bracket. Here, the fact that M is a Riemannian symmetric space guarantees the right-hand side always belongs to ${\mathfrak {p}}$.

Now, it will be very useful to further simplify (34a) and (34b). Precisely, let ${\mathfrak {a}}$ be a maximal Abelian subspace of ${\mathfrak {p}}$, and $\Delta _{\scriptscriptstyle +}$ the set of positive roots of ${\mathfrak {g}}$ with respect to ${\mathfrak {a}}$. Then, each $u \in {\mathfrak {p}}$ can be written under the form $u = \mathrm {Ad}(k)a$ for some $k \in K$ and $a \in {\mathfrak {a}}$ (where $\mathrm {Ad}$ denotes the adjoint representation), and the Riemannian metric of M given by

$$\begin{aligned} \langle u,u\rangle _x \,=\, 2\, \sum _{\lambda \in \Delta _{\scriptscriptstyle +}}\,m_{\scriptscriptstyle \lambda }\left( \lambda (a)\right) ^2 \end{aligned}$$

(35a)

where $m_{\scriptscriptstyle \lambda }$ is the multiplicity of the root $\lambda $. In addition, the curvature tensor $R_x$ verifies

$$\begin{aligned} R_x(u,v)u \,=\,-\sum _{\lambda \in \Delta _{\scriptscriptstyle +}}\,\left( \lambda (a)\right) ^2\,{\Pi }^k_{\lambda }(v) {\Pi }^k_\lambda = \mathrm {Ad}(k)\circ {\Pi }_{\lambda }\circ \mathrm {Ad}(k^{\scriptscriptstyle -1}) \end{aligned}$$

(35b)

where each ${\Pi }_{\lambda }$ is an orthogonal projector of rank $m_\lambda \,$. From (35b) it is possible to see that M has positive sectional curvatures, and that ${\Pi }_{{\mathfrak {a}}}$ and the ${\Pi }_{\lambda }$ form a complete system of orthogonal projectors in ${\mathfrak {p}}$ (here, ${\Pi }_{{\mathfrak {a}}}$ is the orthogonal projector onto ${\mathfrak {a}}$).

With these simplifications, it is possible to obtain the closed form solution of the Jacobi equation. Indeed, assume that $\gamma $ is a geodesic which leaves x at $t = 0$, with initial velocity u. Recall the Jacobi equation, whose solution J is a vector field along $\gamma $ (for example, see [5]),

$$\begin{aligned} {\ddot{J}}(t) \,=\, R_{\gamma (t)}({\dot{\gamma }},J(t))\,{\dot{\gamma }} \end{aligned}$$

(36a)

where the dot denotes the covariant derivative along $\gamma $. Moreover, recall that the curvature tensor of a symmetric space is parallel. Therefore, if $j(t) = P^{\scriptscriptstyle -1}_tJ(t)$, where $P_t$ denotes parallel transport from $\gamma (0)$ to $\gamma (t)$, then j(t) solves an ordinary differential equation with constant coefficients. Precisely, from (35b),

$$\begin{aligned} j^{\scriptscriptstyle ''}(t) \,=\,-\sum _{\lambda \in \Delta _{\scriptscriptstyle +}}\,\left( \lambda (a)\right) ^2\,{\Pi }^k_{\lambda }(j(t)) \end{aligned}$$

(36b)

where the prime denotes the derivative with respect to t. The solution of (36b) is standard, and immediately provides the solution of (36a). Indeed, for the initial conditions $J(0) = 0$ and ${\dot{J}}(0) = v$, it follows from the definition of j(t),

$$\begin{aligned} J(t) \,=\,\left( P_t\circ {\Pi }^k_{{\mathfrak {a}}}\right) (tv)+\sum _{\lambda \in \Delta _{\scriptscriptstyle +}}\, \frac{\sin \lambda (a)\,t}{\lambda (a)}\,\left( P_t\circ {\Pi }^k_{\lambda }\right) (v) \end{aligned}$$

(37)

as can be checked directly. This solution of the Jacobi equation will be the main ingredient in the following.

1.1 Integration in polar coordinates

Assume now, as in Propositions 2 and 3, that M is a simply connected compact Riemannian symmetric space. By the Cartan–Hadamard theorem [5], M has positive sectional curvatures, and is therefore of the compact type. For the purpose of computing integrals on M, it seems desirable to parameterise each point $y \in M$ by so-called polar coordinates $k \in K$ and $a \in {\mathfrak {a}}$,

$$\begin{aligned} y \,=\, \mathrm {Exp}_x\left( \mathrm {Ad}(k)a\right) \end{aligned}$$

(38a)

where $\mathrm {Exp}$ is the Riemannian exponential mapping. However, these polar coordinates are not really coordinates, because they are not unique.

To obtain a unique parameterisation, let $S = \left. K\big /K_{{\mathfrak {a}}}\right. $ where $K_{{\mathfrak {a}}}$ denotes the centraliser of ${\mathfrak {a}}$ in K (that is, the set of $k \in K$ such that $\mathrm {Ad}(k)a = a$ for all $a \in {\mathfrak {a}})$. Moreover, let $C_{\scriptscriptstyle +}$ denote the set of $a \in {\mathfrak {a}}$ such that $\lambda (a) \in (0,\pi )$ for each $\lambda \in \Delta _{\scriptscriptstyle +}\,$. Then, consider the mapping

$$\begin{aligned} \varphi (s,a) \,=\, \mathrm {Exp}_x\left( \,\!\beta (s,a)\right) (s,a) \in S\times {\bar{C}}_{\scriptscriptstyle +} \end{aligned}$$

(38b)

where $\beta (s,a) = \mathrm {Ad}(s)a$, and ${\bar{C}}_{\scriptscriptstyle +}$ is the closure of $C_{\scriptscriptstyle +}$. It turns out that $\varphi $ maps $S\times {\bar{C}}_{\scriptscriptstyle +}$ onto the whole of M, and is a diffeomorphism of $S\times C_{\scriptscriptstyle +}$ onto its image $M_{\scriptscriptstyle r\,}$, which is the set of regular values of $\varphi \,$. The proof of this last statement is not detailed here. It relies on an important result from [26] : if M is a simply connected compact Riemannian symmetric space, then the cut locus of any point x in M is identical to its first conjugate locus.

With this in mind, it is possible to compute the integral of any measurable function f on M. By Sard’s lemma, the set $M-M_{\scriptscriptstyle r}$ has Riemannian volume equal to zero [24]. Therefore,

$$\begin{aligned} \int _M\,f(y)\,\mathrm {vol}(dy)\,=\,\int _{M_{\scriptscriptstyle r}}\,f(y)\,\mathrm {vol}(dy) \end{aligned}$$

(39a)

Or, since $\varphi $ is a diffeomorphism onto $M_{\scriptscriptstyle r\,}$,

$$\begin{aligned} \int _M\,f(y)\,\mathrm {vol}(dy)\,=\,\int _S\int _{C_{\scriptscriptstyle +}}\,f(s,a)\,D(a)\,da\,\omega (ds) \end{aligned}$$

(39b)

where $f(s,a) = (f\circ \varphi )(s,a)$ and D(a) is the Jacobian determinant of $\varphi $, and where $\omega $ is the invariant Riemannian volume induced on S from K.

To make use of (39b), it remains to express D(a). First, by a well-known result from Riemannian geometry (see [5]), in the notation of (37),

$$\begin{aligned} d\,\mathrm {Exp}_x(u)\,v \,=\, J(1) v \in T_uT_xM \simeq {\mathfrak {p}} \end{aligned}$$

where $d\,\mathrm {Exp}_x(u)$ denotes the derivative of $\mathrm {Exp}_x$ at u. From (37), the determinant of this derivative is

$$\begin{aligned} \det \left( d\,\mathrm {Exp}_x(u)\right) \,=\, \prod _{\lambda \in \Delta _{\scriptscriptstyle +}}\,\left( \frac{\sin \lambda (a)}{\lambda (a)}\right) ^{m_\lambda } \end{aligned}$$

(40a)

Second, it can be shown, from the definition of the adjoint representation, that

$$\begin{aligned} \det \left( d\beta (s,a)\right) \,=\, \prod _{\lambda \in \Delta _{\scriptscriptstyle +}}\,\left( \lambda (a)\right) ^{m_\lambda } \end{aligned}$$

(40b)

(for a detailed calculation of the derivative $d\beta (s,a)$ of $\beta $ at (s, a), see Page 295 of [16]). Multiplying (40a) by (40b) provides the Jacobian determinant D(a) of $\varphi \,$. Replacing this into (39b) gives

$$\begin{aligned} \int _M\,f(y)\,\mathrm {vol}(dy)\,=\,\int _S\int _{C_{\scriptscriptstyle +}}\,\,f(s,a)\prod _{\lambda \in \Delta _{\scriptscriptstyle +}}\left( \sin \lambda (a)\right) ^{m_\lambda }\,da\,\omega (ds) \end{aligned}$$

(41)

which is a formula for integration in polar coordinates, on any simply connected compact Riemannian symmetric space. Some examples of this formula are provided in 3 below.

1.2 The squared distance function

With the same assumptions as in 1, let $y \in M$, and consider the squared distance function, $f_y(x) = \frac{1}{2}\,d^2(x,y)$. This function will be $C^2$ near x, whenever $y \notin \mathrm {Cut}(x)$, where $\mathrm {Cut}(x)$ denotes the cut locus of x. As noted above, $\mathrm {Cut}(x)$ is identical to the first conjugate locus of x [26]. Using this result, $\mathrm {Cut}(x)$ can be described in the following way.

In the notation of (37), the first conjugate locus of x is the locus of points $\gamma (t_{\scriptscriptstyle c})$ where $J(t_{\scriptscriptstyle c}) = 0$ for the first time after $J(0) = 0$. Clearly, $t_{\scriptscriptstyle c}$ is given by

$$\begin{aligned} t_{\scriptscriptstyle c} \,=\, \mathrm {min}_{\lambda \in \Delta _{\scriptscriptstyle +}}\,\, \frac{\pi }{\left| \lambda (a)\right| } \,=\, \mathrm {min}_{\lambda \in \Delta _{\scriptscriptstyle +}}\,\, \frac{\pi }{\lambda (a)} \end{aligned}$$

(42a)

where the absolute value can be dropped because it is always possible to chose $a \in C_{\scriptscriptstyle +\,}$. But, if M is an irreducible symmetric space, then there exists a maximal root $\kappa \in \Delta _{\scriptscriptstyle +\,}$, so that $\kappa (a) \ge \lambda (a)$ for $\lambda \in \Delta _{\scriptscriptstyle +}$ and $a \in C_{\scriptscriptstyle +}$ [16] (Page 337). In this case, $t_{\scriptscriptstyle c} = \pi /\kappa (a)\,$. On the other hand, if M is not irreducible, it is a product of irreducible Riemannian symmetric spaces (all of the compact type), say $M = M_{\scriptscriptstyle 1}\times \cdots \times M_{\scriptscriptstyle s\,}$. Then, if $\kappa _{\scriptscriptstyle 1}\,,\ldots ,\kappa _{\scriptscriptstyle s}$ are the corresponding maximal roots,

$$\begin{aligned} t_{\scriptscriptstyle c} \,=\, \mathrm {min}_{\ell \in \lbrace 1,\ldots ,s\rbrace }\,\, \frac{\pi }{\kappa _{\scriptscriptstyle \ell }(a)} \end{aligned}$$

(42b)

Now, from the definition of the Riemannian exponential mapping, it can be seen that

$$\begin{aligned} \mathrm {Cut}(x) \,=\, \varphi (S\times {\bar{C}}_{\scriptscriptstyle \pi }) \text {where }\,\,{\bar{C}}_{\scriptscriptstyle \pi } \,=\,{\bar{C}}_{\scriptscriptstyle +}\, \cap \left( \,\cup _{\ell }\left\{ a: \kappa _{\scriptscriptstyle \ell }(a) = \pi \right\} \,\right) \end{aligned}$$

(43)

where $\varphi $ was defined in (38b).

To express the derivatives of $f_y(x)$, denote $G_x(y)$ and $H_x(y)$ the gradient and the Hessian of this function. If $y \notin \mathrm {Cut}(x)$ is given by $y = \varphi (s,a)$ in the notation of (38b), then (with $\cot $ the cotangent function)

$$\begin{aligned} G_x(y) \,=\, -\,\beta (s,a) \end{aligned}$$

(44a)

$$\begin{aligned} H_x(y) \,=\, {\Pi }^s_{{\mathfrak {a}}}\,+\sum _{\lambda \in \Delta _{\scriptscriptstyle +}}\lambda (a)\cot \lambda (a)\,\,{\Pi }^s_{\lambda } \end{aligned}$$

(44b)

where the orthogonal projectors ${\Pi }^s_{{\mathfrak {a}}}$ and ${\Pi }^s_{\lambda }$ are defined as in (35b). The proofs of (44a) and (44b) are not detailed here, but the reader may find partial proofs of these expressions in [27]. An important observation, in relation to (44b), is that $H_x(y)$ blows up as y approaches $\mathrm {Cut}(x)$. This is because $\lambda (a)\cot \lambda (a)$ diverges to $-\infty $ as $\lambda (a)$ approaches $\pi \,$.

Many things can be said about the Riemannian geometry of M, from the above material. The maximum sectional curvature of M is equal to $\kappa ^2 \,=\,\max \Vert \kappa _{\scriptscriptstyle \ell }\Vert ^2$ over $\ell \in \lbrace 1,\ldots ,s\rbrace $, where $\Vert \kappa _{\scriptscriptstyle \ell }\Vert $ denotes the Riemannian norm of $\kappa _{\scriptscriptstyle \ell } \in {\mathfrak {a}}^*$ (${\mathfrak {a}}^*$ being the dual space of ${\mathfrak {a}}$). In addition, the closed (that is, periodic) geodesics of minimal length in M have length $2\pi \kappa ^{-1}$ (see [16], Page 334). From (43), it is also possible to show the injectivity radius of M is equal to $\pi \kappa ^{-1}$, which is half the minimal length of a closed geodesic.

The convexity radius of M is $r_{\scriptscriptstyle cx} = \frac{\pi }{2}\kappa ^{-1\,}$. Indeed, if $B(y,\delta )$ is a geodesic ball with radius $\delta < \frac{\pi }{2}\kappa ^{-1}$, the distance between any two points in this ball is strictly less than the injectivity radius of M, so they can be connected by a unique geodesic. This geodesic will lie entirely in $B(y,\delta )$, ensuring the convexity of this ball. This follows from the fact that $f_y(x)$ is strongly convex on $B(y,\delta )$, which in turn follows from (44b). Roughly, since the distance between x and y is equal to $\Vert a \Vert $, the condition $\Vert a \Vert< \delta < \frac{\pi }{2}\kappa ^{-1}$ implies all the eigenvalues $\lambda (a)\cot \lambda (a)$ of $H_x(y)$ are strictly positive, and uniformly bounded below as x describes the ball $B(y,\delta )$. On the other hand, a geodesic ball $B(y,\delta )$ with radius $\delta = \frac{\pi }{2}\kappa ^{-1}$ fails to be convex. This is because this ball contains two antipodal points on a closed geodesic of minimal length which passes through y.

Remark

the most direct route to proving (44b) is through the Riccati equation (called the radial curvature equation in [5]). In particular, this shows that $H_x(y)$ is diagonal in any orthonormal basis which also diagonalises the linear mapping $v\mapsto R_x(u,v)u$ in (35b). Then, the eigenvalues $\lambda (a)\cot \lambda (a)$ of $H_x(y)$ can be found from the solution of a scalar Riccati equation. This approach to proving (44b) is different from the one employed in [27].

1.3 Compact rank-one symmetric spaces

The rank of the symmetric space M is the dimension of the maximal Abelian subspace ${\mathfrak {a}}$ of ${\mathfrak {p}}$. If this dimension is equal to 1, then M is called a compact rank-one symmetric space. Concretely, M is one of the following : a Euclidean sphere, a real, complex or quaternion projective space, or the Cayley projective plane. All of these are simply connected, except real projective spaces, and all are examples of manifolds all of whose geodesics are closed (this is the title of the book [28]).

Compact rank-one symmetric spaces are the only symmetric spaces of strictly positive sectional curvatures. A sphere or a real projective space has constant sectional curvature $\kappa ^{\scriptscriptstyle 2}$. For a complex or quaternion projective space, the sectional curvatures describe the interval $[\kappa ^{\scriptscriptstyle 2}/4,\kappa ^{\scriptscriptstyle 2}]$ ($\kappa > 0$ is a constant which is fixed by choosing the unit used to measure length). In the special case where M is a compact rank-one symmetric space, the material in 1 and 2 lends itself to several simplifications.

To begin with, M is an irreducible symmetric space. Indeed, if ${\hat{a}}$ is a unit vector in ${\mathfrak {a}}$, then any unit vector $u \in {\mathfrak {p}}$ can be written $u = \mathrm {Ad}(k)({\hat{a}})$ for some $k \in K$. But, this means the adjoint representation is transitive on the unit sphere in ${\mathfrak {p}}$, so M is indeed irreducible.

Now, any $a \in {\mathfrak {a}}$ is of the form $r{\hat{a}}$ for some real r. Consider the linear forms

$$\begin{aligned} \kappa (a) = \kappa \,r \lambda (a) = \frac{\kappa }{2}\,r \end{aligned}$$

(45a)

The set $\Delta _{\scriptscriptstyle +}$ of positive roots is either $\Delta _{\scriptscriptstyle +} = \lbrace \kappa \rbrace $ or $\Delta _{\scriptscriptstyle +} = \lbrace \kappa , \lambda \rbrace $. The first case corresponds to a sphere or real projective space of dimension n, where $m_{\scriptscriptstyle \kappa } = n-1$. A complex or quaternion projective space of dimension n falls within the second case, where $m_{\scriptscriptstyle \kappa } = 1$ in the complex subcase and $m_{\scriptscriptstyle \kappa } = 3$ in the quaternion subcase. In general, if $d = m_{\scriptscriptstyle \kappa } + 1$, then d divides n.

For the polar coordinate parameterisation (38b), note that $S = \left. K\big /K_{{\mathfrak {a}}}\right. $ can be identified with a unit sphere $S^{n-1}$. On the other hand, $C_{\scriptscriptstyle +}$ is the set of $a = r{\hat{a}}$ where $\kappa \,r \in (0,\pi )$. Accordingly, if $s \in S \simeq S^{n-1}$ and ${\hat{s}} = \beta (s,{\hat{a}})$,

$$\begin{aligned} \varphi (s,a) \,=\, \mathrm {Exp}_x\left( \,\!r{\hat{s}}\right) ({\hat{s}},r) \in S^{n-1}\times [0,\pi \kappa ^{\scriptscriptstyle -1}] \end{aligned}$$

(45b)

Therefore, it is clear that r and ${\hat{s}}$ are Riemannian spherical coordinates. If M is simply connected, then $\varphi $ is a diffeomorphism of $S^{n-1}\times (0,\pi \kappa ^{\scriptscriptstyle -1})$ onto its image, which is the set of regular values of $\varphi $. As in 1, this leads to a formula for integration, which is a special case of (41),

$$\begin{aligned} \int _M\,f(y)\,\mathrm {vol}(dy)\,=\,\int ^{r_{\scriptscriptstyle c}}_{0}\!\int _{S^{n-1}}\,f({\hat{s}},r)\,\left( \frac{\sin \kappa r}{\kappa }\right) ^{d-1}\,\left( \frac{\sin \kappa r/2}{\kappa /2}\right) ^{n-d}\,\omega _n(d{\hat{s}})\,dr \nonumber \\ \end{aligned}$$

(45c)

where $r_{\scriptscriptstyle c} \,=\, \pi \kappa ^{\scriptscriptstyle -1}$ and $\omega _n(d{\hat{s}})$ is the area element of $S^{n-1\,}$. Formula (45c) remains true if M is not simply connected, but with $r_{\scriptscriptstyle c} \,=\, \frac{\pi }{2}\kappa ^{\scriptscriptstyle -1}$.

The cut locus of any point $x \in M$ is the geodesic sphere centred at x and of radius $r_{\scriptscriptstyle c\,}$. In other words,

$$\begin{aligned} \mathrm {Cut}(x) \,=\, \left\{ \mathrm {Exp}_x\left( \,\!r_{\scriptscriptstyle c\,}{\hat{s}}\right) \,;\, {\hat{s}} \in S^{n-1}\right\} \end{aligned}$$

(45d)

In fact, if M is simply connected, this is a direct result of (42a). The right-hand side of (45d) is known as the antipodal set of x : the set of points in M which lie farthest away from x. In [16] (Page 330), this is shown to be a totally geodesic submanifold of M, and also a compact rank-one symmetric space.

Admitting that $\mathrm {Cut}(x)$ is a submanifold of M, it is not difficult to see from (45d) that this submanifold has dimension $n-d$ if M is simply connected, and $n-1$ otherwise. If M is a sphere, then $\mathrm {Cut}(x)$ is a single point, while if M is a projective space (real, complex, or quaternion), then $\mathrm {Cut}(x)$ is a projective space (real, complex, or quaternion)—see [28].

In relation to the remark made after Lemma 1 in Paragraph 2.2, it can here be seen why the conclusion of this lemma does not hold in the case where M is a real projective space. In this case, if $\gamma (t_{\scriptscriptstyle 0}) = x$, for some $t_{\scriptscriptstyle 0} \in I$, then $\mathrm {Cut}(x)$ is a submanifold of dimension $n-1$, as just discussed. But, then $\mathrm {Cut}(\gamma )$ is an immersed submanifold in M, locally diffeomorphic to $(-1,1) \times \mathrm {Cut}(x)$, so that it has dimension n, equal to the dimension of M. This also shows the Riemannian volume of $\mathrm {Cut}(\gamma )$ is non-zero.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Said, S., Manton, J.H. Riemannian barycentres of Gibbs distributions: new results on concentration and convexity in compact symmetric spaces. Info. Geo. 4, 329–362 (2021). https://doi.org/10.1007/s41884-021-00055-5

Download citation

Received: 03 June 2019
Revised: 17 August 2021
Accepted: 23 August 2021
Published: 01 September 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s41884-021-00055-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Riemannian barycentres of Gibbs distributions: new results on concentration and convexity in compact symmetric spaces

Abstract

Access this article

Similar content being viewed by others

Existence and consistency of Wasserstein barycenters

Information geometry and asymptotic geodesics on the space of normal distributions

Bounds for $$L_p$$ -Discrepancies of Point Distributions in Compact Metric Measure Spaces

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

An elementary example

Symmetric spaces of the compact type

1.1 Integration in polar coordinates

1.2 The squared distance function

Remark

1.3 Compact rank-one symmetric spaces

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Riemannian barycentres of Gibbs distributions: new results on concentration and convexity in compact symmetric spaces

Abstract

Access this article

Similar content being viewed by others

Existence and consistency of Wasserstein barycenters

Information geometry and asymptotic geodesics on the space of normal distributions

Bounds for $$L_p$$ -Discrepancies of Point Distributions in Compact Metric Measure Spaces

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

An elementary example

Symmetric spaces of the compact type

1.1 Integration in polar coordinates

1.2 The squared distance function

Remark

1.3 Compact rank-one symmetric spaces

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation