Brought to you by:
Paper

Large deviations in renewal models of statistical mechanics

Published 14 November 2019 © 2019 IOP Publishing Ltd
, , Citation Marco Zamparo 2019 J. Phys. A: Math. Theor. 52 495004 DOI 10.1088/1751-8121/ab523f

1751-8121/52/49/495004

Abstract

In Zamparo (2019 (arXiv:1903.03527)) the author has recently established sharp large deviation principles for cumulative rewards associated with a discrete-time renewal model, supposing that each renewal involves a broad-sense reward taking values in a separable Banach space. The renewal model has been there identified with constrained and non-constrained pinning models of polymers, which amount to Gibbs changes of measure of a classical renewal process. In this paper we show that the constrained pinning model is the common mathematical structure to the Poland–Scheraga model of DNA denaturation and to some relevant one-dimensional lattice models of statistical mechanics, such as the Fisher–Felderhof model of fluids, the Wako–Saitô–Muñoz–Eaton model of protein folding, and the Tokar–Dreyssé model of strained epitaxy. Then, in the framework of the constrained pinning model, we develop an analytical characterization of the large deviation principles for cumulative rewards corresponding to multivariate deterministic rewards that are uniquely determined by, and at most of the order of magnitude of, the time elapsed between consecutive renewals. In particular, we outline the explicit calculation of the rate functions and successively we identify the conditions that prevent them from being analytic and that underlie affine stretches in their graphs. Finally, we apply the general theory to the number of renewals. From the point of view of equilibrium statistical physics and statistical mechanics, cumulative rewards of the above type are the extensive observables that enter the thermodynamic description of the system. The number of renewals, which turns out to be the commonly adopted order parameter for the Poland–Scheraga model and for also the renewal models of statistical mechanics, is one of these observables.

Export citation and abstract BibTeX RIS

1. Introduction

Renewal models describe events that are randomly renewed over time. Extensive use of renewal models as classical stochastic processes is made in different areas of applied mathematics, including queueing theory [2], insurance [3], and finance [4] among others. With a different interpretation of the time coordinate, these models also enter Equilibrium statistical physics through the phenomena of polymer pinning and melting of DNA. Indeed, the thermodynamics of a polymer that is pinned by a substrate at certain monomers regarded as renewed events along the polymer chain is studied by a renewal model called the pinning model [5, 6]. Similarly, DNA denaturation upon heating has been investigated by Poland and Scheraga [7, 8] through a renewal model where renewed events identify base pairs along the DNA sequence. Formally, the Poland–Scheraga model is a constrained pinning model obtained by the pinning model under the condition that one of the renewals occurs at a predetermined position corresponding to the DNA size [5]. Although it is generally not recognized, the constrained pinning model also is the mathematical essence of some significant one-dimensional lattice models of statistical mechanics. They are the cluster model of fluids proposed by Fisher and Felderhof [913], the model of protein folding introduced independently by Wako and Saitô first [14, 15] and Muñoz and Eaton later [1618], and the model of strained epitaxy considered by Tokar and Dreyssé [1921]. These models have attracted the interest of many researchers due to exact solvability, often encouraging generalizations such as in the case of the Wako–Saitô–Muñoz–Eaton model [2229].

In the framework of discrete-time renewal models, identified with constrained and non-constrained homogeneous pinning models, the author [1] has recently established large deviation principles for cumulative rewards, supposing that each renewal involves a broad-sense reward taking values in a separable Banach space. Deterministic rewards that are uniquely determined by, and at most of the order of magnitude of, the time elapsed between consecutive renewals constitute a special class of rewards for which the theory can be further developed in an analytical direction. This class of rewards deserves attention from the point of view of Equilibrium statistical physics and statistical mechanics, because the corresponding cumulative rewards are the extensive observables that enter the thermodynamic description of the system. So far, analytical characterizations of large deviation principles for macroscopic observables have been provided only for few lattice models of statistical mechanics, including the Curie–Weiss model [30], the Curie–Weiss–Potts model [31], the mean-field Blume–Emery–Griffiths model [32], and the Ising model to some extent [3336].

The present paper reconsiders the constrained pinning model as defined in [1] with a dual purpose. First of all, it aims to propose a unified formulation of the Poland–Scheraga model, the Fisher–Felderhof model, the Wako–Saitô–Muñoz–Eaton model, and the Tokar–Dreyssé model as a constrained pinning model. The latter three models are customarily presented in terms of binary occupation numbers that are here interpreted as indicators of hypothetical renewals, thus constituting the so-called regenerative phenomenon associated by Kingman with a renewal process [37]. To the best of our knowledge, the mapping of these models with renewal systems has never been shown before. Second, the paper aims to characterize analytically, within the constrained pinning model, the rate functions associated with large deviation principles for cumulative rewards corresponding to multivariate deterministic rewards, thus providing a portrayal for the (possible joint) fluctuations of macroscopic observables. In doing this, the conditions that prevent the rate functions from being analytic and that underlie affine stretches in their graphs are identified. The connection between the singular behavior of rate functions and critical phenomena has been gaining considerable interest in the physics community as demonstrated by several recent works [3849], two of which dealing with renewal processes [47, 48]. Renewal models supply a perfect framework to probe this connection as they are able to account for phase transitions of any order [5], making at the same time explicit results feasible in contrast to most models of statistical mechanics.

The paper is organized as follows. In section 2 we introduce the framework of pinning models together with deterministic rewards. In this section we also report, specialized to deterministic rewards, the large deviation principle obtained in [1] for constrained pinning models. In section 3 we explain the role of the constrained pinning model in statistical mechanics, briefly reviewing the Fisher–Felderhof model, the Wako–Saitô–Muñoz–Eaton model, and the Tokar–Dreyssé model as well as the Poland–Scheraga model. The rate functions corresponding to the constrained pinning model are studied in section 4, where their explicit calculation is outlined and their main analytical properties are classified. Here we also single out a critical constrained pinning model where persistent large fluctuations of extensive observables lead to subexponential decays of probabilities that cannot be captured by a large deviation principle. An example concerning the number of renewals is finally proposed to show how the analytical theory developed in the section works in practice. The major mathematical proofs are reported in the appendices in order to not interrupt the flow of the presentation.

2. Pinning models, deterministic rewards, and large deviations

In this section we review the framework of pinning models as defined in [1]. Then, we introduce a class of deterministic rewards in the Euclidean d-space $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ and, focusing on constrained pinning models, we specialize to such class the large deviation principle established in [1] for cumulative rewards associated with general rewards in separable Banach spaces.

2.1. Pinning models

The pinning model considered in [1] calls for a probability space $ \newcommand{\prob}{\mathbb{P}} (\Omega,\mathcal{F},\prob)$ and random variables $S_1,S_2,\ldots$ on it that take values in $\{1,2,\ldots\}\cup\{\infty\}$ and form an independent and identically distributed sequence. In the classical theory of renewal processes, the variable Si is regarded as the waiting time for the ith occurrence at the renewal time $T_i:=S_1+\cdots+S_i$ of some event that is continuously renewed over time. Instead, here we imagine that a polymer consisting of $t\geqslant 1$ monomers is pinned by a substrate at the monomers $T_1,T_2,\ldots$ in such a way that the monomer Ti contributes an energy $-v(S_i)$ provided that $T_i\leqslant t$ . The real function $v$ is called the potential. The state of the polymer is described by the law $ \newcommand{\prob}{\mathbb{P}} \prob_t$ defined on the measurable space $(\Omega,\mathcal{F})$ by the Gibbs change of measure

where $H_t:=\sum\nolimits_{i\geqslant 1}v(S_i)\mathbb{1}_{\{T_i\leqslant t\}}$ is the Hamiltonian and the normalization constant $ \newcommand{\Ex}{\mathbb{E}} Z_t:=\Ex[{\rm e}^{H_t}]$ is the partition function. The model $ \newcommand{\prob}{\mathbb{P}} (\Omega,\mathcal{F},\prob_t)$ precisely is the pinning model, that we supply with the hypotheses of aperiodicity and extensivity. The waiting time distribution $ \newcommand{\prob}{\mathbb{P}} p:=\prob[S_1=\cdot\,]$ is said to be aperiodic if its support $\mathcal{S}:=\{s\geqslant 1:p(s)>0\}$ is nonempty and there does not exist an integer $\tau>1$ with the property that $\mathcal{S}$ includes only some multiples of $\tau$ . We observe that p can be made aperiodic by simply changing the time unit whenever $ \newcommand{\prob}{\mathbb{P}} \prob[S_1<\infty]>0$ .

Assumption 1. The waiting time distribution p is aperiodic.

The potential $v$ is said to be extensive if there exists a real number zo such that ${\rm e}^{v(s)}p(s)\leqslant {\rm e}^{z_os}$ for all s. Extensivity is necessary to make the thermodynamic limit of the pinning model meaningful since $ \newcommand{\Ex}{\mathbb{E}} Z_t\geqslant\Ex[{\rm e}^{H_t}\mathbb{1}_{\{S_1=t\}}]={\rm e}^{v(t)}p(t)$ .

Assumption 2. The potential $v$ is extensive.

This paper focuses on the constrained pinning model where the last monomer is always pinned by the substrate. The constrained pinning model as introduced in [1] corresponds to the law $ \newcommand{\prob}{\mathbb{P}} \prob_t^c$ defined on the measurable space $(\Omega,\mathcal{F})$ through the change of measure

$U_t:=\sum\nolimits_{i\geqslant 1}\mathbb{1}_{\{T_i=t\}}$ being the renewal indicator that takes value 1 if t is a renewal and value 0 otherwise, and $ \newcommand{\Ex}{\mathbb{E}} Z_t^c:=\Ex[U_t{\rm e}^{H_t}]$ being the partition function. Aperiodicity of the waiting time distribution gives $Z_t^c>0$ for all sufficiently large t [1], thus ensuring that the constrained pinning model is well-defined at least for such t.

2.2. Deterministic rewards and large deviation principles

The cumulative reward by the integer time t is $W_t:=\sum\nolimits_{i\geqslant 1}X_i\mathbb{1}_{\{T_i\leqslant t\}}$ , supposing that the ith renewal involves a reward Xi valued in a vector space and possibly dependent on Si. Notice that Wt reduces to the number $N_t:=\sum\nolimits_{\tau=1}^tU_\tau$ of renewals by t when Xi  =  1 for all i. The large deviation theory developed in [1] describes the fluctuations of Wt within constrained and non-constrained pinning models for rewards that are generic random variables valued in a real separable Banach space. In this paper we deepen the study for the special case of deterministic rewards of the form $X_i:=f(S_i)$ for each i, where f is a function on $\{1,2,\ldots\}\cup\{\infty\}$ that takes values in the Euclidean d-space $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ and satisfies the following assumption.

Assumption 3. If the support $\mathcal{S}$ of the waiting time distribution is infinite, then $f(s)/s$ has a limit $ \newcommand{\Rl}{\mathbb{R}} r\in\Rl^d$ when s goes to infinity through $\mathcal{S}$ .

Under this assumption, there exists a positive constant $M<\infty$ such that $\|f(s)\|\leqslant Ms$ for every $s\in\mathcal{S}$ , meaning that f is at most of the order of magnitude of the waiting time. From now on, $u\cdot v$ denotes the usual dot product between u and $v$ in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ and $\|u\|:=\sqrt{u\cdot u}$ is the Euclidean norm of u.

The large deviation principle for cumulative rewards in constrained pinning models stated by theorems 1 of [1] can be specialized to deterministic rewards as follows. With reference to the formalism of [1], it is convenient here to identify a linear functional $\varphi$ on $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ with that unique $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ such that $\varphi(w)=k\cdot w$ for all w. Let z be the function that maps each point $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ in the extended real number $z(k)$ defined by

Equation (1)

where the infimum over the empty set is customarily interpreted as $\infty$ . Denote by I the Fenchel–Legendre transform of $z-z(0)$ , which associates every vector $ \newcommand{\Rl}{\mathbb{R}} w\in\Rl^d$ with the extended real number $I(w)$ defined by

Equation (2)

We point out that the function z is finite everywhere under assumptions 13. Indeed, given any $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ , assumption 1 entailing $\sum\nolimits_{s\geqslant 1}p(s)>0$ yields $\sum\nolimits_{s\geqslant 1}{\rm e}^{k\,\cdot f(s)+v(s)-\zeta s}\,p(s)>1$ for all sufficiently negative $\zeta$ , so that $z(k)>-\infty$ . At the same time, the bounds ${\rm e}^{v(s)}p(s)\leqslant {\rm e}^{z_os}$ for each s with some real number zo by assumption 2 and $\|f(s)\|\leqslant M s$ for every $s\in\mathcal{S}$ with some constant $M<\infty$ by assumption 3 give $\sum\nolimits_{s\geqslant 1}{\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)\leqslant 1$ for all $\zeta\geqslant z_o+M\|k\|+\ln2 $ , thus implying $z(k)\leqslant z_o+M\|k\|+\ln2<\infty$ . The finiteness of z allows us to obtain the following strong version of theorem 1 of [1], which extends the Cramér's theorem to the cumulative reward Wt within the constrained pinning model $ \newcommand{\prob}{\mathbb{P}} (\Omega,\mathcal{F},\prob_t^c)$ .

Theorem 1. The following conclusions hold under assumptions 13:

  • (a)  
    the function z is finite everywhere and convex. The function I is lower semicontinuous and proper convex;
  • (b)  
    if $ \newcommand{\Rl}{\mathbb{R}} G\subseteq\Rl^d$ is an open set, then
  • (c)  
    if $ \newcommand{\Rl}{\mathbb{R}} F\subseteq\Rl^d$ is either a closed set or a Borel convex set, then

The lower bound in part (b) and the upper bound in part (c) are called, respectively, large deviation lower bound and large deviation upper bound [50, 51]. When a lower semicontinuous function I exists so that the large deviation lower bound holds for each open set G and the large deviation upper bound holds for each closed set F, then Wt is said to satisfy a large deviation principle with rate function I [50, 51]. Theorem 1 states that the cumulative reward Wt satisfies a large deviation principle with rate function I given by (2) within the constrained pinning model. We observe that the rate function I has compact level sets, thus resulting in a good rate function [50, 51]. Indeed, the level set $ \newcommand{\Rl}{\mathbb{R}} \{w\in\Rl^d:I(w)\leqslant a\}$ for a given positive real number a is closed by the lower semicontinuity of I and bounded as if $w\ne 0$ belongs to this set, then the bounds $w\cdot k-z(k)+z(0)\leqslant I(w)\leqslant a$ and $z(k)\leqslant z_o+M\|k\|+\ln2$ together imply $\|w\|\leqslant z_o+M+\ln2-z(0)$ if the choice $k:=w/\|w\|$ is made.

3. Uses of the constrained pinning model in statistical mechanics

This section resolves around the binary process $\{U_t\}_{t\geqslant 0}$ of renewal indicators, with U0:  =  1. We recall that Ut:  =  1 if t is a renewal and Ut:  =  0 otherwise for each $t\geqslant 1$ . From a mathematical point of view, the finite-dimensional marginals of the process $\{U_t\}_{t\geqslant 0}$ with respect to the constrained pinning model coincide with the finite-volume Gibbs state associated with the Fisher–Felderhof model of fluids, the Wako–Saitô–Muñoz–Eaton model of protein folding, and the Tokar–Dreyssé model of strained epitaxy. Here we determine the finite-dimensional marginals of $\{U_t\}_{t\geqslant 0}$ with respect to constrained and non-constrained pinning models. Then, we briefly review the above models and the Poland–Scheraga model, sketching the mapping with the constrained pinning model.

3.1. Kingman's regenerative phenomena and pinning models

The binary process $\{U_t\}_{t\geqslant 0}$ is a discrete-time regenerative phenomenon according to Kingman [37], because it satisfies the following property. This property is proved in appendix A and comes from the fact that a renewal process forgets the past and starts over at every renewal.

Proposition 1. For any $m\geqslant 1$ and instants $0=:\tau_0<\tau_1<\cdots<\tau_m$

The finite-dimensional marginals of the process $\{U_t\}_{t\geqslant 0}$ with respect to constrained and non-constrained pinning models can be determined through the following argument. Fix a time $t\geqslant 1$ and binary numbers $u_1,\ldots,u_t$ that are supposed to contain $n:=\sum\nolimits_{\tau=1}^tu_\tau\geqslant 1$ ones in certain positions, the ith of which being written as $s_1+\cdots+s_i$ . The distance $s\leqslant t$ between consecutive ones is attained a number of times equal to $\sum\nolimits_{i=1}^n\mathbb{1}_{\{s_i=s\}}$ , which can be explicitly expressed in terms of $u_1,\ldots,u_t$ and u0  =  1 as

Equation (3)

where the intermediate factor is not present when s  =  1. The distance between the position of the last one and t is $t-s_1-\cdots-s_n=\sum\nolimits_{\tau=1}^t\prod\nolimits_{k=\tau}^t(1-u_k)$ . The condition $U_\tau=u_\tau$ for each $\tau\leqslant t$ is tantamount to the condition $S_i=s_i$ for each $i\leqslant n$ and $S_{n+1}>t-t_n$ provided that u0  =  1 since U0:  =  1. It follows that

This formula also holds for n  =  0, which corresponds to the case $u_1=\cdots=u_t=0$ that gives $\#_{s|t}(u_0,\ldots,u_t)=0$ for all s, since the probability that $U_1=\cdots=U_t=0$ is $ \newcommand{\prob}{\mathbb{P}} \prob[S_1>t]$ . This way, we find that the finite-dimensional marginals of the process $\{U_t\}_{t\geqslant 0}$ with respect to the pinning model are expressed for every integer $t\geqslant 1$ and binary numbers $u_0,\ldots,u_t$ by

As far as the constrained pinning model is concerned, adding the condition Ut  =  1 in this expression we get

Equation (4)

The corresponding law for waiting times is obtained by noticing that n renewals occur by the time $t\geqslant 1$ , namely Nt  =  n, and a renewal exactly occurs at the time t if and only if $T_n=\sum\nolimits_{i=1}^nS_i=t$ . This argument gives for every positive integers n and $s_1,\ldots,s_n$ the formula

Equation (5)

The probability distribution (4) is exactly the finite-volume Gibbs state associated with the Fisher–Felderhof model, the Wako–Saitô–Muñoz–Eaton model, and the Tokar–Dreyssé model, whereas the probability distribution (5) is the Poland–Scheraga model. The following four paragraphs illustrate such connections.

3.2. The model by Poland and Scheraga for melting of DNA

Most DNA molecules consist of two strands made up of nucleotide monomers. Monomers on one strand are bound to a specific matching monomer on the other strand, constituting the so-called Watson–Crick pairs that held together the two strands in a double helix. Thermal denaturation of DNA is the process by which the two strands unravel upon heating, melting into bubbles where they are apart. The formation of a bubble results in an entropic gain $\sigma_l$ , which is observed experimentally to depend on the length l of the denatured fragment as $\sigma_l\sim al+b-c\ln l$ with positive coefficients a, b, and c [8]. The logarithmic dependence can be explained as a consequence of loop closure since the bubble can be regarded as a loop of length 2l [8]. The Poland–Scheraga model is a simplified model that aims to describe thermal denaturation of DNA as a phase transition [7, 8]. The model considers a partially melted DNA molecule as being composed of an alternating sequence of bound segments and denaturated segments that do not interact with one another. A bound segment of length $l\geqslant 1$ is favored by the energetic gain $ \newcommand{\e}{{\rm e}} \epsilon l$ , the binding energy $ \newcommand{\e}{{\rm e}} \epsilon<0$ being taken to be the same for all matching monomers, whereas a denaturated segment of length $l\geqslant 1$ is favored by a certain entropic gain $\sigma_l>0$ . Here we suppose that $ \newcommand{\e}{{\rm e}} \epsilon$ is measured in unit of $k_{\rm B}T$ , where $k_{\rm B}$ is the Boltzmann constant and T is the absolute temperature. The mathematical construction of the Poland–Scheraga model with t monomers per strand assumes that there is a variable number n of consecutive stretches of positive lengths $s_1,\ldots,s_n$ that span a chain of t monomers: $1\leqslant n\leqslant t$ and $\sum\nolimits_{i=1}^ns_i=t$ . The ith stretch is imagined to consist of just one bound monomer if si  =  1 and one denaturated segment of length si  −  1 followed by one bound monomer if si  >  1. Within this scheme, monomers $s_1,s_1+s_2,\ldots,s_1+\cdots +s_n$ are bound and there is actually a bound segment of length $l\geqslant 2$ starting at position $s_1+\cdots+s_i$ if si  >  1 when i  >  1, $s_{i+1}=\cdots=s_{i+l-1}=1$ , and si+l  >  1 when $i+l\leqslant n$ . We notice that the last monomer is always bound under this construction. The probability that the Poland–Scheraga model assigns to a configuration with n stretches of lengths $s_1,\ldots,s_n$ reads

Equation (6)

where $\sigma_0:=0$ and $\Xi_t$ is the partition function:

A similarity between the probability distributions (6) and (5) is evident and can be made tight as follows. Focusing on the configuration with only one large denatured segment we get $ \newcommand{\e}{{\rm e}} \ln \Xi_t\geqslant -\epsilon+\sigma_{t-1}$ , so that a necessary condition for the free energy $\ln \Xi_t$ to be extensive in t is $ \newcommand{\e}{{\rm e}} \eta_o:=\limsup_{l\uparrow\infty}\sigma_l/l<\infty$ . Under this condition, a real number $ \newcommand{\e}{{\rm e}} \eta\geqslant\eta_o$ can be found in such a way that $\sum_{s\geqslant 1}p(s)\leqslant 1$ with $ \newcommand{\e}{{\rm e}} p(s):={\rm e}^{\sigma_{s-1}-\eta s}$ for each s and a constrained pinning model with waiting time distribution p and potential $ \newcommand{\e}{{\rm e}} v(s):=-\epsilon$ for every s can be devised. The distribution p is clearly aperiodic because $p(s)>0$ for all s. Such constrained pinning model gives rise to the identities $ \newcommand{\prob}{\mathbb{P}} {\rm PS}_t(n;s_1,\ldots,s_n)=\prob_t^c[S_1=s_1,\ldots,S_n=s_n,N_t=n]$ whenever $\sum_{i=1}^ns_i=t$ and $ \newcommand{\e}{{\rm e}} \ln\Xi_t=\ln Z_t^c+\eta t$ . Thus, we get an interpretation of the Poland–Scheraga model as a constrained pinning model where renewal times mark bound monomers. It is worth noting here that if $ \newcommand{\e}{{\rm e}} \sum_{s\geqslant 1}{\rm e}^{\sigma_{s-1}-\eta_o s}\geqslant 1$ , then $ \newcommand{\e}{{\rm e}} \eta$ can be chosen in such a way that $\sum_{s\geqslant 1}p(s)=1$ , giving $ \newcommand{\prob}{\mathbb{P}} \prob[S_1=\infty]=0$ . If on the contrary $ \newcommand{\e}{{\rm e}} \sum_{s\geqslant 1}{\rm e}^{\sigma_{s-1}-\eta_o s}<1$ , then any choice of $ \newcommand{\e}{{\rm e}} \eta$ entails $ \newcommand{\prob}{\mathbb{P}} \prob[S_1=\infty]>0$ .

Extensive observables involved in the thermodynamic description of the system are, for example, the number Nt of bound monomers per strand and the total loop entropy. They are the cumulative rewards Wt corresponding to the deterministic rewards $X_i:=f(S_i)$ for each i with, respectively, $f(s)=1$ and $f(s)=\sigma_{s-1}$ for any s. The joint fluctuations of the number of bound monomers and the total loop entropy can be investigated by taking $f(s)=(1,\sigma_{s-1})$ for all s. As an application of the analytical theory developed in section 4, in section 4.4 we shall review the phase transition of the Poland–Scheraga model and investigate the fluctuations of Nt. In particular, we will see that there are situations where the fraction of bound monomers changes from being positive to being zero at a certain finite value of the binding energy while increasing $ \newcommand{\e}{{\rm e}} \epsilon$ .

3.3. The model by Fisher and Felderhof for fluids

In 1970 Fisher and Felderhof published a series of papers where they introduced a many-body cluster interaction model of a one-dimensional continuum classical fluid [912]. In the thermodynamic limit, the model was found to exhibit a phase transition from a gas-like phase containing clusters of particles of all sizes to a liquid-like phase consisting essentially of a single macroscopic cluster [9, 10]. The discrete counterpart was later considered by Roepstorff [13], who formalized a lattice version of the model where, in a nutshell, if some site is not occupied by a particle, then the particles on the left do not interact with those on the right. This means that particles interact only when they fill a cluster of contiguous sites, contributing a certain energy El  <  0 when the cluster has size $l\geqslant 1$ . The model by Fisher and Felderhof on a lattice is a lattice-gas model that can be introduced as follows by taking advantage of (3). If particles are arranged on t lattice sites and if the binary variable $u_\tau$ is associated with the site $\tau$ in such a way that $u_\tau=1$ denotes a hole and $u_\tau=0$ denotes a particle, then $\#_{s|t}(1,u_1,\ldots,u_t)$ defined by (3) counts the number of clusters with s  −  1 particles provided that ut  =  1. This way, assuming that the last site is always a hole, namely ut  =  1, the probability that the Fisher–Felderhof model with a chemical potential $\mu$ assigns to the configuration $u_1,\ldots,u_t$ can written as

Equation (7)

where $\Xi_t$ is the partition function:

Here parameters are supposed to be expressed in unit of $k_{\rm B}T$ .

From the mathematical point of view, the probability distribution (7) is nothing but (4). This fact is understood by observing at first that the identities $\sum\nolimits_{s=1}^t\#_{s|t}(1,u_1,\ldots,u_t)=\sum\nolimits_{\tau=1}^tu_\tau$ and $\sum\nolimits_{s=1}^ts\,\#_{s|t}(1,u_1,\ldots,u_t)=t$ , which are valid when ut  =  1, allow us to recast (7) in the form

where the convention E0:  =  0 has been made and an arbitrary number $ \newcommand{\e}{{\rm e}} \eta$ has been introduced. Second, let us notice that a necessary condition for the free energy $\ln \Xi_t$ to be extensive in t is $ \newcommand{\e}{{\rm e}} \eta_0:=\liminf\nolimits_{l\uparrow\infty}E_l/l>-\infty$ . Under this condition, a real number $ \newcommand{\e}{{\rm e}} \eta\leqslant\eta_o$ exists with the property that $\sum\nolimits_{s\geqslant 1}p(s)\leqslant 1$ with $ \newcommand{\e}{{\rm e}} p(s):={\rm e}^{\eta s-E_{s-1}}$ for each $s\geqslant 1$ . Then, the constrained pinning model with aperiodic waiting time distribution p and potential $v(s):=-\mu$ for every s satisfies $ \newcommand{\prob}{\mathbb{P}} {\rm FF}_t(u_1,\ldots,u_t)=\prob_t^c[U_0=u_0,\ldots,U_t=u_t]$ whenever u0  =  1 and $ \newcommand{\e}{{\rm e}} \ln\Xi_t=\ln Z_t^c+(\mu-\eta)t$ . This way, the Fisher–Felderhof model can be interpreted as a constrained pinning model where renewal times mark holes. As for the Poland–Scheraga model, $ \newcommand{\e}{{\rm e}} \eta$ can be chosen so that $ \newcommand{\prob}{\mathbb{P}} \prob[S_1=\infty]=0$ only when $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}{\rm e}^{\eta_os-E_{s-1}}\geqslant 1$ . The number Nt of holes and the total energy are extensive observables entering the thermodynamic description of the system, the latter being the cumulative reward Wt associated with the deterministic rewards $X_i:=E_{S_i-1}$ for every i. Similarly to the Poland–Scheraga model, the Fisher–Felderhof model can have a gas–liquid phase transition. We shall review this phase transition in section 4.4, showing that there are situations where the fraction of holes changes from being positive to being zero at a certain finite value of the chemical potential while increasing $\mu$ .

3.4. The model by Wako, Saitô, Muñoz, and Eaton for protein folding

Most proteins consist of a long chain of amino acid monomers held together by peptide bonds. At physiological temperatures, peptide bonds are planar, rigid, covalent bonds that allow adjacent monomers to perform two rotations. Native values of the dihedral angles associated with these two rotations identify the functional three-dimensional structure of the protein. Protein folding is the cooperative process by which a polypeptide chain folds into its native shape from random coil. The model by Wako and Saitô [14, 15] and Muñoz and Eaton [1618] is a simplified discrete model that aims to describe the process of protein folding as a first-order-like phase transition. This model considers a protein made up of t  +  1 monomers as a sequence of t peptide bonds. A configuration of the protein is identified by associating the ith peptide bond with a binary variable ui taking value 0 if the dihedral angles are native and value 1 otherwise. Bonds i and j  >  i are supposed to interact only if all intervening bonds along the chain are native, namely only if $u_i=\cdots=u_j=0$ . For homogeneous systems like homopolymers [14], their interaction contributes the energy $ \newcommand{\e}{{\rm e}} \epsilon_{j-i}\leqslant 0$ , which we express here in unit of $k_{\rm B}T$ . In order to incorporate the principle of minimal frustration of proteins, the model imposes the condition $ \newcommand{\e}{{\rm e}} \epsilon_{j-i}=0$ unless bonds i and j are known to be in spatial proximity in the native three-dimensional structure of the protein. The model also takes into account the entropic loss $\sigma>0$ of fixing one peptide unit in the native conformation [14, 18]. Assuming for our convenience that the last bond is always disordered, so that ut  =  1, the probability that the Wako–Saitô–Muñoz–Eaton model assigns to the configuration $u_1,\ldots,u_t$ reads

Equation (8)

where $\Xi_t$ is the partition function:

The probability distribution (8) can be recast in the form (4) by making the number $\#_{s|t}(1,u_1,\ldots,u_t)$ defined by (3) appear. To this aim, set E0:  =  0 and $ \newcommand{\e}{{\rm e}} E_l:=\sum\nolimits_{s=1}^l(l-s)\,\epsilon_s$ for each $l\geqslant 1$ . We notice that El is the energetic contribution of a stretch of l consecutive native bonds. Then, the three simple identities $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s=1}^t E_{s-1}\,\#_{s|t}(1,u_1,\ldots,u_t)=\sum\nolimits_{i=1}^{t-1}\sum\nolimits_{j=i+1}^t\epsilon_{j-i}\prod\nolimits_{k=i}^{\,j}(1-u_k)$ , $\sum\nolimits_{s=1}^t\#_{s|t}(1,u_1,\ldots,u_t)=\sum\nolimits_{i=1}^tu_i$ , and $\sum\nolimits_{s=1}^ts\,\#_{s|t}(1,u_1,\ldots,u_t)=t$ , which hold when ut  =  1, allow us to introduce an arbitrary number $ \newcommand{\e}{{\rm e}} \eta$ and rewrite (8) as

We exploit $ \newcommand{\e}{{\rm e}} \eta$ to define a waiting time distribution. A necessary condition for the free energy $\ln \Xi_t$ to be extensive in t is $ \newcommand{\e}{{\rm e}} \eta_0:=\liminf\nolimits_{l\uparrow\infty}E_l/l>-\infty$ . Under this condition, we can repeat the arguments made above for the Fisher–Felderhof model to conclude that $ \newcommand{\e}{{\rm e}} \eta\leqslant\eta_o$ exists so that the aperiodic waiting time distribution p and the potential $v$ defined by $ \newcommand{\e}{{\rm e}} p(s):={\rm e}^{\eta s-E_{s-1}}$ and $v(s):=\sigma$ for any s originate a constrained pinning model whose marginal distribution fulfills $ \newcommand{\prob}{\mathbb{P}} {\rm WSME}_t(u_1,\ldots,u_t)=\prob_t^c[U_0=u_0,\ldots,U_t=u_t]$ if u0  =  1 and $ \newcommand{\e}{{\rm e}} \ln\Xi_t=\ln Z_t^c-\eta t$ . Thus, the Wako–Saitô–Muñoz–Eaton model results in a constrained pinning model where renewal times mark peptide bonds that do not take their native conformation. The number of these bonds is the extensive observable Nt. Another extensive observable is the total energy, which is the cumulative reward Wt associated with the deterministic rewards $X_i:=E_{S_i-1}$ for every i. In section 4.4 we shall see that, while decreasing $\sigma$ or the energetic contributions El, the model can exhibit a phase transition from a denaturated phase to the native state where the fraction of native bonds is 1.

3.5. The model by Tokar and Dreyssé for strained epitaxy

Epitaxy is the growth process of a crystal film on a crystalline substrate used in nanotechnology and in semiconductor fabrication. In most cases where the film material is different from the substrate material, the strain of the crystal film to accommodate the lattice geometry of the substrate leads to the self-assembly of coherent nanostructures. The model by Tokar and Dreyssé [1921] for strained epitaxy is a simplified lattice-gas model that aims to describe the size distribution of these atomic structures assuming that atoms interact effectively only when they belong to the same cluster. In the one-dimensional case, clusters of $l\geqslant 1$ contiguous atoms contribute the energy El  <  0 [1921]. When atoms are arranged on t lattice sites and the configuration of the system is described by binary variables taking value 1 for holes and value 0 for atoms, then the Tokar–Dreyssé model endowed with a chemical potential $\mu$ formally is the Fisher–Felderhof model. This way, the Tokar–Dreyssé model can be identified with a constrained pinning model associated with an aperiodic waiting time distribution under the condition $\liminf\nolimits_{l\uparrow\infty}E_l/l>-\infty$ .

4. Rate functions within the constrained pinning model

This section addresses the study of the rate function I defined by (2) under assumptions 13, which will be made from now on. The computation of I by means of methods from convex analysis is discussed first. Second, we classify the main analytical properties of I, identifying the conditions that prevent I from being analytic and connecting the presence of affine stretches in its graph with the existence of some point where the function z given by (1) is not differentiable. Third, we show that the large deviation principle stated by theorem 1 loses its effectiveness to describe the way the probabilities of the cumulative reward decay when the origin is a point where z is not differentiable. This fact leads us to define the critical constrained pinning model where the decay of probabilities is subexponential. Finally, we exemplify the analytical theory developed in this section by describing the fluctuations of the number $N_t:=\sum\nolimits_{\tau=1}^tU_\tau$ of renewals by t in the Poland–Scheraga model and in the other renewal models of statistical mechanics. Within these models, the extensive observable Nt is commonly regarded as a natural order parameter in order to identify a phase transition.

Hereafter, we denote by $ \newcommand{\inter}{{\rm int}\,} \inter A$ and $ \newcommand{\cl}{{\rm cl}\,} \cl A$ the interior and the closure, respectively, of a set A in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . The interior which results when A is regarded as a subset of its affine hull is the relative interior, denoted by $ \newcommand{\ri}{{\rm ri}\,} \ri A$ . Clearly, $ \newcommand{\inter}{{\rm int}\,} \newcommand{\ri}{{\rm ri}\,} \ri A=\inter A$ if A is of full dimension, i.e. has the whole space $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ as its affine hull. A function $\varphi$ defined on an open set $ \newcommand{\Rl}{\mathbb{R}} A\subseteq\Rl^d$ is analytic on A if it can be represented by a convergent power series in some neighborhood of any point $x\in A$ . A vector field $\nu$ on A is analytic on A if each of its components is analytic on A. Given a convex function $\varphi$ on $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ , we denote by $ \newcommand{\Rl}{\mathbb{R}} \newcommand{\dom}{{\rm dom}\,} \dom \varphi:=\{x\in\Rl^d:\varphi(x)<\infty\}$ its effective domain and by $ \newcommand{\Rl}{\mathbb{R}} \partial \varphi(x):=\{g\in\Rl^d:g{{\rm ~is~a~subgradient~of~}} \varphi {{\rm ~at~}} x\}$ its subdifferential at x. The main results from convex analysis that will be used in the sequel are recalled in appendix B.

4.1. Computing the rate function I

In principle, the computation of the rate function I is feasible once the effective domain of I and the subdifferentials of z are known. This can be deduced from the following proposition, which collects together standard results from convex analysis (see appendix B, propositions B.1, B.2, B.6, and B.7). We point out that such results apply because z and I are proper convex function by theorem 1, z being continuous and I being lower-semicontinuous. Continuity of z is a consequence of the fact that z is a convex function that is finite on the whole space $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ .

Proposition 2. The following conclusions hold:

  • (a)  
    $I(w)=w\cdot k-z(k)+z(0)$ for every points w and k in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ such that $w\in\partial z(k)$ ;
  • (b)  
    $w\in\partial z(k)$ if and only if $k\in\partial I(w)$ ;
  • (c)  
    for any $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} w\in\ri(\dom I)$ there exists $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ with the property that $w\in\partial z(k)$ ;
  • (d)  
    for each $ \newcommand{\cl}{{\rm cl}\,} \newcommand{\dom}{{\rm dom}\,} w\in\cl(\dom I)$ , $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} u\in\ri(\dom I)$ , and $\lambda\in[0,1)$ the vector $\lambda w+(1-\lambda)u$ belongs to $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom I)$ and $I(w)=\lim\nolimits_{\lambda\uparrow 1}I(\lambda w+(1-\lambda)u)$ .

We are thus led to investigate the effective domain of I and the subdifferentials of z. Let $\mathcal{S}:=\{s\geqslant 1:p(s)>0\}$ be the support of the waiting time distribution p as defined in section 2 and recall that f denotes the function that identifies deterministic rewards. The set $\mathcal{C}$ of all convex combinations of the elements from $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is the smallest convex set that contains $\{\,f(s)/s\}_{s\in\mathcal{S}}$ , namely its convex hull. The interest in the set $\mathcal{C}$ stems from the fact that the effective domain of I differs very little from $\mathcal{C}$ , as stated by the next proposition that is proved in appendix C. In particular, the proposition entails $ \newcommand{\cl}{{\rm cl}\,} \newcommand{\dom}{{\rm dom}\,} \cl(\dom I)=\cl\mathcal{C}$ and $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom I)=\ri\mathcal{C}$ .

Proposition 3. Let $\mathcal{C}$ be the convex hull of $\{\,f(s)/s\}_{s\in\mathcal{S}}$ . Then, $ \newcommand{\cl}{{\rm cl}\,} \newcommand{\dom}{{\rm dom}\,} \mathcal{C}\subseteq\dom I\subseteq\cl\mathcal{C}$ .

In order to determine the subdifferentials of the function z, we need at first to make z explicit. To this aim, we set $ \newcommand{\e}{{\rm e}} \ell:=\limsup\nolimits_{s\uparrow\infty}(1/s)\ln {\rm e}^{v(s)}p(s)$ and we stress that $ \newcommand{\e}{{\rm e}} -\infty\leqslant\ell\leqslant z_o<\infty$ by assumption 2. We also recall that if $\mathcal{S}$ is an infinite set, then $f(s)/s$ has a limit $ \newcommand{\Rl}{\mathbb{R}} r\in\Rl^d$ when s goes to infinity through $\mathcal{S}$ by assumption 3. When $\mathcal{S}$ is finite, then $ \newcommand{\e}{{\rm e}} \ell=-\infty$ and we set $r:=f(s_o)/s_o$ for future convenience, so being an arbitrarily given element of $\mathcal{S}$ . Invoking the Cauchy–Hadamard theorem we find that the series $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)$ convergences if $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ and, in the case $ \newcommand{\e}{{\rm e}} \ell>-\infty$ , divergences if $ \newcommand{\e}{{\rm e}} \zeta<k\cdot r+\ell$ . The properties of the function z crucially depends on the behavior of this series at $ \newcommand{\e}{{\rm e}} \zeta=k\cdot r+\ell$ , so that it is helpful to introduce the extended real number $\theta(k)$ defined by

It is understood that $\theta(k)=\infty$ for all k if $ \newcommand{\e}{{\rm e}} \ell=-\infty$ . The function $\theta$ that maps each $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ in $\theta(k)$ is convex and lower semicontinuous because, in the case $ \newcommand{\e}{{\rm e}} \ell>-\infty$ , is the sum of finite positive convex functions. As a consequence, the possible empty level set

is convex and closed, and its complement $\Theta^c$ is open. Necessary conditions for $\Theta$ to be nonempty are that $\mathcal{S}$ is an infinite set and $ \newcommand{\e}{{\rm e}} \ell>-\infty$ . It is a simple exercise to verify that $ \newcommand{\Rl}{\mathbb{R}} \Theta=\Rl^d$ if and only if $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} {\rm e}^{v(s)-\ell s}\,p(s)\leqslant 1$ and $f(s)=rs$ for all $s\in\mathcal{S}$ . The following lemma provides z explicitly and is proved in appendix D.

Lemma 1. Pick $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ . Then

where $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ is the unique number that satisfies $\sum\nolimits_{s\geqslant 1}{\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)=1$ .

We are now ready to supply a complete description of the subdifferentials of z, which is the task of the next proposition whose proof is given in appendix E. The proposition states that z is analytic on the open set $\Theta^c$ , so that z is in particular differentiable on $\Theta^c$ . To connect subdifferentiability and differentiability, we remind that z is differentiable at a certain point $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ with gradient $\nabla z(k)$ if and only if $\partial z(k)$ is a singleton containing only the vector $\nabla z(k)$ (see appendix B, proposition B.4). In order to understand the content of the proposition, we also point out that the vector

Equation (9)

exists whenever $\sum\nolimits_{s\geqslant 1} s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)<\infty$ because $\|f(s)\|\leqslant Ms$ for some constant $M<\infty$ and all $s\in\mathcal{S}$ by assumption 3. This is certainly the case if $k\in\Theta^c$ because $ \newcommand{\e}{{\rm e}} z(k)>k\cdot r+\ell$ when $k\in\Theta^c$ . From now on, we think of a vector $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ as a column vector and we denote by $u^{\rm T}$ its transpose.

Proposition 4. The following conclusions hold:

  • (a)  
    z is analytic on $\Theta^c$ and $\nabla z(k)=\nu(k)$ for all $k\in\Theta^c$ . The vector field $\nu$ that associates any $k\in\Theta^c$ with $\nu(k)$ is analytic and its Jacobian matrix $J(k)$ at k, which obviously is the Hessian matrix of z at k, is given by
  • (b)  
    if $k\in\Theta$ and $\theta(k)=1$ , then $\partial z(k)=\{r\}$ or $\partial z(k)=\{(1-\alpha)r+\alpha\nu(k)\}_{\alpha\in[0,1]}$ according to the series $\sum\nolimits_{s\geqslant 1} s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)$ diverges or converges;
  • (c)  
    if $k\in\Theta$ and $\theta(k)<1$ , then z is differentiable at k and $\nabla z(k)=r$ .

To conclude this paragraph, we observe that while proposition 2 states that there exists at least one point k with the property that $w\in\partial z(k)$ when $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} w\in\ri(\dom I)$ , nothing is said about how many different points k share this property. The following lemma, which is proved in appendix F, answers the question.

Lemma 2. Let w and k be two points in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ such that $w\in\partial z(k)$ . Let h be another point in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . The following conclusions hold:

  • (a)  
    if $w\ne r$ , then $w\in\partial z(h)$ if and only if $(h-k)\cdot[\,f(s)-rs]=0$ for all $s\in\mathcal{S}$ ;
  • (b)  
    if w  =  r and $k\in\Theta^c$ , then $ \newcommand{\e}{{\rm e}} \Theta=\emptyset$ and $w\in\partial z(h)$ if and only if $(h-k)\cdot[\,f(s)-rs]=0$ for all $s\in\mathcal{S}$ ;
  • (c)  
    if w  =  r and $k\in\Theta$ , then $w\in\partial z(h)$ if and only if $h\in\Theta$ .

4.2. Analytical properties of the rate function I

Proposition 2 offers a method to compute the rate function I that can count on the description of $ \newcommand{\dom}{{\rm dom}\,} \dom I$ given by proposition 3 and the expression of $\partial z(k)$ provided for any k by proposition 4. In doing this, proposition 2 allows us to identify the main analytical properties of I. Here we discuss these properties mostly assuming that the set $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is of full dimension, i.e. its affine hull is $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . In this case, $ \newcommand{\dom}{{\rm dom}\,} \dom I$ is full dimensional by proposition 3. Since the affine hull of any subset of $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ is closed, the affine hull of $\{\,f(s)/s\}_{s\in\mathcal{S}}$ contains r and hence equals $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ if and only if there are exactly d linearly independent vectors in the set $\{\,f(s)-rs\}_{s\in\mathcal{S}}$ . This is the situation we expect to face in common real applications and to which, however, we can always reduce the problem1.

If the set $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is of full dimension and $ \newcommand{\inter}{{\rm int}\,} \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} w\in\inter(\dom I)=\ri(\dom I)$ , then there exists $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ such that $w\in\partial z(k)$ by part (c) of proposition 2. If $w\ne r$ , then such k is unique by part (a) of lemma 2. Thus, $\partial I(w)$ is a singleton by part (b) of proposition 2 and I results differentiable at w (see appendix B, proposition B.4). This way, I is differentiable on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ with the only possible exception of the point r, and hence it is continuously differentiable by convexity (see appendix B, proposition B.5). Still assuming that $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is full dimensional, if there exists $k\in\Theta^c$ such that $r\in\partial z(k)$ , then I is differentiable at r because $I(r)=r\cdot k-z(k)+z(0)<\infty$ and $\partial I(r)=\{k\}$ by part (b) of lemma 2. If instead there is no point $k\in\Theta^c$ such that $r\in\partial z(k)$ , then $r\in\partial z(h)$ if and only if $h\in\Theta$ by proposition 4, so that $\partial I(r)=\Theta$ by part (b) of proposition 2. It is not excluded here that $ \newcommand{\e}{{\rm e}} \Theta=\emptyset$ . These arguments prove the following lemma.

Lemma 3. Suppose $\{\,f(s)/s\}_{s\in\mathcal{S}}$ , and hence $ \newcommand{\dom}{{\rm dom}\,} \dom I$ , is of full dimension. Then, I is continuously differentiable on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ possibly deprived of r. If I is not differentiable at r, then $\partial I(r)=\Theta$ .

More can be said about the smoothness properties of I. Suppose $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is of full dimension. Then, the condition $f(s)=rs$ for all $s\in\mathcal{S}$ cannot hold true and $ \newcommand{\Rl}{\mathbb{R}} \Theta\ne\Rl^d$ as a consequence. This way, $ \newcommand{\e}{{\rm e}} \Theta^c\ne\emptyset$ and the restriction of $\nu$ to $\Theta^c$ is an injective continuous map by parts (a) and (b) of lemma 2 since $\partial z(k)=\{\nu(k)\}$ for every $k\in\Theta^c$ . It follows by invariance of domain (see [52], theorem 10.3.7) that

is a nonempty open subset of $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . Injectivity of $\nu$ also entails the invertibility of the Jacobian matrix $J(k)$ for each $k\in\Theta^c$ and the existence of an inverse map $\mu:\mathcal{U}\to\Theta^c$ such that $\nu(\mu(w))=w$ for all $w\in\mathcal{U}$ . The real analytic inverse function theorem (see [53], theorem 2.5.1) tells us the $\mu$ is analytic. For each $w\in\mathcal{U}$ we have $w=\nu(\mu(w))=\nabla z(\mu(w))$ and part (a) of proposition 2 gives $I(w)=w\cdot \mu(w)-z(\mu(w))+z(0)<\infty$ , which shows that $ \newcommand{\dom}{{\rm dom}\,} \mathcal{U}\subseteq\dom I$ and that I is analytic on $\mathcal{U}$ thanks to analyticity of both $\mu$ and z. We have thus proved the following result.

Lemma 4. If $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is of full dimension, then $\mathcal{U}$ is a nonempty open subset of $ \newcommand{\dom}{{\rm dom}\,} \dom I$ on which I is analytic.

Now, besides the set $\{\,f(s)/s\}_{s\in\mathcal{S}}$ being of full dimension, suppose that the function z is differentiable everywhere. In this case, part (c) of proposition 2 tells us that for any $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} w\in\inter(\dom I)$ there exists $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ such that $\nabla z(k)=w$ . Proposition 4 entails that if $w\ne r$ , then necessarily $k\in\Theta^c$ and $w=\nu(k)$ , so that $w\in\mathcal{U}$ . It follows that $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \mathcal{U}\subseteq\inter(\dom I)$ differs from $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ by at most the point r, which implies that I is analytic on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ possibly deprived of r. If $r\notin\mathcal{U}$ , then $r\in\partial z(k)$ if and only if $k\in\Theta$ by proposition 4, so that $\partial I(r)=\Theta$ by part (b) of proposition 2 and r could be a singular point of I. We point out that I is strictly convex on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ if z is differentiable everywhere, even when $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is not of full dimension (see appendix B, proposition B.8). This way, we have proved the following theorem.

Theorem 2. Suppose $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is of full dimension and z is differentiable throughout $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . Then, I is strictly convex on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ and analytic on the open set $\mathcal{U}$ , which differs from $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ by at most the point r. If $r\notin\mathcal{U}$ , then $\partial I(r)=\Theta$ .

To conclude, let us investigate situations where z is not differentiable everywhere. If z is not differentiable at some point, then the set $\Theta$ is necessarily nonempty by proposition 4, so that $r\notin\mathcal{U}$ by part (b) of lemma 2 and $\partial I(r)=\Theta$ . In this case, the rate function I cannot be strictly convex on $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom I)$ and affine stretches in its graph must emerge. In fact, if k is a point where z is not differentiable, then $\partial z(k)$ must not be a singleton and proposition 4 consequently tells us that $k\in\Theta$ , $\sum\nolimits_{s\geqslant 1} s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)<\infty$ , $\nu(k)\ne r$ , and $\partial z(k)=\{(1-\alpha)r+\alpha\nu(k)\}_{\alpha\in[0,1]}$ . This way, part (a) of proposition 2 stating that $I(w)=w\cdot k-z(k)+z(0)$ for all $w\in\partial z(k)$ shows that I maps affinely the closed line segment in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d $ from r to $\nu(k)$ onto the closed segment in $ \newcommand{\Rl}{\mathbb{R}} \Rl$ from $I(r)$ to $I(\nu(k))$ . In spite of the lack of strict convexity, I is however continuously differentiable on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ possibly deprived of r and analytic on $\mathcal{U}$ , as stated by lemmas 3 and 4. We have thus demonstrated the following theorem.

Theorem 3. Suppose $\{\,f(s)/s\}_{s\in\mathcal{S}}$ is of full dimension and z is not differentiable on the whole $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . Then, I is not strictly convex on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ but it is continuously differentiable on $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I)$ possibly deprived of r, where $\partial I(r)=\Theta$ . I is analytic on $\mathcal{U}$ and $r\notin\mathcal{U}$ .

4.3. The critical constrained pinning model

We have seen that non-differentiability of z at some point causes an affine stretch in the graph of I. If this point is the origin, then the large deviation principle stated by theorem 1 does not completely describe the way the probabilities of the cumulative reward Wt decay. In fact, here we demonstrate that Wt/t always converges in probability to some constant vector $ \newcommand{\Rl}{\mathbb{R}} \rho\in\Rl^d$ but the convergence is necessarily subexponential if $\partial z(0)$ is not a singleton. Let us observe that the formula $I(w)=w\cdot k-z(k)+z(0)$ for $w\in\partial z(k)$ by part (a) of proposition 2 yields $I(w)=0$ when $w\in\partial z(0)$ . The converse is also true because if $I(w)=0$ , then the bound $w\cdot k-z(k)+z(0)\leqslant I(w)=0$ valid for every k shows that w is a subgradient of z at the origin. These arguments give $I(w)=0$ if and only if $w\in\partial z(0)$ and I turns out to have more than one zero whenever $\partial z(0)$ is not a singleton.

The probability that the scaled cumulative reward Wt/t fluctuates over closed sets that do not contain zeros of I always decays exponentially fast in t, as stated by the following lemma that is proved in appendix G.

Lemma 5. Let $ \newcommand{\Rl}{\mathbb{R}} F\subset\Rl^d$ be a closed set disjoint from $\partial z(0)$ . Then, there exists a real number $\lambda>0$ such that $ \newcommand{\prob}{\mathbb{P}} \prob_t^c[W_t/t\in F]\leqslant {\rm e}^{-\lambda t}$ for all sufficiently large t.

According to Ellis [30], we say that Wt/t converges exponentially to a constant vector $\rho$ if for any $\delta>0$ there exists a real number $\lambda>0$ with the property that for all sufficiently large t

Equation (10)

Lemma 5 implies that Wt/t converges exponentially to a constant vector $\rho$ if $\partial z(0)$ is a singleton containing $\rho$ . Proposition 4 tells us that such $\rho$ equals r if $\theta(0)\leqslant 1$ and equals $\nu(0)$ if $\theta(0)>1$ , with $\nu(0)$ defined by (9). The following theorem, which is proved in appendix H, completes the picture about the convergence in probability of Wt/t towards an appropriate constant vector $\rho$ .

Theorem 4. Set $\rho:=r$ if $\theta(0)\leqslant 1$ and $\partial z(0)$ is a singleton and set $\rho:=\nu(0)$ if $\theta(0)>1$ or $\partial z(0)$ is not a singleton. The following conclusions hold:

  • (a)  
    $ \newcommand{\prob}{\mathbb{P}} \lim\nolimits_{t\uparrow\infty}\,\prob_t^c[\|W_t/t-\rho\|\geqslant \delta]=0$ for any $\delta>0$ ;
  • (b)  
    Wt/t converges exponentially to $\rho$ if and only if $\partial z(0)$ is a singleton.

Theorem 4 tells us that the scaled cumulative reward Wt/t exhibits a complex behavior if $\partial z(0)$ is not a singleton, whereby convergence in probability to $\rho$ is slower than exponential. Lemma 5 teaches us that exponential convergence is prevented by persistent fluctuations of Wt/t over the set $\partial z(0)$ where the rate function takes value zero. Observing that $ \newcommand{\e}{{\rm e}} z(0)=\ell$ when $ \newcommand{\e}{{\rm e}} \theta(0):=\sum\nolimits_{s\geqslant 1} {\rm e}^{v(s)-\ell s}\,p(s)=1$ according to lemma 1, proposition 4 states that necessary and almost sufficient conditions for $\partial z(0)$ not to be a singleton are

Equation (11)

We are led to call critical a constrained pinning model that satisfies such conditions, which, we notice, do not involve the function f defining the deterministic rewards. Under the situation identified by (11), $\partial z(0)$ is not a singleton, and precisely is the closed line segment in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ connecting r to $\nu(0)$ by part (b) of proposition 4, whatever the function f is except for those peculiar f satisfying $\nu(0)=r$ . According to the literature on large deviation principles in statistical mechanics [33, 38], we call such line segment the phase transition segment. A critical constrained pinning model is thus a constrained pinning model for which most scaled cumulative rewards display persistent fluctuations, over their phase transition segment, which lead to subexponential decays of probabilities that cannot be captured by a large deviation principle.

4.4. Large fluctuations of Nt

The theory developed in the last three paragraphs is well exemplified in the case of the number $N_t:=\sum\nolimits_{\tau=1}^tU_\tau$ counting the renewals by t, which is the cumulative reward corresponding to the deterministic rewards identified by the function $f(s)=1$ for all $s\geqslant 1$ . Large deviation principles for Nt within non-constrained renewal processes have been previously investigated by Glynn and Whitt [54] under the regularity conditions of the Gärtner-Ellis theorem and by Lefevere et al [55] in general through a theory for the fluctuations of the empirical measures of backward and forward recurrence times. In this paragraph we study the fluctuations of Nt within the renewal models of statistical mechanics described in section 3, which are constrained pinning models where $p(s)>0$ and $v(s)=\beta$ for every s, $\beta$ being a control parameter that can drive a phase transition. Apart from the sign, this control parameter is the binding energy $ \newcommand{\e}{{\rm e}} \epsilon$ in the Poland–Scheraga model, the chemical potential $\mu$ in the Fisher–Felderhof model and in the Tokar–Dreyssé model, and the entropic loss $\sigma$ in the Wako–Saitô–Muñoz–Eaton model. We point out that assumption 1 is satisfied because $\mathcal{S}$ consists of all positive integers and assumption 3 is verified with r  =  0. According to the physical arguments of section 3, we suppose that $\limsup\nolimits_{s\uparrow\infty}(1/s)\ln p(s)<\infty$ , which guarantees that also assumption 2 is fulfilled.

Some preliminary considerations are in order. It is convenient here to make the dependence on $\beta$ explicit by writing $z_\beta$ , $\nu_\beta$ , and $I_\beta$ in place of z, $\nu$ , and I. According to definition (1), we have $ \newcommand{\Rl}{\mathbb{R}} z_\beta(k):=\inf\{\zeta\in\Rl\,:\,\sum\nolimits_{s\geqslant 1}{\rm e}^{k+\beta-\zeta s}\,p(s)\leqslant 1\}=z_0(k+\beta)$ for every $\beta$ and k. It follows immediately from definition (2) that $I_\beta(w)=I_0(w)-w\beta+z_0(\beta)-z_0(0)$ for all $\beta$ and w, which is useful to disentangle $\beta$ and w. Setting

we find $ \newcommand{\e}{{\rm e}} \theta(k)=\sum\nolimits_{s\geqslant 1} {\rm e}^{k+\beta-\ell s}\,p(s)={\rm e}^{k+\beta-\beta_c}>1$ if $k>\beta_c-\beta$ and $\theta(k)\leqslant 1$ if $k\leqslant\beta_c-\beta$ provided that $\beta_c>-\infty$ . We notice that $ \newcommand{\e}{{\rm e}} \ell$ equals $\limsup\nolimits_{s\uparrow\infty}(1/s)\ln p(s)$ due to the finiteness of $v$ . Since r  =  0, lemma 1 tells us that $ \newcommand{\e}{{\rm e}} z_\beta(k)>\ell$ satisfies the identity $\sum\nolimits_{s\geqslant 1}{\rm e}^{k+\beta-z_\beta(k) s}\,p(s)=1$ for any $k>\beta_c-\beta$ and that $ \newcommand{\e}{{\rm e}} z_\beta(k)=\ell$ for each $k\leqslant\beta_c-\beta$ when $\beta_c>-\infty$ . Proposition 4 states that z is differentiable at all points k with the only possible exception of $k=\beta_c-\beta$ in the case $\beta_c>-\infty$ . Within this case, part (b) of proposition 4 tells us that $z_\beta$ is differentiable at $k=\beta_c-\beta$ when $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)=\infty$ and is not differentiable at $k=\beta_c-\beta$ when $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)<\infty$ . In both situations we can write $\partial z_\beta(\beta_c-\beta)=[0,w_c]$ with

We stress that differentiability of $z_\beta$ and the value of wc are independent of $\beta$ .

4.4.1. The phase transition.

Theorem 4 shows that Nt/t converges in probability to a certain $\rho_\beta$ as t is sent to infinity. This convergence is exponential unless the model is critical, namely unless $\beta_c>-\infty$ , $\beta=\beta_c$ , and $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}s\,{\rm e}^{-\ell s}\,p(s)<\infty$ . The real number $\rho_\beta$ is given explicitly by

if $\beta_c=-\infty$ and by

when $\beta_c>-\infty$ . We observe that the identity $z_\beta(0)=z_0(\beta)$ valid for all $\beta$ in combination with part (a) of proposition 4 tells us that $z_\beta(0)$ and $\rho_\beta=z_0'(\beta)$ are analytic function of $\beta$ in the region $\beta>\beta_c$ , where $\rho_\beta$ is increasing with $\beta$ because z0 is strictly convex.

The way $\rho_\beta$ depends on $\beta$ reveals a phase transition whenever $\beta_c>-\infty$ , whereby the fraction of times that are renewals changes from being zero to being positive at $\beta_c$ while increasing $\beta$ . We have $\lim\nolimits_{\beta\downarrow\beta_c}\rho_\beta=w_c$ , which shows that the phase transition is continuous or discontinuous according to the series $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}s\,{\rm e}^{-\ell s}\,p(s)$ diverges or converges. The limit $\lim\nolimits_{\beta\downarrow\beta_c}\rho_\beta=w_c$ is due to Abel's theorem when $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}s\,{\rm e}^{-\ell s}\,p(s)<\infty$ . If instead $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}s\,{\rm e}^{-\ell s}\,p(s)=\infty$ , then the fact that $ \newcommand{\e}{{\rm e}} z_\beta(0)>\ell$ allows us to write for every $\beta>\beta_c$ and $t\geqslant 1$ the bound

If follows from here that $\lim\nolimits_{\beta\downarrow\beta_c}\rho_\beta=0$ by sending $\beta$ to $\beta_c$ first and t to infinity later. We point out that, according to our definition of critical constrained pinning model, the model is critical at $\beta=\beta_c$ only if undergoes a discontinuous phase transition. In the case of the Poland–Scheraga model with loop entropy $\sigma_l=al+b-c\ln l$ for all $l\geqslant 1$ we recover well-known facts [7]: there is a phase transition corresponding to $\beta_c>-\infty$ only if c  >  1, and the phase transition is continuous when $1<c\leqslant 2$ and discontinuous when c  >  2.

4.4.2. The rate function.

We describe now the rate function $I_\beta$ . The convex hull $\mathcal{C}$ of the full dimensional set $\{1/s\}_{s\in\mathcal{S}}$ is the half-open interval $(0,1]$ and proposition 3 gives that $ \newcommand{\dom}{{\rm dom}\,} \dom I_\beta$ differs from the set $(0,1]$ for at most the point 0. In any case, we have $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} \inter(\dom I_\beta)=(0,1)$ , $ \newcommand{\inter}{{\rm int}\,} \newcommand{\dom}{{\rm dom}\,} r=0\notin\inter(\dom I_\beta)$ , and $I_\beta(w)=\infty$ for all $w\notin[0,1]$ . Let us identify $\mathcal{U}$ , which is an open subset of $(0,1)$ by lemma 4. If $\beta_c=-\infty$ or $\beta_c>-\infty$ and $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)=\infty$ , then $z_\beta$ is differentiable throughout $ \newcommand{\Rl}{\mathbb{R}} \Rl$ and theorem 2 tells us that $\mathcal{U}=(0,1)$ . In order to tackle the case $\beta_c>-\infty$ and $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)<\infty$ , let us pick $w\in(0,1)$ and let us observe that there exists a point k such that $w\in\partial z(k)$ by part (c) of proposition 2. Since $w\ne r=0$ , proposition 4 tells us that $k\geqslant\beta_c-\beta$ , in such a way that either $k>\beta_c-\beta$ and hence $w\in\mathcal{U}$ or $k=\beta_c-\beta$ and hence $w\in(0,w_c]$ . In the second case $w\notin\mathcal{U}$ by the unicity of k stated by part (a) of lemma 2. These arguments show that necessarily $\mathcal{U}=(w_c,1)$ .

Let us determine $I_\beta(w)$ when $w\in(0,1)$ . If $w\in(0,1)$ but $w\notin\mathcal{U}$ , then we have necessarily $\beta_c>-\infty$ , $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)<\infty$ , and $w\in(0,w_c]$ , in such a way that $ \newcommand{\e}{{\rm e}} I_\beta(w)=w(\beta_c-\beta)-$ $ \newcommand{\e}{{\rm e}} \ell+z_\beta(0)$ by part (a) of proposition 2. If instead $w\in\mathcal{U}$ , then there exists $k>\beta_c-\beta$ with the property that $w=\nu_\beta(k)$ and $I_\beta(w)=wk-z_\beta(k)+z_\beta(0)$ follows again by part (a) of proposition 2. This formula can be simplified. Consider the function $\mathcal{V}$ that maps each $ \newcommand{\e}{{\rm e}} \zeta>\ell$ in

It is not difficult to verify that $m^2\frac{\partial\mathcal{V}}{\partial\zeta}(\zeta)=\sum\nolimits_{s\geqslant 1}(s-m){}^2q(s)>0$ for any $\zeta$ with $m:=\sum\nolimits_{s\geqslant 1}s\,q(s)$ and $q(s):={\rm e}^{-\zeta s}\,p(s)/\sum\nolimits_{\sigma\geqslant 1}{\rm e}^{-\zeta \sigma}\,p(\sigma)$ for all s, so that the function $\mathcal{V}$ is increasing. The condition $\nu_\beta(k)=w$ reads $\mathcal{V}(z_\beta(k))=w$ , so that stating that there exists $k>\beta_c-\beta$ with the property that $\nu_\beta(k)=w$ is tantamount to say that there exists a real number $ \newcommand{\e}{{\rm e}} \zeta>\ell$ independent of $\beta$ such that $\mathcal{V}(\zeta)=w$ and $z_\beta(k)=\zeta$ for some $k>\beta_c-\beta$ . The number $\zeta$ is unique because $\mathcal{V}$ is increasing. This way, since ${\rm e}^{-k}=\sum\nolimits_{s\geqslant 1}{\rm e}^{\beta-z_\beta(k) s}\,p(s)$ for $k>\beta_c-\beta$ by construction, we can express $I_\beta(w)$ in terms of $\zeta$ as $I_\beta(w)=-w\ln\sum\nolimits_{s\geqslant 1}{\rm e}^{\beta-\zeta s}\,p(s)-\zeta+z_\beta(0)$ .

In order to completely find out $I_\beta$ , it remains to compute $I_\beta(0)$ and $I_\beta(1)$ . We demonstrate at first that $ \newcommand{\Rl}{\mathbb{R}} \newcommand{\e}{{\rm e}} \inf\nolimits_{k\in\Rl}\{z_\beta(k)\}=\ell$ , which gives the result $ \newcommand{\Rl}{\mathbb{R}} \newcommand{\e}{{\rm e}} I_\beta(0)=\sup\nolimits_{k\in\Rl}\{-z_\beta(k)+z_\beta(0)\}=-\ell+z_\beta(0)$ and shows that $ \newcommand{\dom}{{\rm dom}\,} 0\in\dom I_\beta$ if and only if $ \newcommand{\e}{{\rm e}} \ell>-\infty$ . Recall that $ \newcommand{\e}{{\rm e}} z_\beta(k)>\ell$ if $k>\beta_c-\beta$ and $ \newcommand{\e}{{\rm e}} z_\beta(k)=\ell$ if $k\leqslant\beta_c-\beta$ provided that $\beta_c>-\infty$ . Thus, $ \newcommand{\Rl}{\mathbb{R}} \newcommand{\e}{{\rm e}} \inf\nolimits_{k\in\Rl}\{z_\beta(k)\}=\ell$ is trivial when $\beta_c>-\infty$ . When $\beta_c=-\infty$ , then we get $ \newcommand{\Rl}{\mathbb{R}} \newcommand{\e}{{\rm e}} \inf\nolimits_{k\in\Rl}\{z_\beta(k)\}=\ell$ as a consequence of the limit $ \newcommand{\e}{{\rm e}} \lim\nolimits_{k\downarrow -\infty}z_\beta(k)=\ell$ . In fact, $z_\beta$ is an increasing function that satisfies $\sum\nolimits_{s\geqslant 1}{\rm e}^{k+\beta-z_\beta(k) s}\,p(s)=1$ for all $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl$ when $\beta_c=-\infty$ . If it were $ \newcommand{\e}{{\rm e}} \lim\nolimits_{k\downarrow -\infty}z_\beta(k)=:\ell_o>\ell$ , then we would find $ \newcommand{\e}{{\rm e}} \lim\nolimits_{k\downarrow -\infty}\sum\nolimits_{s\geqslant 1}{\rm e}^{-z_\beta(k) s}\,p(s)=\sum\nolimits_{s\geqslant 1}{\rm e}^{-\ell_os}\,p(s)<\infty$ by Abel's theorem, thus obtaining $\lim\nolimits_{k\downarrow -\infty}\sum\nolimits_{s\geqslant 1}{\rm e}^{k+\beta-z_\beta(k) s}\,p(s)=0$ and contradicting the fact that $\sum\nolimits_{s\geqslant 1}{\rm e}^{k+\beta-z_\beta(k) s}\,p(s)=1$ for every k.

As far as the value of $I(1)$ is concerned, part (d) of proposition 2 with any $u\in(0,1)$ gives $I_\beta(1)=\lim\nolimits_{w\uparrow 1}I_\beta(w)$ . In order to compute this limit, we observe that $\zeta$ solving the equation $\mathcal{V}(\zeta)=w$ is an increasing function of $w\in\mathcal{U}$ that goes to infinity when w is sent to 1. In fact, $\mathcal{V}$ is an increasing function that is bounded away from 1 on compact intervals. As a consequence, $(1-w)\zeta=[1-\mathcal{V}(\zeta)]\zeta$ goes to 0 when w is sent to 1 because for positive $ \newcommand{\e}{{\rm e}} \zeta>\zeta_o>\ell$

Then, writing $I_\beta(w)=-w\ln\sum\nolimits_{s\geqslant 1}{\rm e}^{\beta-\zeta (s-1)}\,p(s)-(1-w)\zeta+z_\beta(0)$ for every $w\in\mathcal{U}$ we realize that $I_\beta(1)=\lim\nolimits_{w\uparrow 1}I_\beta(w)=-\ln e^\beta p(1)+z_\beta(0)$ .

In conclusion, by putting the pieces together, in the case $\beta_c=-\infty$ we find

Recalling that wc:  =  0 if $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)=\infty$ , in the case $\beta_c>-\infty$ we can write the universal expression

By lemma 4, the rate function $I_\beta$ is analytic on $(0,1)$ with an affine stretch and a singularity at w  =  wc when $\beta_c>-\infty$ and $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)<\infty$ . It is however continuously differentiable on $(0,1)$ by lemma 3. The conditions $\beta_c>-\infty$ , $\beta=\beta_c$ , and $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} s\,{\rm e}^{-\ell s}\,p(s)<\infty$ that make critical the model give $ \newcommand{\e}{{\rm e}} z_\beta(0)=\ell$ and $I_\beta(w)=0$ for all $w\in[0,w_c]$ as a consequence.

Acknowledgments

The author is grateful to Aernout van Enter for suggesting to include the model by Fisher and Felderhof among renewal models of statistical mechanics.

Appendix A.: Proof of proposition 1

For any positive integers $\tau$ and $\delta$ the variable $U_{\tau+\delta}$ is independent of $U_1,\ldots,U_\tau$ and distributed as $U_\delta$ conditional on the event that $\tau$ is a renewal, namely conditional on $U_\tau=1$ . This argument with $\tau:=\tau_{m-1}$ and $\delta:=\tau_m-\tau_{m-1}$ yields $ \newcommand{\prob}{\mathbb{P}} \prob[U_{\tau_1}=\cdots=U_{\tau_m}=1]=\prob[U_{\tau_1}=\cdots=U_{\tau_{m-1}}=1]\cdot\prob[U_{\tau_m-\tau_{m-1}}=1]$ , which proves the proposition after iteration over m. To see formally that $U_{\tau+\delta}$ is independent of $U_1,\ldots,U_\tau$ and distributed as $U_\delta$ when $\tau$ is a renewal it suffices to observe that if $\tau=T_n$ for some positive integer n, then $T_i\leqslant\tau$ for each $i\leqslant n$ and $T_i>\tau$ for any i  >  n. It follows that Ut with $t\leqslant\tau$ takes the expression $\sum\nolimits_{i=1}^n\mathbb{1}_{\{T_i=t\}}$ that depends only on $S_1,\ldots,S_n$ . At the same time, we find $U_{\tau+\delta}=\sum\nolimits_{i\geqslant n+1}\mathbb{1}_{\{T_i=\tau+\delta\}}=\sum\nolimits_{i\geqslant 1}\mathbb{1}_{\{S_{n+1}+\cdots+S_{n+i}=\delta\}}$ , showing that $U_{\tau+\delta}$ depends only on $S_{n+1},S_{n+2},\ldots$ through the same formula that connects $U_\delta$ to $S_1,S_2,\ldots$ .

Appendix B. Convex analysis considerations in Rd

This appendix lists the technical results from convex analysis that are used in section 4 and for which we refer to, e.g. [56]. Hereafter, let C be a convex set in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ and let $\varphi$ be a proper convex function on $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . The following is a fundamental property of closures and relative interiors of convex sets.

Proposition B.1. Pick any $ \newcommand{\cl}{{\rm cl}\,} x\in\cl C$ and $ \newcommand{\ri}{{\rm ri}\,} y\in\ri C$ . Then, $ \newcommand{\ri}{{\rm ri}\,} \lambda x+(1-\lambda)y\in\ri C$ for all $\lambda\in[0,1)$ .

The convex function $\varphi$ is continuous on the relative interior $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom\varphi)$ of its effective domain $ \newcommand{\dom}{{\rm dom}\,} \dom\varphi$ , so that a convex function finite on all of $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ is necessarily continuous. The following result states certain continuity properties of $\varphi$ on $ \newcommand{\dom}{{\rm dom}\,} \dom\varphi$ .

Proposition B.2. Let $\varphi$ be a lower semicontinuous convex function and let y be a point in $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom\varphi)$ . Then, $\lim\nolimits_{\lambda\uparrow 1}\varphi(\lambda x+(1-\lambda)y)=\varphi(x)$ for all $ \newcommand{\cl}{{\rm cl}\,} \newcommand{\dom}{{\rm dom}\,} x\in\cl(\dom \varphi)$ .

A practical way to determine subgradients of convex functions relies on directional derivatives. The one-sided directional derivative $\varphi'(x;u)$ of $\varphi$ at x with respect to a vector $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ is defined by

Equation (B.1)

It exists as an extended real number and $ \newcommand{\e}{{\rm e}} \varphi(x+\epsilon u)\geqslant \varphi(x)+\varphi'(x;u)\epsilon$ for all positive $ \newcommand{\e}{{\rm e}} \epsilon$ because the difference quotient in (B.1) is a non-decreasing function of the parameter $ \newcommand{\e}{{\rm e}} \epsilon>0$ by convexity. The following result holds true.

Proposition B.3. Let x be a point where $\varphi$ is finite. Then, a vector g is a subgradient of $\varphi$ at x if and only if $g\cdot u\leqslant \varphi'(x;u)$ for all $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ .

Subgradients and directional derivatives are related to differentiability in the following way.

Proposition B.4. Let x be a point where $\varphi$ is finite. Then, the following three conditions are equivalent to each other:

  • (a)  
    $\varphi$ is differentiable at x with gradient $\nabla\varphi(x)$ ;
  • (b)  
    $\varphi$ has a unique subgradient g at x;
  • (c)  
    there exists a vector $\nu$ (necessarily unique) such that $\varphi'(x;u)=\nu\cdot u$ for every $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ .

If one of these conditions is fulfilled, then all these conditions are satisfied and $\nabla\varphi(x)=g=\nu$ .

According to the next result, the gradient mapping $x\mapsto\nabla\varphi(x)$ is continuous on open sets where $\varphi$ is differentiable.

Proposition B.5. Let $ \newcommand{\Rl}{\mathbb{R}} A\subseteq\Rl^d$ be an open set. If $\varphi$ is differentiable at any point of A, then $\varphi$ is actually continuously differentiable on A.

The Fenchel–Legendre transform, or conjugate, of $\varphi$ is the lower semicontinuous proper convex function $\varphi^\star$ defined for all $ \newcommand{\Rl}{\mathbb{R}} x^\star\in\Rl^d$ by

Subdifferentials enter the theory of conjugate functions through the following result.

Proposition B.6. The following two conditions are equivalent to each other for any vectors x and $x^\star$ :

  • (a)  
    $x^\star\in\partial\varphi(x)$ ;
  • (b)  
    $x^\star\cdot y-\varphi(y)$ achieves its supremum in y at y  =  x.

If $\varphi$ is lower semicontinuous, then one more condition can be added to this list:

  • (c)  
    $x\in\partial\varphi^\star(x^\star)$ .

Importantly, the subdifferentials of $\varphi$ are very close to covering the effective domain of $\varphi^\star$ when $\varphi$ is lower semicontinuous according to the following result.

Proposition B.7. If $\varphi$ is lower semicontinuous, then $ \newcommand{\Rl}{\mathbb{R}} \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom\varphi^\star)\subseteq\cup_{x\in\Rl^d}\partial\varphi(x)\subseteq\dom\varphi^\star$ .

Differentiability of $\varphi$ everywhere is related to strict convexity of $\varphi^\star$ by the following last result.

Proposition B.8. If $\varphi$ is finite everywhere, then $\varphi^\star$ is strictly convex on $ \newcommand{\ri}{{\rm ri}\,} \newcommand{\dom}{{\rm dom}\,} \ri(\dom\varphi^\star)$ if and only if $\varphi$ is differentiable throughout $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ .

Appendix C.: Proof of proposition 3

Let $\mathcal{H}$ be the closed convex set in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ defined by

We prove in the order that $ \newcommand{\dom}{{\rm dom}\,} \mathcal{C}\subseteq\dom I$ , that $ \newcommand{\cl}{{\rm cl}\,} \cl\mathcal{C}=\mathcal{H}$ , and that $ \newcommand{\dom}{{\rm dom}\,} \dom I\subseteq\mathcal{H}$ .

To begin with, let us observe that $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)$ is lower semicontinuous in $\zeta$ for any given k because is the sum of continuous positive functions. This fact gives $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)\leqslant 1$ by definition (1), which in turn implies $k\cdot f(s)-z(k)s\leqslant-v$ $(s)-\ln p(s)$ for all $s\in\mathcal{S}$ . If w is a convex combination of the elements from $\{\,f(s)/s\}_{s\in\mathcal{S}}$ , then there exist integers $s_1,\ldots,s_m$ in the support $\mathcal{S}$ of p and positive real numbers $\lambda_1,\ldots,\lambda_m$ such that $w=\sum\nolimits_{l=1}^m\lambda_lf(s_l)/s_l$ and $\sum\nolimits_{l=1}^m\lambda_l=1$ . Let $M<\infty$ be a constant satisfying $-v(s_l)-\ln p(s_l)\leqslant M s_l$ for every l and pick an arbitrary point $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ . Then, $k\cdot f(s_l)-z(k)s_l\leqslant -v(s_l)-\ln p(s_l)\leqslant Ms_l$ for all l and, recalling that $\sum\nolimits_{l=1}^m\lambda_l=1$ and $\lambda_l>0$ for each l, we find

The arbitrariness of k results in $I(w)\leqslant M+z(0)<\infty$ , showing that $ \newcommand{\dom}{{\rm dom}\,} w\in\dom I$ . The arbitrariness of w implies $ \newcommand{\dom}{{\rm dom}\,} \mathcal{C}\subseteq\dom I$ .

If $w=\sum\nolimits_{l=1}^m\lambda_l f(s_l)/s_l$ is as above with $\lambda_l>0$ for each l and $\sum\nolimits_{l=1}^m\lambda_l=1$ , then for all $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ we get

This gives $w\in\mathcal{H}$ . Thus, $\mathcal{C}\subseteq\mathcal{H}$ is deduced from the arbitrariness of w and $ \newcommand{\cl}{{\rm cl}\,} \cl\mathcal{C}\subseteq\mathcal{H}$ follows since $\mathcal{H}$ is closed. In order to show that $ \newcommand{\cl}{{\rm cl}\,} \cl\mathcal{C}=\mathcal{H}$ it remains to prove that $ \newcommand{\cl}{{\rm cl}\,} \mathcal{H}\subseteq\cl\mathcal{C}$ . By contradiction, if there exists $u\in\mathcal{H}$ that is not contained in $ \newcommand{\cl}{{\rm cl}\,} \cl\mathcal{C}$ , then we can find a point h and a number $ \newcommand{\e}{{\rm e}} \epsilon>0$ such that $ \newcommand{\e}{{\rm e}} h\cdot w+\epsilon\leqslant h\cdot u$ for all $ \newcommand{\cl}{{\rm cl}\,} w\in\cl\mathcal{C}$ by the Hahn–Banach separation theorem. In particular, as $ \newcommand{\cl}{{\rm cl}\,} f(s)/s\in\cl\mathcal{C}$ for all $s\in\mathcal{S}$ , we obtain $ \newcommand{\e}{{\rm e}} h\cdot f(s)/s+\epsilon\leqslant h\cdot u$ for each $s\in\mathcal{S}$ . This contradicts the fact that $k\cdot u\leqslant \sup\nolimits_{s\in\mathcal{S}}\{k\cdot f(s)/s\}$ for every $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ because $u\in\mathcal{H}$ .

To conclude, we prove that $ \newcommand{\dom}{{\rm dom}\,} \dom I\subseteq\mathcal{H}$ . Assume for a moment to know that $z(k)\leqslant \sup\nolimits_{s\in\mathcal{S}}\{k\cdot f(s)/s\}+|z_o|+\ln 2$ for all $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ with zo given by assumption 2. Then, definition (2) yields for each point k and positive real number $ \newcommand{\e}{{\rm e}} \eta$

This way, if $I(w)<\infty$ , then dividing by $ \newcommand{\e}{{\rm e}} \eta$ first and sending $ \newcommand{\e}{{\rm e}} \eta$ to infinity later we obtain $k\cdot w\leqslant \sup\nolimits_{s\in\mathcal{S}}\{k\cdot f(s)/s\}$ for any k. This shows that $w\in\mathcal{H}$ whenever $ \newcommand{\dom}{{\rm dom}\,} w\in\dom I$ . It remains to verify that $z(k)\leqslant \sup\nolimits_{s\in\mathcal{S}}\{k\cdot f(s)/s\}+|z_o|+\ln 2$ for all $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ . To this aim, fix a point k in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ and a real number $\zeta<z(k)$ , so that $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)>1$ by definition of $z(k)$ . As ${\rm e}^{v(s)}p(s)\leqslant {\rm e}^{z_os}$ for all s by assumption 2, an integer $t\in\mathcal{S}$ exists with the property that $k\cdot f(t)-\zeta t>-z_o-\ln 2$ . It follows that $\zeta<k\cdot f(t)/t+(z_o+\ln 2)/t\leqslant\sup\nolimits_{s\in\mathcal{S}}\{k\cdot f(s)/s\}+$ $|z_o|+\ln 2$ , giving $z(k)\leqslant \sup\nolimits_{s\in\mathcal{S}}\{k\cdot f(s)/s\}+|z_o|+\ln 2$ after that $\zeta$ is sent to $z(k)$ .

Appendix D.: Proof of lemma 1

Assume for a moment that $ \newcommand{\e}{{\rm e}} \Theta\ne\emptyset$ and pick $k\in\Theta$ . In this case $ \newcommand{\e}{{\rm e}} \ell>-\infty$ and we have $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)=\infty$ or $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)\leqslant \theta(k)\leqslant 1$ according to $ \newcommand{\e}{{\rm e}} \zeta<k\cdot r+\ell$ or $ \newcommand{\e}{{\rm e}} \zeta\geqslant k\cdot r+\ell$ . This way, a glance at definition (1) immediately tells us that $ \newcommand{\e}{{\rm e}} z(k)=k\cdot r+\ell$ .

Suppose now that $ \newcommand{\Rl}{\mathbb{R}} \Theta\ne\Rl^d$ and fix $k\in\Theta^c$ . The series $\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)$ defines a non-increasing function in the variable $\zeta$ . In the region $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ , such function is finite, continuous, and strictly decreasing to zero as $\zeta$ goes to infinity. Moreover, it satisfies $ \newcommand{\e}{{\rm e}} \lim\nolimits_{\zeta\downarrow k\cdot r+\ell}\sum\nolimits_{s\geqslant 1} {\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)=\theta(k)>1$ by Abel's theorem. Then, we deduce that there exists a unique real number $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ solving the equation $\sum\nolimits_{s\geqslant 1}{\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)=1$ and the value of z at k is exactly such number $\zeta$ by definition (1).

Appendix E.: Proof of proposition 4

E.1. Part (a)

Let G be the function that associates $k\in\Theta^c$ and $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ with the real number $G(k,\zeta):=\sum\nolimits_{s\geqslant 1}{\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)$ . We prove that G is analytic. To this aim, fix $k\in\Theta^c$ and $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ and set for brevity $c_0(s):={\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)$ for all $s\geqslant 1$ . Denote by fi, ki, xi, and $\nu_i$ the ith component respectively of f, k, a vector x in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ , and $\nu(k)$ . As $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ , there exists $\delta_o>0$ such that $ \newcommand{\e}{{\rm e}} \zeta\geqslant k\cdot r+\ell+(\|r\|+1)\delta_o$ . It follows that if $ \newcommand{\Rl}{\mathbb{R}} x\in\Rl^d$ and $ \newcommand{\Rl}{\mathbb{R}} y\in\Rl$ satisfy $\|x\|<\delta_o$ and $|y|<\delta_o$ , then $|x\cdot r|+|y|<(\|r\|+1)\delta_o$ and by the Cauchy's criterion the series

is convergent. This way, Fubini's theorem allows us to freely rearranged the order of summation to get

This formula shows that G can be represented by a convergent power series in an open neighborhood of the arbitrary given point $(k,\zeta)$ , thus demonstrating the analyticity of G. Moreover, it gives

Equation (E.1)

for all non-negative integers $m_1,\ldots,m_d$ and n.

In particular, formula (E.1) yields $\frac{\partial G}{\partial \zeta}(k,\zeta)=-\sum\nolimits_{s\geqslant 1}s\,{\rm e}^{k\cdot f(s)+v(s)-\zeta s}\,p(s)\ne 0$ for all $k\in\Theta^c$ and $ \newcommand{\e}{{\rm e}} \zeta>k\cdot r+\ell$ . This way, since $ \newcommand{\e}{{\rm e}} z(k)>k\cdot r+\ell$ and $G(k,z(k))=1$ for each $k\in\Theta^c$ , the real analytic implicit function theorem (see [53], theorem 2.3.5) tells us that z is analytic on $\Theta^c$ . By taking the derivative of $G(k,z(k))=1$ with respect to ki we get for every index i and point $k\in\Theta^c$

The vector field $\nu$ that associates k with $\nu(k)$ turns out to be analytic on $\Theta^c$ inheriting this property from z. As far as the Jacobian matrix $J(k)$ of $\nu$ at k is concerned, by taking the derivative of $G(k,z(k))=1$ with respect to ki and kj we find for every indices i and j and point $k\in\Theta^c$

E.2. Part (b)

Assume that $\mathcal{S}$ is an infinite set and that $ \newcommand{\e}{{\rm e}} \ell>-\infty$ , otherwise $ \newcommand{\e}{{\rm e}} \Theta=\emptyset$ and there is nothing to prove, and pick $k\in\Theta$ . As discussed in appendix B, a practical way to determine the subgradients of the convex function z at the point k relies on the one-sided directional derivative $z'(k;u)$ with respect to a vector $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ . We have $ \newcommand{\e}{{\rm e}} z(k+\epsilon u)\geqslant z(k)+z'(k;u)\epsilon$ for all positive $ \newcommand{\e}{{\rm e}} \epsilon$ and the vector g is a subgradient of z at k if and only if $g\cdot u\leqslant z'(k;u)$ for all $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ (see appendix B, proposition B.3). The function z is differentiable at k if and only if a vector $\nu$ (necessarily unique) exists so that $z'(k;u)=\nu\cdot u$ for every u (see appendix B, proposition B.4). If such a $\nu$ exists, then the gradient $\nabla z(k)$ of z at k is equal to $\nu$ . Let us examine what happens when a vector $\nu\ne r$ exists so that $z'(k;u)=\max\{r\cdot u, \nu\cdot u\}$ for all $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ . In this case, $g:=(1-\alpha)r+\alpha \nu$ with any $\alpha\in[0,1]$ satisfies $g\cdot u\leqslant z'(k;u)$ for each u, thus resulting in a subgradient of z at k. Conversely, if g is a subgradient of z at k, then $g\cdot u\leqslant\max\{r\cdot u, \nu\cdot u\}$ for every u. It follows that $(g-r)\cdot u\leqslant\max\{0, (\nu-r)\cdot u\}=0$ for all vectors u orthogonal to $\nu-r$ , showing that a number $\alpha$ exists such that $g-r=\alpha(\nu-r)$ , and hence $g=(1-\alpha)r+\alpha \nu$ . Taking $u=-(\nu-r)$ first and $u=\nu-r$ later in $g\cdot u\leqslant\max\{r\cdot u, \nu\cdot u\}$ we find that $\alpha\geqslant 0$ and that $\alpha\leqslant 1$ , respectively. In conclusion, if $z(k;u)=\max\{r\cdot u, \nu\cdot u\}$ for any $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ , then g is a subgradient of z at k if and only if there exists $\alpha\in[0,1]$ such that $g=(1-\alpha)r+\alpha \nu$ . This is true even in the case $\nu=r$ , to which a function z differentiable at k corresponds. These arguments tell us that in order to prove part (b) of the proposition it suffices to check that if $\theta(k)=1$ , then $z'(k;u)=r\cdot u$ or $z'(k;u)=\max\{r\cdot u, \nu(k)\cdot u\}$ for any given $ \newcommand{\Rl}{\mathbb{R}} u\in\Rl^d$ depending on whether the series $\sum\nolimits_{s\geqslant 1} s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)$ diverges or converges. Similarly, part (c) follows if we prove that $z'(k;u)=r\cdot u$ for all u when $\theta(k)<1$ .

Let us fix an arbitrary vector u in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . In order to simplify next formulas, we set $ \newcommand{\e}{{\rm e}} c_\epsilon(s):={\rm e}^{(k+\epsilon u)\cdot f(s)+v(s)-z(k+\epsilon u) s}\,p(s)$ and $ \newcommand{\e}{{\rm e}} \Delta_\epsilon(s):=\epsilon u\cdot f(s)-z(k+\epsilon u)s+z(k) s$ for each $s\geqslant 1$ and $ \newcommand{\e}{{\rm e}} \epsilon\geqslant 0$ . We notice that $ \newcommand{\e}{{\rm e}} c_\epsilon(s)={\rm e}^{\Delta_\epsilon(s)}c_0(s)$ for all s and $ \newcommand{\e}{{\rm e}} \epsilon$ and that the equality $\nu(k)=\sum\nolimits_{s\geqslant 1}f(s)\,c_0(s)/\sum\nolimits_{s\geqslant 1}s\,c_0(s)$ holds true according to (9). Given any $ \newcommand{\e}{{\rm e}} \epsilon$ , lemma 1 tells us that $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}c_\epsilon(s)\leqslant 1$ and that $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}c_\epsilon(s)=1$ if $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta^c$ . In addition, we have $\sum\nolimits_{s\geqslant 1}c_0(s)=\theta(k)$ because $ \newcommand{\e}{{\rm e}} z(k)=k\cdot r+\ell$ when $k\in\Theta$ . Finally, we observe that $ \newcommand{\e}{{\rm e}} \lim\nolimits_{\epsilon\downarrow 0}c_\epsilon(s)=c_0(s)$ for all s thanks to the continuity of z.

Assuming that $\theta(k)=1$ , so that $\sum\nolimits_{s\geqslant 1}c_0(s)=\theta(k)=1$ , we now prove that $z'(k;u)=r\cdot u$ or $z'(k;u)=\max\{r\cdot u, \nu(k)\cdot u\}$ depending on whether the series $\sum\nolimits_{s\geqslant 1} s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)$ diverges or converges. The fact that $ \newcommand{\e}{{\rm e}} z(h)\geqslant h\cdot r+\ell$ for all $ \newcommand{\Rl}{\mathbb{R}} h\in\Rl^d$ gives $ \newcommand{\e}{{\rm e}} z(k+\epsilon u)-z(k)\geqslant \epsilon r\cdot u$ for any $ \newcommand{\e}{{\rm e}} \epsilon$ , showing that $z'(k;u)\geqslant r\cdot u$ . Furthermore, the bound $e^y\geqslant 1+y$ valid for every real number y implies that if $\sum\nolimits_{s\geqslant 1} s\,c_0(s)<\infty$ , then the series $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} \Delta_\epsilon(s)\,c_0(s)$ exists for each $ \newcommand{\e}{{\rm e}} \epsilon>0$ and

This yields $ \newcommand{\e}{{\rm e}} (1/\epsilon)\sum\nolimits_{s\geqslant 1} \Delta_\epsilon(s)\,c_0(s)\leqslant 0$ and sending $ \newcommand{\e}{{\rm e}} \epsilon$ to zero we obtain from here that $z'(k;u)\geqslant \nu(k)\cdot u$ whenever $\sum\nolimits_{s\geqslant 1} s\,c_0(s)<\infty$ . These arguments prove that $z'(k;u)\geqslant r\cdot u$ or $z'(k;u)\geqslant \max\{r\cdot u, \nu(k)\cdot u\}$ according to $\sum\nolimits_{s\geqslant 1} s\,c_0(s)=\infty$ or $\sum\nolimits_{s\geqslant 1} s\,c_0(s)<\infty$ .

Let us deduce the opposite bounds $z'(k;u)\leqslant r\cdot u$ if $\sum\nolimits_{s\geqslant 1} s\,c_0(s)=\infty$ and $z'(k;u)\leqslant\max\{r\cdot u, \nu(k)\cdot u\}$ if $\sum\nolimits_{s\geqslant 1} s\,c_0(s)<\infty$ , which conclude the proof of part (b) of the proposition. Pick a number $ \newcommand{\e}{{\rm e}} \eta>0$ and observe that the assumption that $f(s)/s$ has a limit $ \newcommand{\Rl}{\mathbb{R}} r\in\Rl^d$ when s goes to infinity through $\mathcal{S}$ ensures us that a positive integer $\tau_o$ can be found with the property that $ \newcommand{\e}{{\rm e}} [\,f(s)-r s]\cdot u\leqslant \eta s$ for each $s\in\mathcal{S}$ larger than $\tau_o$ . Then, fix a number $ \newcommand{\e}{{\rm e}} \epsilon>0$ and suppose for a moment that the condition $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta^c$ is satisfied. Under this condition, z is differentiable at $ \newcommand{\e}{{\rm e}} k+\epsilon u$ as stated by part (a) of the present proposition and it follows from convexity that for all $\tau\geqslant\tau_o$

We get from here that for any integer $t\geqslant 1$

This inequality holds even if $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta$ , and hence whatever $ \newcommand{\e}{{\rm e}} k+\epsilon u$ is, since $ \newcommand{\e}{{\rm e}} z(k+\epsilon u)-z(k)=\epsilon r\cdot u$ when $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta$ . This way, we can send $ \newcommand{\e}{{\rm e}} \epsilon$ to zero finding that for all $\tau\geqslant\tau_o$ and $t\geqslant 1$

Recall that $ \newcommand{\e}{{\rm e}} \lim\nolimits_{\epsilon\downarrow 0}c_\epsilon(s)=c_0(s)$ for each s. At this point, sending first t to infinity, then $\tau$ to infinity, and finally $ \newcommand{\e}{{\rm e}} \eta$ to zero, we get $z'(k;u)\leqslant r\cdot u$ or $z'(k;u)\leqslant\max\{r\cdot u, \nu(k)\cdot u\}$ according to the series $\sum\nolimits_{s\geqslant 1} s\,c_0(s)$ diverges or converges.

E.3. Part (c)

Supposing that $\theta(k)<1$ , so that $\sum\nolimits_{s\geqslant 1}c_0(s)=\theta(k)<1$ , here we show that $z'(k;u)=r\cdot u$ . If there exists a real number $ \newcommand{\e}{{\rm e}} \epsilon_o>0$ with the property that $ \newcommand{\e}{{\rm e}} k+\epsilon_o u\in\Theta$ , then $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta$ for every $ \newcommand{\e}{{\rm e}} \epsilon\in(0,\epsilon_o)$ since $ \newcommand{\e}{{\rm e}} k+\epsilon u=(1-\epsilon/\epsilon_o)k+(\epsilon/\epsilon_o)(k+\epsilon_o u)$ and both k and $ \newcommand{\e}{{\rm e}} k+\epsilon_o u$ belong to the convex set $\Theta$ . Consequently, $ \newcommand{\e}{{\rm e}} z(k+\epsilon u)=(k+\epsilon u)\cdot r+\ell$ for all $ \newcommand{\e}{{\rm e}} \epsilon\in(0,\epsilon_o)$ and $z'(k;u)=r\cdot u$ follows immediately. The non trivial case is when the number $ \newcommand{\e}{{\rm e}} \epsilon_o$ does not exist, namely when $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta^c$ for all $ \newcommand{\e}{{\rm e}} \epsilon>0$ . However, in such case there exists a subsequence $\{s_i\}_{i\geqslant 1}$ of $\mathcal{S}$ diverging to infinity with the property that $z'(k;u)s_i<u\cdot f(s_i)$ for any i as we shall prove in a moment. This fact yields

On the other hand, since $ \newcommand{\e}{{\rm e}} z(h)\geqslant h\cdot r+\ell$ for all $ \newcommand{\Rl}{\mathbb{R}} h\in\Rl^d$ and $ \newcommand{\e}{{\rm e}} z(k)=k\cdot r+\ell$ , we also have $z'(k;u)\geqslant r\cdot u$ . This way, $z'(k;u)=r\cdot u$ even when the above $ \newcommand{\e}{{\rm e}} \epsilon_o$ does not exist and the proof of part (c) of the proposition is concluded.

Assume that $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta^c$ for all $ \newcommand{\e}{{\rm e}} \epsilon>0$ . We prove that there exists a subsequence $\{s_i\}_{i\geqslant 1}$ of $\mathcal{S}$ diverging to infinity such that $z'(k;u)s_i<u\cdot f(s_i)$ for any i by contradiction. Once again, we set $ \newcommand{\e}{{\rm e}} c_\epsilon(s):={\rm e}^{(k+\epsilon u)\cdot f(s)+v(s)-z(k+\epsilon u) s}\,p(s)$ and $ \newcommand{\e}{{\rm e}} \Delta_\epsilon(s):=\epsilon u\cdot f(s)-z(k+\epsilon u)s+z(k) s$ for each $s\geqslant 1$ and $ \newcommand{\e}{{\rm e}} \epsilon\geqslant 0$ . If an integer $t\geqslant 1$ with the property that $z'(k;u)s\geqslant u\cdot f(s)$ for all $s\in\mathcal{S}$ larger than t exists, then $ \newcommand{\e}{{\rm e}} z(k+\epsilon u)-z(k)\geqslant z'(k;u)\epsilon\geqslant \epsilon u\cdot f(s)/s$ for all those s and $ \newcommand{\e}{{\rm e}} \epsilon>0$ by convexity. This means that $ \newcommand{\e}{{\rm e}} \Delta_\epsilon(s)\leqslant 0$ for any $s\in\mathcal{S}$ larger than t and $ \newcommand{\e}{{\rm e}} \epsilon>0$ . On the other hand, we have $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1} c_\epsilon(s)=1$ for every $ \newcommand{\e}{{\rm e}} \epsilon>0$ as $ \newcommand{\e}{{\rm e}} k+\epsilon u\in\Theta^c$ by hypothesis. It follows that $ \newcommand{\e}{{\rm e}} 1=\sum\nolimits_{s\geqslant 1} c_\epsilon(s)\leqslant\sum\nolimits_{s=1}^t {\rm e}^{\Delta_\epsilon(s)}c_0(s)+\sum\nolimits_{s=t+1}^\infty c_0(s)$ for every $ \newcommand{\e}{{\rm e}} \epsilon>0$ . This way, sending $ \newcommand{\e}{{\rm e}} \epsilon$ to zero we get $\sum\nolimits_{s\geqslant 1} c_0(s)\geqslant 1$ , which contradicts the fact that $\sum\nolimits_{s\geqslant 1} c_0(s)<1$ .

Appendix F.: Proof of lemma 2

F.1. Part (a)

Fix $w\ne r$ and $ \newcommand{\Rl}{\mathbb{R}} k\in\Rl^d$ in such a way that $w\in\partial z(k)$ . By proposition 4, the fact that $w\ne r$ implies $\theta(k)\geqslant 1$ and $\sum\nolimits_{s\geqslant 1}s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)<\infty$ , so that $\nu(k)$ is well defined. Let h be an arbitrary point in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ . If the condition $(h-k)\cdot[\,f(s)-rs]=0$ for every $s\in\mathcal{S}$ is satisfied, then $\partial z(h)=\partial z(k)$ and hence $w\in\partial z(h)$ . Indeed, it is a simple exercise to verify that such condition entails $\theta(h)=\theta(k)\geqslant 1$ , $z(h)=z(k)+(h-k)\cdot r$ based on definition (1), $\sum\nolimits_{s\geqslant 1}s\,{\rm e}^{h\cdot f(s)+v(s)-z(h) s}\,p(s)=\sum\nolimits_{s\geqslant 1}s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)<\infty$ , and $\nu(h)=\nu(k)$ . In particular, the results $\theta(h)\geqslant 1$ , $\sum\nolimits_{s\geqslant 1}s\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)<\infty$ , and $\nu(h)=\nu(k)$ combined with proposition 4 necessarily yield $\partial z(h)=\partial z(k)$ .

Assume now that $w\in\partial z(h)$ . We prove that $(h-k)\cdot[\,f(s)-rs]=0$ for every $s\in\mathcal{S}$ as a consequence. We have $z(x)\geqslant z(h)+w\cdot(x-h)$ for all $ \newcommand{\Rl}{\mathbb{R}} x\in\Rl^d$ since w is a subgradient of z at h and, similarly, $z(x)\geqslant z(k)+w\cdot(x-k)$ . It follows in particular that $z(h)-z(k)=w\cdot(h-k)$ and that $z(\lambda h+(1-\lambda)k)\geqslant z(k)+\lambda w\cdot(h-k)$ for all $ \newcommand{\Rl}{\mathbb{R}} \lambda\in\Rl$ . These two relations give $z(\lambda h+(1-\lambda)k)\geqslant \lambda z(h)+(1-\lambda)z(k)$ , which combined with convexity entails $z(\lambda h+(1-\lambda)k)=\lambda z(h)+(1-\lambda)z(k)$ for every $\lambda\in[0,1]$ . Suppose for a moment that $k\in\Theta^c$ and recall that z is analytic on $\Theta^c$ . Then, $\lambda h+(1-\lambda)k\in\Theta^c$ for all sufficiently small $\lambda$ as $\Theta^c$ is open and by taking the second derivative of $z(\lambda h+(1-\lambda)k)=\lambda z(h)+(1-\lambda)z(k)$ with respect to $\lambda$ and sending $\lambda$ to zero we find $(h-k)\cdot J(k)(h-k)=0$ , $J(k)$ being the Hessian matrix of z at k. By part (a) of proposition 4, the condition $(h-k)\cdot J(k)(h-k)=0$ is tantamount to

It follows from here that $(h-k)\cdot[\,f(s)-\nu(k)s]=0$ whenever $s\in\mathcal{S}$ , which gives $(h-k)\cdot[\,f(s)/s-f(\sigma)/\sigma]=0$ for each s and $\sigma$ in $\mathcal{S}$ . By setting $\sigma$ equal to so if $ \newcommand{\e}{{\rm e}} \ell=-\infty$ or by sending $\sigma$ to infinity if $ \newcommand{\e}{{\rm e}} \ell>-\infty$ , we get $(h-k)\cdot[\,f(s)-rs]=0$ for all $s\in\mathcal{S}$ . The same conclusion is achieved by changing k with h if $h\in\Theta^c$ .

It remains to tackle the case where both h and k belong to $\Theta$ . In this case, $ \newcommand{\e}{{\rm e}} \ell>-\infty$ because $ \newcommand{\e}{{\rm e}} \Theta\ne\emptyset$ . Moreover, $ \newcommand{\e}{{\rm e}} z(h)=h\cdot r+\ell$ and $ \newcommand{\e}{{\rm e}} z(k)=k\cdot r+\ell$ , so that the above condition $z(h)-z(k)=w\cdot(h-k)$ becomes $(w-r)\cdot(h-k)=0$ . Since $h\in\Theta$ , $w\in\partial z(h)$ , and $w\ne r$ , proposition 4 tells us that necessarily there exists $\alpha>0$ such that $w=(1-\alpha)r+\alpha\nu(h)$ . Similarly, there exists $\beta>0$ such that $w=(1-\beta)r+\beta\nu(k)$ . These two identities combined with $(w-r)\cdot(h-k)=0$ yield $[\nu(h)-r]\cdot(h-k)=[\nu(k)-r]\cdot$ $(h-k)=0$ . Write $\bar{f}(s):=f(s)-rs$ for all s, $\bar{\nu}(k):=\nu(k)-r$ , and $\bar{\nu}(h):=\nu(h)-r$ for brevity. Let $\mathcal{S}_+$ and $\mathcal{S}_-$ be the subsets of $\mathcal{S}$ where $(h-k)\cdot\bar{f}(s)\geqslant 0$ and $(h-k)\cdot\bar{f}(s)\leqslant 0$ , respectively. Using first the fact that $(h-k)\cdot\bar{\nu}(h)=0$ , namely $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}(h-k)\cdot\bar{f}(s)\,{\rm e}^{h\cdot \bar{f}(s)+v(s)-\ell s}$ $ \newcommand{\e}{{\rm e}} p(s)=0$ , and later the fact that $(h-k)\cdot\bar{\nu}(k)=0$ , namely $ \newcommand{\e}{{\rm e}} \sum\nolimits_{s\geqslant 1}(h-k)\cdot\bar{f}(s)\,{\rm e}^{k\cdot \bar{f}(s)+v(s)-\ell s}$ $ \newcommand{\e}{{\rm e}}p(s)=0$ , we get

This bound can be recast as

which shows that $(h-k)\cdot\bar{f}(s)=0$ for each $s\in\mathcal{S}_+$ . A similar argument gives $(h-k)\cdot\bar{f}(s)=0$ for every $s\in\mathcal{S}_-$ , so that $(h-k)\cdot\bar{f}(s)=(h-k)\cdot[\,f(s)-rs]=0$ for all $s\in\mathcal{S}$ .

F.2. Part (b)

Suppose that there exists $k\in\Theta^c$ such that $r\in\partial z(k)=\{\nu(k)\}$ and bear in mind that $ \newcommand{\e}{{\rm e}} z(k)-k\cdot r>\ell$ and $\sum\nolimits_{s\geqslant 1}{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)=1$ for such k by lemma 1. The condition $\nu(k)=r$ is tantamount to $\sum\nolimits_{s\geqslant 1}\bar{f}(s)\,{\rm e}^{k\cdot f(s)+v(s)-z(k) s}\,p(s)=0$ with $\bar{f}(s):=f(s)-rs$ for all s. This way, making use of the bound $ \newcommand{\e}{{\rm e}} z(k)-k\cdot r>\ell$ first and on the bound $e^y\geqslant 1+y$ valid for all $ \newcommand{\Rl}{\mathbb{R}} y\in\Rl$ later, we find for each $ \newcommand{\Rl}{\mathbb{R}} h\in\Rl^d$

It follows from here that $ \newcommand{\e}{{\rm e}} \Theta=\emptyset$ .

Let h be a point in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d=\Theta^c$ . If the condition $(h-k)\cdot[\,f(s)-rs]=0$ for every $s\in\mathcal{S}$ is satisfied, then it is immediate to verify that $\nu(h)=\nu(k)=r$ . If instead $r=\nu(h)$ , then $z(\lambda h+(1-\lambda)k)=\lambda z(h)+(1-\lambda)z(k)$ for all $\lambda\in[0,1]$ as before. This way, by taking the second derivative with respect to $\lambda$ and by repeating the previous arguments, we find $(h-k)\cdot[\,f(s)-rs]=0$ for all $s\in\mathcal{S}$ .

F.3. Part (c)

Suppose that w  =  r and and that $k\in\Theta$ . The latter in particular means that $ \newcommand{\e}{{\rm e}} \Theta\ne\emptyset$ . Pick $ \newcommand{\Rl}{\mathbb{R}} h\in\Rl^d$ . If $h\in\Theta$ , then $r\in\partial z(h)$ by part (b) and (c) of proposition 4. If $r\in\partial z(h)$ and $h\in\Theta^c$ , then $ \newcommand{\e}{{\rm e}} \Theta=\emptyset$ by part (b), which is a contradiction. This way, $r\in\partial z(h)$ implies $h\in\Theta$ .

Appendix G.: Proof of lemma 5

The lemma is due to part (c) of theorem 1 and the fact that $\inf\nolimits_{w\in F}\{I(w)\}>0$ since $ \newcommand{\e}{{\rm e}} F\cap\partial z(0)=\emptyset$ . The latter is obvious if $I(w)\geqslant 1$ for all $w\in F$ . If instead $I(w)<1$ for some $w\in F$ , then the nonempty set $K:=\{w\in F:I(w)\leqslant 1\}$ is compact because I is a good rate function and, as a consequence, I attains a minimum over K by lower semicontinuity. This means that there exists $u\in K$ such that $I(w)\geqslant I(u)$ for all $w\in K$ , and $I(w)\geqslant I(u)$ for all $w\in F$ follows as $I(u)\leqslant 1$ . We find $\inf\nolimits_{w\in F}\{I(w)\}=I(u)>0$ because $u\notin\partial z(0)$ when $u\in K$ .

Appendix H.: Proof of theorem 4

H.1. Part (b)

We prove part (b) first. We already know from lemma 5 that if $\partial z(0)$ is a singleton, then Wt/t converges exponentially to $\rho:=r$ when $\theta(0)\leqslant 1$ and to $\rho:=\nu(0)$ when $\theta(0)>1$ . Conversely, if (10) holds for a fixed $\delta>0$ and the corresponding $\lambda>0$ , then part (b) of theorem 1 shows that $ \newcommand{\prob}{\mathbb{P}} -I(w)\leqslant \liminf\nolimits_{t\uparrow\infty}(1/t)\ln\prob_t^c[\|W_t/t-\rho\|>\delta]\leqslant-\lambda$ whenever $\|w-\rho\|>\delta$ . This implies that if $w\in\partial z(0)$ , so that $I(w)=0$ , then $\|w-\rho\|\leqslant\delta$ and the arbitrariness of $\delta$ gives $w=\rho$ .

H.2. Part (a)

Part (a) follows from part (b) when $\partial z(0)$ is a singleton, so that it remains to verity part (a) when z is not differentiable at the origin. Set $ \newcommand{\e}{{\rm e}} p_o(s):={\rm e}^{v(s)-\ell s}\,p(s)$ for all $s\geqslant 1$ . Proposition 4 states that necessary conditions for z not to be differentiable at the origin are $ \newcommand{\e}{{\rm e}} \ell>-\infty$ , $\theta(0)=\sum\nolimits_{s\geqslant 1}p_o(s)=1$ , and $\sum\nolimits_{s\geqslant 1}s\,p_o(s)<\infty$ as $ \newcommand{\e}{{\rm e}} z(0)=\ell$ by lemma 1 when $\theta(0)=1$ . Let us consider for a moment a new probability space $ \newcommand{\prob}{\mathbb{P}} (\Omega_o,\mathcal{F}_o,\prob_o)$ where a sequence $\{S_i\}_{i\geqslant 1}$ of independent waiting times distributed according to the new distribution po is given. Denoting by $ \newcommand{\Ex}{\mathbb{E}} \Ex_o$ the expectation under $ \newcommand{\prob}{\mathbb{P}} \prob_o$ , we have $ \newcommand{\Ex}{\mathbb{E}} \Ex_o[S_1]=\sum\nolimits_{s\geqslant 1} s\,p_o(s)<\infty$ and $ \newcommand{\Ex}{\mathbb{E}} \Ex_o[\|f(S_1)\|]<\infty$ since $\|f(S_1)\|\leqslant MS_1$ with some positive constant $M<\infty$ and full probability by assumption 3. We observe that $ \newcommand{\Ex}{\mathbb{E}} \Ex_o[\,f(S_1)]/\Ex_o[S_1]=\nu(0)$ . The probability space $ \newcommand{\prob}{\mathbb{P}} (\Omega_o,\mathcal{F}_o,\prob_o)$ fulfills the following important properties: $ \newcommand{\Ex}{\mathbb{E}} \lim\nolimits_{t\uparrow\infty}\Ex_o[U_t]=1/\Ex_o[S_1]$ and

Equation (H.1)

for any $\delta>0$ , Nt being the number of renewals by t. Since $\sum\nolimits_{s\geqslant 1} p_o(s)=1$ , the limit $ \newcommand{\Ex}{\mathbb{E}} \lim\nolimits_{t\uparrow\infty}\Ex_o[U_t]=1/\Ex_o[S_1]$ is established by applying the renewal theorem (see [57], theorem 1 in Chapter XIII.10) to the renewal equation $ \newcommand{\Ex}{\mathbb{E}} \Ex_o[U_t]=\sum\nolimits_{s=1}^tp_o(s)\,\Ex_o[U_{t-s}]$ valid for every $t\geqslant 1$ . This equation is deduced by conditioning on $T_1=S_1$ and then by using the fact that a renewal process starts over at every renewal. The limit (H.1) is due to the strong law of large numbers. In fact, the strong law of large numbers tells us that $ \newcommand{\Ex}{\mathbb{E}} \lim\nolimits_{n\uparrow\infty}(1/n)\sum\nolimits_{i=1}^n S_i=\Ex_o[S_1]$ and $ \newcommand{\Ex}{\mathbb{E}} \lim\nolimits_{n\uparrow\infty}(1/n)\sum\nolimits_{i=1}^n f(S_i)=\Ex_o[\,f(S_1)]$ $ \newcommand{\prob}{\mathbb{P}} \prob_o$ -almost surely since $ \newcommand{\Ex}{\mathbb{E}} \Ex_o[S_1]<\infty$ and $ \newcommand{\Ex}{\mathbb{E}} \Ex_o[\|f(S_1)\|]<\infty$ . On the other hand, $\lim\nolimits_{t\uparrow\infty}N_t=\infty$ $ \newcommand{\prob}{\mathbb{P}} \prob_o$ -almost surely because the event where one of the waiting times is infinite has probability zero with respect to the probability measure $ \newcommand{\prob}{\mathbb{P}} \prob_o$ . This way, we get $ \newcommand{\Ex}{\mathbb{E}} \lim\nolimits_{t\uparrow\infty}\sum\nolimits_{i=1}^{N_t} f(S_i)/\sum\nolimits_{i=1}^{N_t} S_i=\Ex_o[\,f(S_1)]/\Ex_o[S_1]=\nu(0)$ $ \newcommand{\prob}{\mathbb{P}} \prob_o$ -almost surely and (H.1) follows from the fact that almost sure convergence implies converge in probability.

The features of the probability space $ \newcommand{\prob}{\mathbb{P}} (\Omega_o,\mathcal{F}_o,\prob_o)$ allow us to prove the theorem as follows. The event Ut  =  1 with $t\geqslant 1$ is tantamount to the condition that an integer $n\geqslant 1$ exists so that Tn  =  t, which in particular yields Nt  =  n. Then, observing that $ \newcommand{\e}{{\rm e}} \prod_{i=1}^n {\rm e}^{v(s_i)}p(s_i)={\rm e}^{\ell t}\prod\nolimits_{i=1}^n p_o(s_i)$ whenever $s_1+\cdots+s_n=t$ , for any Borel set $\mathcal{B}$ in $ \newcommand{\Rl}{\mathbb{R}} \Rl^d$ we have

Equation (H.2)

Equation (H.3)

The identity (H.2) with $ \newcommand{\Rl}{\mathbb{R}} \mathcal{B}=\Rl^d$ gives $ \newcommand{\Ex}{\mathbb{E}} \newcommand{\e}{{\rm e}} Z_t^c={\rm e}^{\ell t}\,\Ex_o[U_t]$ , which shows that $ \newcommand{\Ex}{\mathbb{E}} \newcommand{\e}{{\rm e}} {\rm e}^{\ell t}\leqslant 2\Ex_o[S_1]Z_t^c$ for all sufficiently large t because of the limit $ \newcommand{\Ex}{\mathbb{E}} \lim\nolimits_{t\uparrow\infty}\Ex_o[U_t]=1/\Ex_o[S_1]$ . This way, using $ \newcommand{\Ex}{\mathbb{E}} \newcommand{\e}{{\rm e}} {\rm e}^{\ell t}\leqslant 2\Ex_o[S_1]Z_t^c$ in the bound (H.3) specialized to the closed set $ \newcommand{\Rl}{\mathbb{R}} \mathcal{B}:=\{w\in\Rl^d:\|w-\nu(0)\|\geqslant\delta\}$ and dividing by $Z_t^c$ , we find that for each $\delta>0$ and all sufficiently large t

We obtain $ \newcommand{\prob}{\mathbb{P}} \lim\nolimits_{t\uparrow\infty}\,\prob_t^c[\|W_t/t-\nu(0)\|\geqslant \delta]=0$ from here thanks to (H.1).

Footnotes

  • Assume that there are at most do  <  d linearly independent vectors of the form $f(s_1)-rs_1,\ldots,f(s_{d_o})-rs_{d_o}$ with $s_1,\ldots,s_{d_o}$ in $\mathcal{S}$ . By expanding $f(s)-rs$ on these vectors, we can define a function $ \newcommand{\Rl}{\mathbb{R}} f_o:\{1,2,\ldots\}\cup\{\infty\}\to\Rl^{d_o}$ with the property that f(s)  −  rs  =  Afo(s) for all $s\in\mathcal{S}$ , where $ \newcommand{\Rl}{\mathbb{R}} A\in\Rl^{d\times d_o}$ is a matrix whose lth column is $f(s_l)-rs_l$ . By construction, the matrix A has rank do, the values of fo on $\mathcal{S}$ are uniquely determined, and $f_o(s_1),\ldots,f_o(s_{d_o})$ is the canonical basis of $ \newcommand{\Rl}{\mathbb{R}} \Rl^{d_o}$ . Moreover, $f(s_o)/s_o=0$ if $\mathcal{S}$ is finite or fo(s)/s has limit 0 when s goes to infinity through $\mathcal{S}$ if $\mathcal{S}$ is infinite. It follows that $ \newcommand{\Rl}{\mathbb{R}} A^{\rm T}A\in\Rl^{d_o\times d_o}$ is invertible, where $A^{\rm T}$ denotes the transpose of A, and that the affine hull of $\{\,f_o(s)/s\}_{s\in\mathcal{S}}$ is $ \newcommand{\Rl}{\mathbb{R}} \Rl^{d_o}$ . Let z and I be the functions that (1) and (2) associate to f and let zo and Io be the functions that (1) and (2) associate to fo. It is a simple exercise based solely on definitions (1) and (2) to verify that $I(w)= I_o(A_ow+a_o)$ for all $ \newcommand{\dom}{{\rm dom}\,} w\in\dom I$ , where $A_o:=(A^{\rm T}A){}^{-1}A^{\rm T}$ and $a_o:=-A_or$ . This way, the rate function I can be computed starting from the rate function Io associated with a set $\{\,f_o(s)/s\}_{s\in\mathcal{S}}$ of full dimension.

Please wait… references are loading.
10.1088/1751-8121/ab523f