1 Introduction

A standard capacity expansion model, which is a special case of the model studied by Kobila [34], can be described as follows. We model market uncertainty by means of the geometric Brownian motion given by

$$\begin{aligned} dX_t^0 = bX_t^0 \, dt + \sqrt{2} \sigma X_t^0 \, dW_t , \quad X_0^0 = x >0 , \end{aligned}$$
(1)

for some constants b and \(\sigma \ne 0\), where W is a standard one-dimensional Brownian motion. The random variable \(X_t^0\) can represent an economic indicator such as the price of or the demand for one unit of a given investment project’s output at time t. The firm behind the project can invest additional capital at proportional costs at any time, but cannot disinvest from the project. We denote by y the project’s initial capital at time 0 and by \(\zeta _t\) the total additional capital invested by time t. We assume that there is no capital depreciation, so the total capital invested at time t is

$$\begin{aligned} Y_t = y + \zeta _t , \quad Y_0 = y \ge 0 . \end{aligned}$$
(2)

The investor’s objective is to maximise the total expected discounted payoff resulting from the project’s management, which is given by the performance index

$$\begin{aligned} J_{x,y}^0 (\zeta ) = {\mathbb {E}}\left[ \int _0^\infty e^{-rt} h(X_t^0, Y_t) \, dt - K \int _{[0, \infty [} e^{-rt} \, d\zeta _t \right] , \end{aligned}$$
(3)

over all capacity expansion strategies \(\zeta \). The discounting rate \(r>0\) and the cost of each additional unit of capital \(K>0\) are constants, while h is an appropriate running payoff function.

Under suitable assumptions on the problem data, the solution to this stochastic control problem is characterised by a threshold given by a strictly increasing free-boundary function \(G^0: {\mathbb R}_+ \rightarrow {\mathbb R}_+\). In the special case that arises when \(h(x,y) = x^\alpha y^\beta \), for some \(\alpha > 0\) and \(\beta \in ]0,1[\), namely, when h is a so-called Cobb-Douglas production function,

$$\begin{aligned} G^0 (y) = \left( \frac{rK (\alpha - m)}{-m\beta } \right) ^\frac{1}{\alpha } y^\frac{1-\beta }{\alpha } \quad \text {for } y \ge 0 , \end{aligned}$$

where \(m < 0\) is an appropriate constant. If the initial condition (xy) is strictly below the graph of the function \(G^0\) in the x-y plane, then it is optimal to invest so that the joint process \((X^0,Y)\) has a jump at time 0 that positions it in the graph of \(G^0\). Otherwise, it is optimal to take minimal action so that the state process \((X^0,Y)\) does not fall below the graph of \(G^0\), which amounts to reflecting it in \(G^0\) in the positive y-direction.

Irreversible capacity expansion models have attracted considerable interest and can be traced back to Manne [38] (see Van Mieghem [47] for a survey). More relevant to this paper models have been studied by several authors in the economics literature: see Dixit and Pindyck [17, Chapter 11] and references therein. Related models that have been studied in the mathematics literature include Davis, Dempster, Sethi and Vermes [13], Arntzen [4], Øksendal [42], Wang [48], Chiarolla and Haussmann [11], Bank [6], Alvarez [2, 3], Løkka and Zervos [35], Steg [45], Chiarolla and Ferrari [9], De Angelis, Federico and Ferrari [15], and references therein. Furthermore, capacity expansion models with costly reversibility were introduced by Abel and Eberly [1], and were further studied by Guo and Pham [22], Merhi and Zervos [40], Guo and Tomecek [23, 24], Guo, Kaminsky, Tomecek and Yuen [21], Løkka and Zervos [36], De Angelis and Ferrari [16], and Federico and Pham [19].

In the model that we have briefly discussed above, additional investment does not influence the underlying economic indicator, which is unrealistic if one considers supply and demand issues. The nature of the optimal strategy is such that, if \(b < {\frac{1}{2}}\sigma ^2\), then \(\lim _{t \rightarrow \infty } X_t^0 = 0\) and the investment’s maximal optimal capacity level remains finite for realistic choices of the problem data. On the other hand, if \(b \ge {\frac{1}{2}}\sigma ^2\), then \(\limsup _{t \rightarrow \infty } X_t^0 = \infty \) and the optimal capacity level typically converges to \(\infty \) as \(t \rightarrow \infty \).

The model that we study here assumes that additional investment has a strictly negative effect on the value of the underlying economic indicator process X. We assume that increasing the project’s capacity by a very small amount \(\Delta \zeta _t = \varepsilon \) at time t affects the process X linearly, namely,

$$\begin{aligned} \Delta X_t \equiv X_{t+} - X_t = - c \varepsilon X_t \quad \Rightarrow \quad X_{t+} = (1 - c \varepsilon ) X_t \simeq e^{- c \varepsilon } X_t , \end{aligned}$$

for some constant \(c>0\), where we have taken X to be càglàd. Furthermore, we assume that increasing the project’s capacity by an amount \(\Delta \zeta _t > 0\) at time t has the same effect on the process X as increasing the project’s capacity N times infinitesimally close to each other by an amount \(\Delta \zeta _t / N\) for every choice of N, which gives rise to the identities

$$\begin{aligned} X_{t+} = e^{- c (\Delta \zeta _t / N) N} X_t = e^{- c \Delta \zeta _t} X_t . \end{aligned}$$

These considerations suggest the modelling of market uncertainty by the solution to the SDE

$$\begin{aligned} dX_t = bX_t \, dt - X_t \circ d\zeta _t + \sqrt{2} \sigma X_t \, dW_t , \quad X_0 = x >0 , \end{aligned}$$
(4)

where

$$\begin{aligned} \int _0^t X_s \circ d\zeta _s = c \int _0^t X_s \, d\zeta _s^\mathrm {c}+ \sum _{0 \le s < t} X_t \bigl ( 1 - e^{-c \Delta \zeta _t} \bigr ) , \end{aligned}$$
(5)

in which expression, \(\zeta ^\mathrm {c}\) denotes the continuous part of the increasing process \(\zeta \). At this point, it is worth noting that Guo and Zervos [25] have considered the same state dynamics in the optimal execution problem that they study. The objective is to maximise over all admissible capacity expansion strategies \(\zeta \) the performance criterion

$$\begin{aligned} J_{x,y} (\zeta ) = {\mathbb {E}}\left[ \int _0^\infty e^{-rt} h(X_t,Y_t) \, dt - K \int _{[0, \infty [} e^{-rt} \, d\zeta _t \right] , \end{aligned}$$
(6)

where \(r, K > 0\) are constants and the running payoff function h satisfies Assumption 1 in the next section.

The solution to this problem is again characterised by a threshold defined by a strictly increasing free-boundary function G. Informally, the optimal strategy can be described as the one in the problem defined by (1)–(3). However, reflection in the free-boundary G is oblong rather than in the positive y-direction (see Figs. 1, 2, 3). Furthermore, the negative effect that additional investment has on the underlying economic indicator X results in a maximal optimal capacity level that is bounded in cases of special interest, such as the ones arising, e.g., when the running payoff function h is a Cobb-Douglas production function (see Example 2).

Fig. 1
figure 1

Graph of the free-boundary function G in the general context

Fig. 2
figure 2

Graph of G when h is a Cobb–Douglas function with \(\beta \in ]0,1[\)

Fig. 3
figure 3

Graph of G when h is a Cobb–Douglas function with \(\beta = 1\)

From a stochastic control theoretic perspective, the problem that we solve has the features of singular stochastic control, which was introduced by Bather and Chernoff [7] who considered a simplified model of spaceship control. In their seminal paper, Beneš, Shepp and Witsenhausen [8] were the first to solve rigorously an example of a finite-fuel singular control problem. Since then, the area has attracted considerable interest in the literature. Apart from references that we have discussed in the context of capacity expansion models, Bahlali et al. [5] Chiarolla and Haussmann [10], Chow, Menaldi and Robin [12], Davis and Zervos [14], Fleming and Soner [20, Chapter VIII], Haussmann and Suo [27, 28], Harrison and Taksar [26], Jack, Johnson and Zervos [29], Jacka [30, 31], Karatzas [32], Ma [37], Menaldi and Robin [39], Øksendal [42], Shreve et al. [43], Soner and Shreve [44], Sun [46] and Zhu [49], provide an alphabetically ordered list of further contributions.

In the references discussed above, the controlled process affects the state dynamics in a purely additive way: the change of the state process due to control action does not depend on the state process itself. Singular stochastic control models in which changes of the state process due to control action may depend on the state process were introduced and studied by Dufour and Miller [18] and Motta and Sartori [41]. To the best of our knowledge, problems with state dynamics such as the ones given by (4)–(5) have not been considered in the literature before. Furthermore, the problem that we solve is the very first one in the singular stochastic control literature that involves control action that does not affect the state dynamics in a purely additive way and admits an explicit solution (see also Remark 1 in the next section).

2 Problem Formulation and Assumptions

We fix a probability space \((\Omega , {\mathcal F},{\mathbb {P}})\) equipped with a filtration \(({\mathcal F}_t)\) satisfying the usual conditions of right continuity and augmentation by \({\mathbb {P}}\)-negligible sets, and carrying a standard one-dimensional \(({\mathcal F}_t)\)-Brownian motion W. We denote by \(\mathcal Z\) the family of all càglàd \(({\mathcal F}_t)\)-adapted increasing process \(\zeta \) such that \(\zeta _0 = 0\).

The state space of the control problem that we study is defined by

$$\begin{aligned} {\mathcal S} = \bigl \{ (x,y) \in {\mathbb R}^2 \mid \ x>0 \text { and } 0 \le y \le \bar{y} \bigr \} , \end{aligned}$$

where \(\bar{y} \in ]0, \infty ]\) is the maximal capital that can be invested in the project, namely, the maximum capacity level that can be achieved. Given a capacity expansion processes \(\zeta \in {\mathcal Z}\), we consider the capacity process Y defined by (2) and the economic indicator process X given by (4)–(5). Using Itô’s formula, we can verify that

$$\begin{aligned} X_t = X_t^0 e^{-c \zeta _t^\mathrm {c}} \prod _{0 \le s < t} \bigl ( 1 - e^{-c \Delta \zeta _t} \bigr ) = X_t^0 e^{-c\zeta _t} , \end{aligned}$$
(7)

where \(X^0\) is the geometric Brownian motion defined by (1).

Definition 1

The set \(\mathcal A\) of all admissible capacity expansion strategies is the family of all processes \(\zeta \in {\mathcal Z}\) such that

$$\begin{aligned} {\mathbb {E}}\left[ \int _{[0, \infty [} e^{-rt} \, d\zeta _t \right] < \infty . \end{aligned}$$
(8)

\( \Box \)

The objective of the control problem is to maximise the performance index \(J_{x,y}\) defined by (6) over all admissible strategies \(\zeta \in {\mathcal A}\), for each initial condition \((x,y) \in {\mathcal S}\). Accordingly, we define the problem’s value function v by

$$\begin{aligned} v(x,y) = \sup _{\zeta \in {\mathcal A}} J_{x,y} (\zeta ) , \quad \text {for } (x,y) \in {\mathcal S} . \end{aligned}$$
(9)

Remark 1

In view of (7), we can see that the stochastic optimisation problem we solve is equivalent to maximising

$$\begin{aligned} J_{x,y} (\zeta ) = {\mathbb {E}}\left[ \int _0^\infty e^{-rt} h(e^{cy} X_t^0 e^{-cY_t} ,Y_t) \, dt - K \int _{[0, \infty [} e^{-rt} \, d\zeta _t \right] \end{aligned}$$

over all admissible strategies \(\zeta \in {\mathcal A}\), where the dynamics of the state process \((X^0, Y)\) are given by (1)–(2). At first glance, this observation puts us in the context of the standard singular stochastic control theory because control action affects the dynamics of \((X^0, Y)\) in a purely additive way. However, such a reformulation is of limited theoretical value because the problem’s initial condition y enters non-trivially in the description of the performance criterion, which is a situation that is typically associated with time-inconsistent control problems. \(\Box \)

Our analysis involves the general solution to the second order Euler’s ODE

$$\begin{aligned} \sigma ^2 x^2 u''(x) + bx u' (x) - ru(x) = 0 , \end{aligned}$$
(10)

which is given by

$$\begin{aligned} u(x) = A x^n + B x^m , \end{aligned}$$

for some \(A, B \in {\mathbb R}\), where the constants \(m<0<n\) are the solutions to the quadratic equation

$$\begin{aligned} \sigma ^2 \lambda ^2 + ( b-\sigma ^2 ) \lambda - r =0 , \end{aligned}$$
(11)

given by

$$\begin{aligned} m,n = \frac{-(b-\sigma ^2) \pm \sqrt{(b-\sigma ^2 )^2 + 4 \sigma ^2 r}}{2\sigma ^2} . \end{aligned}$$
(12)

Our analysis also involves the function H defined by

$$\begin{aligned} H(x,y) = h_y (x,y) - cx h_x (x,y) - rK , \quad \text {for } x>0 \text { and } y \in ]0,\bar{y}[ . \end{aligned}$$
(13)

This function has a natural economic interpretation. Indeed, increasing capacity by a small amount \(\varepsilon > 0\) causes the joint process (XY) to jump from a value (xy) to the value \((x - cx \varepsilon , y + \varepsilon )\). Noting that

$$\begin{aligned} h(x - cx \varepsilon , y + \varepsilon ) - h(x,y) \simeq \bigl [ h_y (x,y) - cx h_x (x,y) \bigr ] \varepsilon \quad \text {and} \quad K = \int _0^\infty e^{-rt} rK \, dt , \end{aligned}$$

we can see that H(xy) represents the project’s marginal running payoff rate in excess of the marginal cost of capital rate. In view of standard economics theory, this interpretation suggests that (a) the function \(H(\cdot , y)\) should be increasing for all \(y \ge 0\) because higher values of the underlying economic indicator X, which models the price of or the demand for one unit of the project’s output, should reflect higher values of marginal running payoff, and (b) the function \(H(x, \cdot )\) should be decreasing for all \(x>0\) because the project’s payoff rate should be concave in the volume of its output due to the balancing of supply and demand. These observations suggest requirements (17)–(19) in the following assumption. In fact, the conditions reflected by (17)–(19) are much weaker than the ones suggested by the above considerations. However, the relaxations involved present no added complications in our analysis whatsoever. The underlying economics theory also suggests that the running payoff function h should be increasing in the value of the underlying economic indicator X for each fixed value of the project’s capacity, which is captured by condition (14). The rest of the conditions appearing in the following assumption, which is admittedly rather long to state, are of a purely technical nature. It is worth noting that (15) is equivalent to the probabilistic condition

$$\begin{aligned} {\mathbb {E}}\left[ \int _0^\infty e^{-rt} \left| h(X_t^0, y) \right| dt \right] < \infty \quad \text {for all } x > 0 \text { and } y \in [0, \bar{y}] \cap {\mathbb R}\end{aligned}$$

(see (77)–(78) in Appendix 2).

Assumption 1

The constants r, K are strictly positive, the function h is \(C^3\),

$$\begin{aligned}&h(\cdot , y) \text { is increasing for all } y \in [0, \bar{y}] \cap {\mathbb R}, \end{aligned}$$
(14)
$$\begin{aligned}&\int _0^x s^{-m-1} \left| h(s,y) \right| ds + \int _x^\infty s^{-n-1} \left| h(s,y) \right| ds < \infty \quad \text {for all } x > 0 \text { and } y \in [0, \bar{y}] \cap {\mathbb R}. \nonumber \\ \end{aligned}$$
(15)

There exists a point \(x_0 \ge 0\) and a continuous strictly increasing function \(y^\dagger : ]x_0, \infty [ \rightarrow {\mathbb R}_+\) such that

$$\begin{aligned} 0 \le y_0 := \lim _{x \downarrow x_0} y^\dagger (x) < \lim _{x \rightarrow \infty } y^\dagger (x) =: y_\infty \le \bar{y} , \quad y_0 = 0 \text { if } x_0 > 0 , \end{aligned}$$
(16)
$$\begin{aligned} H(x,y) {\left\{ \begin{array}{ll} < 0 , &{} \text {if } (x,y) \in {\mathcal H}_- ,\\ = 0 , &{} \text {if } (x,y) \in {\mathcal S} \setminus ({\mathcal H}_- \cup {\mathcal H}_+) , \\ > 0 , &{} \text {if } (x,y) \in {\mathcal H}_+ , \end{array}\right. } \end{aligned}$$
(17)
$$\begin{aligned} \liminf _{x \rightarrow \infty } H(x,y) > 0 \quad \text {for all } y \in ]y_0, y_\infty [ , \end{aligned}$$
(18)
$$\begin{aligned} \text{ the } \text{ function } H(x, \cdot ) \text { is strictly decreasing for all } y \in ]y_0, y_\infty [ , \end{aligned}$$
(19)

where

$$\begin{aligned} {\mathcal H}_- = \bigl \{ (x,y) \in {\mathcal S} \mid \ x \le x_0 \text { or } x > x_0 \text { and } y > y^\dagger (x) \bigr \} , \\ {\mathcal H}_+ = \bigl \{ (x,y) \in {\mathcal S} \mid \ x > x_0 \text { and } y < y^\dagger (x) \bigr \} . \end{aligned}$$

Also, there exist a decreasing function \(\Psi : ]y_0, y_\infty [ \rightarrow ]0,\infty [\) such that \(\lim _{y \downarrow 0} \Psi (y) < \infty \) if \(x_0 > 0\) as well as constants \(C_0>0\) and \(\vartheta \in ]0,n[\) such that

$$\begin{aligned} - C_0 (1+y) \le h(x,y) \le C_0 (1+y) \bigl ( 1+x^{n-\vartheta } \bigr ) \quad \text {for all } (x,y) \in \mathcal{S} , \end{aligned}$$
(20)
$$\begin{aligned} H(x,y) \le \Psi (y) \bigl ( 1+x^{n-\vartheta } \bigr ) \quad \text {for all } x>0 \text { and } y \in ]0,\bar{y}[ . \end{aligned}$$
(21)

\(\Box \)

We denote by \(x^\dagger \) the inverse of the function \(y^\dagger \) that is defined by

$$\begin{aligned} x^\dagger (y) = {\left\{ \begin{array}{ll} 0 , &{} \text {if } 0 \le y < y_0 , \\ (y^\dagger )^{-1} (x) , &{} \text {if } y_0 \le y < y_\infty , \\ \infty , &{} \text {if } y_\infty \le y < \bar{y} . \end{array}\right. } \end{aligned}$$
(22)

Example 1

Suppose that \(\bar{y} = \infty \) and h is a so-called Cobb-Douglas function, given by

$$\begin{aligned} h(x,y) = x^\alpha y^\beta , \quad \text {for } (x,y) \in {\mathcal S} , \end{aligned}$$
(23)

where \(\alpha \in ]0,n[\) and \(\beta \in ]0,1]\) are constants. In this case, we can check that

$$\begin{aligned} H(x,y) = \bigl ( \beta y^{-1} - c\alpha \bigr ) x^\alpha y^\beta - rK . \end{aligned}$$

If we define

$$\begin{aligned} y_0 = 0 , \quad y_\infty = \frac{\beta }{c\alpha } \quad \text {and} \quad x_0 = {\left\{ \begin{array}{ll} (rK)^{1/\alpha } , &{} \text {if } \beta =1 ,\\ 0 , &{} \text {if } \beta \in ]0,1[ , \end{array}\right. } \end{aligned}$$

then we can see that the calculations

$$\begin{aligned} \frac{\partial H (x,y)}{\partial x}= & {} \alpha \bigl ( \beta y^{-1} - c\alpha \bigr ) x^{\alpha - 1} y^\beta {\left\{ \begin{array}{ll} > 0 &{} \text {for all } y \in ]y_0, y_\infty [ , \\ < 0 &{} \text {for all } y \ge y_\infty , \end{array}\right. } \nonumber \\ \lim _{x \downarrow 0} H(x,y)= & {} -rK < 0 \text { for all } y > 0 \quad \text {and} \quad \lim _{x \rightarrow \infty } H(x,y) \nonumber \\= & {} {\left\{ \begin{array}{ll} \infty &{} \text {for all } y \in ]y_0, y_\infty [ \\ -\infty , &{} \text {for all } y \ge y_\infty , \end{array}\right. } \end{aligned}$$

imply that there exists a unique function \(y^\dagger : ]x_0, \infty [ \rightarrow {\mathbb R}_+\) such that (16)–(17) hold true. Furthermore, differentiating the identity \(H \bigl ( x, y^\dagger (x) \bigr ) = 0\) with respect to x, we can see that

$$\begin{aligned} \dot{y}^\dagger (x) = \frac{\alpha y (\beta - c\alpha y)}{\beta x \bigl [ (1-\beta ) + c\alpha y \bigr ]} > 0 \quad \text {for all } y \in ]y_0, y_\infty [ , \end{aligned}$$

so \(y^\dagger \) is indeed strictly increasing. Also, it is straightforward to check that (19)–(18) and (20)–(21) are all satisfied for \(\vartheta = n - \alpha \) and

$$\begin{aligned} \Psi (y) = {\left\{ \begin{array}{ll} 1, &{} \text {if } \beta = 1 , \\ y^{-(1-\beta )} , &{} \text {if } \beta \in ]0,1[ . \end{array}\right. } \end{aligned}$$

\(\square \)

3 The Solution to the Control Problem

We solve the stochastic control problem that we consider by constructing an appropriate classical solution \(w: {\mathcal S} \rightarrow {\mathbb R}\) to the Hamilton-Jacobi-Bellman (HJB) equation

$$\begin{aligned} \max \Bigl \{ \sigma ^2 x^2 w_{xx} (x,y) + bx w_x (x,y)&- r w(x,y) + h(x,y) , \nonumber \\&w_y (x,y) - c x w_x (x,y) - K \Bigr \} = 0 , \quad (x,y) \in {\mathcal S} , \end{aligned}$$
(24)

where \(w_y (x,0) = \lim _{y \downarrow 0} w_y (x,y)\). To obtain qualitative understanding of this equation, we consider the following heuristic arguments. At time 0, the project’s management has two options. The first one is to wait for a short time \(\Delta t\) and then continue optimally. Bellman’s principle of optimality implies that this option, which is not necessarily optimal, is associated with the inequality

$$\begin{aligned} v(x,y) \ge E \left[ \int _0^{\Delta t} e^{-rt} h(X_t^0,y) \, dt + e^{-r\Delta t} v \bigl ( X_{\Delta t}^0,y \bigr ) \right] . \end{aligned}$$

Applying Itô’s formula to the second term in the expectation, and dividing by \(\Delta t\) before letting \(\Delta t \downarrow 0\), we obtain

$$\begin{aligned} \sigma ^2 x^2 v_{xx} (x,y) + bx v_x (x,y) - rv(x,y) + h(x,y) \le 0 . \end{aligned}$$
(25)

The second option is to increase capacity by \(\varepsilon > 0\), and then continue optimally. This action is associated with the inequality

$$\begin{aligned} v(x,y) \ge v(x-cx\varepsilon , y+\varepsilon ) - K \varepsilon . \end{aligned}$$

Rearranging terms and letting \(\varepsilon \downarrow 0\), we obtain

$$\begin{aligned} v_y (x, y) -c x v_x (x, y) - K \le 0 . \end{aligned}$$
(26)

Furthermore, the Markovian character of the problem implies that one of these options should be optimal and one of (25), (26) should hold with equality at any point in the state space \(\mathcal S\). It follows that the problem’s value function v should identify with an appropriate solution w to the HJB equation (24).

To construct the solution w to (24) that identifies with the value function v, we first consider the existence of a strictly increasing function \(G: ]y_0, y_\infty [ \rightarrow ]0, \infty [\) that partitions the state space \(\mathcal S\) into two regions, the “waiting” region \(\mathcal W\) and the “investment” region \(\mathcal I\) defined by

$$\begin{aligned} {\mathcal W}&=\bigl \{ (x,0) \mid \ 0 < x \le x_0 \text { if } x_0 > 0 \bigr \} \\&\quad \; \cup \bigl \{ (x,y) \mid \ y \in ]y_0, y_\infty [ \text { and } 0 < x \le G(y) \bigr \} \nonumber \\&\quad \; \cup \bigl \{ (x,y) \mid \ x > 0 \text { and } y \in [y_\infty , \bar{y}] \cap {\mathbb R}\bigr \} , \nonumber \\ {\mathcal I}&= \bigl \{ (x,0) \mid \ x > x_0 \text { if } x_0 > 0 \bigr \} \\&\quad \; \cup \bigl \{ (x,y) \mid \ x > 0 \text { and } y \in [0, y_0] \text { if } y_0 > 0\bigr \} \nonumber \\&\quad \; \cup \bigl \{ (x,y) \mid y \in ]y_0, y_\infty [ \text { and } x > G(y) \bigr \} . \end{aligned}$$

In view of the interpretation of the function H defined by (13) as the project’s marginal running payoff rate in excess of the marginal cost of capital rate, which we have discussed in the previous section, we can see that increasing capacity cannot be optimal whenever the state process takes values \((x,y) \in {\mathcal S}\) such that \(H(x,y) < 0\). This observation, (17) in Assumption 1 and (22) suggest that the inequality

$$\begin{aligned} G(y) < x^\dagger (y) \quad \text {for all } y \in ]y_0, y_\infty [ \end{aligned}$$

should hold true. Figures 1, 2, and 3 depict possible configurations of the waiting and the investment regions.

Inside the region \(\mathcal {W}\), the heuristic arguments that we have briefly discussed above suggest that w should satisfy the differential equation

$$\begin{aligned} \sigma ^2 x^2 w_{xx}(x,y) + bx w_x (x,y) - rw(x,y) + h(x,y) = 0 . \end{aligned}$$
(27)

In light of the theory that we review in Appendix 2 and the intuitive idea that the value function should remain bounded as \(x \downarrow 0\), every relevant solution to this ODE is given by

$$\begin{aligned} w(x,y) = A(y) x^n + R(x,y) , \end{aligned}$$
(28)

for some function A, where n is given by (12) and \(R(\cdot , y)\) is defined by (79) for \(k = h(\cdot , y)\), namely,

$$\begin{aligned} R(x,y) = \frac{1}{\sigma ^2 (n-m)} \left[ x^{m} \int _0^x s^{-m-1} h(s,y) \, ds + x^n \int _x^\infty s^{-n-1} h(s,y) \, ds \right] .\nonumber \\ \end{aligned}$$
(29)

On the other hand, w should satisfy

$$\begin{aligned} w_y (x,y) - cx w_x (x,y) = K , \quad \text {for } (x,y) \in {\mathcal I} , \end{aligned}$$
(30)

which implies that

$$\begin{aligned} w_{yx} (x,y) - c x w_{xx} (x,y) - c w_x (x,y) = 0 , \quad \text {for } (x,y) \in {\mathcal I} . \end{aligned}$$
(31)

To determine A and G, we postulate that w is \(C^{2,1}\), in particular, along the free-boundary G. Such a requirement and (28)–(31) yield the system of equations

$$\begin{aligned} \bigl [ \dot{A} (y) - nc A(y) \bigr ] G^n (y)= & {} - \Bigl [ R_y \bigl ( G(y),y \bigr ) - c G(y) R_x \bigl ( G(y),y \bigr ) - K \Bigr ] , \end{aligned}$$
(32)
$$\begin{aligned} \bigl [ \dot{A} (y) - nc A(y) \bigr ] G^n (y)= & {} - \frac{G(y)}{n} \Bigl [ R_{yx} \bigl ( G(y),y \bigr ) \nonumber \\&- \,c G(y) R_{xx} \bigl ( G(y),y \bigr ) - c R_x \bigl ( G(y),y \bigr ) \Bigr ] . \end{aligned}$$
(33)

In view of the definition (29) of R, the associated expression (84) for the function \(x \mapsto xR_x (x,y)\) and (83), we can see that this system is equivalent to

$$\begin{aligned} q \bigl ( G(y),y \bigr ) = 0, \end{aligned}$$
(34)
$$\begin{aligned} \dot{A} (y) = nc A(y) - \frac{1}{\sigma ^2 (n-m)} \int _{G(y)}^{\infty } s^{-n-1} H(s,y) \, ds , \end{aligned}$$
(35)

where H is defined by (13) and

$$\begin{aligned} q(x,y) = \int _0^x s^{-m-1} H(s,y) \, ds . \end{aligned}$$
(36)

We can also check that the solution to (35) is given by

$$\begin{aligned} A(y) = \frac{e^{c n y}}{\sigma ^2 (n-m)} \int _y^{y_\infty } e^{-c n u} \int _{G(u)}^\infty s^{-n-1} H(s,u) \, ds \, du , \quad \text {for } y_0 < y < y_\infty , \end{aligned}$$
(37)

if the integrals converge.

The following result, the proof of which we develop in Appendix 1, is concerned with the solution to the system of equations (34)–(35).

Lemma 1

Suppose that Assumption 1 holds true. The equation \(q(x,y)=0\) for \(x>0\) defines uniquely a strictly increasing \(C^1\) function \(G: ]y_0, y_\infty [ \rightarrow ]0,\infty [\), which satisfies

$$\begin{aligned}&x^\dagger (y) < G(y) \text { for all } y \in ]y_0, y_\infty [ , \quad \lim _{y \downarrow y_0} G(y) = 0 , \text { if } y_0 > 0 , \quad \text {and} \quad \nonumber \\&\quad \lim _{y \uparrow y_\infty } G(y) = \infty , \end{aligned}$$
(38)

where \(x^\dagger \) is defined by (22). Furthermore, the function A given by (37) is well-defined and real-valued, and there exists a constant \(C_1 > 0\) such that

$$\begin{aligned} 0 < A(y) G^n (y) \le C_1 \Psi (y) \left[ 1 + G^{n-\vartheta } (y) \right] \quad \text {for all } y \in ]y_0, y_\infty [ , \end{aligned}$$
(39)

where the decreasing function \(\Psi \) and the constant \(\vartheta > 0\) are as in (21), and

$$\begin{aligned} g^{-1} (x) + \left[ 1 + g^{-1} (x) \right] G^{n-\vartheta } \bigl ( g^{-1} (x) \bigr ) \le C_1 \bigl [ 1+ x^{n-\vartheta } \bigr ] \quad \text {for all } x > x_0 , \end{aligned}$$
(40)

where \(g^{-1}\) is the inverse of the strictly increasing function g that is defined by

$$\begin{aligned} g(y) = e^{cy} G(y) , \quad \text {for } y \in ]y_0, y_\infty [ . \end{aligned}$$
(41)

Remark 2

The last limit in (38) implies that, under the optimal strategy, if \(\bar{y} < \infty \), then the maximal capacity level \(\bar{y}\) is never reached. This result is due to the assumption that the function \(y^\dagger \) appearing in Assumption 1 is such that \(y^\dagger (\chi ) < \lim _{x \rightarrow \infty } y^\dagger (x) \equiv y_\infty \le \bar{y}\) for all \(\chi \in ]x_0, \infty [\). Our analysis could be trivially modified to allow for the possibility that \(\bar{y} < \infty \) and \(\lim _{y \uparrow \bar{y}} G(y) < \infty \), which would give rise to the situation where the maximal capacity level \(\bar{y}\) is reached in finite time with strictly positive probability. Such a relaxation would simply involve allowing for the strictly increasing function \(y^\dagger \) to be such that \(\lim _{x \rightarrow \infty } y^\dagger (x) \equiv y_\infty > \bar{y}\). However, we have opted against such a relaxation because this would complicate the notation and the proof of Lemma 1 substantially. \(\square \)

Example 2

Suppose that h is a Cobb-Douglas function given by (23) in Example 1. In this case, we can check that

$$\begin{aligned} G(y) = \left[ \frac{rK (\alpha -m)}{-m} \frac{y^{1-\beta }}{\beta -\alpha c y} \right] ^{1 / \alpha } ,\quad \text {for } y \in ]y_0, y_\infty [ \equiv ]0, \beta /c\alpha [ . \end{aligned}$$
(42)

Figures 2 and 3 illustrate this example. \(\square \)

To complete the construction of the solution w to the HJB equation (24) that identifies with the problem’s value function v, we note that there exists a mapping \(z : {\mathcal I} \rightarrow {\mathbb R}_+\) such that

$$\begin{aligned} z(x,y) \in ](y_0 -y)^+, y_\infty -y[ \quad \text {and} \quad xe^{-cz(x,y)} = G \bigl ( y + z(x,y) \bigr ) \quad \text {for all} (x,y) \in {\mathcal I} . \end{aligned}$$
(43)

Indeed, this claim follows immediately from the calculations

$$\begin{aligned}&\lim _{z \uparrow y_\infty - y} \Bigl [ xe^{-cz} - G(y+z) \Bigr ] = - \infty , \\&\frac{\partial }{\partial z} \Bigl [ xe^{-cz} - G(y+z) \Bigr ] = - cx e^{-cz} - G' (y+z) < 0 , \quad \text {for } z \in ] (y_0-y)^+, y_\infty -y[ , \\&\lim _{z \downarrow (y_0 -y)^+} \Bigl [ xe^{-cz} - G(y+z) \Bigr ] = \left. {\left\{ \begin{array}{ll} xe^{-c (y_0-y)} - \lim _{u \downarrow y_0} G(u) , &{} \text {if } y \le y_0 , \\ x - G(y) , &{} \text {if } y > y_0 \end{array}\right. } \right\} > 0 , \end{aligned}$$

in which, we have used (38) and the fact that G is increasing. We prove the following result in Appendix 1.

Lemma 2

Suppose that Assumption 1 holds true. The function w defined by

$$\begin{aligned} w(x,y) = {\left\{ \begin{array}{ll} R(x,y) , &{} \text {if } (x,y) \in {\mathcal W} \cap \bigl ( {\mathbb R}_+ \times [y_\infty , \bar{y}] \bigr ) , \\ A(y) x^n + R(x,y) , &{} \text {if } (x,y) \in {\mathcal W} \cap \bigl ( {\mathbb R}_+ \times [y_0, y_\infty [ \bigr ), \\ w \bigl ( xe^{-cz(x,y)}, y+z(x,y) \bigr ) - K z(x,y) , &{} \text {if } (x,y) \in {\mathcal I} , \end{array}\right. } \end{aligned}$$
(44)

where A is defined by (37) and z is given by (43), is a \(C^{2,1}\) solution to the HJB equation (24). Furthermore, the function \(w(\cdot ,y)\) is increasing and there exists a constant \(C_2 > 0\) such that

$$\begin{aligned} - C_2 (1+y) \le w(x,y) \quad \text {for all } (x,y) \in {\mathcal S} , \end{aligned}$$
(45)
$$\begin{aligned} w \bigl ( G(y),y \bigr ) \le C_2 [\Psi (y) + y] \bigl [ 1 + G^{n-\vartheta } (y) \bigr ] \quad \text {for all } y \in ]y_0, y_\infty [ , \end{aligned}$$
(46)

where the decreasing function \(\Psi \) is as in (20)–(21).

We can now establish the main result of the paper.

Theorem 1

Suppose that Assumption 1 holds true. The value function v of the control problem formulated in Sect. 2 identifies with the solution w to the HJB equation (24) given by (44) in Lemma 2 and the optimal capacity expansion strategy \(\zeta ^\star \) is given by

$$\begin{aligned} \zeta _t^\star = {\left\{ \begin{array}{ll} 0 , &{} \text {if } y > y_0 \text { and } e^{cy} \sup _{0 \le s \le t} X_s^0 \le \overline{g} (y) , \\ g^{-1} \left( e^{cy} \sup _{0 \le s \le t} X_s^0 \right) , &{} \text {if } y < y_\infty \text { and } e^{cy} \sup _{0 \le s \le t} X_s^0 > \overline{g} (y) , \end{array}\right. } \quad \text {for } t>0 , \end{aligned}$$
(47)

where

$$\begin{aligned} \overline{g} (y) = {\left\{ \begin{array}{ll} 0 , &{} \text {if } y_0 > 0 \text { and } y \le y_0 , \\ g(y) , &{} \text {if } y \in ]y_0, y_\infty [ , \\ \infty , &{} \text {if } y \in [y_\infty , \bar{y}] \cap {\mathbb R}_+ , \end{array}\right. } \end{aligned}$$
(48)

g is defined by (41), and \(X^0\) is the geometric Brownian motion given by (1).

Proof

Fix any initial condition \((x,y) \in {\mathcal S}\) and any admissible strategy \(\zeta \in {\mathcal A}\). In view of Itô-Tanaka-Meyer’s formula and the left-continuity of the processes X, Y, we can see that

$$\begin{aligned}&e^{-rT} w(X_{T+} , Y_{T+}) \\&\quad = w(x,y) + \int _0^T e^{-rt} \bigl [ \sigma ^2 X_t^2 w_{xx} (X_t,Y_t) + bX_t w_x (X_t,Y_t) - rw(X_t,Y_t) \bigr ] \, dt \\&\qquad + \int _{[0,T]} \bigl [ w_y (X_t, Y_t) - cX_t w_x (X_t, Y_t) \bigr ] \, d\zeta _t^\mathrm {c}+ M_T \\&\qquad + \sum _{0 \le t \le T} e^{-rt} \bigl [ w( X_{t+}, Y_{t+} ) - w(X_t, Y_t) \bigr ] , \end{aligned}$$

where

$$\begin{aligned} M_T = \sqrt{2} \sigma \int _0^T e^{-rt} X_t w_x(X_t, Y_t) \, dW_t . \end{aligned}$$
(49)

Combining this calculation with the observation that

$$\begin{aligned}&w( X_{t+}, Y_{t+} ) - w(X_t, Y_t)\\&\quad \mathop {=}\limits ^{(7)} \int _0^{\Delta \zeta _t} \frac{dw \bigl ( X_t e^{-cs}, Y_t +s \bigr )}{ds} \, ds , \\&\quad = \int _0^{\Delta \zeta _t} \bigl [ w_y \bigl ( X_t e^{-cs}, Y_t + s \bigr ) - cX_t e^{-cs} w_x \bigl ( X_t e^{-cs}, Y_t + s \bigr ) \bigr ] \, ds , \end{aligned}$$

we obtain

$$\begin{aligned}&\int _0^T e^{-rt} h(X_t, Y_t) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t + e^{-rT} w(X_{T+}, Y_{T+}) \nonumber \\&\quad = w(x,y) + \int _0^T e^{-rt} \bigl [ \sigma ^2 X_t^2 w_{xx} (X_t,Y_t) + bX_t w_x (X_t,Y_t) - rw(X_t,Y_t) \nonumber \\&\qquad + h(X_t, Y_t) \bigr ] \, dt + \int _{[0,T]} \bigl [ w_y (X_t, Y_t) - cX_t w_x (X_t, Y_t) - K \bigr ] \, d\zeta _t^\mathrm {c}+ M_T \nonumber \\&\qquad + \sum _{0 \le t \le T} e^{-rt} \int _0^{\Delta \zeta _t} \bigl [ w_y \bigl ( X_t e^{-cs}, Y_t +s \bigr ) - cX_t e^{-cs} w_x \bigl ( X_t e^{-cs}, Y_t +s \bigr ) - K \bigr ] \, ds . \end{aligned}$$
(50)

Since w satisfies the HJB equation (24), it follows that

$$\begin{aligned} \int _0^T e^{-rt} h(X_t, Y_t) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t + e^{-rT} w(X_{T+}, Y_{T+}) \le w(x,y) + M_T . \end{aligned}$$
(51)

In view of the integration by parts formula and (2), we can see that

$$\begin{aligned} e^{-rT} Y_{T_+} - y = -r \int _0^T e^{-rt} Y_t \, dt + \int _{[0,T]} e^{-rt} \, d\zeta _t . \end{aligned}$$
(52)

This identity, the admissibility condition (8) in Definition 1 and the monotone convergence theorem imply that

$$\begin{aligned} {\mathbb {E}}\left[ \int _0^\infty e^{-rt} Y_t \, dt \right]&= \lim _{T \rightarrow \infty } {\mathbb {E}}\left[ \int _0^T e^{-rt} Y_t \, dt \right] \nonumber \\&\le \lim _{T \rightarrow \infty } \left( \frac{y}{r} + \frac{1}{r} {\mathbb {E}}\left[ \int _{[0,T]} e^{-rt} \, d\zeta _t \right] \right) \nonumber \\&= \frac{y}{r} + \frac{1}{r} {\mathbb {E}}\left[ \int _{[0,\infty [} e^{-rt} \, d\zeta _t \right] \nonumber \\&< \infty , \end{aligned}$$
(53)

which implies that

$$\begin{aligned} \liminf _{T \rightarrow \infty } {\mathbb {E}}\left[ e^{-rT} Y_{T+} \right] = 0 . \end{aligned}$$
(54)

The lower bound in (20), the estimate (45) and (52) imply that

$$\begin{aligned}&\int _0^T e^{-rt} h(X_t, Y_t) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t + e^{-rT} w(X_{T+}, Y_{T+}) \\&\quad \ge - C_0 \int _0^T e^{-rt} (1 + Y_t) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t - C_2 e^{-rT} (1 + Y_{T+}) \\&\quad \ge - C_0 \int _0^T e^{-rt} (1 + Y_t) \, dt - (K+C_2) \int _{[0,T]} e^{-rt} \, d\zeta _t - C_2 (1+y) \\&\quad \ge - \left( \frac{C_0}{r} + C_2 + C_2 y \right) - C_0 \int _0^\infty e^{-rt} Y_t \, dt - (K+C_2) \int _{[0,\infty [} e^{-rt} \, d\zeta _t . \end{aligned}$$

The admissibility condition (8) and (53) imply that the random variable on the right-hand side of these inequalities has finite expectation. Combining this observation with (51), we can see that \({\mathbb {E}}\left[ \inf _{T \ge 0} M_T \right] > - \infty \). Therefore, the stochastic integral M is a supermartingale and \({\mathbb {E}}\left[ M_T \right] \le 0\) for all \(T>0\). Furthermore, Fatou’s lemma implies that

$$\begin{aligned} J_{x,y} (\zeta ) \le \liminf _{T \rightarrow \infty } {\mathbb {E}}\left[ \int _0^T e^{-rt} h(X_t, Y_t) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t \right] . \end{aligned}$$

Taking expectations in (51) and passing to the limit, we obtain

$$\begin{aligned} J_{x,y} (\zeta ) \le w(x,y) + \liminf _{T \rightarrow \infty } e^{-rT} {\mathbb {E}}\left[ - w(X_{T+} , Y_{T+}) \right] . \end{aligned}$$

The inequality \(J_{x,y} (\zeta ) \le w(x,y)\) now follows because the estimate (45) implies that

$$\begin{aligned} \liminf _{T \rightarrow \infty } e^{-rT} E \bigl [ - w(X_{T+} , Y_{T+}) \bigr ]&\le \lim _{T \rightarrow \infty } C_2 e^{-rT} + C_2 \liminf _{T \rightarrow \infty } e^{-rT} E \left[ Y_{T+} \right] \mathop {=}\limits ^{(54)} 0 . \end{aligned}$$

Thus, we have proved that \(v(x,y) \le w(x,y)\).

To prove the reverse inequality and establish the optimality of the process \(\zeta ^\star \) given by (47), we first consider the possibility that \([y_\infty , \bar{y}] \cap {\mathbb R}_+ \ne \emptyset \) and \(y \in [y_\infty , \bar{y}]\). In this case, \(\zeta _t^\star = 0\) for all \(t \ge 0\), and

$$\begin{aligned} J_{x,y} (\zeta ^\star ) = {\mathbb {E}}\left[ \int _0^\infty e^{-rt} h(X_t^0, y) \, dt \right] \mathop {=}\limits ^{(29), (81)} R(x,y) \mathop {=}\limits ^{(44)} w(x,y) , \end{aligned}$$

which establish the required claims.

In the rest of the proof, we assume that \(y < y_\infty \). In this case,

$$\begin{aligned} Y_t^\star = {\left\{ \begin{array}{ll} y , &{} \text {if } y \in ]y_0, y_\infty [ \text { and } e^{cy} \sup _{0 \le s \le t} X_s^0 \le \overline{g}(y) , \\ g^{-1} \left( e^{cy} \sup _{0 \le s \le t} X_s^0 \right) , &{} \text {if } e^{cy} \sup _{0 \le s \le t} X_s^0 > \overline{g}(y) , \end{array}\right. } \end{aligned}$$
(55)

for all \(t>0\), and, apart from a possible initial jump of size \((g^{-1} (e^{cy}x) - y)^+\) at time 0, the process \((e^{cy} X^0 , Y^\star )\) is reflecting in the free-boundary g in the positive direction. In particular,

$$\begin{aligned} Y_t^\star \in [y_0, y_\infty [ , \quad e^{cy} X_t^0 \le g (Y_t^\star ) \quad \text {and} \quad \zeta _t^\star - \zeta _0^\star = \int _{]0,t[} \mathbf{1} _{\{ e^{cy} X_s^0 = g(Y_s^\star ) \}} \, d\zeta _s^\star \quad \text {for all } t > 0. \end{aligned}$$

In view of (7) and the definition (41) of g, we can see that

$$\begin{aligned} e^{cy} X_t^0 \le g (Y_t^\star ) \ \Leftrightarrow \ X_t^\star \le G (Y_t^\star ) \quad \text {and} \quad \{ e^{cy} X_t^0 = g(Y_t^\star ) \} = \{ X_t^\star = G(Y_t^\star ) \} , \end{aligned}$$

where \(X^\star \) is the solution to (4) given by (7). It follows that the process \((X^\star , Y^\star )\) satisfies

$$\begin{aligned} Y_t^\star \in [y_0, y_\infty [ , \quad X_t^\star \le G (Y_t^\star ) \quad \text {and} \quad \zeta _t^\star - \zeta _0^\star = \int _{]0,t[} \mathbf{1} _{\{ X_s^\star = G(Y_s^\star ) \}} \, d\zeta _s^\star \quad \text {for all } t > 0 . \end{aligned}$$
(56)

Since the function g is strictly increasing, \(\zeta _0^\star > 0\) if and only if \(xe^{cy} > g(y) \mathop {=}\limits ^{(41)} e^{cy} G(y)\). Therefore,

$$\begin{aligned} \zeta _0^\star = \bigl ( g^{-1} (e^{cy}x) - y \bigr )^+ > 0 \text { if and only if } (x,y) \in {\mathcal I} . \end{aligned}$$
(57)

Furthermore, given any \((x,y) \in {\mathcal I}\), we note that

$$\begin{aligned} z = g^{-1} (xe^{cy}) - y \ \Leftrightarrow \ xe^{cy} = e^{c(y+z)} G(y+z) \ \Leftrightarrow \ xe^{-cz} = G(y+z) , \end{aligned}$$

which implies that \(\zeta _0^\star = z(x,y)\), where the function z is given by (43). It follows that

$$\begin{aligned} w(X_{0+}^\star , Y_{0+}^\star ) - w(x,y) = w \bigl ( xe^{-cz(x,y)}, y+z(x,y) \bigr ) - w(x,y) \mathop {=}\limits ^{(44)} K z(x,y) . \end{aligned}$$
(58)

In light of (56)–(58) and the construction of the solution w to the HJB equation (24), we can see that (50) implies that

$$\begin{aligned} \int _0^T e^{-rt} h \bigl (X_t^\star , Y_t^\star \bigr ) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t^\star + e^{-rT} w \bigl ( X_T^\star , Y_T^\star \bigr ) = w(x,y) + M_T^\star \end{aligned}$$
(59)

for all \(T>0\), where the local martingale \(M^\star \) is defined as in (49).

To show that \(\zeta ^\star \) is indeed admissible, we use (40) and (55) to calculate

$$\begin{aligned} Y_t^\star = y \mathbf{1} _{\{ Y_t^\star = y \}} + g^{-1} \left( e^{cy} \sup _{0 \le s \le t} X_s^0 \right) \mathbf{1} _{\{ Y_t^\star > y \}} \le y + C_1 + C_1 e^{c(n-\vartheta )y} \left( \sup _{0 \le s \le t} X_s^0 \right) ^{n-\vartheta }. \end{aligned}$$

Combining these inequalities with the first estimate in (76), we can see that

$$\begin{aligned} \lim _{T \rightarrow \infty } {\mathbb {E}}\left[ e^{-rT} Y_T^\star \right] = 0 \quad \text {and} \quad {\mathbb {E}}\left[ \int _0^\infty e^{-rt} Y_t^\star \, dt \right] < \infty . \end{aligned}$$

It follows that

$$\begin{aligned} {\mathbb {E}}\left[ \int _{[0,\infty [} e^{-rt} \, d\zeta _t^\star \right]&= \lim _{T \rightarrow \infty } {\mathbb {E}}\left[ \int _{[0,T]} e^{-rt} \, d\zeta _t^\star \right] \nonumber \\&\mathop {=}\limits ^{(52)} \lim _{T \rightarrow \infty } \left( {\mathbb {E}}\left[ e^{-rT} Y_T^\star \right] + r {\mathbb {E}}\left[ \int _0^T e^{-rt} Y_t^\star \, dt \right] - y \right) \nonumber \\&< \infty , \end{aligned}$$
(60)

which proves that \(\zeta ^\star \in {\mathcal A}\).

To proceed further, we note that the inequality in (56), the fact that \(w(\cdot , y)\) is increasing and the bound given by (46) imply that, given any \(t>0\),

$$\begin{aligned} w(X_t^\star , Y_t^\star )&\le w \bigl ( G(Y_t^\star ), Y_t^\star \bigr ) \\&\le C_2 \bigl [ \Psi (Y_t^\star ) + Y_t^\star \bigr ] \bigl [ 1 + G^{n-\vartheta } (Y_t^\star ) \bigr ] \le C_2 \bigl [ \Psi (Y_{0+}) + Y_t^\star \bigr ] \bigl [ 1 + G^{n-\vartheta } (Y_t^\star ) \bigr ] , \end{aligned}$$

the last inequality following because \(\Psi \) is decreasing. Also, (20) and (56) imply that

$$\begin{aligned} h(X_t^\star , Y_t^\star ) \le C_0 (1 + Y_t^\star ) (1 + {X_t^\star } ^{n-\vartheta }) \le C_0 (1 + Y_t^\star ) \bigl [ 1 + G^{n-\vartheta } (Y_t^\star ) \bigr ] . \end{aligned}$$

The estimate (40) and (55) imply that

$$\begin{aligned}&(1 + Y_t^\star ) G^{n-\vartheta } (Y_t^\star )\\&\quad = (1+y) G^{n-\vartheta } (y) \mathbf{1} _{\{ Y_t^\star = y \}} \\&\qquad + \left[ 1 + g^{-1} \left( e^{cy} \sup _{0 \le s \le t} X_s^0 \right) \right] G^{n-\vartheta } \left( g^{-1} \left( e^{cy} \sup _{0 \le s \le t} X_s^0 \right) \right) \mathbf{1} _{\{ Y_t^\star > y \}} \\&\quad \le (1+y) G^{n-\vartheta } (y) \mathbf{1} _{\{ y>y_0 \}} + C_1 + C_1 e^{c(n-\vartheta )y} \left( \sup _{0 \le s \le t} X_s^0 \right) ^{n-\vartheta }. \end{aligned}$$

It follows that there exists a constant \(C_3 = C_3 (y)\) such that

$$\begin{aligned}&w(X_t^\star , Y_t^\star ) \le C_3 \left[ 1 + \left( \sup _{0 \le s \le t} X_s^0 \right) ^{n-\vartheta } \right] \quad \text {and} \quad \\&\quad h(X_t^\star , Y_t^\star ) \le C_3 \left[ 1 + \left( \sup _{0 \le s \le t} X_s^0 \right) ^{n-\vartheta } \right] \end{aligned}$$

for all \(t>0\). These inequalities and the estimates (76) imply that

$$\begin{aligned}&{\mathbb {E}}\left[ \sup _{T > 0} \left( \int _0^T e^{-rt} h \bigl ( X_t^\star , Y_t^\star \bigr ) \, dt + e^{-rT} w \bigl ( X_T^\star , Y_T^\star \bigr ) \right) \right] \nonumber \\&\quad \le C_3 \left( \frac{(1+r)}{r} + \int _0^\infty {\mathbb {E}}\left[ e^{-rt} \left( \sup _{0 \le s \le t} X_s^0 \right) ^{n-\vartheta } \right] dt\right. \nonumber \\&\left. \qquad +\, {\mathbb {E}}\left[ \sup _{T > 0} e^{-rT} \left( \sup _{0 \le s \le T} X_s^0 \right) ^{n-\vartheta } \right] \right) \nonumber \\&\quad < \infty , \end{aligned}$$
(61)

and

$$\begin{aligned} \liminf _{T \rightarrow \infty } e^{-rT} {\mathbb {E}}\bigl [ - w(X_T^\star , Y_T^\star ) \bigr ] \ge - C_3 \lim _{T \rightarrow \infty } e^{-rT} \left( 1 + {\mathbb {E}}\left[ \left( \sup _{0 \le s \le T} X_s^0 \right) ^{n-\vartheta } \right] \right) = 0 . \end{aligned}$$
(62)

In view of (59) and (61), we can see that \({\mathbb {E}}\left[ \sup _{T>0} M_T^\star \right] < \infty \). Therefore, the stochastic integral \(M^\star \) is a submartingale and \({\mathbb {E}}\left[ M_T^\star \right] \ge 0\) for all \(T>0\). Furthermore, Fatou’s lemma implies that

$$\begin{aligned} J_{x,y} (\zeta ^\star ) \ge \limsup _{T \rightarrow \infty } {\mathbb {E}}\left[ \int _0^T e^{-rt} h(X_t^\star , Y_t^\star ) \, dt - K \int _{[0,T]} e^{-rt} \, d\zeta _t^\star \right] . \end{aligned}$$

In view of these observations and (62), we can take expectations in (59) and pass to the limit to obtain

$$\begin{aligned} J_{x,y} (\zeta ^\star ) \ge w(x,y) + \limsup _{T \rightarrow \infty } e^{-rT} {\mathbb {E}}\left[ - w(X_T^\star , Y_T^\star ) \right] \ge w(x,y). \end{aligned}$$

This result and the inequality \(v(x,y) \le w(x,y)\) that we have proved above, imply that \(v(x,y) = w(x,y)\) and that \(\zeta ^\star \) is optimal. \(\square \)