1 Introduction

Microeconomic theory sets up a solid foundation for the estimation of systems of demand equations. This theory, in its most transparent form, states that such demand equations should be consistent with the maximization of utility subject to a budget constraint. Accordingly, a demand system is deemed to be regular if it satisfies the restrictions imposed by the paradigm of rational consumer choice. In the context of Marshallian demand systems, this means the demand systems expressing quantities demanded as functions of expenditure and prices satisfy the properties of nonnegativity, homogeneity, Engel aggregation, Cournot aggregation, and the symmetry and negative semi-definiteness of the Slutsky matrix (Deaton and Muellbauer 1980a).

To translate these restrictions into empirical application, three approaches may be identified. In the first approach, the demand equations are derived literally by specifying a direct utility function and solving the constrained maximization problem. Although this approach leads to demand systems which satisfy the above regularity conditions by construction, the need to derive analytical solutions to the first order conditions restricts its application to quite specific utility functions. To our knowledge, the derivation and estimation of consumer demand systems from globally regular direct utility functions which satisfy everywhere the neoclassical monotonicity and curvature conditions are restricted to minor variants of Cobb–Douglas and constant elasticity of substitution (CES) forms, and they come at the price of inflexibility. For instance, in a Leontief system the ratios of quantities of commodities are always in fixed proportions, irrespective of price or income; in a Cobb–Douglas system income, own-price and cross-price elasticities are a priori constrained to be unity, minus one and zero, respectively.

The second approach is the Rotterdam methodology, which attempts to impose the regularity restrictions on log-differential approximations to the demand equations. This approach, first proposed by Theil (1965) and Barten (1966), has frequently been used to test the theory. This approach, in many ways, is very similar to Stone (1954)’s method, but it works in differentials, instead of working in levels of logarithms. The main query that can be raised with respect to the Rotterdam methodology is: given the numerous parameterizations used in applications, does the particular parameterization chosen correspond to a legitimate parameterization of preferences, i.e. of the direct or indirect utility functions; in other words, are the functional forms used integrable? Answering this query is not a trivial task, and to our knowledge, it hasn’t been properly answered in the literature, yet.

The third approach exploits the theory of duality among direct utility functions, indirect utility functions, and cost functions, and the regularity conditions on these functions which make them equivalent representations of the underlying preferences. Duality theory allows systems of demand equations to be derived from these dual representations via simple differentiation, according to Roy’s identity or Shephard’s lemma. This approach was popularized by Diewert (1974, 1982) and led to the use of flexible functional forms, such as the generalized Leontief of Diewert (1971) and the translog of Christensen et al. (1975), in that they do not impose any prior restrictions on slopes or elasticities at a point of approximation, and hence, they can potentially be regular at this point. In other words, they possess enough free parameters to attain arbitrary elasticities at a point in price–expenditure space, by providing a second-order approximation to an arbitrary twice continuously differentiable cost or indirect utility function (Diewert and Wales 1987).

However, this flexibility at a point comes at the cost that such systems generally satisfy globally only homogeneity with respect to prices and expenditures and often violate monotonicity and, particularly, curvature restrictions, either within the sample or at points close to the sample. Lau (1986) discusses the characterization of regularity of such systems and shows that the domain of regularity is rather limited.

In an attempt to resolve this conflict, much work has attempted to improve regularity, in the context of curvature, for example the Fourier expansions used by Gallant (1981, 1984), Elbadawi et al. (1983) and Gallant and Golub (1984), Barnett’s minflex Laurent expansion (Barnett 1983, 1985; Barnett and Lee 1985), the Generalized McFadden and Generalized Barnett cost functions of Diewert and Wales (1987) and the Asymptotically Ideal Model of Barnett and Yue (1988). However, Barnett (2002) shows that without satisfaction of both curvature and monotonicity, the second-order conditions for optimizing behaviour fail, and duality theory fails.

A convenient compromise is the class of “effectively globally regular” demand systems (Cooper and McLaren 1992, 1996; McLaren and Wong 2009). By “effectively globally regular” is meant that there exists a unit cost function (or price index) P(p) such that the regularity properties are satisfied for all expenditure–price combinations satisfying \(c\ge P(p)\), where c indicates expenditure and p represents the vector of prices. Therefore, the regularity region is an unbounded region in price–expenditure space, potentially including all points in the sample, and all points corresponding to higher levels of “real income”. A well-known example of such a system is the linear expenditure system (LES), which is regular over an unbounded region but only for sufficiently high expenditure levels.

This paper, in the spirit of the third approach, introduces a class of demand systems based on simple parametric specifications of the indirect utility functions, but allowing for the parsimonious imposition of global regularity. This class of demand systems follow the steps of the almost ideal demand system (AIDS) due to Deaton and Muellbauer (1980a), the quadratic almost ideal demand system (QUAIDS) due to Banks et al. (1997), the Modified almost ideal demand system (MAIDS) due to Cooper and McLaren (1992) and Cooper and McLaren (1996) , and a more recent rank-four demand system due to Lewbel (2003). Members from this class can be specified to acquire as large a rank as required for empirical work, following the definition of rank, due to Lewbel (1991) which generalizes Gorman’s rank to all demand systems. They also exhibit a clear and valid homothetic asymptotic behaviour, as income approaches infinity. Furthermore, by using unit cost functions such as those suggested by Diewert and Wales (1987), this approach can also allow complete price flexibility.

The layout of the paper is as follows. The parametric representation of the generic indirect utility function in terms of unit cost functions is introduced in Sect. 2, where conditions for global regularity are specified. Section 3 details possible specifications for the unit cost functions and presents four specific examples, two of which are rank two and the rest are rank three. For the purpose of illustration, in Sect. 4, using data from the latest 2009–2010 Australian Household Expenditure Survey (HES), the two rank-two and two rank-three examples presented in Sect. 3 are estimated and compared with their existing counterparts in the literature, namely LES, AIDS, and QUAIDS.

2 The representation of preferences

Let m represent the number of goods, \(p\in \varOmega _+^m \) represent the corresponding vector of prices, and let \(c>0\) represent total expenditure (cost), where \(\varOmega _+^m \) is the positive orthant. Based on duality theory, the approach to specifying demand systems based on dual representation of preferences is typified by AIDS, a rank-two demand system. In particular, the indirect utility function associated with the AIDS can be specified asFootnote 1:

$$\begin{aligned} V(c,p)=[\ln (c/P_1 (p))]/P_2 (p) \end{aligned}$$
(1)

where p is an m-vector of commodity prices, c is total expenditure, and \(P_1\) and \(P_2 \) are, respectively, specified as translog and homogeneous of degree zero (HD0) Cobb–Douglas functions of prices. AIDS is a special case of the price-independent generalized logarithmic (PIGLOG) specification, in which \(P_1 \) and\(P_2\) are explicitly specified, and PIGLOG is itself a special case of the price-independent generalized linear (PIGL) specification of Muellbauer (1975).

In spite of its dominance in empirical application, probably due to the very convenient form of its share equations, especially for the linearized AIDS model, AIDS is not globally regular. One may think that the violation of regularity by AIDS arises from the translog component, since it is well known that a translog function cannot be globally regular, and that the imposition of local (sample) regularity is a non-trivial task (Diewert and Wales 1987). However, even if \(P_1 \) were specified along more regular lines following Diewert and Wales (1987), the functional form (1) would still exhibit regularity violations. Briefly, the problem is that \(P_2 \) cannot be simultaneously HD0, non-decreasing, and concave in p. Euler’s theorem rules out the HD0 and non-decreasing combination, and the HD0 and concave combination, for any non-trivial functions. The full details can be found in Cooper and McLaren (1992).

Another popular model is QUAIDS, which is rank three, and extends the AIDS share equations to include a quadratic term of logarithmic real expenditure. The QUAIDS indirect utility function can be specified as:

$$\begin{aligned} V(c,p)=\left\{ \left[ \frac{\ln (c/P_1 (p))}{P_2 (p)}\right] ^{-1}+P_3 (p)\right\} ^{-1} \end{aligned}$$
(2)

where \(P_1 \) and \(P_2 \) are specified the same as in AIDS and \(P_3 \) is specified as another HD0 Cobb–Douglas function. QUAIDS nests AIDS as a special case when \(P_3 \equiv 0\). Although attempting to build consistency with observed Engel curves requiring quadratic terms in the logarithm of expenditure, QUAIDS doesn’t help to gain any better regularity properties than AIDS. Actually, by adding another HD0 Cobb–Douglas function of prices, it makes checking of regularity conditions even more difficult. The more recent rank-four demand system, due to Lewbel (2003), further extends QUAIDS along the same line by including a third HD1 Cobb–Douglas, and is not regular either. Its indirect utility function can be specified as:

$$\begin{aligned} V(c,p)=\left\{ \left[ \frac{\ln [(c-P_4 (p))/P_1 (p)]}{P_2 (p)}\right] ^{-1}+P_3 (p)\right\} ^{-1} \end{aligned}$$
(3)

where \(P_1 \), \(P_2 \), and \(P_3 \) are specified as in QUAIDS and \(P_4 (p)\) is specified as a HD1 Cobb–Douglas.

By contrast, the class of “effectively globally regular” demand systems improves regularity by proposing an indirect utility function which is comprised of expenditure and unit cost functions, yet possesses an unbounded regularity region in price–expenditure space, potentially including all points in the sample, and all points corresponding to higher levels of “real income”, provided all the component unit cost functions satisfy several sufficient regularity conditions. As an example of this class, the indirect utility function of the Generalized exponential form due to Cooper and McLaren (1996) can be specified as:

$$\begin{aligned} V(c,p)=\frac{\left( \frac{c}{\kappa P_1 (p)}\right) ^{\mu }-1}{\mu }\left( \frac{c}{P_2 (p)}\right) ^{\eta } \end{aligned}$$
(4)

where \(P_1 \) and \(P_2 \) are two unit cost functions satisfying several sufficient regularity conditions, and \(0\le \eta \le 1, \quad \mu \ge -1,\) and \(\kappa >0.\) The corresponding Marshallian demand equations are regular over an unbounded region \(\{(c,p):c>\kappa P_1 (p)\}\). This effective globally regular demand system is rank two.

Following this line in the literature, we propose a class of demand systems, of which the associated indirect utility functions are simple and parametric, and for which a simple set of sufficient conditions will ensure global regularity, that is, fully consistent with the maximization of utility subject to a budget constraint over the entire price–expenditure region. In addition, members of this class of demand systems are fully flexible in rank, that is, can acquire as large a rank as required for empirical work.

The generic indirect utility function of this class is specified as follows:

$$\begin{aligned} V^{R}(c,p)=\left( \frac{c}{P_0 (p)}\right) ^{\alpha }-\sum _{k=1}^n {\kappa _k \left( \frac{P_k (p)}{c}\right) ^{\beta _k }} \end{aligned}$$
(5)

where parameters \(\alpha \), \(\beta ^{\prime }s\), and \(\kappa ^{\prime }s\) satisfy \(\alpha >0\) and \(0<\beta _k \le 1, \kappa _k >0,k=1,\ldots ,n.\)

According to duality theory, an indirect utility function V(cp) is a valid representation of preferences if it satisfies the following regularity conditions: (i) continuous in (cp), and twice continuously differentiable everywhere except possibly at a set of specific price–expenditure vectors of measure zero; (ii) HD0 in (cp); (iii) non-increasing in p; (iv) non-decreasing in c; and (v) quasi-convex in p. If V(cp) satisfies these regularity conditions over the entire positive orthant \(\varOmega _+^{m+1} =\{(c, p): c>0, p>0\}\), V(cp) is said to be globally regular. If V(cp) satisfies these regularity conditions over a region \(G\subset \varOmega _+^{m+1} \), then V(cp) is said to be locally regular. Flexible functional forms such as the translog and generalized Leontief typically have rather restricted locally regular regions, and in particular those regular regions are often bounded from above in the direction of real income.

For the specification in (5), sufficient conditions for global regularity will depend on the properties of the \(P_k (p)\) functions, \(k\in \{0,\ldots ,n\}\). These \(P_k (p)\) functions can be interpreted as unit cost functions or price indices. The properties that a function P(p) should satisfy to qualify as a unit cost function are as follows: (i) P(p) is continuous in p, and twice continuously differentiable almost everywhere; (ii) \(P(p)>0\) for \(p\in \varOmega _+^m \); (iii) P(p) is HD1; (iv) P(p) is non-decreasing in p; (v) P(p) is concave in p; and (vi) \(P(1)=1\).

In Appendix 1, the global regularity of the proposed generic indirect utility function is proved. Specifically, it is shown that provided that its component \(P_k (p)\) functions, \(k\in \{0,\ldots ,n\}\), qualify as unit cost functions (i.e. satisfy all the properties presented in the previous paragraph), this generic indirect utility function \(V^{R}(c,p)\) in (5) satisfies all the regularity conditions of an indirect utility function implied by the maximization of a utility function subject to a budget constraint, over the entire price–expenditure space, and hence, the corresponding Marshallian demand equations are globally regular.

Another interesting characteristic of demand systems is their rank. Gorman (1981) defined the rank of a demand system as the dimension of the space spanned by its Engel curves. Lewbel (1991) extended the definition of rank to non-aggregable systems and also showed that rank is equivalent to the minimum number of price indices in the indirect utility function. Thus, the Cobb–Douglas system is rank one, while, as pointed out above, AIDS is rank two and QUAIDS is rank three. For the proposed generic indirect utility function in (5), it is therefore straightforward to see that the rank of this specification is \(n+1\). As n is an arbitrary integer, this model is fully flexible in rank (i.e. can potentially acquire as large a rank as required in empirical work). Empirically, while applied to a specific data set, the nonparametric procedures proposed by Donald (1997) and Cragg and Donald (1996) can be employed as a pre-specification rank test, to determine the rank. One may notice that this model seems to resemble a series expansion approach, such as Barnett’s AIM. However, the ability of this system specification to support arbitrary price functions, thus arbitrary rank, distinguishes it from that of a series expansion approach. Moreover, as expenditure c goes to infinity, the first part of the indirect utility function \((c/P_0 (p))^{\alpha }\) dominates, and thus, the rank of the corresponding demand system degenerates to one. In other words, as income goes to infinity, the preferences become homothetic, which can be regarded as a fairly reasonable assumption about asymptotic (in c) consumption behaviour.

Demand equations are most easily represented in share form. Application of Roy’s Identity to (5) gives the associated share equations as:

$$\begin{aligned} W_i^R (c,p)= & {} p_i \frac{\sum \nolimits _{k=0}^n {R_k (c,p)\textit{DP}_{ki} (p)} }{\sum \nolimits _{k=0}^n {R_k (c,p)P_k (p)} }=\frac{\sum \nolimits _{k=0}^n {R_k (c,p)P_k (p)EP_{ki} (p)} }{\sum \nolimits _{k=0}^n {R_k (c,p)P_k (p)} }\nonumber \\= & {} \sum \limits _{k=0}^n {w_k EP_{ki} (p)} , i=1,\ldots ,m \end{aligned}$$
(6)

where \(R_0 (c,p)=\alpha c^{\alpha }P_0 (p)^{-\alpha -1}\); \(R_k (c,p)=\kappa _k \beta _k c^{-\beta _k }P_k (p)^{\beta _k -1}\), \(k=1,\ldots ,n\); \(\textit{DP}_{ki} =\partial P_k /\partial p_i \), \(k=0,\ldots ,n\); \(p_i \) is the price of good i; \(EP_{ki} (p)=\partial \ln P_k (p)/\partial \ln p_i \), \(k=0,\ldots ,n\)(the elasticity of price index k with respect to price of good i); and \(w_k ={R_k (c,p)P_k (p)}\big /{\sum \nolimits _{j=0}^n {R_j (c,p)P_j (p)} }\), \(k=1,\ldots ,n\).

From (6), it can be seen that in cases where Engel rank is less than or equal to two, the share equations resemble the general share functional form of Lewbel’s fractional demand systems in Lewbel (1987). Lewbel noted that fractional demands provide a parsimonious way of increasing the range of Engel curve responses and conjectured that they have enhanced regularity properties. System (6) demonstrates how such global regularity properties can be imposed by restricting the component functions of prices to satisfy properties other than just homogeneity.

It is noteworthy that (6) expresses a share, which is supposed to be naturally bounded to the unit interval, as a weighted average of \(n+1\) functions of prices \(EP_{ki} (p)\) which are themselves bounded to the unit interval (the elasticities of non-decreasing, homogeneous of degree one, price indices), with weights \(w_k \)’s, \(k=0,\ldots ,n\). It can be seen that provided \(\alpha >0\), \(0<\beta _k \le 1, \quad \kappa _k >0,\) for \(k=1,\ldots ,n\), and \(P_k (p)>0\), for \(k=0,\ldots ,n\), all the weights, \(w_k\)’s (\(k=0,\ldots ,n)\), are positive. According to the definition of \(w_k \), \(k=0,\ldots ,n\), it is also straightforward to see that these positive weights are less than one, so that all the weights are bounded to the unit interval. Therefore, over the entire price–expenditure space, the right-hand side of (6) is guaranteed to be within the unit interval. This is a particularly serious issue for applied work, since policy evaluations are often implemented towards the end of the sample or post-sample, at higher levels of real expenditure.

By contrast, the share equations of AIDS, i.e. \(W_i =EP_{1i} +EP_{2i} \ln (c/P_1 )\), and those of QUAIDS, i.e. \(W_i =EP_{1i} +EP_{2i} \ln (c/P_1 )+EP_{3i} \cdot (P_3 /P_2 )\cdot [\ln (c/P_1 )]^{2}\), will necessarily violate the unit interval, as real expenditure grows, which, from one aspect, demonstrates their inability to be globally regular (Banks et al. 1997). Some other examples of fractional demand systems (Barnett and Jonas 1983; Cooper and McLaren 1996) have share equations satisfying the zero-to-one range, whereas none with rank higher than two have been specified and implemented empirically.

Let \(Q_i^R (c,p)\) denote the Marshallian demand equations, and thus, \(Q_i^R ={W_i^R \cdot c}/{p_i }\). For this generic system (6), expenditure elasticities are given by:

$$\begin{aligned} E_i= & {} {\partial \ln Q_i^R } /{\partial \ln c}\nonumber \\= & {} 1+ c\frac{\left( \sum \nolimits _k {D_c R_k \textit{DP}_{ki} } \right) \left( \sum \nolimits _k {R_k P_k } \right) -\left( \sum \nolimits _k {R_k \textit{DP}_{ki} } \right) \left( \sum \nolimits _k {D_c R_k P_k } \right) }{\left( \sum \nolimits _k {R_k \textit{DP}_{ki} } \right) \left( \sum \nolimits _k {R_k P_k } \right) },\nonumber \\&\quad i=1,\ldots ,m \end{aligned}$$
(7)

where \(D_c R_k ={\partial R_k }/{\partial c}\) and \(\textit{DP}_{ki} =\partial P_k /\partial p_i \), and a typical term of the Slutsky matrix can be expressed as:

$$\begin{aligned} S_{ij}= & {} c \frac{\left( \sum \nolimits _k {D_i R_k \textit{DP}_{ki} } +\sum \nolimits _k {R_k \textit{DP}_{ki}^2 } \right) \left( \sum \nolimits _k {R_k P_k } \right) -\left( \sum \nolimits _k {R_k \textit{DP}_{ki} } \right) \left( \sum \nolimits _k {D_i R_k P_k } +\sum \nolimits _k {R_k \textit{DP}_{ki} } -\sum \nolimits _k {R_k \textit{DP}_{kj} } \right) }{\left( \sum \nolimits _k {R_k P_k } \right) ^{2}}\nonumber \\&+\,c^{2} \frac{\left( \sum \nolimits _k {R_k \textit{DP}_{kj} } \right) \left[ \left( \sum \nolimits _k {D_c R_k \textit{DP}_{ki} } \right) \left( \sum \nolimits _k {R_k P_k } \right) -\left( \sum \nolimits _k {R_k \textit{DP}_{ki} } \right) \left( \sum \nolimits _k {D_c R_k P_k } \right) \right] }{\left( \sum \nolimits _k {R_k P_k } \right) ^{3}},\quad i=1,\ldots ,m \end{aligned}$$
(8)

where \(D_i R_k ={\partial R_k }/{\partial p_i }\), \(D_c R_k ={\partial R_k }/{\partial c}\), \(\textit{DP}_{ki} =\partial P_k /\partial p_i \) and \(\textit{DP}_{ki}^2 ={\partial ^{2}P_k }/{\partial p_i^2 }\).

3 Specification of unit cost functions

Recall that m represents the number of goods, \(p\in \varOmega _+^m \) represents the corresponding vector of prices, and \(c>0\) represents total expenditure (cost), where \(\varOmega _+^m \) is the positive orthant. The globally regular indirect utility function can be made operational by specifying functional forms for the unit cost \(P_k (p)\) functions, \(k\in \{0,\ldots ,n\}\). As proved in Appendix 1, the global regularity will be assured if these \(P_k (p)\) functions are chosen to satisfy the properties of a unit cost function P(p): P(p) is continuous in p, and twice continuously differentiable almost everywhere; \(P(p)>0\) for \(p\in \varOmega _+^m \); P(p) is HD1; P(p) is non-decreasing in p; and P(p) is concave in p; \(P(1)=1\).

One may think that the specification of regular unit cost functions raises as many of the same difficulties as the specification of a regular indirect utility function itself. Nevertheless, one obvious advantage of the specification of unit cost functions is that the testing for, or imposition of, concavity is usually more straightforward than the testing or imposition of quasi-convexity. A second advantage is that it is well known that positive linear combinations of positive non-decreasing concave functions are positive non-decreasing concave functions, and a non-decreasing concave transformation of a non-decreasing concave function is still a non-decreasing concave function. Following these properties, new valid and possibly complex unit cost functions can be constructed from known simple ones, substantially extending the latitude of choice. In contrast, quasi-convexity is not preserved when taking linear combinations.

In general, the choice of unit cost functions involves the usual trade-off between regularity and flexibility. To allow complete price flexibility, one unit cost function could be specified as translog or more regular alternatives, such as the Generalized McFadden and the Generalized Barnett, introduced by Diewert and Wales (1987). Even though some regularity is sacrificed in this price flexible specification of our model, it is still inherently more regular than its existing price flexible counterparts, such as AIDS or QUAIDS. From the comparison in the next section, using the 2009–2010 Australian HES data, it will be seen that the price flexible specification of this model outperforms its existing price flexible counterparts in the literature.

For a rank-two specification of this model, a set of obvious and parsimonious initial representations of the unit cost functions are the linear and Cobb–Douglas specifications which are also used in LES. The details of this choice are presented below as Model 1. To allow for complete price flexibility, one of the unit cost functions can be specified using one of the flexible functions suggested by Diewert and Wales (1987), which can also be constrained to satisfy curvature conditions globally.

Accordingly, Model 2 below is based on using the Generalized McFadden as the specification for one of the two unit cost functions in a rank-two example of our model. Since \(P_0 (p)\) describes asymptotic behaviour, while \(P_1 (p)\) can be interpreted as local behaviour of a particular sample, it seems natural to consider the specification of \(P_1 (p)\) using a flexible functional form. In Model 3, a rank-three example, \(P_0 (p)\) is specified as Cobb–Douglas, and \(P_1 (p)\) and \(P_2 (p)\) are specified as CES which nests Cobb–Douglas and linear specifications as special cases, while in Model 4, in order to allow comparability with QUAIDS, \(P_1 (p)\) and \(P_2 (p)\) are, respectively, specified as Cobb–Douglas and Generalized McFadden.

Model 1: Cobb–Douglas \(P_0 (p)\) and linear \(P_1 (p)\):

$$\begin{aligned}&P_0 (p)=\prod _{i=1}^m {p_i ^{\gamma _i }},\quad \sum _{i=1}^m {\gamma _i } =1,\quad \gamma _i \ge 0, \quad i=1,\ldots ,m \end{aligned}$$
(9)
$$\begin{aligned}&P_1 (p)=\sum _{i=1}^m {\eta _i p_i },\quad \sum _{i=1}^m {\eta _i } =1,\quad \eta _i \ge 0, \quad i=1,\ldots ,m . \end{aligned}$$
(10)

Hence, the specific indirect utility function is as follows:

$$\begin{aligned} V^{R}(c,p)=\left( \frac{c}{P_0 (p)}\right) ^{\alpha }-\kappa _1 \left( \frac{P_1 (p)}{c}\right) ^{\beta _1 }, \qquad \alpha >0, 0<\beta _1 \le 1, \kappa _1 >0. \end{aligned}$$
(11)

The restrictions on parameters \(\gamma _i \ge 0,\eta _i \ge 0, i=1,\ldots ,m, \sum \nolimits _{i=1}^m {\gamma _i } =1\) and \( \sum \nolimits _{i=1}^m {\eta _i } =1\) are sufficient to ensure that \(P_0 \) and \(P_1 \) globally satisfy all the properties of a valid unit cost function,Footnote 2 \(^{,}\) Footnote 3 and hence, the corresponding share equations:

$$\begin{aligned} W_i =\frac{\gamma _i R_0 P_0 +\eta _i p_i R_1 }{R_0 P_0 +R_1 P_1 }, \quad i=1,\ldots ,m \end{aligned}$$
(12)

with

$$\begin{aligned} R_0 (c,p)=\alpha c^{\alpha }P_0 (p)^{-\alpha -1},\quad R_1 (c,p)=\kappa _1 \beta _1 c^{-\beta _1 }P_1 (p)^{\beta _1 -1} \end{aligned}$$
(13)

constitute a rank-two globally regular demand system, which can thus be called RDS2.

Model 2: Cobb–Douglas \(P_0 (p)\) and the Generalized McFadden \(P_1 (p)\):

To overcome the problems of imposing curvature conditions on popular flexible functional forms such as the translog and the generalized Leontief, Diewert and Wales (1987) proposed a number of flexible functional forms which are more amenable to the imposition and testing of curvature conditions. In this example, we will use the Generalized McFadden defined for a time invariant unit cost function by:

$$\begin{aligned} P_1 (p)=(1/2)p_m^{-1} \sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} p_i p_j } } +\sum _{i=1}^m {b_i p_i } \end{aligned}$$
(14)

where \(c_{ij} =c_{ji} \). The Hessian matrix of \(P_1 (p)\) will be negative semi-definite, and hence, \(P_1 (p)\) will be concave, for all \(p\in \varOmega _+^m \), if and only if C, which is defined as the \((m-1)\times (m-1)\) matrix of \(c_{ij}^{\prime }s\), is negative semi-definite. Accordingly, the global concavity of \(P_1 (p)\) can be easily tested. If, after an unconstrained estimation procedure, \(P_1 (p)\) turns out not concave, it is then relatively straightforward to impose concavity on C by means of a technique introduced by Wiley et al. (1973) to re-parameterize the matrix C such that \(C=-AA^{T}\) where \(A=[a_{ij} ]\); \(a_{ij} =0\) for \(i<j\); \(i, j=1,\ldots ,m-1\).

In this Model 2, the functional form (14) is used for \(P_1 (p)\) in (11), and \(P_0 (p)\) is maintained to be Cobb–Douglas as in RDS2. This parameterization will generate an indirect utility function which is potential globally convex, and thus quasi-convex, i.e. the Slutsky matrix of the implied demand system, is globally negative semi-definite. However, although (14) achieves full price flexibility and concavity in p, \(P_1 (p)\) fails to satisfy all the properties of a valid unit cost function, in particular monotonicity and nonnegativity (Cooper et al. 1994).

Specifying \(P_0 (p)\) to be Cobb–Douglas as in (9), and \(P_1 (p)\) as in (14), the corresponding share equations are as follows:

$$\begin{aligned} W_i =p_i \frac{R_0 \textit{DP}_{0i} +R_1 \textit{DP}_{1i} }{R_0 P_0 +R_1 P_1 }, \quad i=1,\ldots ,m \end{aligned}$$
(15)

with

$$\begin{aligned} R_0 (c,p)= & {} \alpha c^{\alpha }P_0 (p)^{-\alpha -1},\quad R_1 (c,p)=\kappa _1 \beta _1 c^{-\beta _1 }P_1 (p)^{\beta _1 -1}, \nonumber \\ \textit{DP}_{0i} (p)= & {} \frac{\gamma _i }{p_i }P_0 (p),\quad i=1,\ldots ,m \nonumber \\ \textit{DP}_{1i} (p)= & {} \left\{ {\begin{array}{l} p_m^{-1} \sum \nolimits _{j=1}^{m-1} {c_{ji} p_j } +b_i ,\quad i=1,\ldots ,m-1 \\ -\frac{1}{2}p_m^{-1} \sum \nolimits _{i=1}^{m-1} {\sum \nolimits _{j=1}^{m-1} {c_{ij} p_i p_j } } +b_m ,\quad i=m \\ \end{array}} \right. \end{aligned}$$
(16)

and constitute a rank-two flexible demand system, which is called RDS2_M. Note that this model can be regarded as a counterpart of the AIDS model, as AIDS employs similar specifications for its component price functions and is also a rank-two demand system.

The specification of \(P_1 (p)\) has to be normalized to satisfy \(P(1)=1\), ensuring that \(P_1 (p)\) has the same base as the component relative prices. Specifically, the normalization is as follows:

$$\begin{aligned} \sum _{i=1}^m {b_i } +\frac{1}{2}\sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} } } =1. \end{aligned}$$
(17)

Model 3: Cobb–Douglas \(P_0 (p)\), and CES \(P_1 (p)\) and \(P_2 (p)\):

$$\begin{aligned} P_0 (p)= & {} \prod _{i=1}^m {p_i ^{\gamma _i }} , \quad \sum _{i=1}^m {\gamma _i } =1, \quad \gamma _i \ge 0,\quad i=1,\ldots ,m \end{aligned}$$
(18)
$$\begin{aligned} P_1 (p)= & {} \left[ \sum _{i=1}^m {\tau _i p_i ^{-\rho }} \right] ^{-1/\rho },\quad \sum _{i=1}^m {\tau _i } =1,\quad \tau _i \ge 0,\quad i=1,\ldots ,m,\quad \rho \in [0,1] \end{aligned}$$
(19)
$$\begin{aligned} P_2 (p)= & {} \left[ \sum _{i=1}^m {\eta _i p_i ^{\rho }} \right] ^{1/\rho }, \quad \sum _{i=1}^m {\eta _i } =1,\quad \eta _i \ge 0,\quad i=1,\ldots ,m,\quad \rho \in [0,1] .\nonumber \\ \end{aligned}$$
(20)

Hence, the associated indirect utility function is as follows:

$$\begin{aligned} V^{R}(c,p)= & {} \left( \frac{c}{P_0 (p)}\right) ^{\alpha }-\kappa _1 \left( \frac{P_1 (p)}{c}\right) ^{\beta _1 }-\kappa _2 \left( \frac{P_2 (p)}{c}\right) ^{\beta _2 }, \nonumber \\&\quad \alpha >0; 0<\beta _k \le 1, \kappa _k >0,\quad k=1,2. \end{aligned}$$
(21)

The restrictions on parameters \(\gamma _i \), \(\tau _i \), \(\eta _i \) in (18)–(20) are sufficient to ensure \(P_0 \), \(P_1 \), and \(P_2 \) globally satisfy all the properties of a valid unit cost function,Footnote 4 \(^{,}\) Footnote 5 and hence, the corresponding share equations:

$$\begin{aligned} W_i =p_i \frac{R_0 \textit{DP}_{0i} +R_1 \textit{DP}_{1i} +R_2 \textit{DP}_{2i} }{R_0 P_0 +R_1 P_1 +R_2 P_2 },\quad i=1,\ldots ,m \end{aligned}$$
(22)

with

$$\begin{aligned} R_0 (c,p)= & {} \alpha c^{\alpha }P_0 (p)^{-\alpha -1}, \quad R_1 (c,p)=\kappa _1 \beta _1 c^{-\beta _1 }P_1 (p)^{\beta _1 -1}, \nonumber \\ R_2 (c,p)= & {} \kappa _2 \beta _2 c^{-\beta _2 }P_2 (p)^{\beta _2 -1}\nonumber \\ \textit{DP}_{0i} (p)= & {} \frac{\gamma _i }{p_i }P_0 (p),\quad i=1,\ldots ,m \nonumber \\ \textit{DP}_{1i} (p)= & {} \tau _i p_i^{-\rho -1} \left[ \sum _{i=1}^m {\tau _i p_i ^{-\rho }} \right] ^{-1/\rho -1}, \quad i=1,\ldots ,m \nonumber \\ \textit{DP}_{2i} (p)= & {} \eta _i p_i^{\rho -1} \left[ \sum _{i=1}^m {\eta _i p_i ^{\rho }} \right] ^{1/\rho -1},\quad i=1,\ldots ,m \end{aligned}$$
(23)

constitute a rank-three globally regular demand system, which can thus be called RDS3.

Model 4: Cobb–Douglas \(P_0 (p)\) and \(P_1 (p)\), and the Generalized McFadden \(P_2 (p)\):

$$\begin{aligned} P_0 (p)= & {} \prod _{i=1}^m {p_i ^{\gamma _i }} , \quad \sum _{i=1}^m {\gamma _i } =1, \gamma _i \ge 0, \quad i=1,\ldots ,m \end{aligned}$$
(24)
$$\begin{aligned} P_1 (p)= & {} \prod _{i=1}^m {p_i ^{\tau _i }} , \quad \sum _{i=1}^m {\tau _i } =1, \tau _i \ge 0, \quad i=1,\ldots ,m \end{aligned}$$
(25)
$$\begin{aligned} P_2 (p)= & {} (1/2)p_m^{-1} \sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} p_i p_j } } +\sum _{i=1}^m {b_i p_i } ,\quad c_{ij} =c_{ji} ,\nonumber \\&\times \sum _{i=1}^m {b_i } +\frac{1}{2}\sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} } } =1. \end{aligned}$$
(26)

This model, called RDS3_M, is a rank-three flexible demand system and can be regarded as a counterpart of the QUAIDS model, as QUAIDS is also rank three and uses similar specifications for its component price functions. The corresponding share equations are as follows:

$$\begin{aligned} W_i =p_i \frac{R_0 \textit{DP}_{0i} +R_1 \textit{DP}_{1i} +R_2 \textit{DP}_{2i} }{R_0 P_0 +R_1 P_1 +R_2 P_2 },\quad i=1,\ldots ,m \end{aligned}$$
(27)

with

$$\begin{aligned} R_0 (c,p)= & {} \alpha c^{\alpha }P_0 (p)^{-\alpha -1},\quad R_1 (c,p)=\kappa _1 \beta _1 c^{-\beta _1 }P_1 (p)^{\beta _1 -1}, \nonumber \\ R_2 (c,p)= & {} \kappa _2 \beta _2 c^{-\beta _2 }P_2 (p)^{\beta _2 -1} \nonumber \\ \textit{DP}_{0i} (p)= & {} \frac{\gamma _i }{p_i }P_0 (p),\quad i=1,\ldots ,m \nonumber \\ \textit{DP}_{1i} (p)= & {} \frac{\tau _i }{p_i }P_1 (p), \quad i=1,\ldots ,m \nonumber \\ \textit{DP}_{2i}= & {} \left\{ {\begin{array}{l} p_m^{-1} \sum \nolimits _{j=1}^{m-1} {c_{ji} p_j } +b_i ,\quad i=1,\ldots ,m-1 \\ -\frac{1}{2}p_m^{-1} \sum \nolimits _{i=1}^{m-1} {\sum \nolimits _{j=1}^{m-1} {c_{ij} p_i p_j } } +b_m ,\quad i=m. \\ \end{array}} \right. \end{aligned}$$
(28)

4 An empirical comparison with several existing alternatives

The models that we have discussed so far relate to individuals or households. Hence, in the following application, to place emphasis on the shape of the Engel curves, a relatively homogeneous subsample taken from the 2009–2010 Australian HES is used. The selection criteria are as follows: one-couple households without any children or students who live in capital cities of the eastern states of Australia, i.e. Victoria, Queensland, and New South Wales. It is noteworthy that this class of models can be straightforwardly extended to accommodate households’ demographic heterogeneity without destroying global regularity, using techniques introduced in Pollak and Wales (1992), for instance the demographic scaling technique.

In this application, total food expenditures of households are classified into six aggregated commodities: Bread and Cereal products; Meat and Seafoods; Dairy and Related products; Fruit and Vegetables; Non-alcoholic beverages; and Other.

One of the difficulties here, as in any study using household level purchase data, is the question of how to deal with households that record zero purchase. In the 2009–2010 Australian HES, for most types of expenditure, data were taken from diaries in which survey respondents recorded their household expenditure over a 2-week period, beginning from the day of initial contact. The zero difficulty is therefore pervasive in these data. However, it is necessary to note that for important and largely necessary categories such as those used here, most zero purchases do not actually reflect zero consumption. It just occurs that the household does not purchase the commodity during the short survey period. Other surveys that have longer survey periods typically find very many fewer zero records. Therefore, irrespective of some recent progress in modelling zero consumption (Heien and Wessells 1990; Shonkwiler and Yen 1999; Yen et al. 2003; Dong et al. 2004; Meyerhoefer et al. 2005; Sam and Zheng 2010), such models cannot properly describe the current data.

In this study, we therefore simply confine attention to households that record positive purchases. As argued in Deaton (1988), this is admissible if all households consume the good, while purchases are randomly distributed over time with a distribution that is unaffected by prices or other variables that determine purchases. In this study, in contrast with Deaton (1987) where rural households are likely to substitute between own and market consumption in response to price fluctuations, only urban households are included in our sample, plus the aggregated commodity groups are important and largely necessary, and thus, it is not implausible to assume that purchases are randomly distributed over time. As a result, in total, there are 1017 observations in our sample.

As with other applications using micro survey data, no price data are provided by the 2009–2010 Australian HES. Therefore, they are constructed based on the CPI. The CPI is provided by the Australian Bureau of Statistics (ABS) as a general measure of changes in prices of consumer goods and services purchased by Australian urban households. A quarterly CPI series is provided by ABS, which is consistent with the quarterly based HES data. One remedy for the difficulty caused by insufficient price variation of the CPI was introduced by Lewbel (1989) and further exploited by Hoderlein and Mihaleva (2008) who compare the results of using the usual aggregate price indices and the Stone–Lewbel (SL) price indices in the food demand estimation and conclude that the SL price indices greatly increase the precision of the estimates in both parametric and nonparametric modelling. Accordingly, in this study, the SL price indices are constructed for individual households and used in estimation. Even though the within-group utility function is assumed to be Cobb–Douglas, there is no restriction on the form of the between-group utility function, and in this study, the between-group utility function is specified as a member from the class of globally regular systems. This is the usual practice in applied work, where complicated, flexible group-demand models are estimated using simple Laspeyres or Paasche price indices.

For the purposes of comparison, three popular existing demand systems, LES, AIDS and QUAIDS, are also estimated. In order to make AIDS and QUAIDS have the same flexible components as RDS2_M and RDS3_M, we modify AIDS and QUAIDS by using the Generalized McFadden functional form in place of the original translog, which are thus named as AIDS_M and QUAIDS_M.

Specifically, for AIDS_M, the indirect utility function is shown as Eq. (1), with specifications of the component price functions as:

$$\begin{aligned} P_1 (p)= & {} (1/2)p_m^{-1} \sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} p_i p_j } } +\sum _{i=1}^m {b_i p_i },\quad c_{ij} =c_{ji} \end{aligned}$$
(29)
$$\begin{aligned} P_2 (p)= & {} \prod _{i=1}^m {p_i ^{\beta _i }} ,\quad \sum _{i=1}^m {\beta _i } =0. \end{aligned}$$
(30)

The corresponding share equations are as follows:

$$\begin{aligned} W_i =\frac{p_i }{P_1 }\left( p_m^{-1} \sum _{j=1}^{m-1} {c_{ji} p_j } +b_i \right) +\beta _i \ln \left( \frac{c}{P_1 }\right) , \quad i=1,\ldots ,m-1. \end{aligned}$$
(31)

Note that, as discussed in Deaton and Muellbauer (1980b) and Banks et al. (1997), while estimating AIDS and QUAIDS, to facilitate identification, \(P_1 (p)\) has to be normalized. Since \(P_1 (p)\) can be interpreted as the outlay required for a minimal standard of living in the base period when prices are all unity (Deaton and Muellbauer 1980b), our choice of the normalization constant follows the discussion in the literature and is thus chosen to be just below the lowest value of c in our data (Banks et al. 1997). The normalization applied is as follows:

$$\begin{aligned} \sum _{i=1}^m {b_i } +\frac{1}{2}\sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} } } =27. \end{aligned}$$
(32)

As for QUAIDS_M, the indirect utility function is shown as Eq. (2), with specifications of the component price functions as:

$$\begin{aligned} P_1 (p)= & {} (1/2)p_m^{-1} \sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} p_i p_j } } +\sum _{i=1}^m {b_i p_i } ,\,\, c_{ij} =c_{ji} , \nonumber \\&\sum _{i=1}^m {b_i } +\frac{1}{2}\sum _{i=1}^{m-1} {\sum _{j=1}^{m-1} {c_{ij} } } =27 \end{aligned}$$
(33)
$$\begin{aligned} P_2 (p)= & {} \prod _{i=1}^m {p_i ^{\beta _i }} ,\quad \sum _{i=1}^m {\beta _i } =0 \end{aligned}$$
(34)
$$\begin{aligned} P_3 (p)= & {} \sum _{i=1}^m {\lambda _i \ln p_i } ,\quad \sum _{i=1}^m {\lambda _i } =0 . \end{aligned}$$
(35)

So, assuming m goods, the corresponding share equations are as follows:

$$\begin{aligned} W_i= & {} \frac{p_i }{P_1 }\left( p_m^{-1} \sum _{j=1}^{m-1} {c_{ji} p_j } +b_i \right) \nonumber \\&+\,\beta _i \ln \left( \frac{c}{P_1 }\right) +\frac{\lambda _i }{P_2 }\left[ \ln \left( \frac{c}{P_1 }\right) \right] ^{2}, \quad i=1,\ldots ,m-1 \end{aligned}$$
(36)

As a result, AIDS_M and QUAIDS_M are both more regular than their translog counterparts AIDS and QUAIDS. While LES and AIDS_M are rank two, QUAIDS_M is rank three.

In this study, the maximum likelihood estimation (MLE) procedure is implemented using the software R. As the estimation of the class of globally regular demand systems involves interval constraints on parameters, the L-BFGS-B algorithm, due to Byrd et al. (1995), is used in the procedure of optimization. It is well known that standard asymptotic theory needs the assumption that the true parameter value lies away from the boundary, and therefore in cases where estimates lie on or almost on the boundary, the asymptotic theory does not apply and the inverse of the negative of the Hessian matrix thus fails to provide a practically useful approximation to the variance–covariance matrix. Accordingly, in such cases, the standard errors of estimates are derived by bootstrapping. Specifically, 500 new samples are randomly drawn (allowing repeated sampling) from the original data, in which each new sample has the same sample size as the original one. The same estimation procedure is implemented on these new samples, and 500 new sets of parameter estimates are derived. The standard derivation of these estimates is the standard error.

The likelihood estimates and corresponding t values are summarized in Table 3 in Appendix 2. For comparison, the AIC and BIC values for each model are presented in Table 1. Since the concavity for LES, RDS2, and RDS3 is imposed by construction and cannot be relaxed and thereby tested, for comparison purpose, the curvature property is improved for AIDS_M, RDS2_M, and QUAIDS_M by imposing concavity on their component Generalized McFadden unit cost functions, without testing for it a priori.

Table 1 AIC and BIC

As shown in Table 1, the order in terms of AIC is as follows: (1) RDS3_M; (2) RDS3; (3) RDS2_M; (4) QUAIDS_M; (5) AIDS_M; (6) RDS2; and (7) LES. It can be seen that the flexible rank-three model RDS3_M performs better than its existing counterpart in the literature QUAIDS_M. The flexible rank-two model RDS2_M outperforms its counterpart AIDS_M. One might have thought that having more parameters, the existing flexible rank-three model QUAIDS_M would have a stronger explanatory power than RDS3 which is globally regular but not fully flexible in prices. However, it seems that the penalty of QUAIDS_M over-fitting apparently outweighs the extra information that it might have achieved by increasing the number of free parameters in the model. When free parameters are penalized more strongly, the order in terms of BIC is as follows: (1) RDS3; (2) RDS2; (3) RDS3_M; (4) RDS2_M; (5) LES; (6) AIDS_M; and (7) QUAIDS_M. According to this order, our rank-three globally regular model RDS3 performs the best, and all the four examples of our class outperform the three existing alternatives. Therefore, to sum up, this empirical evidence seems to favour global regularity over flexibility.

From Table 1, it can also be seen that moving from a rank-two system to a rank-three system, for example, from RDS3 to RDS2 or from RDS3_M to RDS2_M, which improves Engel flexibility of a system, helps achieve a better performance in terms of AIC and BIC. This evidence seems to imply that the true Engel curve might have a rank of three or higher. To determine the rank, a more rigorous pre-specification rank test has to be implemented, such as the nonparametric procedures proposed by Donald (1997) and Cragg and Donald (1996), which is beyond the scope of this paper.

For illustration of elasticities, income and Hicksian elasticities are estimated for LES, RDS2, AIDS_M, and RDS2_M, all of which are rank-two demand systems, and are presented in Table 2. Note that AIDS_M and RDS2_M are fully flexible in prices; in other words, they possess enough parameters to attain arbitrary price elasticities at a point in price–expenditure space.

As shown in Table 2, all the models produce similar estimates of income elasticities for all the six goods, except that the income elasticity for the good “Other” is estimated slightly above 1 by AIDS_M, which means “Other” is classified as a luxury good by AIDS_M, while the others classify it as a necessity. Regarding price elasticities, all the four systems present similar negative Hicksian own-price elasticities for all the six goods. As for Hicksian cross-price elasticities, comparison between AIDS_M and RDS2_M, both of which are fully flexible in prices and thus comparable, shows that “Bread and Cereal products” and “Meats and Seafoods”, Non-alcoholic beverages” and “Meats and Seafoods”, “Dairy and Related products” and “Other”, and “Fruit and Vegetables” and “Other” are classified as complements by AIDS_M, while they are classified as substitutes by RDS2_M. However, it is noteworthy that all these negative estimates of the four cross-price elasticities by AIDS_M are not significant even at the 10 % significance level.

Table 2 Income and Hicksian elasticities

5 Conclusion

In this paper, we have introduced a new class of demand systems based on a simple parametric specification of the indirect utility function, but allowing for the parsimonious imposition of global regularity, i.e. fully consistent with the maximization of utility subject to a budget constraint over the entire price–expenditure region. They also exhibit a clear and reasonable homothetic asymptotic behaviour, as income approaches infinity. In addition, this class of demand systems is potentially fully flexible in rank, i.e. can acquire as large a rank as required for empirical work. In an empirical application using Australian household expenditure data, according to AIC and BIC, the four examples of this class outperform their popular existing counterparts in the literature, and this empirical evidence therefore seems to favour global regularity over flexibility.

Furthermore, income and Hicksian own- and cross-price elasticities are estimated for all the rank-two demand systems illustrated in this study, namely LES, RDS2, AIDS_M, and RDS2_M, and are compared between each other. As a result, all these four systems produce similar estimates for income elasticities and Hicksian own-price elasticities. As for Hicksian cross-price elasticities, although four pairs of goods are classified as complements by AIDS_M while classified as substitutes by RDS2_M, all the four negative estimates of cross-price elasticities by AIDS_M are not significant even at the 10 % significance level.