Elsevier

Automatica

Volume 38, Issue 9, September 2002, Pages 1539-1547
Automatica

Brief Paper
Non-asymptotic confidence ellipsoids for the least-squares estimate

https://doi.org/10.1016/S0005-1098(02)00064-XGet rights and content

Abstract

In this paper, we consider the finite sample properties of least-squares system identification, and derive non-asymptotic confidence ellipsoids for the estimate. The shape of the confidence ellipsoids is similar to the shape of the ellipsoids derived using asymptotic theory, but unlike asymptotic theory, they are valid for a finite number of data points. The probability that the estimate belongs to a certain ellipsoid has a natural dependence on the volume of the ellipsoid, the data generating mechanism, the model order and the number of data points available.

Introduction

In this paper, we consider the properties of least-squares system identification when only a finite number of data points are available. The asymptotic properties of least-squares identification are well understood, see e.g. Ljung (1999) or Söderström and Stoica (1989), but it is only recently that results addressing the finite sample properties have started appearing, e.g. Weyer, Williamson, and Mareels (1999), Weyer and Campi (1999), Campi and Weyer (2002), and Weyer (2000).

In applications such as evaluation of model uncertainty it is common to use the asymptotic confidence regions for the parameter estimate, even when only a finite number of data points are available. In this paper, we derive non-asymptotic confidence ellipsoids for the least-squares estimate. It is shown that the confidence ellipsoids depend in a natural way on factors such as the model and system order, the pole locations and the number of data points available.

The main tool we make use of is exponential inequalities in order to bound differences between expected values and empirical values. Earlier, using different techniques, Spall (1995) has considered uncertainty bounds for general M-estimators for a finite number of data points. His results are, however, difficult to use in the situation we consider here.

It should also be mentioned that finite sample properties have been studied in the deterministic set membership and worst case identification settings. In this context, the identification algorithms deliver all models which are in agreement with the observed data, so that finite-sample results are automatically included in the identification result. Differently from these settings, the present paper concentrates on the standard least-squares identification method in a stochastic framework.

The paper is organised as follows. In the next section we introduce the identification setting. In Section 3 we first bound the difference between the expected and empirical values of the matrix and vector which make up the normal equation (Theorem 1), and then we use this result to derive the non-asymptotic confidence ellipsoids (Theorem 2). Technical results are given in the appendices.

Section snippets

The data generation mechanism

We assume that the observed data are generated by a linear systemy(t)=G0(q−1)u(t)+H0(q−1)e(t),where the input signal u(t) is stochastic and generated byu(t)=V0(q−1)w(t),where G0(q−1),H0(q−1) and V0(q−1) are transfer functions in the backward shift operator q−1, i.e. q−1y(t)=y(t−1); however, for the sake of readability, we omit throughout the dependence on q−1. w(t) and e(t) are sequences of independent Gaussian random variables, independent of each other, with zero mean and variance σw2 and σe2

The main result

In this section, we present the ellipsoidal confidence regions for the least-squares estimate. First we bound the probability that the differences RNR and fNf exceed a certain value (Theorem 1), and then we use these results to bound θ̂N−θ (Theorem 2).

It is important to put our results in the right perspective. We have not made any attempt of optimising the bounds, and in some places we have made the bounds more conservative in order to get relatively simple expressions. The bounds are

Concluding remarks

In this paper, we have derived non-asymptotic confidence ellipsoids for the least-squares estimate. The shape of the ellipsoids is similar to that obtained using asymptotic theory, although the probabilities we assign to the ellipsoids can be quite different. The probability that the estimate belongs to a certain ellipsoid has a natural dependence on the volume of the ellipsoid, the data generating mechanism, the model order and the number of data points available.

Our results are worst case in

Erik Weyer received the Siv. Ing. degree in 1988 and Ph.D. in 1993, both from the Norwegian Institute of Technology, Trondheim Norway. During his Ph.D. studies he spent two years at the Australian National University.

From 1994 to 1996 he was a Research Fellow at the University of Queensland, and since 1997 he has been with the Department of Electrical and Electronic Engineering, the University of Melbourne, where he is currently a Senior Research Fellow.

His research interests are in the area of

References (8)

There are more references available in the full text version of this article.

Cited by (22)

  • Identification of linear systems with multiplicative noise from multiple trajectory data

    2022, Automatica
    Citation Excerpt :

    These algorithms can be applied to identification of linear systems with multiplicative noise, provided that certain conditions of system stability and noise hold. Non-asymptotic performance analysis of identification methods can be found in Campi and Weyer (2002, 2005) and Weyer and Campi (2002). It has once again attracted attention from different domains and been investigated more extensively, because of recent development of random matrix theories, self-normalized martingales, and so on (see Dean et al. (2019), Matni and Tu (2019) and Zheng and Li (2020) and references therein).

  • Parameter estimation in nonlinear chemical and biological processes with unmeasured variables from small data sets

    2011, Chemical Engineering Science
    Citation Excerpt :

    In situations where large amounts of data are available, algorithms based on maximum likelihood provide acceptable parameter and covariance estimates. However, it is well-known that maximum likelihood estimates show significant bias if the available data sets are small (Casella and Berger, 1990; Ninness and Hjalmarsson, 2004) and that the uncertainty estimates based on the Cramer–Rao bound are poor (Weyer et al., 1999; Weyer and Campi, 2002; Campi et al., 2004). In light of these problems with MLE, we develop a Bayesian inference based approach.

  • Contributions to the Theory and Implementation of the LSCR Method

    2010, RIAI - Revista Iberoamericana de Automatica e Informatica Industrial
  • Non-asymptotic confidence regions for model parameters in the presence of unmodelled dynamics

    2009, Automatica
    Citation Excerpt :

    Moreover, asymptotic approaches have little validity when the number of data points is small, and hence finite sample methods as developed in this paper are of great interest. For further discussions on model quality evaluation and confidence sets for the parameters of dynamical systems, the readers are referred to Campi, Ooi, and Weyer (2004), Campi and Weyer (2002), Weyer and Campi (2002), Douma and Van den Hof (2006), Hjalmarson and Ninness (2006) and den Dekker, Bombois and Van den Hof (2008). In the next subsection we give a simple preview example of the developed procedure which illustrates the main ideas and shows the generality of the approach.

View all citing articles on Scopus

Erik Weyer received the Siv. Ing. degree in 1988 and Ph.D. in 1993, both from the Norwegian Institute of Technology, Trondheim Norway. During his Ph.D. studies he spent two years at the Australian National University.

From 1994 to 1996 he was a Research Fellow at the University of Queensland, and since 1997 he has been with the Department of Electrical and Electronic Engineering, the University of Melbourne, where he is currently a Senior Research Fellow.

His research interests are in the area of system identification and control.

Marco C. Campi was born in Tradate, Italy, in 1963. He received the Doctor degree in electronic engineering from the Politecnico di Milano, Milano, Italy, in 1988.

From 1988 to 1989, he was a Research Assistant in the Department of Electrical Engineering, the Politecnico di Milano. From 1989 to 1992, he worked as a researcher at the Centro di Teoria dei Sistemi of the National Research Council in Milano. Since 1992, he has been with the University of Brescia, Italy, where he is currently Professor of Automatic Control.

Dr. Campi is an Associate Editor of Automatica, Systems and Control Letters, and the European Journal of Control. Serves as Vice-Chair of the Technical Committee IFAC on Stochastic Systems (SS) and is a member of the Technical Committee IFAC on Modeling, Identification and Signal Processing (MISP). Moreover, he is a Distinguished Lecturer under the IEEE CSS Program.

He has held visiting positions at many universities including the Australian National University, Canberra, Australia; the University of Illinois at Urbana-Champaign, USA; the Centre for Artificial Intelligence and Robotics, Bangalore, India; and the University of Melbourne, Australia.

His current research interests include: system identification, control relevant system identification, adaptive and iterative control, learning theory, and stochastic systems.

This paper was not presented at any IFAC meeting. This paper was recommended for publication in revised form by Associate Editor Antonio Vicino under the direction of Editor Torsten Söderström.

View full text