Elsevier

Automatica

Volume 87, January 2018, Pages 238-250
Automatica

Particle filters for partially-observed Boolean dynamical systems

https://doi.org/10.1016/j.automatica.2017.10.009Get rights and content

Abstract

Partially-observed Boolean dynamical systems (POBDS) are a general class of nonlinear models with application in estimation and control of Boolean processes based on noisy and incomplete measurements. The optimal minimum mean square error (MMSE) algorithms for POBDS state estimation, namely, the Boolean Kalman filter (BKF) and Boolean Kalman smoother (BKS), are intractable in the case of large systems, due to computational and memory requirements. To address this, we introduce approximate MMSE filtering and smoothing algorithms based on the auxiliary particle filter (APF) method, which are called APF–BKF and APF–BKS, respectively. For joint state and parameter estimation, the APF–BKF is used jointly with maximum-likelihood (ML) methods for simultaneous state and parameter estimation in POBDS models. In the case the unknown parameters are discrete, the proposed ML adaptive filter consists of multiple APF–BKFs running in parallel, in a manner reminiscent of the Multiple Model Adaptive Estimation (MMAE) method in classical linear filtering theory. In the presence of continuous parameters, the proposed ML adaptive filter is based on an efficient particle-based expectation maximization (EM) algorithm for the POBDS model, which is based on a modified Forward Filter Backward Simulation (FFBSi) in combination with the APF–BKS. The performance of the proposed particle-based adaptive filters is assessed through numerical experiments using a POBDS model of the well-known cell cycle gene regulatory network observed through noisy RNA-Seq time series data.

Introduction

Partially-observed Boolean dynamical systems consist of a Boolean state process, also known as a Boolean network, observed through an arbitrary noisy mapping to a measurement space Braga-Neto (2011), Imani & Braga-Neto (2015a), Imani & Braga-Neto (2015b), Imani & Braga-Neto (2016a), Imani & Braga-Neto (2017a), Imani & Braga-Neto (2018). Instances of POBDSs abound in fields such as genomics (Kauffman, 1969), robotics (Roli, Manfroni, Pinciroli, & Birattari, 2011), and digital communication systems (Messerschmitt, 1990). The optimal recursive minimum mean-square error (MMSE) state estimators for this model are called the Boolean Kalman Filter (BKF) Braga-Neto (2011), Imani & Braga-Neto (2017a) and the Boolean Kalman Smoother (BKS) (Imani & Braga-Neto, 2015b). These filters have many desirable properties; in particular, it can be shown that the MMSE estimate of the state vector provides both the MMSE and the maximum-a-posteriori (MAP) estimates of each state vector component.

The BKF and BKS are exact algorithms, which is unusual in the class of general partially-observable nonlinear dynamical systems, of which POBDS is a special case. However, for large systems with large number of state variables, exact computation of the BKF and BKS becomes impractical due to large computational and memory requirements. In the general case, various approximations to the optimal estimator have been developed. The classical approach is to apply linearization at each time point through a first-order Taylor series expansion, and then apply the traditional Kalman filter solution; such an approach is called the Extended Kalman Filter (EKF) (Jazwinski, 1970). The EKF cannot be applied to Boolean dynamical systems, the reason being that the Boolean transition functions are not differentiable. There is a class of schemes that can be applied to system functions without derivatives, collectively called derivativeless filters (Van Der Merwe, 2004), which include the Unscented Kalman Filter (UKF) (Julier, Uhlmann, & Durrant-Whyte, 1995), the Central Difference Filter (CDF) (Ito & Xiong, 2000), and the Divided Difference Filter (DDF) (NøRgaard, Poulsen, & Ravn, 2000). These derivativeless filters are special cases of the general class of Sigma-Point Kalman Filters (SPKF) in Van Der Merwe (2004). However, SPKF theory has not been developed for discrete distributions such as the ones involved in Boolean dynamical systems. The only well-known approximation that can also be applied to POBDS is the class of Sequential Monte-Carlo (SMC) methods Doucet et al. (2001), Kantas et al. (2015). In Braga-Neto (2013), an approximate sequential Monte-Carlo algorithm was proposed to compute the BKF using sequential importance resampling (SIR). By contrast, we propose here SMC algorithms for both the BKF and fixed-interval BKS using the more efficient auxiliary particle filter (APF) algorithm (Pitt & Shephard, 1999), which are called the APF–BKF and APF–BKS algorithms, respectively.

The BKF and BKS require for their application that all system parameters be known. In the case where noise intensities, the network topology, or observational parameters are not known or only partially known, an adaptive scheme to simultaneously estimate the state and parameters of the system is required. An exact adaptive filtering framework to accomplish that task was proposed recently in Imani and Braga-Neto (2017a), which is based on the BKF and BKS in conjunction with maximum-likelihood estimation of the parameters. In this paper, we develop an accurate and efficient particle filtering implementation of the adaptive filtering framework in Imani and Braga-Neto (2017a), which is suitable for large systems.

Several exact and approximate adaptive filters for dual and joint state and parameter estimation for general Hidden Markov Models (HMM) Chen et al. (2005), Kantas et al. (2015), Liu & West (2001) have been proposed. However, these methods are quite sensitive to initialization and do not necessarily result in maximum likelihood estimates of the parameters. Another class of methods are provided by gradient-based maximization techniques DeJong et al. (2013), Hürzeler & Künsch (2001), Malik & Pitt (2011), which try to directly maximize the log-likelihood function using particle-based techniques. However, most of these methods suffer from unreliable approximation of the likelihood function or high computational complexity. The expectation maximization (EM) algorithm is a very popular alternative procedure for likelihood maximization, originally introduced by Dempster, Laird, and Rubin (1977). Applications of EM to linear/Gaussian state space models (Ghahramani, 1996), general nonlinear/non-Gaussian state space modelsGopaluni (2008), Schön et al. (2011), Wills et al. (2013) and the POBDS model (Imani & Braga-Neto, 2017a) have been proposed. Gaussian smoothing and sigma-point based approximations in an EM context have also been discussed in Väänänen et al. (2012). Particle-based implementations of EM filters have shown more numerical stability than gradient-based techniques (Kantas et al., 2015). For more information about inference methods of HMM, the reader is referred to Kantas et al. (2015).

For POBDS with discrete (finite) parameter space, the adaptive filter we propose in this paper uses a bank of particle filters (APF–BKFs)in parallel, which is reminiscent of the Multiple Model Adaptive Estimation (MMAE) procedure in classical linear filtering (Maybeck & Hanlon, 1995). The log-likelihood approximated by the APF–BKFs is used to obtain the ML estimate of the parameters, and the approximated MMSE estimate of the state of the selected filter yields the estimate of the state. On the other hand, if the parameter space is continuous, the available particle-based EM methods for general HMM with continuous state-space do not result in efficient estimation in POBDS models. To address this, we propose an efficient particle-based EM method for POBDS based on a modified Forward Filter Backward Simulation (FFBSi) algorithm (Godsill, Doucet, & West, 2012). The proposed filter yields the following advantages in comparison to the original EM method introduced in Imani and Braga-Neto (2017a) :

  • (1)

    Smaller computational complexity of smoothing at theE-step.

  • (2)

    Smaller memory requirement to store the required matrices and vectors (e.g. the posterior probability vectors) from the E-Step to the M-Step.

  • (3)

    Reduced complexity of each iteration in the M-step, in which several function evaluations are required.

The application of interest in this paper is to model Boolean gene regulatory networks Kauffman (1969), Schmulevich et al. (2002) observed through a single time series of RNA-seq data (Marguerat & Bahler, 2010). Using the POBDS model, we employ the proposed approximate adaptive ML algorithm to estimate the gene expression state simultaneously to the inference of the network topology and noise and expression parameters. Performance is assessed through a series of numerical experiments using the well-known cell-cycle gene regulatory model (Faure, Naldi, Chaouiya, & Thieffry, 2006). The influence of transition noise, expression parameters, and RNA-seq measurement noise (data dispersion) on performance is studied, and the consistency of the adaptive ML filter (i.e., convergence to true parameter values) is empirically established.

The article is organized as follows. In Section 2, the POBDS signal model and the Boolean Kalman Filter and Boolean Kalman Smoother are reviewed, while in Section 3, a detailed description of the APF-based filtering and smoothing algorithms proposed in this paper is provided. In Section 4, the particle-based ML adaptive filter is developed for discrete and continuous parameter spaces. A POBDS model for gene regulatory networks observed though RNA-seq measurements is reviewed in Section 5. Results for the numerical experiments with the cell-cycle network are presented in Section 6. Finally, Section 7 contains concluding remarks.

Section snippets

Optimal state estimators for POBDS

In this section, we review the POBDS model and exact algorithms for computation of its optimal state estimators. For more details see Imani & Braga-Neto (2017a), Imani & Braga-Neto (2017b), Imani & Braga-Neto (2017c), Imani & Braga-Neto (2018), McClenny et al. (2017a), McClenny et al. (2017b), McClenny et al. (2017c). For a proof of optimality of the BKF, see (Imani & Braga-Neto, 2017a).

We assume that the system is described by a state process{Xk;k=0,1,}, whereXk{0,1}d is a Boolean vector of

Particle filters for state estimation

When the number of states is large, the exact computation of the BKF and the BKS becomes intractable, due to the large size of the matrices involved, in which each contains22d elements, and approximate methods must be used, such as sequential Monte-Carlo methods, also known as particle filter algorithms. In the next subsections, we describe particle filter implementation of the BKF and BKS.

Particle filters for maximum-likelihood adaptive estimation

Suppose that the nonlinear signal model in (1) is incompletely specified. For example, the deterministic functionsfk andhk may be only partially known, or the statistics of the noise processesnk andvk may need to be estimated. By assuming that the missing information can be coded into a finite-dimensional vector parameterθΘ, whereΘ is the parameter space, we propose next particle filtering approaches for simultaneous state and parameter estimation for POBDS. For simplicity and conciseness, we

Gene regulatory network and RNA-Seq measurement models

The algorithms developed in the previous section apply to the general POBDS signal model in (1). In this section, we describe a specific instance of that model, which allows the application of the methodology to Boolean gene regulatory networks observed through noisy gene-expression data. There are several gene-expression measurement technologies currently in use, such as cDNA microarrays (Chen, Dougherty, & Bittner, 1997) and live cell imaging-based assays (Hua, Sima, Cypert, Gooden, Shack,

Numerical experiments

In this section, we carry out detailed numerical experiments to assess the performance of the developed particle-based methods. We base our experiments on the well-known Mammalian Cell-Cycle network (Fauré, Naldi, Chaouiya, & Thieffry, 2006). The pathway diagram for this network is presented in Fig. 4. The state vector isx= (CycD, Rb, p27, E2F, CycE, CycA, Cdc20, Cdh1, UbcH10, CycB). The gene interaction parametersaij can be read off Fig. 4 easily. As an example, Rb is activated by p27, and is

Conclusion

In this paper, we introduced approximate particle-based algorithms for state and simultaneous state and parameter estimation for large partially-observed Boolean dynamical systems. For approximate state estimation, filtering and smoothing methods based on auxiliary particle filtering (APF) were developed to approximate the optimal BKF and BKS. These algorithms are called APF–BKF and APF–BKS, and are original contributions of this work.

Moreover, we considered the case where some of the

Mahdi Imani (S’15) received the B.Sc. degree in mechanical engineering and the M.Sc. degree in electrical engineering, both from the University of Tehran, Tehran, Iran, in 2012 and 2014, respectively. He is currently working toward the Ph.D. degree in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA. His current research interests include estimation and control of stochastic dynamical systems, with applications in genomic signal processing.

References (52)

  • ChenYidong et al.

    Ratio-based decisions and the quantitative analysis of cDNA microarray images

    Journal of Biomedical Optics

    (1997)
  • DeJongDavid N. et al.

    Efficient likelihood evaluation of state-space representations

    Review of Economic Studies

    (2013)
  • DempsterA.D. et al.

    Maximum likelihood from incomplete data via the EM algorithm

    Journal of the Royal Statistical Society, Series B

    (1977)
  • DoucetArnaud et al.

    Sequential Monte Carlo methods in practice

    (2001)
  • DoucetArnaud et al.

    On sequential Monte Carlo sampling methods for Bayesian filtering

    Statistics and Computing

    (2000)
  • FaureA. et al.

    Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle

    Bionformatics

    (2006)
  • FauréAdrien et al.

    Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle

    Bioinformatics

    (2006)
  • GhaffariNoushin et al.

    Modeling the next generation sequencing sample processing pipeline for the purposes of classification

    BMC Bioinformatics

    (2013)
  • GhahramaniZoubin

    Parameter estimation for linear dynamical systems. Technical report

    (1996)
  • GodsillSimon J. et al.

    Monte Carlo smoothing for nonlinear time series

    Journal of the American Statistical Association

    (2012)
  • GopaluniR.B.

    A particle filter approach to identification of nonlinear processes under missing observations

    The Canadian Journal of Chemical Engineering

    (2008)
  • HuaJianping et al.

    Dynamical analysis of drug efficacy and mechanism of action using GFP reporters

    Journal of Biological Systems

    (2012)
  • HürzelerMarkus et al.

    Monte Carlo approximations for general state-space models

    Journal of Computational and Graphical Statistics

    (1998)
  • HürzelerMarkus et al.

    Approximating and maximising the likelihood for a general state-space model

  • Imani, M., & Braga-Neto, U. M. (2015a). Optimal gene regulatory network inference using the Boolean Kalman filter and...
  • Imani, M., & Braga-Neto, U. M. (2015b). Optimal state estimation for Boolean dynamical systems using a Boolean Kalman...
  • Cited by (0)

    Mahdi Imani (S’15) received the B.Sc. degree in mechanical engineering and the M.Sc. degree in electrical engineering, both from the University of Tehran, Tehran, Iran, in 2012 and 2014, respectively. He is currently working toward the Ph.D. degree in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA. His current research interests include estimation and control of stochastic dynamical systems, with applications in genomic signal processing.

    Ulisses M. Braga-Neto (S’97-M’04-SM’11) received the Ph.D. degree in electrical and computer engineering from The Johns Hopkins University, Baltimore, MD, USA. He is an associate professor in the Department of Electrical and Computer Engineering and a member of the Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, USA. He has held postdoctoral positions at the University of Texas M.D. Anderson Cancer Center, Houston, TX, and at the Oswaldo Cruz Foundation, Recife, Brazil. His research interests include pattern recognition and statistical signal processing. He is the author of the textbook Error Estimation for Pattern Recognition (IEEE-Wiley, 2015) and has received the NSF CAREER Award for his work in this area.

    The authors acknowledge the support of the National Science Foundation, through NSF award CCF-1320884. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Thomas Bo Schön under the direction of Editor Torsten Söderström.

    View full text