Particle filters for partially-observed Boolean dynamical systems☆
Introduction
Partially-observed Boolean dynamical systems consist of a Boolean state process, also known as a Boolean network, observed through an arbitrary noisy mapping to a measurement space Braga-Neto (2011), Imani & Braga-Neto (2015a), Imani & Braga-Neto (2015b), Imani & Braga-Neto (2016a), Imani & Braga-Neto (2017a), Imani & Braga-Neto (2018). Instances of POBDSs abound in fields such as genomics (Kauffman, 1969), robotics (Roli, Manfroni, Pinciroli, & Birattari, 2011), and digital communication systems (Messerschmitt, 1990). The optimal recursive minimum mean-square error (MMSE) state estimators for this model are called the Boolean Kalman Filter (BKF) Braga-Neto (2011), Imani & Braga-Neto (2017a) and the Boolean Kalman Smoother (BKS) (Imani & Braga-Neto, 2015b). These filters have many desirable properties; in particular, it can be shown that the MMSE estimate of the state vector provides both the MMSE and the maximum-a-posteriori (MAP) estimates of each state vector component.
The BKF and BKS are exact algorithms, which is unusual in the class of general partially-observable nonlinear dynamical systems, of which POBDS is a special case. However, for large systems with large number of state variables, exact computation of the BKF and BKS becomes impractical due to large computational and memory requirements. In the general case, various approximations to the optimal estimator have been developed. The classical approach is to apply linearization at each time point through a first-order Taylor series expansion, and then apply the traditional Kalman filter solution; such an approach is called the Extended Kalman Filter (EKF) (Jazwinski, 1970). The EKF cannot be applied to Boolean dynamical systems, the reason being that the Boolean transition functions are not differentiable. There is a class of schemes that can be applied to system functions without derivatives, collectively called derivativeless filters (Van Der Merwe, 2004), which include the Unscented Kalman Filter (UKF) (Julier, Uhlmann, & Durrant-Whyte, 1995), the Central Difference Filter (CDF) (Ito & Xiong, 2000), and the Divided Difference Filter (DDF) (NøRgaard, Poulsen, & Ravn, 2000). These derivativeless filters are special cases of the general class of Sigma-Point Kalman Filters (SPKF) in Van Der Merwe (2004). However, SPKF theory has not been developed for discrete distributions such as the ones involved in Boolean dynamical systems. The only well-known approximation that can also be applied to POBDS is the class of Sequential Monte-Carlo (SMC) methods Doucet et al. (2001), Kantas et al. (2015). In Braga-Neto (2013), an approximate sequential Monte-Carlo algorithm was proposed to compute the BKF using sequential importance resampling (SIR). By contrast, we propose here SMC algorithms for both the BKF and fixed-interval BKS using the more efficient auxiliary particle filter (APF) algorithm (Pitt & Shephard, 1999), which are called the APF–BKF and APF–BKS algorithms, respectively.
The BKF and BKS require for their application that all system parameters be known. In the case where noise intensities, the network topology, or observational parameters are not known or only partially known, an adaptive scheme to simultaneously estimate the state and parameters of the system is required. An exact adaptive filtering framework to accomplish that task was proposed recently in Imani and Braga-Neto (2017a), which is based on the BKF and BKS in conjunction with maximum-likelihood estimation of the parameters. In this paper, we develop an accurate and efficient particle filtering implementation of the adaptive filtering framework in Imani and Braga-Neto (2017a), which is suitable for large systems.
Several exact and approximate adaptive filters for dual and joint state and parameter estimation for general Hidden Markov Models (HMM) Chen et al. (2005), Kantas et al. (2015), Liu & West (2001) have been proposed. However, these methods are quite sensitive to initialization and do not necessarily result in maximum likelihood estimates of the parameters. Another class of methods are provided by gradient-based maximization techniques DeJong et al. (2013), Hürzeler & Künsch (2001), Malik & Pitt (2011), which try to directly maximize the log-likelihood function using particle-based techniques. However, most of these methods suffer from unreliable approximation of the likelihood function or high computational complexity. The expectation maximization (EM) algorithm is a very popular alternative procedure for likelihood maximization, originally introduced by Dempster, Laird, and Rubin (1977). Applications of EM to linear/Gaussian state space models (Ghahramani, 1996), general nonlinear/non-Gaussian state space modelsGopaluni (2008), Schön et al. (2011), Wills et al. (2013) and the POBDS model (Imani & Braga-Neto, 2017a) have been proposed. Gaussian smoothing and sigma-point based approximations in an EM context have also been discussed in Väänänen et al. (2012). Particle-based implementations of EM filters have shown more numerical stability than gradient-based techniques (Kantas et al., 2015). For more information about inference methods of HMM, the reader is referred to Kantas et al. (2015).
For POBDS with discrete (finite) parameter space, the adaptive filter we propose in this paper uses a bank of particle filters (APF–BKFs)in parallel, which is reminiscent of the Multiple Model Adaptive Estimation (MMAE) procedure in classical linear filtering (Maybeck & Hanlon, 1995). The log-likelihood approximated by the APF–BKFs is used to obtain the ML estimate of the parameters, and the approximated MMSE estimate of the state of the selected filter yields the estimate of the state. On the other hand, if the parameter space is continuous, the available particle-based EM methods for general HMM with continuous state-space do not result in efficient estimation in POBDS models. To address this, we propose an efficient particle-based EM method for POBDS based on a modified Forward Filter Backward Simulation (FFBSi) algorithm (Godsill, Doucet, & West, 2012). The proposed filter yields the following advantages in comparison to the original EM method introduced in Imani and Braga-Neto (2017a) :
- (1)
Smaller computational complexity of smoothing at theE-step.
- (2)
Smaller memory requirement to store the required matrices and vectors (e.g. the posterior probability vectors) from the E-Step to the M-Step.
- (3)
Reduced complexity of each iteration in the M-step, in which several function evaluations are required.
The application of interest in this paper is to model Boolean gene regulatory networks Kauffman (1969), Schmulevich et al. (2002) observed through a single time series of RNA-seq data (Marguerat & Bahler, 2010). Using the POBDS model, we employ the proposed approximate adaptive ML algorithm to estimate the gene expression state simultaneously to the inference of the network topology and noise and expression parameters. Performance is assessed through a series of numerical experiments using the well-known cell-cycle gene regulatory model (Faure, Naldi, Chaouiya, & Thieffry, 2006). The influence of transition noise, expression parameters, and RNA-seq measurement noise (data dispersion) on performance is studied, and the consistency of the adaptive ML filter (i.e., convergence to true parameter values) is empirically established.
The article is organized as follows. In Section 2, the POBDS signal model and the Boolean Kalman Filter and Boolean Kalman Smoother are reviewed, while in Section 3, a detailed description of the APF-based filtering and smoothing algorithms proposed in this paper is provided. In Section 4, the particle-based ML adaptive filter is developed for discrete and continuous parameter spaces. A POBDS model for gene regulatory networks observed though RNA-seq measurements is reviewed in Section 5. Results for the numerical experiments with the cell-cycle network are presented in Section 6. Finally, Section 7 contains concluding remarks.
Section snippets
Optimal state estimators for POBDS
In this section, we review the POBDS model and exact algorithms for computation of its optimal state estimators. For more details see Imani & Braga-Neto (2017a), Imani & Braga-Neto (2017b), Imani & Braga-Neto (2017c), Imani & Braga-Neto (2018), McClenny et al. (2017a), McClenny et al. (2017b), McClenny et al. (2017c). For a proof of optimality of the BKF, see (Imani & Braga-Neto, 2017a).
We assume that the system is described by a state process, where is a Boolean vector of
Particle filters for state estimation
When the number of states is large, the exact computation of the BKF and the BKS becomes intractable, due to the large size of the matrices involved, in which each contains elements, and approximate methods must be used, such as sequential Monte-Carlo methods, also known as particle filter algorithms. In the next subsections, we describe particle filter implementation of the BKF and BKS.
Particle filters for maximum-likelihood adaptive estimation
Suppose that the nonlinear signal model in (1) is incompletely specified. For example, the deterministic functions and may be only partially known, or the statistics of the noise processes and may need to be estimated. By assuming that the missing information can be coded into a finite-dimensional vector parameter, where is the parameter space, we propose next particle filtering approaches for simultaneous state and parameter estimation for POBDS. For simplicity and conciseness, we
Gene regulatory network and RNA-Seq measurement models
The algorithms developed in the previous section apply to the general POBDS signal model in (1). In this section, we describe a specific instance of that model, which allows the application of the methodology to Boolean gene regulatory networks observed through noisy gene-expression data. There are several gene-expression measurement technologies currently in use, such as cDNA microarrays (Chen, Dougherty, & Bittner, 1997) and live cell imaging-based assays (Hua, Sima, Cypert, Gooden, Shack,
Numerical experiments
In this section, we carry out detailed numerical experiments to assess the performance of the developed particle-based methods. We base our experiments on the well-known Mammalian Cell-Cycle network (Fauré, Naldi, Chaouiya, & Thieffry, 2006). The pathway diagram for this network is presented in Fig. 4. The state vector is (CycD, Rb, p27, E2F, CycE, CycA, Cdc20, Cdh1, UbcH10, CycB). The gene interaction parameters can be read off Fig. 4 easily. As an example, Rb is activated by p27, and is
Conclusion
In this paper, we introduced approximate particle-based algorithms for state and simultaneous state and parameter estimation for large partially-observed Boolean dynamical systems. For approximate state estimation, filtering and smoothing methods based on auxiliary particle filtering (APF) were developed to approximate the optimal BKF and BKS. These algorithms are called APF–BKF and APF–BKS, and are original contributions of this work.
Moreover, we considered the case where some of the
Mahdi Imani (S’15) received the B.Sc. degree in mechanical engineering and the M.Sc. degree in electrical engineering, both from the University of Tehran, Tehran, Iran, in 2012 and 2014, respectively. He is currently working toward the Ph.D. degree in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA. His current research interests include estimation and control of stochastic dynamical systems, with applications in genomic signal processing.
References (52)
- et al.
Particle filters for state and parameter estimation in batch processes
Journal of Process Control
(2005) Metabolic stability and epigenesis in randomly constructed genetic nets
Journal of Theoretical Biology
(1969)- et al.
Particle filters for continuous likelihood evaluation and maximisation
Journal of Econometrics
(2011) - et al.
New developments in state estimation for nonlinear systems
Automatica
(2000) - et al.
System identification of nonlinear state-space models
Automatica
(2011) - et al.
Identification of Hammerstein–Wiener models
Automatica
(2013) - et al.
On approximate maximum-likelihood methods for blind identification: how to cope with the curse of dimensionality
IEEE Transactions on Signal Processing
(2009) - et al.
Improving ultimate convergence of an augmented Lagrangian method
Optimization Methods & Software
(2008) Optimal state estimation for Boolean dynamical systems
- Braga-Neto, U. M. (2013). Particle filtering approach to state estimation in Boolean dynamical systems. In Proceedings...
Ratio-based decisions and the quantitative analysis of cDNA microarray images
Journal of Biomedical Optics
Efficient likelihood evaluation of state-space representations
Review of Economic Studies
Maximum likelihood from incomplete data via the EM algorithm
Journal of the Royal Statistical Society, Series B
Sequential Monte Carlo methods in practice
On sequential Monte Carlo sampling methods for Bayesian filtering
Statistics and Computing
Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle
Bionformatics
Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle
Bioinformatics
Modeling the next generation sequencing sample processing pipeline for the purposes of classification
BMC Bioinformatics
Parameter estimation for linear dynamical systems. Technical report
Monte Carlo smoothing for nonlinear time series
Journal of the American Statistical Association
A particle filter approach to identification of nonlinear processes under missing observations
The Canadian Journal of Chemical Engineering
Dynamical analysis of drug efficacy and mechanism of action using GFP reporters
Journal of Biological Systems
Monte Carlo approximations for general state-space models
Journal of Computational and Graphical Statistics
Approximating and maximising the likelihood for a general state-space model
Cited by (0)
Mahdi Imani (S’15) received the B.Sc. degree in mechanical engineering and the M.Sc. degree in electrical engineering, both from the University of Tehran, Tehran, Iran, in 2012 and 2014, respectively. He is currently working toward the Ph.D. degree in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA. His current research interests include estimation and control of stochastic dynamical systems, with applications in genomic signal processing.
Ulisses M. Braga-Neto (S’97-M’04-SM’11) received the Ph.D. degree in electrical and computer engineering from The Johns Hopkins University, Baltimore, MD, USA. He is an associate professor in the Department of Electrical and Computer Engineering and a member of the Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, TX, USA. He has held postdoctoral positions at the University of Texas M.D. Anderson Cancer Center, Houston, TX, and at the Oswaldo Cruz Foundation, Recife, Brazil. His research interests include pattern recognition and statistical signal processing. He is the author of the textbook Error Estimation for Pattern Recognition (IEEE-Wiley, 2015) and has received the NSF CAREER Award for his work in this area.
- ☆
The authors acknowledge the support of the National Science Foundation, through NSF award CCF-1320884. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Thomas Bo Schön under the direction of Editor Torsten Söderström.