Imprecise global sensitivity analysis using bayesian multimodel inference and importance sampling

https://doi.org/10.1016/j.ymssp.2020.107162Get rights and content

Highlights

  • A fully probabilistic multimodel approach for imprecise GSA is proposed.

  • This approach employs Bayesian multimodel inference to quantify uncertainties in model inputs.

  • The multimodel inference is combined with an importance reweighting scheme for estimation of imprecise Sobol indices.

  • The methodology can be used to assess confidence in computed sensitivity indices and inform testing and data collection efforts.

Abstract

Global Sensitivity Analysis (GSA) aims to understand the relative importance of uncertain input variables to model response. Conventional GSA involves calculating sensitivity (Sobol’) indices for a model with known model parameter distributions. However, model parameters are affected by aleatory and epistemic uncertainty, with the latter often caused by lack of data. We propose a new framework to quantify uncertainty in probability model-form and model parameters resulting from small datasets and integrate these uncertainties into Sobol’ index estimates. First, the process establishes, through Bayesian multimodel inference, a set of candidate probability models and their associated probabilities. Imprecise Sobol’ indices are calculated from these probability models using an importance sampling reweighting approach. This results in probabilistic Sobol’ indices, whose distribution characterizes uncertainty in the sensitivity resulting from small dataset size. The imprecise Sobol’ indices thus provide a measure of confidence in the sensitivity estimate and, moreover, can be used to inform data collection efforts targeted to minimize the impact of uncertainties. Through an example studying the parameters of a Timoshenko beam, we show that these probabilistic Sobol’ indices converge to the true/deterministic Sobol’ indices as the dataset size increases and hence, distribution-form uncertainty reduces. The approach is then applied to assess the sensitivity of the out-of-plane properties of an E-glass fiber composite material to its constituent properties. This second example illustrates the approach for an important class of materials with wide-ranging applications when data may be lacking for some input parameters.

Introduction

Computational simulation is widely used to better understand complex mathematical and physical systems. Typically these simulations aim to model the system response, specifically targeting some quantity of interest (QoI) and its dependence on some set of input model parameters. In many cases, these input parameters are not precisely known or deterministic because of inherent randomness in the system and/or a lack of knowledge/data. In traditional analyses, uncertainty is quantified probabilistically and propagated through the computational model. Consequently, the corresponding QoI is uncertain.

Sensitivity analysis is the quantitative assessment of the influence of variations in input parameters on variations in the output QoI. Sensitivity analysis therefore enables the assessment of the relative importance of each model input to the specific QoI [1], [2]. Generally speaking, there are two classes of sensitivity analysis: Local Sensitivity Analysis (LSA) and Global Sensitivity Analysis (GSA). LSA focuses on the influence of the small variations in the input parameters around some nominal values and are typically quantified by evaluating local gradients in the system solution with respect to the input parameters. GSA, on the other hand, examines the effect of variations in the input parameters across the entire range of the parameter space. We are specifically interested in GSA in this work. Two prominent methods have been developed for GSA. The first is the one-factor-at-a-time approach in which each parameter is perturbed in turn while keeping all other parameters fixed at their nominal value. This approach has been shown to break down when the model deviates from linearity [3] and cannot account for a probability density over the parameter domain. The second is based on the estimation of variance-based sensitivity indices (so-called Sobol’ indices) [4] that prove more robust to model form and relies on each parameter having an associated probability measure.

A variety of methods have been proposed for variance-based GSA that employ, for example, Fourier analysis as in the FAST (Fourier Amplitude Sensitivity Test) algorithm [5], [6], Monte Carlo methods [7], [8], and surrogate modeling methods such as polynomial chaos [9], [10], Gaussian processes/Kriging [11], and support vector machines [12] considering both independent and correlated input parameters [13]. Moreover, the range of recent applications for GSA spans across physical systems of all types from chemistry [14] to structural acoustics [15] and structural vibration of nuclear reactor assemblies [16]. In this work, we employ Monte Carlo estimates of global sensitivity indices because they are robust, simple to use, and couple conveniently with the multimodel uncertainty quantification methods for imprecise probabilities proposed herein. Extension of the concepts proposed herein to utilize more efficient GSA algorithms thus remains an important research objective.

Conventional GSA methods are built from a classical probabilistic framework in which input uncertainty is characterized through a known probability measure on the input parameters. Hence, the first step of GSA is to identify or assume a reasonable probability distribution for the input variables. However, as commonly occurs when data characterizing the input parameters are sparse, it may be impossible to identify the appropriate distribution. This can lead to assumptions that may seem reasonable but are nonetheless subjective and may yield sensitivity indices that are inaccurate, non-conservative, or even potentially unreasonable with no quantitative error metrics. These errors arise from so-called epistemic uncertainty – or uncertainty that arises from a lack of knowledge or data and can therefore be reduced with additional data collection [17]. The inclusion of epistemic uncertainty gives rise to two distinct objectives for imprecise global sensitivity analysis:

  • 1. To assess confidence in estimates of conventional global sensitivity indices.

  • 2. To specifically understand the influence of epistemic uncertainty on system response, and therefore inform data collection efforts such that they reduce the impact of uncertainty in system response.

To address both objectives in this work, we propose a method that infers a probability distribution for the Sobol’ indices. Thus, toward the first object, the method provides a natural measure of confidence. Toward the second objective, the inferred distribution identifies parameters that may contribute significantly to system response and also have large uncertainty. Therefore, it is useful in identifying parameters that should be targeted for further data collection efforts. Generally speaking, several theories have been proposed to address the various forms of epistemic uncertainty. It has been argued that epistemic uncertainty requires a different mathematical treatment than aleatory uncertainty [18], which arises from irreducible randomness and is naturally treated probabilitistically. However, no scientific consensus has been achieved in terms of what that treatment should be. The larger problem is that a unified mathematical treatment of epistemic uncertainty must be general enough to accommodate the many types of epistemic uncertainty – such as those that arise from vague, qualitative, or conflicting data, total or near-total ignorance, small datasets that give limited information about a probability measure, and the “unknown unknown” among others. Different mathematical theories have been developed which account for some of these epistemic uncertainties; typically by generalizing the measure used to quantify events in the algebra of possibilities. Specific examples include possibility theory which introduce the concepts of possibility and necessity measures [19], Dempster-Shafer/Evidence theory which introduces the concepts of belief and plausibility measures [20], [21], Choquet capacities which introduce conjugage measures of lower and upper probability [22], fuzzy sets [23], [24] and fuzzy measures [25], random sets and sets of probability measures [26], [27], and interval methods [28] including interval probabilities and probability boxes (p-boxes) [18]. In the intervening years, efforts have been made to unify these theories under an over-arching theory of imprecise probabilities, most notably through the works of Walley [29], [30].

Perhaps more relevant for our purposes, many recent efforts have been made in the engineering community to translate these theories of imprecise probabilities into the practice of uncertainty quantification and stochastic analysis in computational modeling. This includes efforts to: extend Monte Carlo simulations [31], [32], [33], develop non-intrusive stochastic simulation methods for imprecise probability models [34], [35], [36], [37], construct p-boxes and Dempster-Shafer belief functions [38], understand the effects of imprecision on reliability/probability of failure [39], [40], [41], [42], [43], propagate p-boxes[44], [45], and perform Bayesian model averaging [46] to name just a few. A concise review of imprecise probability methods in engineering can be found in [47]. Among the many studies of epistemic uncertainty in engineering, relatively few have considered the problem of imprecise global sensitivity analysis. To the authors’ knowledge, such investigations date back a little more than a decade beginning with the work of Helton et al. who merged GSA with evidence theory to estimate imprecise sensitivity measures [48] and Hall who estimated the upper and lower bounds of sensitivity indices when the uncertainty in the model inputs are expressed as closed convex sets of probability measures [49]. Among the most comprehensive studies of imprecise GSA is that conducted by Oberguggenberger et al. [50] who compared classical GSA with GSA for problems with uncertainties defined by random sets, fuzzy sets, and intervals. Other recent works include those of Song et al. [51] who devised GSA for input uncertainties characterized by p-boxes and computed the imprecise sensitivity indices using extended Monte Carlo simulation [31] (EMCS). Li and Mahadevan [52] devised a scheme to estimate Sobol’ indices when both aleatory and epistemic uncertainty are present in time series data. More recently, Wei et al. [53] proposed a probabilistic framework where epistemic input uncertainties are characterized by second-order probability models to compare the relative importance of influential and non-influential input variables using EMCS. Schöbi and Sudret [54] used polynomial chaos expansions to develop interval Sobol’ indices when input uncertainties are characterized by p-boxes, and Hart and Germaud [55] provided a theoretical analysis for the robustness of the Sobol’ indices to changes in the distribution of the uncertain variables.

In this study, we investigate imprecise GSA where epistemic uncertainty specifically results fromsparse data sets, which may be compiled from disparate data sources. This focus is motivated by the difficulty of data collection under complex conditions in engineering practice. In many cases, experimental or validated simulation data for quantifying parameter uncertainties are strictly limited. As a result, it is impossible to assign an objective and accurate probability distribution for the input variables to a computational model and precisely estimate their impact on the response output. The specific contributions of this study can be summarized as follows:

  • We derive a formulation for computing main effect and total sensitivity indices using importance sampling.

  • We apply a Bayesian multimodel inference process in conjunction with the importance sampling-based GSA to estimate the distribution of imprecise Sobol’ indices whose uncertainty results from lack of data quantifying input uncertainty.

  • We demonstrate how to obtain the optimal sampling density to efficiently employ the proposed multimodel GSA for engineering systems.

  • We illustrate how to use the results of the proposed imprecise GSA to assess confidence in computed sensitivity indices and inform future testing and data collection.

It is important to emphasize that the novelty of the work presented here lies in bringing the components of importance sampling based GSA and Bayesian multimodel inference together for a robust and efficient approach to imprecise GSA. The development of the individual components of the approach is not the primary intent. Indeed previous works have, for example, derived formulations for computing sensitivity indices using importance sampling (see e.g. [53]) – although the formulation provided here has some distinct differences and advantages that are discussed below.

The paper is structured as follows. We begin in Section 2 by presenting a brief review of GSA, particularly variance-based methods and Sobol’ indices. Section 3 introduces the Bayesian multimodel inference methodology and its application for identifying imprecise probability models given limited data. An efficient importance sampling-based Monte Carlo method for imprecise global sensitivity analysis is then proposed in Section 4. The effectiveness of the proposed algorithm is illustrated by a closed-form numerical example in Section 5. Section 6 presents an application of the proposed method to predicting the sensitivity of composite material properties to the properties of its constituents. Finally, some concluding remarks are provided in Section 7.

Section snippets

Variance-based methods for GSA

Variance-based GSA decomposes the variance of the model output into fractions which can be attributed to input variables within a probabilistic framework. Variance-based measures of sensitivity are attractive as they are applicable over the whole space of input random variables, and they can also deal with nonlinear responses and measure the effect of interactions in nonadditive systems [8]. Here we review some of the basic principles of variance-based GSA.

Bayesian multimodel inference for imprecise probabilities from limited data

Methods for global sensitivity analysis, as presented above, employ samples that are drawn from a known probability density. However, identifying a probability density requires either a large data set characterizing the distribution or some assumptions. In engineering practice, however, it is common that only small/limited data are available such that a unique probability distribution cannot be identified without significant (and potentially problematic) assumptions. In this section, we review

Imprecise global sensitivity analysis

If the conventional Monte Carlo method is used to estimate Sobol’ indices for each model in P, the total computational cost isCMC=Nc×Nc××Ncd×(d+1)×m=Ncd×(d+2)×m

Nc must be sufficiently large number such that it can adequately represent the overall uncertainties in both model form and model parameters. In fact, as previously stated, we want to allow Nc to be arbitrarily large. This thus leads to a huge number of simulations, making estimation of Sobol’ indices intractable and cost prohibitive.

Numerical example

In this section, we use a thick cantilever beam example [60] shown in Fig. 1 to illustrate the proposed methodology for the estimation of imprecise sensitivity indices. This example computes the beam’s deflection δ analytically using the Timoshenko beam theory [76]δ=P6EI(4+5ν)h2L4+2L3where I=bh3/12. The statistical information of these model inputs in Eq. (35) are listed in Table 1 [60].

Table 1 provides a distribution for each random variables of model inputs provided by [60]. We assume that

Application: IGSA for out-of-plane composite lamina properties

Here, we explore the influence of the constituent (fiber and matrix) material properties on the out-of-plane elastic properties of a unidirectional composite lamina.

Conclusion

This work investigates the effect of uncertainties associated with small data sets for quantifying model inputs on the global sensitivity analysis of engineering systems. An effective method is presented to estimate imprecise first-order Sobol’ indices. These imprecise Sobol’ indices take a probabilistic form, such that instead of yielding known Sobol’ indices for a given problem, the method produces distributions for the Sobol’ indices reflecting the underlying epistemic uncertainty associated

CRediT authorship contribution statement

Jiaxin Zhang: Conceptualization, Formal analysis, Investigation, Methodology, Software, Writing - original draft, Writing - review & editing, Validation, Visualization. Stephanie TerMaath: Data curation, Investigation, Validation, Writing - review & editing. Michael D. Shields: Conceptualization, Methodology, Writing - review & editing, Funding acquisition, Project administration, Supervision, Validation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The work presented herein has been supported by the Office of Naval Research under Award Numbers N00014-16-1-2582 and N00014-16-1-2370 with Dr. Paul Hess as the program officer. The work of J. Zhang was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under contract ERKJ352; and by the Artificial Intelligence Initiative at the Oak Ridge National Laboratory (ORNL). ORNL is operated by UT-Battelle, LLC.,

References (108)

  • J.-L. Christen et al.

    Global sensitivity analysis and uncertainties in sea models of vibroacoustic systems

    Mech. Syst. Signal Process.

    (2017)
  • G.A. Banyay et al.

    Efficient global sensitivity analysis for flow-induced vibration of a nuclear reactor assembly using kriging surrogates

    Nucl. Eng. Des.

    (2019)
  • S. Ferson et al.

    Different methods are needed to propagate ignorance and variability

    Reliab. Eng. Syst. Saf.

    (1996)
  • L.A. Zadeh

    Fuzzy sets

    Inf. Control

    (1965)
  • T. Fetz et al.

    Propagation of uncertainty through multivariate functions in the framework of sets of probability measures

    Reliab. Eng. Syst. Saf.

    (2004)
  • P. Walley

    Towards a unified theory of imprecise probability

    Int. J. Approximate Reasoning

    (2000)
  • T. Fetz et al.

    Imprecise random variables, random sets, and monte carlo simulation

    Int. J. Approximate Reasoning

    (2016)
  • J. Zhang et al.

    On the quantification and efficient propagation of imprecise probabilities resulting from small datasets

    Mech. Syst. Signal Process.

    (2018)
  • P. Wei et al.

    Non-intrusive stochastic analysis with parameterized imprecise probability models: I. performance estimation

    Mech. Syst. Signal Process.

    (2019)
  • P. Wei et al.

    Non-intrusive stochastic analysis with parameterized imprecise probability models: II. reliability and rare events analysis

    Mech. Syst. Signal Process.

    (2019)
  • J. Song et al.

    Non-intrusive imprecise stochastic simulation by line sampling

    Struct. Saf.

    (2020)
  • J. Song et al.

    Generalization of non-intrusive imprecise stochastic simulation for mixed uncertain variables

    Mech. Syst. Signal Process.

    (2019)
  • R. Zhang et al.

    Integration of computation and testing for reliability estimation

    Reliab. Eng. Syst. Saf.

    (2001)
  • H. Zhang et al.

    Structural reliability analysis on the basis of small samples: an interval quasi-Monte Carlo method

    Mech. Syst. Signal Process.

    (2013)
  • S. Nannapaneni et al.

    Reliability analysis under epistemic uncertainty

    Reliab. Eng. Syst. Saf.

    (2016)
  • M.A. Valdebenito et al.

    Fuzzy failure probability estimation applying intervening variables

    Struct. Saf.

    (2020)
  • R. Schöbi et al.

    Uncertainty propagation of p-boxes using sparse polynomial chaos expansions

    J. Comput. Phys.

    (2017)
  • S. Bi et al.

    The Bhattacharyya distance: enriching the p-box in stochastic sensitivity analysis

    Mech. Syst. Signal Process.

    (2019)
  • S. Sankararaman et al.

    Distribution type uncertainty due to sparse and imprecise data

    Mech. Syst. Signal Process.

    (2013)
  • M. Beer et al.

    Imprecise probabilities in engineering analyses

    Mech. Syst. Signal Process.

    (2013)
  • J.C. Helton et al.

    Sensitivity analysis in conjunction with evidence theory representations of epistemic uncertainty

    Reliab. Eng. Syst. Saf.

    (2006)
  • J.W. Hall

    Uncertainty-based sensitivity indices for imprecise probability distributions

    Reliab. Eng. Syst. Saf.

    (2006)
  • M. Oberguggenberger et al.

    Classical and imprecise probability methods for sensitivity analysis in engineering: a case study

    Int. J. Approximate Reasoning

    (2009)
  • C. Li et al.

    Relative contributions of aleatory and epistemic uncertainty sources in time series prediction

    Int. J. Fatigue

    (2016)
  • P. Wei et al.

    A probabilistic procedure for quantifying the relative importance of model inputs characterized by second-order probability models

    Int. J. Approximate Reasoning

    (2018)
  • T. Homma et al.

    Importance measures in global sensitivity analysis of nonlinear models

    Reliab. Eng. Syst. Saf.

    (1996)
  • C. Li et al.

    An efficient modularized sample-based method to estimate the first order sobol index

    Reliab. Eng. Syst. Saf.

    (2016)
  • I.M. Sobol

    On quasi-monte carlo integrations

    Math. Computers Simul.

    (1998)
  • M.D. Shields et al.

    The generalization of Latin hypercube sampling

    Reliab. Eng. Syst. Saf.

    (2016)
  • J. Zhang et al.

    The effect of prior probabilities on quantification and propagation of imprecise probabilities resulting from small datasets

    Comput. Methods Appl. Mech. Eng.

    (2018)
  • L. Martino et al.

    Effective sample size for importance sampling based on discrepancy measures

    Signal Processing

    (2017)
  • J. Zhang et al.

    Efficient Monte Carlo resampling for probability measure changes from Bayesian updating

    Probab. Eng. Mech.

    (2019)
  • C. Sun et al.

    Prediction of composite properties from a representative volume element

    Compos. Sci. Technol.

    (1996)
  • J. Zhang et al.

    On the quantification and efficient propagation of imprecise probabilities with copula dependence

    Int. J. Approximate Reasoning

    (2020)
  • Z.H. Karadeniz et al.

    A numerical study on the coefficients of thermal expansion of fiber reinforced composite materials

    Compos. Struct.

    (2007)
  • M.K. Chati et al.

    Prediction of elastic properties of fiber-reinforced unidirectional composites

    Eng. Anal. Boundary Elements

    (1998)
  • Z.-M. Huang

    Micromechanical prediction of ultimate strength of transversely isotropic fibrous composites

    Int. J. Solids Struct.

    (2001)
  • A. Wongsto et al.

    Micromechanical fe analysis of ud fibre-reinforced composites with fibres distributed at random over the transverse cross-section

    Compos. Part A: Appl. Sci. Manuf.

    (2005)
  • A. Saltelli et al.

    Global Sensitivity Analysis: The Primer

    (2008)
  • I.M. Sobol

    Sensitivity estimates for nonlinear mathematical models

    Math. Model. Comput. Exp

    (1993)
  • Cited by (17)

    • Belief-Informed Robust Decision Making (BIRDM): Assessing changes in decision robustness due to changing distributions of deep uncertainties

      2023, Environmental Modelling and Software
      Citation Excerpt :

      This optimal density is the one used in the Monte Carlo analysis, and then the uncertainty in the selection of the distribution is explored by applying the weighting scheme as if any of the candidates were used. Zhang et al. (2021) expanded this framework within the context of global sensitivity analysis and derived imprecise Sobol indices (Sobol’, 2001). Following these applications, BIRDM applies the weighting method assuming a uniform surrogate distribution, h – in line with RDM and other scenario neutral frameworks – and evaluate robustness of all identified measures under alternative assumptions of the original distribution of deep uncertainties, f.

    • An integrated approach of artificial neural networks and polynomial chaos expansion for prediction and analysis of yield and environmental impact of oil shale retorting process under uncertainty

      2022, Fuel
      Citation Excerpt :

      This study implements a non-intrusive approach based on the integration of machine learning techniques with a PCE. Sensitivity analysis was performed using Sobol and FAST techniques to determine the impact of feed process conditions on desired outcomes of the oil shale retorting process [38–40]. Sobol is a statistical technique that depicts the impact of individual or combination of two or more input variables’ uncertainties on process outputs.

    View all citing articles on Scopus
    View full text