Non-parametric multivariate comparisons of soil fungal composition: Sensitivity to thresholds and indications of structural redundancy in T-RFLP data

https://doi.org/10.1016/j.soilbio.2008.01.008Get rights and content

Abstract

Complex soil microbial data produced by molecular profiling techniques, such as terminal restriction fragment length polymorphism (T-RFLP), are often analysed using multivariate statistical methods. Despite this, there has been little evaluation of the sensitivity of multivariate methods to routine data manipulations such as the application of noise thresholds. We examine effects of three percentage area thresholds (0.1%, 0.5% and 1.0%; i.e. ranging from low to commonly applied) on non-metric multidimensional scaling (NMDS) ordinations of soil fungal T-RFLP profiles. We then interpret threshold effects by introducing the concept of structural redundancy in T-RFLP data.

NMDS ordinations compared 20 soil samples (encompassing seven sites, two forest types, and three locations) for T-RFLP presence/absence data, which were produced using primers specific to the rDNA internal transcribed spacer (ITS) region, and three separate restriction enzymes (HinfI, AluI and TaqI). Increasing thresholds from 0.1% to 1.0% led to decreases in average number of detected reproducible fragments per site of 57% (HinfI), 75% (AluI), and 69% (TaqI). However, despite the removal of many fragments unique to site/forest type/location, the application of increasing thresholds did not significantly change NMDS patterns for any enzyme in both the complete data and in a more weakly structured sub-group (involving one forest type and two locations). Thus, our data indicate that application of area thresholds up to 1.0% would have minimal impact on NMDS-based comparisons of soil fungal composition.

Robustness of NMDS patterns to considerable fragment loss with increasing threshold suggested a degree of ‘structural redundancy’ in our T-RFLP data – that is, multiple fragments per soil sample were interchangeable in the way they contributed to between-sample NMDS patterns. Indeed, for each restriction enzyme, we found 3–4 peels of data (i.e. mutually exclusive sub-sets) that were capable of reproducing overall patterns in presence/absence data at the 0.1% threshold. Despite potential in the T-RFLP method for production of more than one fragment per fungal species/phylotype, we present arguments to support at least some translation of structural redundancy in T-RFLP data to structural redundancy in fungal community comparisons (i.e. the between-soil similarity patterns can be explained by, or are imprinted in, multiple groups of fungal phylotypes). Our analyses highlight structural redundancy as an efficient method for identifying key phylotypes responsible for differences in community composition among soils, and as a promising starting point for improved understanding of many aspects of spatial soil ecology.

Introduction

Characterisation of soil microbial communities increasingly relies on DNA-based profiling techniques such as terminal restriction fragment length polymorphism (T-RFLP; Thies, 2007). In brief, T-RFLP involves: extraction of soil community DNA/RNA, PCR (polymerase chain reaction) amplification of gene targets with fluorescently labeled primers, restriction endonuclease digestion of PCR products, and separation by length of fluorescently tagged terminal fragments using automated sequencing systems – with the number and relative size of fragments indicating phylogenetic composition of the target sequence (Marsh, 2005). Predicted numbers of unique terminal fragments per primer–restriction enzyme combination are in the hundreds for small-subunit ribosomal (rRNA) genes (and true numbers will be even greater given deficiencies in sequence databases; Kim and Marsh, 2004). In addition, true fragments must be distinguished from background sequencer noise and small spurious peaks arising from unknown DNA components (Dunbar et al., 2001). As such, analysis of soil T-RFLP data is frequently a complex task, with ongoing issues relating to choice of noise thresholds and to their differential effects on data interpretation (e.g. Osborne et al., 2006, Blackwood et al., 2007).

Application of uniform thresholds to T-RFLP data is considered important on the basis that greater relative loading of DNA on some electrophoresis gels can increase the likelihood of detecting minor fragments and noise peaks. This, in turn, can unduly influence between-sample comparisons particularly those based on presence/absence data, which give equal weighting to less abundant fragments (e.g. Blackwood et al., 2003a). Recently, Blackwood et al. (2007) demonstrated the effects of thresholds on T-RFLP diversity indices (e.g. Shannon diversity H′) using simulated microbial communities. They found that while T-RFLP indices adequately represented true diversity at low relative abundance thresholds (0.1%), they were inaccurate at higher, more routinely applied thresholds (1.0%). This inaccuracy of univariate indices at routine thresholds led them to recommend multivariate rather than univariate methods to analyse microbial community differences (Blackwood et al., 2007). Given that multivariate methods are well-established in the broader literature, it is notable that this recommendation to microbial ecologists is so recent. That it is needed is reflected in the relatively low uptake of multivariate statistics by microbial ecologists (Ramette, 2007), and in the consequent infancy of debates regarding which statistical methods are best suited to analysing trends in complex microbial data (Thies, 2007).

Multivariate methods have been used to support arguments for applying standardizing protocols and associated thresholds in T-RFLP data analysis (Dunbar et al., 2001, Osborne et al., 2006). However, there has been minimal evaluation of the sensitivity of multivariate methods themselves to changes in microbial composition with increasing thresholds. As such, potential for errors in interpretation of compositional differences at different thresholds – analogous to those found for univariate diversity indices by Blackwood et al. (2007) – cannot be adequately gauged. It is beyond the scope of this paper to evaluate relative performance of all the various multivariate methods recommended for T-RFLP data analysis (see Blackwood et al., 2003b, Grant and Ogilvie, 2003, Thies, 2007). Rather, we focus on ordination methods as these provide strong visual displays of between-sample compositional differences and are a sensible option for exploring many T-RFLP data sets where a clear a priori structure cannot be defined (Grant and Ogilvie, 2003). In particular, we examine non-metric multidimensional scaling (NMDS) because of its proven strength in discriminating between samples in recent soil microbial studies (Mills et al., 2006, Nelson and Mele, 2006, Wu et al., 2007), and because it can accommodate a range of data characteristics including joint absences and non-linear relationships between variables (Quinn and Keough, 2002).

Rees et al. (2004) described NMDS as a powerful method for detecting differences in sediment bacterial populations among three stream sites. In a rare but cursory examination of threshold effects, they found minimal change in NMDS patterns with increasing area threshold from 1.0% to 5.0% despite removal of up to 50% of fragments (Rees et al., 2004). While suggesting a certain ‘robustness’ of NMDS to threshold effects, these data also indicate that ordination patterns in the complete data (i.e. at the 1.0% threshold) could be reproduced by a sub-set of these data (i.e. at the 5.0% threshold). Indeed, a similar phenomenon has been found in macrobenthos communities, in which NMDS patterns of 110 species could be adequately reproduced by any randomly selected sub-set of 19 species (Gray et al., 1988). This observation gave rise to the concept of ‘structural redundancy’ in marine macrobenthos communities – namely, that many sub-sets of species are interchangeable in the way they characterise or ‘explain’ multivariate patterns of samples (Clarke and Warwick, 2001). Its measurement was then expressed in terms of the number of mutually exclusive sub-sets (or ‘peels’) whose multivariate response patterns closely match that of the full community (Clarke and Warwick, 1998). The structural redundancy concept has proven useful in identifying those species that contribute most to macrobenthos patterns, and in exploring the more difficult concepts of functional redundancy and functional compensation (Clarke and Warwick, 1998, Clarke and Warwick, 2001). Given ongoing interest in spatial patterning and redundancy in soil biology (Ettema and Wardle, 2002, Fitter, 2005), we believe it is timely to explore the relevance of this concept to soil microbial ecology.

This study examined the effects of thresholds on multivariate patterns of soil fungal T-RFLP data. We used three restriction enzymes to produce independent ‘combinations’ of fungal presence/absence data (Osborne et al., 2006) from 20 soil samples encompassing seven sites, two forest types, and three locations. NMDS and associated statistical procedures were applied to the strongly structured complete data, as well as a more weakly structured sub-group, for each enzyme at each of three peak area thresholds ranging from low (0.1%) to more routinely applied (1.0%). We aimed to examine potential consequences of interpreting microbial compositional patterns based on NMDS ordinations at different thresholds, and to test the generality of a previous observation that NMDS was ‘robust’ to threshold effects (Rees et al., 2004). After examining threshold effects, we quantify structural redundancy in the soil fungal data relevant to each enzyme and threshold. That is, we analyse data sets for successive and unique sub-sets of T-RFLP fragments that successfully reproduce NMDS patterns in the full data. To our knowledge, this is the first application of the structural redundancy concept to soil microbial communities.

Section snippets

Study sites

To meet our aim of examining threshold effects on multivariate patterns in T-RFLP data, we utilised a real data set that displayed strong patterns in NMDS ordinations. For context we provide brief descriptions of the study sites. Greater detail, including more comprehensive information on soil properties, is included in a separate paper (Kasel et al., 2008), which examines the relative importance of different ecological mechanisms behind data patterns.

The complete data were based on composite

Changes in profile area and fragment number with increasing threshold

Increasing thresholds from 0.1% to 0.5% and 1.0% decreased average remaining profile area from 98% to 91% and 83% for HinfI-digested profiles, from 97% to 86% and 74% for AluI profiles, and from 99% to 89% and 79% for TaqI profiles (Fig. 1). Concomitant decreases in average numbers of detected reproducible fragments per site were – for HinfI profiles: from 34–55 at the 0.1% threshold to 20–35 at 0.5%, and 15–24 at 1.0%; for AluI profiles: from 46–68 at 0.1% to 21–34 at 0.5%, and 10–18 at 1.0%;

NMDS – robust or insensitive?

Despite decreases in fragment numbers of 57% (HinfI) to 75% (AluI; from Fig. 1) with application of our highest threshold, there were only minor changes in NMDS ordinations and ANOSIM tests for both the full data and a more weakly structured sub-group (Fig. 2; Table 1, Table 2). Low two-dimensional stress values for NMDS ordinations (<0.16) and support from superimposed cluster groups (e.g. Fig. 2) indicated that NMDS ordination distances were a reliable representation of the underlying

Acknowledgements

This work was made possible through funding from the Victorian Government Department of Sustainability and Environment, and through the collaborative efforts of several SFES colleagues. In particular, we thank Gerd Bossinger, Alan York, Tina Bell, Steven Livesley and Stefan Arndt for providing laboratory facilities, field sites or soil samples. Thanks also to Amanda Ashton, Josie Lawrence, Kelly Whyte, Najib Ahmedy, and Frank Jones for technical assistance.

References (42)

  • P.G. Avis et al.

    A ‘dirty’ business: testing the limitations of terminal restriction fragment length polymorphism (TRFLP) analysis of soil fungi

    Molecular Ecology

    (2006)
  • M.-S. Benitez et al.

    Multiple statistical approaches of community fingerprint data reveal bacterial populations associated with general disease suppression arising from the application of different organic field management strategies

    Soil Biology & Biochemistry

    (2007)
  • C.B. Blackwood et al.

    Terminal restriction fragment length polymorphism data analysis for quantitative comparison of microbial communities

    Applied and Environmental Microbiology

    (2003)
  • C.B. Blackwood et al.

    Terminal restriction fragment length polymorphism data analysis: author's reply

    Applied and Environmental Microbiology

    (2003)
  • C.B. Blackwood et al.

    Interpreting ecological diversity indices applied to terminal restriction fragment length polymorphism data: insights from simulated microbial communities

    Applied and Environmental Microbiology

    (2007)
  • C.-L. Chen et al.

    Microbial community structure in a thermophilic anaerobic hybrid reactor degrading terephthalate

    Microbiology

    (2004)
  • K.R. Clarke

    Non-parametric multivariate analyses of changes in community structure

    Australian Journal of Ecology

    (1993)
  • K.R. Clarke et al.

    Quantifying structural redundancy in ecological communities

    Oecologia

    (1998)
  • K.R. Clarke et al.

    Change in Marine Communities: an Approach to Statistical Analysis and Interpretation

    (2001)
  • P.D. Countway et al.

    Protistan diversity estimates based on 18S rDNA from seawater incubations in the Western North Atlantic

    Journal of Eukaryotic Microbiology

    (2005)
  • I. Dickie et al.

    Using terminal restriction fragment length polymorphism (T-RFLP) to identify mycorrhizal fungi: a methods review

    Mycorrhiza

    (2007)
  • J. Dunbar et al.

    Phylogenetic specificity and reproducibility and new method for analysis of terminal restriction fragment profiles of 16S rRNA genes from bacterial communities

    Applied and Environmental Microbiology

    (2001)
  • M. Egert et al.

    Formation of pseudo-terminal restriction fragments, a PCR-related bias affecting terminal restriction fragment length polymorphism analysis of microbial community structure

    Applied and Environmental Microbiology

    (2003)
  • A.L. Engelbrektson et al.

    Analysis of treatment effects on the microbial ecology of the human intestine

    FEMS Microbiology Ecology

    (2006)
  • C.H. Ettema et al.

    Spatial soil ecology

    Trends in Ecology and Evolution

    (2002)
  • A.H. Fitter

    Darkness visible: reflections on underground ecology

    Journal of Ecology

    (2005)
  • M. Gardes et al.

    ITS primers with enhanced specificity for basidiomycetes – application to the identification of mycorrhizae and rusts

    Molecular Ecology

    (1993)
  • M. Glen et al.

    Interspecific and intraspecific variation of ectomycorrhizal fungi associated with Eucalyptus ecosystems as revealed by ribosomal DNA PCR–RFLP

    Mycological Research

    (2001)
  • A. Grant et al.

    Terminal restriction fragment length polymorphism data analysis

    Applied and Environmental Microbiology

    (2003)
  • J.S. Gray et al.

    Analysis of community attributes of the benthic macrofauna of Frierfjord/Langesundfjord and in a mesocosm experiment

    Marine Ecology – Progress Series

    (1988)
  • J.L. Green et al.

    Spatial scaling of microbial eukaryote diversity

    Nature

    (2004)
  • Cited by (19)

    • Soil pretreatment and fast cell lysis for direct polymerase chain reaction from forest soils for terminal restriction fragment length polymorphism analysis of fungal communities

      2016, Brazilian Journal of Microbiology
      Citation Excerpt :

      The processed data were imported to MS Excel 2007. Percentages of each TRF peak area relative to the total peak area of each sample were calculated.31 The normalized peak area was defined as relative abundance of each reserved TRF.

    • Interactive effects of hypoxia and polybrominated diphenyl ethers (PBDEs) on microbial community assembly in surface marine sediments

      2014, Marine Pollution Bulletin
      Citation Excerpt :

      The similarity matrix was used in hierarchical clustering analysis using the unweighted pair-group average (UPGMA) algorithm to construct a dendrogram. Two-way crossed analysis of similarity (ANOSIM) and multiple pairwise comparisons were used to assess the differences in the T-RF pattern between and among the treatments (Bennett et al., 2008). DNA of the three replicate samples of the same treatment were pooled in equimolar ratio for PCR and clone library construction.

    • In situ dynamics of soil fungal communities under different genotypes of potato, including a genetically modified cultivar

      2010, Soil Biology and Biochemistry
      Citation Excerpt :

      Although clustering was very similar with both methods, we present the NMDS analysis as this method is thought to be better for datasets with relatively high beta diversity in the matrix (i.e. greater than 2) as is the case here (Table 2) (Culman et al., 2008b). Goodness of fit between similarity rankings and ordination distances was analyzed using Kruskal’s stress value (<0.2) (Bennett et al., 2008). The effect of the treatments was tested using one- or two-way ANOSIM in PAST (Hammer et al., 2001) with Jaccard as a distance measure.

    • Woodland trees modulate soil resources and conserve fungal diversity in fragmented landscapes

      2009, Soil Biology and Biochemistry
      Citation Excerpt :

      This might be partly due to inaccuracies in richness estimates based on the T-RFLP method, which potentially include multiple TRFs per fungal species (e.g. Avis et al., 2006). However, our T-RFLP methodology was optimised to minimise the production of false peaks (see Bennett et al., 2008), and we argue that the use of three separate enzymes provides three independent pictures for interpreting comparative fungal richness (Osborne et al., 2006). Certainly, based on our previous work, we are confident that minor false peaks had negligible impact on our above interpretations of comparative composition (Bennett et al., 2008).

    View all citing articles on Scopus
    View full text