Recommendations of the DNA Commission of the International Society for Forensic Genetics (ISFG) on quality control of autosomal Short Tandem Repeat allele frequency databasing (STRidER)

https://doi.org/10.1016/j.fsigen.2016.06.008Get rights and content

Highlights

  • STRidER is a curated, freely accessible forensic autosomal STR database.

  • STRidER enables reliable allele frequency estimates from high-quality data.

  • STRidER innovatively offers quality control for autosomal STR data.

  • Autosomal STR datasets can be traced and metrics are provided.

  • STRidER can accommodate length-based STR alleles and sequence data.

Abstract

The statistical evaluation of autosomal Short Tandem Repeat (STR) genotypes is based on allele frequencies. These are empirically determined from sets of randomly selected human samples, compiled into STR databases that have been established in the course of population genetic studies. There is currently no agreed procedure of performing quality control of STR allele frequency databases, and the reliability and accuracy of the data are largely based on the responsibility of the individual contributing research groups. It has been demonstrated with databases of haploid markers (EMPOP for mitochondrial mtDNA, and YHRD for Y-chromosomal loci) that centralized quality control and data curation is essential to minimize error. The concepts employed for quality control involve software-aided likelihood-of-genotype, phylogenetic, and population genetic checks that allow the researchers to compare novel data to established datasets and, thus, maintain the high quality required in forensic genetics.

Here, we present STRidER (http://strider.online), a publicly available, centrally curated online allele frequency database and quality control platform for autosomal STRs. STRidER expands on the previously established ENFSI DNA WG STRbASE and applies standard concepts established for haploid and autosomal markers as well as novel tools to reduce error and increase the quality of autosomal STR data. The platform constitutes a significant improvement and innovation for the scientific community, offering autosomal STR data quality control and reliable STR genotype estimates.

Section snippets

Status quo of autosomal STR quality control and databasing

Short Tandem Repeats (STRs), also known as microsatellites, are polymorphic DNA regions that are widespread throughout the human genome [1]. They typically consist of simple, compound or complex DNA motifs that are 2–7 base pairs (bp) in length and show high variability between unrelated individuals. The forensic field adopted autosomal STRs in the early 1990s [1], and STRs have become the primary workhorse for individual identification in criminal casework, kinship analyses, and identification

Rationale and concept

Available STR population datasets potentially contain errors that were not addressed or happened during the publication process. Some of such errors are easy to spot, such as incorrect allele nomenclature (e.g., “22.4” in a tetrameric STR marker) or incorrectly prepared frequency estimates (e.g., where the reported STR allele frequencies for a marker do not add up to 1) (see [28]). In just one example, re-typing of a widely applied population dataset after 16 years revealed a certain number of

Benefits to the scientific community from services offered by STRidER

The STRidER database and QC platform constitutes a significant augmentation of services accessible to both scientists and practitioners in forensics and beyond. Providing detailed information on high-quality samples, STRidER may serve other fields interested in human genome variation and its geographic patterns, such as anthropology and population genetics. STRidER is offered at no cost to promote data quality, but should be referenced in any publication of validated data. STRidER does not

Outlook: digging deeper into variation − STR sequence data in STRidER

Variation at STR loci is currently almost exclusively assessed from electrophoretic size-based categories and interpreted as repeat numbers. The long-recognized sequence variation in repeat units and flanking regions [4], [5], [44], [45] was hitherto only rarely studied at the population level (e.g., [46]) until massively parallel sequencing techniques (MPS) recently provided access to this information (e.g., [47], [48], [49], [50], [51], [52]). It was demonstrated that sequencing of STRs

Acknowledgements

The authors are grateful to Volker Weirich (Rampe, Germany), Martin Eckert, and Marie-Luise Sonntag (Wiesbaden, Germany) for valuable discussion and contributions, and to Nicole Huber, Gregor Kofler, Martin Pircher, Stefan Troger (Innsbruck, Austria) for excellent technical and software support. This project received funding from the Prevention of and Fight against Crime Programme of the European Union, the ENFSI DNA Working Group, and from the Institute of Legal Medicine, Medical University of

References (54)

  • B. Zimmermann et al.

    Application of a west Eurasian-specific filter for quasi-median network analysis: sharpening the blade for mtDNA error detection

    Forensic Sci. Int. Genet.

    (2011)
  • A. Carracedo et al.

    New guidelines for the publication of genetic population data

    Forensic Sci. Int. Genet.

    (2013)
  • P. Gill et al.

    A comparison of adjustment methods to test the robustness of an STR DNA database comprised of 24 European populations

    Forensic Sci. Int.

    (2003)
  • J. Amigo et al.

    pop.STR—an online population frequency browser for established and new forensic STRs

    Forensic Sci. Int.: Genet. Suppl. Ser.

    (2009)
  • J.P. Pamplona et al.

    A worldwide database of autosomal markers used by the forensic community

    Forensic Sci. Int.: Genetics Suppl. Ser.

    (2008)
  • J. Buckleton et al.

    Population-specific FST values for forensic STR markers: a worldwide survey

    Forensic Sci. Int. Genet.

    (2016)
  • Y.G. Yao et al.

    A call for mtDNA data quality control in forensic science

    Forensic Sci. Int.

    (2004)
  • A. Salas et al.

    A practical guide to mitochondrial DNA error prevention in clinical, forensic, and population genetics

    Biochem. Biophys. Res. Commun.

    (2005)
  • H.J. Bandelt et al.

    The fingerprint of phantom mutations in mitochondrial DNA data

    Am. J. Hum. Genet.

    (2002)
  • W. Parson et al.

    The EDNAP mitochondrial DNA population database (EMPOP) collaborative exercises: organisation, results and perspectives

    Forensic Sci. Int.

    (2004)
  • M. Bodner et al.

    Inspecting close maternal relatedness: towards better mtDNA population samples in forensic databases

    Forensic Sci. Int. Genet.

    (2011)
  • W. Parson et al.

    DNA Commission of the International Society for Forensic Genetics: revised and extended guidelines for mitochondrial DNA typing

    Forensic Sci. Int. Genet.

    (2014)
  • T. Restrepo et al.

    Database sample size effect on minimum allele frequency estimation: database comparison analysis of samples of 4652 and 560 individuals for 22 microsatellites in Colombian population

    Forensic Sci. Int.: Genet. Suppl. Ser.

    (2011)
  • D.J. Balding et al.

    DNA profile match probability calculation: how to allow for population stratification, relatedness, database selection and single bands

    Forensic Sci. Int.

    (1994)
  • K.B. Gettings et al.

    STR allele sequence variation: current knowledge and future issues

    Forensic Sci. Int. Genet.

    (2015)
  • C. Borsting et al.

    Next generation sequencing and its applications in forensic genetics

    Forensic Sci. Int. Genet.

    (2015)
  • C. Van Neste et al.

    My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing

    Forensic Sci. Int. Genet.

    (2014)
  • Cited by (138)

    • Harmonizing the forensic nomenclature for STR loci D6S474 and DYS612

      2024, Forensic Science International: Genetics
    View all citing articles on Scopus
    View full text