ENCODE leads the way on big data

Gerstein, Mark

doi:10.1038/489208b

Download PDF

Correspondence
Published: 12 September 2012

Genomics

ENCODE leads the way on big data

Mark Gerstein¹

Nature volume 489, page 208 (2012)Cite this article

3629 Accesses
32 Citations
97 Altmetric
Metrics details

Subjects

متوفر باللغة العربية

The ENCODE project offers a fresh perspective on big data by providing an organized framework for genomics (www.nature.com/encode). Other big-data efforts tend to focus on rapidly locating needles in petabyte-sized haystacks (such as finding the Higgs boson, for instance), whereas ENCODE aims to supply a structured overview.

ENCODE's organization of information is hierarchical, with raw data at the bottom and layers of annotation above. The processed summaries become progressively broader — for example, starting at the level of signals representing the degree to which DNA is bound by transcription factors, moving on to the locations of sites where these factors bind, and then to overviews of regulatory networks. At the summit are the linked publications documenting the annotation.

The ENCODE data model could be useful in other fields: for example, astronomy and Earth science are in the process of organizing their reams of data (M. J. Raddick and A. S. Szalay Science 329, 1028–1029; 2010), but don't yet compare with ENCODE in terms of the level of integration.

Author information

Authors and Affiliations

Yale University, New Haven, Connecticut, USA
Mark Gerstein

Authors

Mark Gerstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Gerstein.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gerstein, M. ENCODE leads the way on big data. Nature 489, 208 (2012). https://doi.org/10.1038/489208b

Download citation

Published: 12 September 2012
Issue Date: 13 September 2012
DOI: https://doi.org/10.1038/489208b

This article is cited by

Allele-specific transcription factor binding in a cellular model of orofacial clefting
- Katharina L. M. Ruff
- Ronja Hollstein
- Kerstin U. Ludwig
Scientific Reports (2022)
Correlation of an epigenetic mitotic clock with cancer risk
- Zhen Yang
- Andrew Wong
- Andrew E. Teschendorff
Genome Biology (2016)
Sharing big biomedical data
- Arthur W Toga
- Ivo D Dinov
Journal of Big Data (2015)
Teaching 'big data' analysis to young immunologists
- Joachim L Schultze
Nature Immunology (2015)
Implementation of the CDC translational informatics platform - from genetic variants to the national Swedish Rheumatology Quality Register
- Imad Abugessaisa
- David Gomez-Cabrero
- Jesper Tegnér
Journal of Translational Medicine (2013)

ENCODE leads the way on big data

Subjects

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

This article is cited by

Allele-specific transcription factor binding in a cellular model of orofacial clefting

Correlation of an epigenetic mitotic clock with cancer risk

Sharing big biomedical data

Teaching 'big data' analysis to young immunologists

Implementation of the CDC translational informatics platform - from genetic variants to the national Swedish Rheumatology Quality Register

Search

Quick links

Subjects

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Allele-specific transcription factor binding in a cellular model of orofacial clefting

Correlation of an epigenetic mitotic clock with cancer risk

Sharing big biomedical data

Teaching 'big data' analysis to young immunologists

Implementation of the CDC translational informatics platform - from genetic variants to the national Swedish Rheumatology Quality Register

Search

Quick links