An effective non-parametric method for globally clustering genes from expression profiles

Hou, Jingyu; Shi, Wei; Li, Gang; Zhou, Wanlei

doi:10.1007/s11517-007-0271-1

An effective non-parametric method for globally clustering genes from expression profiles

Original Article
Published: 18 October 2007

Volume 45, pages 1175–1185, (2007)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Jingyu Hou¹,
Wei Shi²,
Gang Li¹ &
…
Wanlei Zhou¹

147 Accesses
4 Citations
Explore all metrics

Abstract

Clustering is widely used in bioinformatics to find gene correlation patterns. Although many algorithms have been proposed, these are usually confronted with difficulties in meeting the requirements of both automation and high quality. In this paper, we propose a novel algorithm for clustering genes from their expression profiles. The unique features of the proposed algorithm are twofold: it takes into consideration global, rather than local, gene correlation information in clustering processes; and it incorporates clustering quality measurement into the clustering processes to implement non-parametric, automatic and global optimal gene clustering. The evaluation on simulated and real gene data sets demonstrates the effectiveness of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizing Gene Expression Analysis Using Clustering Algorithms

Clustering: A Novel Meta-Analysis Approach for Differentially Expressed Gene Detection

A New Approach for Clustering Gene Expression Data

References

Aldenderfer MS, Blashfield RK (1984) Cluster analysis. Sage Publications, Beverly Hills
Google Scholar
Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In: Proceedings of the National Academy of Sciences of the USA Cell Biology 96:6745–6750
Altman RB, Raychaudhuri S (2001) Whole-genome expression analysis: challenges beyond clustering. Curr Opin Struct Biol 11(3):340–347
Article Google Scholar
Azuaje F (2003) Clustering-based approaches to discovering and visualising microarray data patterns. Brief Bioinform 4(1):31–42
Article Google Scholar
Boutros PC, Okey AB (2005) Unsupervised pattern recognition: An introduction to the whys and wherefores of clustering microarray data. Brief Bioinform 6(4):331–343
Article Google Scholar
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. In: Proceedings of the National Academy of Sciences of the USA, Cenetics 95:14863–14868
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inform Sys 17(2/3):107–145
Article MATH Google Scholar
Hathaway RJ, Bezdek JC (2003) Visual cluster validity for prototype generator clustering models. Pattern Recognition Letters 24(9–10):1563–1569
Article MATH Google Scholar
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Englewood Cliffs
Google Scholar
MacQueens JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkley symposium on mathematical statistics and probability, vol I Statistics, pp 281–297
Özsu MT, Valduriez P (1991) Principle of distributed database systems. Prentice-Hall, Englewood Cliffs
Google Scholar
Raychaudhuri S, Sutphin PD, Chang JT, Altman RB (2001) Basic microarray analysis: grouping and feature reduction. Trends Biotechnol 19(5):189–193
Article Google Scholar
Sherlock G (2001) Analysis of large-scale gene expression data. Brief Bioinform 2(4):350–362
Article Google Scholar
Simon R, Radmacher MD, Dobbin K, McShane LM (2003) Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 95(1):14–18
Article Google Scholar
Sokal RR, Michener CD (1958) A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull 38:1409–1438
Google Scholar
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Fucher B (1998) Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces Cerevisiae by Microarray Hybridization. Mol Biol Cell 9(12):3273–3297
Google Scholar
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a dataset via the gap statistics. J R Statist Soc B 63:411–423
Article MATH MathSciNet Google Scholar
Tseng VS, Kao CP (2005) Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method. IEEE/ACM Trans Comput Biol Bioinform 2(4):355–365
Article Google Scholar
Tseng SM, Kao CP (2003) Mining and Validating Gene Expression Patterns: An Integrated Approach and Applications. Informatica 27:21–27
MathSciNet Google Scholar
Zhang T, Ramakrishnman R, Linvy M (1996) BIRCH: An efficient method for very large databases, ACM SIGMOD. Montreal
Google Scholar

Download references

Acknowledgments

This research is partially supported by the Starting Grant of Faculty of Science and Technology, Deakin University, Australia. We thank Dr Yang Xiang for his contribution to some evaluations. The authors also thank the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

School of Engineering and Information Technology, Deakin University, 221 Burwood Highway, Burwood, VIC, 3125, Australia
Jingyu Hou, Gang Li & Wanlei Zhou
The Walter and Eliza Hall Institute of Medical Research (WEHI), Parkville, VIC, 3050, Australia
Wei Shi

Authors

Jingyu Hou
View author publications
You can also search for this author in PubMed Google Scholar
Wei Shi
View author publications
You can also search for this author in PubMed Google Scholar
Gang Li
View author publications
You can also search for this author in PubMed Google Scholar
Wanlei Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingyu Hou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hou, J., Shi, W., Li, G. et al. An effective non-parametric method for globally clustering genes from expression profiles. Med Bio Eng Comput 45, 1175–1185 (2007). https://doi.org/10.1007/s11517-007-0271-1

Download citation

Received: 05 June 2007
Accepted: 22 September 2007
Published: 18 October 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s11517-007-0271-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An effective non-parametric method for globally clustering genes from expression profiles

Abstract

Access this article

Similar content being viewed by others

Optimizing Gene Expression Analysis Using Clustering Algorithms

Clustering: A Novel Meta-Analysis Approach for Differentially Expressed Gene Detection

A New Approach for Clustering Gene Expression Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An effective non-parametric method for globally clustering genes from expression profiles

Abstract

Access this article

Similar content being viewed by others

Optimizing Gene Expression Analysis Using Clustering Algorithms

Clustering: A Novel Meta-Analysis Approach for Differentially Expressed Gene Detection

A New Approach for Clustering Gene Expression Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation