Definition
In its rawest form, correlation clustering is graph optimization problem. Consider a clusteringC to be a mapping from the elements to be clustered, V, to the set \(\{1,\ldots,\vert V \vert \}\), so that u and v are in the same cluster if and only if C[u] = C[v]. Given a collection of items in which each pair (u, v) has two weights \(w_{uv}^{+}\) and \(w_{uv}^{-}\), we must find a clustering C that minimizes
or, equivalently, maximizes
Note that although \(w_{uv}^{+}\) and \(w_{uv}^{-}\) may be thought of as positive and negative evidence towards coassociation, the actual weights are nonnegative.
Motivation and Background
The notion of clustering with advice, that is...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Ailon N, Charikar M, Newman A (2005) Aggregating inconsistent information: ranking and clustering. In: Proceedings of the thirty-seventh ACM symposium on the theory of computing. ACM Press, New York, pp 684–693
Alon N, Makarychev K, Makarychev Y, Naor A (2006) Quadratic forms on graphs. Invent Math 163(3):499–522
Arora S, Berger E, Hazan E, Kindler G, Safra S (2005) On non-approximability for quadratic programs. In: Proceedings of forty-sixth symposium on foundations of computer science. IEEE Computer Society, Washington, DC, pp 206–215
Bansal N, Blum A, Chawla S (2002) Correlation clustering. In: Correlation clustering. IEEE Computer Society, Washington, DC pp 238–247
Ben-Dor A, Shamir R, Yakhini Z (1999) Clustering gene expression patterns. J Comput Biol 6:281–297
Bertolacci M, Wirth A (2007) Are approximation algorithms for consensus clustering worthwhile? In: Proceedings of seventh SIAM international conference on data mining. SIAM, Philadelphia, pp 437–442
Charikar M, Guruswami V, Wirth A (2003) Clustering with qualitative information. In: Proceedings of forty fourth FOCS, Cambridge, pp 524–533
Charikar M, Wirth A (2004) Maximizing quadratic programs: extending Grothendieck’s inequality. In: Proceedings of forty fifth FOCS, Rome, pp 54–60
Daume H (2006) Practical structured learning techniques for natural language processing. PhD thesis, University of Southern California
Davidson I, Ravi S (2005) Clustering with constraints: feasibility issues and the k-means algorithm. In: Proceedings of fifth SIAM international conference on data mining, Newport Beach
Demaine E, Emanuel D, Fiat A, Immorlica N (2006) Correlation clustering in general weighted graphs. Theor Comput Sci 361(2):172–187
Demaine E, Immorlica N (2003) Correlation clustering with partial information. In: Proceedings of sixth workshop on approximation algorithms for combinatorial optimization problems, pp 1–13
Emanuel D, Fiat A (2003) Correlation clustering – minimizing disagreements on arbitrary weighted graphs. In: Proceedings of eleventh European symposium on algorithms, Budapest, pp 208–220
Ferligoj A, Batagelj V (1982) Clustering with relational constraint. Psychometrika 47(4):413–426
Finley T, Joachims T (2005) Supervised clustering with support vector machines. In: Proceedings of twenty-second international conference on machine learning, Bonn
Gionis A, Mannila H, Tsaparas P (2005) Clustering aggregation. In: Proceedings of twenty-first international conference on data engineering, Tokyo
Gramm J, Guo J, Hüffner F, Niedermeier R (2004) Automated generation of search tree algorithms for hard graph modification problems. Algorithmica 39(4):321–347
Kulis B, Basu S, Dhillon I, Mooney R (2005) Semi-supervised graph clustering: a kernel approach. In: Proceedings of twenty-second international conference on machine learning, Bonn, pp 457–464
McCallum A, Wellner B (2005) Conditional models of identity uncertainty with application to noun coreference. In: Saul L, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17. MIT Press, Cambridge, pp 905–912
Meilă M (2003) Comparing clusterings by the variation of information. In: Proceedings of sixteenth conference on learning theory, pp 173–187
Shamir R, Sharan R, Tsur D (2004) Cluster graph modification problems. Discr Appl Math 144:173–182
Swamy C (2004) Correlation clustering: maximizing agreements via semidefinite programming. In: Proceedings of fifteenth ACM-SIAM symposium on discrete algorithms, pp 519–520
Tan J (2007) A note on the inapproximability of correlation clustering. Technical report 0704.2092, eprint arXiv, 2007
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Wirth, A. (2017). Correlation Clustering. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_176
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_176
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering