Article

Cluster analysis of heterogeneous rank data

Authors:
Ludwig M. Busse

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

,
Peter Orbanz

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

,
Joachim M. Buhmann

ETH Zurich, Zurich, Switzerland

ETH Zurich, Zurich, Switzerland
View Profile

ICML '07: Proceedings of the 24th international conference on Machine learningJune 2007Pages 113–120https://doi.org/10.1145/1273496.1273511

Published:20 June 2007Publication History

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 113–120

ABSTRACT

Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.

References

Ailon, N., Charikar, M., & Newman, A. (2005). Aggregating inconsistent information: Ranking and clustering. ACM Symposium on the Theory of Computing. Google ScholarDigital Library
Beckett, L. A. (1993). Maximum likelihood estimation in Mallows' model using partially ranked data. In M. A. Fligner and J. S. Verducci (Eds.), Probability models and statistical analyses for ranking data.Google Scholar
Critchlow, D. (1985). Metric methods for analyzing partially ranked data. Springer.Google Scholar
Diaconis, P. (1988). Group representations in probability and statistics. Institute of Mathematical Statistics.Google Scholar
Diaconis, P. (1989). A generalization of spectral analysis with applications to ranked data. Annals of Statistics, 17, 949--979.Google ScholarCross Ref
Fligner, M. A., & Verducci, J. S. (1986). Distance based rank models. Journal of the Royal Statistical Society B, 48, 359--369.Google Scholar
Hofmann, T., & Buhmann, J. (1997). Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 1--14. Google ScholarDigital Library
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81--93.Google ScholarCross Ref
Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220, 671--680.Google ScholarCross Ref
Lebanon, G., & Lafferty, J. (2002). Cranking: Combining rankings using conditional probability models on permutations. International Conference on Machine Learning. Google ScholarDigital Library
Mallows, C. L. (1957). Non-null ranking models I. Biometrika, 44, 114--130.Google ScholarCross Ref
Marden, J. I. (1995). Analyzing and modeling rank data. Chapman & Hall.Google Scholar
McLachlan, G. J., & Krishnan, T. (1997). The EM algorithm and extensions. John Wiley & Sons.Google Scholar
Murphy, T. B., & Martin, D. (2003). Mixtures of distance-based models for ranking data. Computational Statistics and Data Analysis, 41, 645--655. Google ScholarDigital Library

Cluster analysis of heterogeneous rank data
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning

Recommendations

Effective rank aggregation for metasearching

Nowadays, mashup services and especially metasearch engines play an increasingly important role on the Web. Most of users use them directly or indirectly to access and aggregate information from more than one data sources. Similarly to the rest of the ...
Read More
Enhanced Learning to Rank using Cluster-loss Adjustment
CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

Most Learning To Rank (LTR) algorithms like Ranking SVM, RankNet, LambdaRank and LambdaMART use only relevance label judgments as ground truth for training. But in common scenarios like ranking of information cards (google now, other personal assistants)...
Read More
Learning to re-rank: query-dependent image re-ranking using click data
WWW '11: Proceedings of the 20th international conference on World wide web

Our objective is to improve the performance of keyword based image search engines by re-ranking their original results. To this end, we address three limitations of existing search engines in this paper. First, there is no straight-forward, fully ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '07: Proceedings of the 24th international conference on Machine learning
June 2007
1233 pages
ISBN:9781595937933
DOI:10.1145/1273496
Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 55
  Total Citations
  View Citations
- 741
  Total Downloads
- Downloads (Last 12 months)61
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cluster analysis of heterogeneous rank data

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Effective rank aggregation for metasearching

Enhanced Learning to Rank using Cluster-loss Adjustment

Learning to re-rank: query-dependent image re-ranking using click data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Cluster analysis of heterogeneous rank data

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Effective rank aggregation for metasearching

Enhanced Learning to Rank using Cluster-loss Adjustment

Learning to re-rank: query-dependent image re-ranking using click data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media