Abstract
We address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the extent of disagreement with the respective inputs. Specifically, the problems we address are rank aggregation, the feedback arc set problem on tournaments, and correlation and consensus clustering. We show that for all these problems (and various weighted versions of them), we can obtain improved approximation factors using essentially the same remarkably simple algorithm. Additionally, we almost settle a long-standing conjecture of Bang-Jensen and Thomassen and show that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.
- Ailon, N. 2008. Aggregation of partial rankings, p-ratings and top-m lists. Algorithmica. DOI 10.1007/s00453-008-9211-1. Google ScholarDigital Library
- Ailon, N., and Charikar, M. 2005. Fitting tree metrics: Hierarchical clustering and phylogeny. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (Pittsburgh, PA). IEEE Computer Society Press, Los Alamitos, CA, 73--82. Google ScholarDigital Library
- Ailon, N., Charikar, M., and Newman, A. 2005. Aggregating inconsistent information: Ranking and clustering. In Proceedings of the 37th Annual Symposium on the Theory of Computing (STOC) (Boston, MA). ACM, New York, 684--693. Google ScholarDigital Library
- Ailon, N., and Mohri, M. 2008. An efficient reduction of ranking to classification. In Conference on Learning Theory (COLT) (Helsinki, Finland).Google Scholar
- Alon, N. 2006. Ranking tournaments. SIAM J. Disc. Math. 20, 1, 137--142. Google ScholarDigital Library
- Alon, N., and Spencer, J. H. 1992. The Probabilistic Method. Wiley, New York.Google Scholar
- Arora, S., Frieze, A., and Kaplan, H. 1996. A new rounding procedure for the assignment problem with applications to dense graph arrangement problems. In Proceedings of the 37th Annual Symposium on the Foundations of Computer Science (FOCS) (Burlington, VT). IEEE Computer Society Press, Los Alamitos, CA, 24--33. Google ScholarDigital Library
- Balcan, M.-F., Bansal, N., Beygelzimer, A., Coppersmith, D., Langford, J., and Sorkin, G. B. 2007. Robust reductions from ranking to classification. In Proceedings of the Conference on Learning Theory (COLT). Lecture Notes in Computer Science, vol. 4539. Springer-Verlag, New York, 604--619. Google ScholarDigital Library
- Bang-Jensen, J., and Thomassen, C. 1992. A polynomial algorithm for the 2-path problem in semicomplete graphs. SIAM J. Disc. Math. 5, 3, 366--376. Google ScholarDigital Library
- Bansal, N., Blum, A., and Chawla, S. 2004. Correlation clustering. Mach. Learn. J. (Special Issue on Theoretical Advances in Data Clustering) 56, 1--3, 89--113. (Extended abstract appeared in FOCS 2002, pages 238--247.) Google ScholarDigital Library
- Bartholdi, J., Tovey, C. A., and Trick, M. 1989. Voting schemes for which it can be difficult to tell who won the election. Social Choice Welf. 6, 2, 157--165.Google ScholarCross Ref
- Borda, J. C. 1781. Mémoire sur les élections au scrutin. Histoire de l'Académie Royale des Sciences.Google Scholar
- Cai, M.-C., Deng, X., and Zang, W. 2001. An approximation algorithm for feedback vertex sets in tournaments. SIAM J. Comput. 30, 6, 1993--2007. Google ScholarDigital Library
- Charikar, M., Guruswami, V., and Wirth, A. 2003. Clustering with qualitative information. In Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (Boston, MA). IEEE Computer Society Press, Los Alamitos, CA, 524--533. Google ScholarDigital Library
- Chaudhuri, K., Chen, K., Mihaescu, R., and Rao, S. 2006. On the tandem duplication-random loss model of genome rearrangement. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm (SODA'06). ACM, New York, 564--570. Google ScholarDigital Library
- Condorcet, M.-J. 1785. Éssai sur l'application de l'analyse à la probabilité des décisions rendues à la pluralité des voix.Google Scholar
- Coppersmith, D., Fleischer, L., and Rudra, A. 2006. Ordering by weighted number of wins gives a good ranking for weighted tournaments. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm (SODA'06). ACM, New York, 776--782. Google ScholarDigital Library
- Diaconis, P., and Graham, R. 1977. Spearman's footrule as a measure of disarray. J. Roy. Stat. Soc., Ser. B 39, 2, 262--268.Google ScholarCross Ref
- Dinur, I., and Safra, S. 2002. On the importance of being biased. In Proceedings of the 34th Annual Symposium on the Theory of Compututing (STOC). ACM, New York, 33--42. Google ScholarDigital Library
- Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. 2001a. Rank aggregation methods for the web. In Proceedings of the 10th International Conference on the World Wide Web (WWW10) (Hong Kong, China), 613--622. Google ScholarDigital Library
- Dwork, C., Kumar, R., Naor, M., and Sivakumar, D. 2001b. Rank aggregation revisited. Manuscript. (Available from: http://www.eecs.harvard.edu/~michaelm/CS222/rank2.pdf.)Google Scholar
- Even, G., Naor, J. S., Sudan, M., and Schieber, B. 1998. Approximating minimum feedback sets and multicuts in directed graphs. Algorithmica 20, 2, 151--174.Google ScholarCross Ref
- Fagin, R., Kumar, R., and Sivakumar, D. 2003. Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (San Diego, CA). ACM, New York, 301--312. Google ScholarDigital Library
- Filkov, V., and Skiena, S. 2003. Integrating microarray data by consensus clustering. In Proceedings of International Conference on Tools with Artificial Intelligence (ICTAI) (Sacramento, CA). 418--425. Google ScholarDigital Library
- Frieze, A., and Kannan, R. 1999. Quick approximations to matrices and applications. Combinatorica 19, 2, 175--220.Google ScholarCross Ref
- Gionis, A., Mannila, H., and Tsaparas, P. 2005. Clustering aggregation. In Proceedings of the 21st International Conference on Data Engineering (ICDE) (Tokyo, Japan). Google ScholarDigital Library
- Hästad, J. 2001. Some optimal inapproximability results. J. ACM 48, 798--859. Google ScholarDigital Library
- Karp, R. M. 1972. Reducibility among combinatorial problems. In Complexity of Computer Computations. Plenum Press, New York, 85--104.Google Scholar
- Kemeny, J. G. 1959. Mathematics without numbers. Daedalus 88, 571--591.Google Scholar
- Kemeny, J., and Snell, J. 1962. Mathematical Models in the Social Sciences. Blaisdell, New York. (Reprinted by MIT Press, Cambridge, 1972.)Google Scholar
- Kenyon-Mathieu, C., and Schudy, W. 2007. How to rank with few errors. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC) (New York, NY). ACM, New York, 95--103. Google ScholarDigital Library
- Newman, A. 2000. Approximating the maximum acyclic subgraph. M.S. thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
- Newman, A., and Vempala, S. 2001. Fences are futile: On relaxations for the linear ordering problem. In Proceedings of the 8th Conference on Integer Programming and Combinatorial Optimization (IPCO). 333--347. Google ScholarDigital Library
- Potts, C. N. 1980. An algorithm for the single machine sequencing problem with precedence constraints. Math. Prog. 13, 78--87.Google ScholarCross Ref
- Seymour, P. 1995. Packing directed circuits fractionally. Combinatorica 15, 281--288.Google ScholarCross Ref
- Speckenmeyer, E. 1989. On feedback problems in digraphs. Graph Theoretic Concepts in Computer Science, Lecture Notes in Computer Science, vol. 411, Springer-Verlag, New York, 218--231. Google ScholarDigital Library
- Strehl, A. 2002. PhD dissertation. Ph.D. thesis, University of Texas at Austin.Google Scholar
- Wakabayashi, Y. 1998. The complexity of computing medians of relations. Resenhas 3, 3, 323--349.Google Scholar
- Williamson, D. P., and van Zuylen, A. 2007. Deterministic algorithms for rank aggregation and other ranking and clustering problems. In Proceedings of the 5th Workshop on Approximation and Online Algorithms (WAOA). Google ScholarDigital Library
Index Terms
- Aggregating inconsistent information: Ranking and clustering
Recommendations
Aggregating inconsistent information: ranking and clustering
STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computingWe address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the number of disagreements with the respective inputs. Specifically, the problems we ...
On the Approximation of Correlation Clustering and Consensus Clustering
The Correlation Clustering problem has been introduced recently [N. Bansal, A. Blum, S. Chawla, Correlation Clustering, in: Proc. 43rd Symp. Foundations of Computer Science, FOCS, 2002, pp. 238-247] as a model for clustering data when a binary ...
Parameterized algorithms for feedback set problems and their duals in tournaments
Parameterized and exact computationThe parameterized feedback vertex (arc) set problem is to find whether there are k vertices (arcs) in a given graph whose removal makes the graph acyclic. The parameterized complexity of this problem in general directed graphs is a long standing open ...
Comments