skip to main content
10.1145/1007568.1007587acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

FARMER: finding interesting rule groups in microarray datasets

Published:13 June 2004Publication History

ABSTRACT

Microarray datasets typically contain large number of columns but small number of rows. Association rules have been proved to be useful in analyzing such datasets. However, most existing association rule mining algorithms are unable to efficiently handle datasets with large number of columns. Moreover, the number of association rules generated from such datasets is enormous due to the large number of possible column combinations.In this paper, we describe a new algorithm called FARMER that is specially designed to discover association rules from microarray datasets. Instead of finding individual association rules, FARMER finds interesting rule groups which are essentially a set of rules that are generated from the same set of rows. Unlike conventional rule mining algorithms, FARMER searches for interesting rules in the row enumeration space and exploits all user-specified constraints including minimum support, confidence and chi-square to support efficient pruning. Several experiments on real bioinformatics datasets show that FARMER is orders of magnitude faster than previous association rule mining algorithms.

References

  1. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB'94), pages 487--499, Sept. 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. J. Bayardo and R. Agrawal. Mining the most interesting rules. In Proc. of ACM SIGKDD, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. J. Bayardo, R. Agrawal, and D. Gunopulos. Constraint-based rule mining on large, dense data sets. In Proc. 1999 Int. Conf. Data Engineering (ICDE'99).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Cheng and G. M. Church. Biclustering of expression data. In Proc of the 8th Intl. Conf. on intelligent Systems for Mocular Biology, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Cong, A. K. H. Tung, X. Xu, F. Pan, and J. Yang. Farmer: Finding interesting rule groups in microarray datasets. Technical Report: National University of Singapore, 2004.]]Google ScholarGoogle Scholar
  7. C. Creighton and S. Hanash. Mining gene expression databases for association rules. Bioinformatics, 19, 2003.]]Google ScholarGoogle Scholar
  8. S. Doddi, A. Marathe, S. Ravi, and D. Torney. Discovery of association rules in medical data. Med. Inform. Internet. Med., 26:25--33, 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. G. Dong, X. Zhang, L. Wong, and J. Li. Caep: Classification by aggregating emerging patterns. In Proc. 2nd Int. Conf. Discovery Science (DS'99).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Gehrke, R. Ramakrishnan, and V. Ganti. Rainforest: A framework for fast decision tree construction of large datasets. In Proc. 1998 Int. Conf. Very Large Data Bases (VLDB'98).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Han and J. Pei. Mining frequent patterns by pattern-growth:methodology and implications. KDD Exploration, 2, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Joachims. Making large-scale svm learning practical. 1999. svmlight.joachims.org/.]]Google ScholarGoogle Scholar
  13. J. Li and L. Wong. Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics, 18:725--734, 2002.]]Google ScholarGoogle ScholarCross RefCross Ref
  14. B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD'98).]]Google ScholarGoogle Scholar
  15. S. Morishita and J. Sese. Traversing itemset lattices with statistical metric prunning. In Proc. of PODS, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Ng, L. V. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'98).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Pan, G. Cong, A. K. H. Tung, J. Yang, and M. J. Zaki. Carpenter: Finding closed patterns in long biological datasets. In Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. In Proc. 7th Int. Conf. Database Theory (ICDT'99), Jan.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. L. Pfaltz and C. Taylor. Closed set mining of biological data. In Workshop on Data Mining in BIoinformatics with (SIGKDD02), 2002.]]Google ScholarGoogle Scholar
  20. R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD'97), 1997.]]Google ScholarGoogle Scholar
  21. J. Wang, J. Han, and J. Pei. Closet+: Searching for the best strategies for mining frequent closed itemsets. In Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Zaki. Generating non-redundant association rules. In Proc. 2000 Int. Conf. Knowledge Discovery and Data Mining (KDD'00), 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Zaki and C. Hsiao. Charm: An efficient algorithm for closed association rule mining. In Proc. of SIAM on Data Mining, 2002.]]Google ScholarGoogle Scholar
  1. FARMER: finding interesting rule groups in microarray datasets

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
      June 2004
      988 pages
      ISBN:1581138598
      DOI:10.1145/1007568

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 June 2004

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader