Skip to main content

Summarizing Online User Reviews Using Bicliques

  • Conference paper
  • First Online:
SOFSEM 2016: Theory and Practice of Computer Science (SOFSEM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9587))

  • 997 Accesses

Abstract

With vast amounts of text being available in electronic format, such as news and social media, automatic multi-document summarization can help extract the most important information. We present and evaluate a novel method for automatic extractive multi-document summarization. The method is purely combinatorial, based on bicliques in the bipartite word-sentence occurrence graph. It is particularly suited for collections of very short, independently written texts (often single sentences) with many repeated phrases, such as customer reviews of products. The method can run in subquadratic time in the number of documents, which is relevant for the application to large collections of documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discr. Appl. Math. 145, 11–21 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  2. Arasu, A., Ganti, V., Kaushik, R.: Efficient exact set-similarity joins. In: Dayal, U. et al. (eds.) VLDB 2006, pp. 918–929, ACM (2006)

    Google Scholar 

  3. Bogren, E., Toft, J.: Finding top-\(k\) similar document pairs - speeding up a multi-document summarization approach. Master’s thesis, Department of Computer Science and Engineering, Chalmers, Göteborg (2014)

    Google Scholar 

  4. Bonzanini, M., Martinez-Alvarez, M., Roelleke, T.: Extractive summarisation via sentence removal: condensing relevant sentences into a short summary. In: Jones, G.J.F. et al. (eds.) SIGIR 2013, pp. 893–896, ACM (2013)

    Google Scholar 

  5. Damaschke, P.: Finding and enumerating large intersections. Theor. Comp. Sci. 580, 75–82 (2015)

    Article  MathSciNet  Google Scholar 

  6. Dias, V.M.F., de Figueiredo, C.M.H., Szwarcfiter, J.L.: On the generation of bicliques of a graph. Discr. Appl. Math. 155, 1826–1832 (2007)

    Article  MATH  Google Scholar 

  7. Elsayed, T., Lin, J., Oard, D.W.: Pairwise document similarity in large collections with MapReduce. In: ACL 2008: HLT, Short Papers (Companion Volume), pp. 265–268, Association for Computational Linguistics (2008)

    Google Scholar 

  8. Ganesan, K., Zhai, C., Han, J.: Opinosis: a graph based approach to abstractive summarization of highly redundant opinions. In: Huang, C.R., Jurafsky, D. (eds.) COLING 2010, pp. 340–348, Tsinghua University Press (2010)

    Google Scholar 

  9. Gely, A., Nourine, L., Sadi, B.: Enumeration aspects of maximal cliques and bicliques. Discr. Appl. Math. 157, 1447–1459 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  10. Li, W.: Random texts exhibit Zipf’s-law-like word frequency distribution. IEEE Trans. Inf. Theor. 38, 1842–1845 (1992)

    Article  Google Scholar 

  11. Li, J., Liu, G., Li, H., Wong, L.: Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: a one-to-one correspondence and mining algorithms. IEEE Trans. Knowl. Data Eng. 19, 1625–1637 (2007)

    Article  Google Scholar 

  12. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Moens, M.F., Szpakowicz (eds.) ACL Workshop “Text Summarization Branches Out”, pp. 74–81 (2004)

    Google Scholar 

  13. Lin, H., Bilmes, J.A.: A class of submodular functions for document summarization. In: Lin, D., Matsumoto, Y., Mihalcea, R. (eds.) ACL 2011, pp. 510–520, Association for Computational Linguistics (2011)

    Google Scholar 

  14. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. 2, 159–165 (1958)

    Article  MathSciNet  Google Scholar 

  15. Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  16. Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Berlin (2012)

    Chapter  Google Scholar 

  17. Porter, M.F.: An algorithm for suffix stripping. Program 14, 130–137 (1980)

    Article  Google Scholar 

  18. Radev, D.R., Allison, T., Blair-Goldensohn, S., Blitzer, J., Celebi, A., Dimitrov, S., Drábek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD - a platform for multidocument multilingual text summarization. In: LREC(2004)

    Google Scholar 

  19. Wang, D., Zhu, S., Li, T.: SumView: a web-based engine for summarizing product reviews and customer opinions. Expert Syst. Appl. 40, 27–33 (2013)

    Article  Google Scholar 

  20. Xiao, C., Wang, W., Lin, X. Haichuan Shang, H.: Top-\(k\) set similarity joins. In: Ioannidis, Y.E., Lee, D.L., Ng, R.T. (eds.) ICDE 2009, pp. 916–927, IEEE (2009)

    Google Scholar 

Download references

Acknowledgments

This work has been supported by Grant IIS11-0089 from the Swedish Foundation for Strategic Research (SSF), for the project “Data-driven secure business intelligence”. We thank our former master’s students Emma Bogren and Johan Toft for drawing our attention to similarity joins, and the members of our Algorithms group and collaborators at the companies Recorded Future and Findwise for many discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Damaschke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Muhammad, A.S., Damaschke, P., Mogren, O. (2016). Summarizing Online User Reviews Using Bicliques. In: Freivalds, R., Engels, G., Catania, B. (eds) SOFSEM 2016: Theory and Practice of Computer Science. SOFSEM 2016. Lecture Notes in Computer Science(), vol 9587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49192-8_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49192-8_46

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49191-1

  • Online ISBN: 978-3-662-49192-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics