skip to main content
10.1145/3462757.3466066acmconferencesArticle/Chapter ViewAbstractPublication PagesicailConference Proceedingsconference-collections
research-article
Open Access

Context-aware legal citation recommendation using deep learning

Published:27 July 2021Publication History

ABSTRACT

Lawyers and judges spend a large amount of time researching the proper legal authority to cite while drafting decisions. In this paper, we develop a citation recommendation tool that can help improve efficiency in the process of opinion drafting. We train four types of machine learning models, including a citation-list based method (collaborative filtering) and three context-based methods (text similarity, BiLSTM and RoBERTa classifiers). Our experiments show that leveraging local textual context improves recommendation, and that deep neural models achieve decent performance. We show that non-deep text-based methods benefit from access to structured case metadata, but deep models only benefit from such access when predicting from context of insufficient length. We also find that, even after extensive training, RoBERTa does not outperform a recurrent neural model, despite its benefits of pretraining. Our behavior analysis of the RoBERTa model further shows that predictive performance is stable across time and citation classes.

References

  1. Giambattista Amati. 2009. BM25. Springer US, Boston, MA, 257--260.Google ScholarGoogle Scholar
  2. David Ames, Cassandra Handan-Nader, Daniel E. Ho, and David Marcus. 2020. Due Process and Mass Adjudication: Crisis and Reform. Stanford Law Review 72 (2020), 1--78.Google ScholarGoogle Scholar
  3. Shannon Bradshaw. 2004. Reference Directed Indexing: Redeeming Relevance for Subject Search in Citation Indexes. In Research and Advanced Technology for Digital Libraries, Vol. 2769. 499--510.Google ScholarGoogle ScholarCross RefCross Ref
  4. Cornelia Caragea, Adrian Silvescu, Prasenjit Mitra, and C. Lee Giles. 2013. Can't See the Forest for the Trees? A Citation Recommendation System. In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '13). 111--114.Google ScholarGoogle Scholar
  5. Caselaw Access Project. 2020. Caselaw Access Project. https://case.law.Google ScholarGoogle Scholar
  6. CaseText. 2020. The Machine Learning Technology Behind Parallel Search. https://casetext.com/blog/machine-learning-behind-parallel-search/. Accessed: 2020-12-18.Google ScholarGoogle Scholar
  7. Columbia Law Review Ass'n, Harvard Law Review Ass'n, and Yale Law Journal. 2015. The Bluebook: A Uniform System of Citation (21st ed.).Google ScholarGoogle Scholar
  8. Faraz Dadgostari, Mauricio Guim, P. Beling, Michael A. Livermore, and D. Rockmore. 2020. Modeling law search as prediction. Artif. Intell. Law 29 (2020), 3--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings NAACL-HLT '19. 4171--4186.Google ScholarGoogle Scholar
  10. Travis Ebesu and Yi Fang. 2017. Neural Citation Network for Context-Aware Citation Recommendation. In Proceedings SIGIR '17. 1093--1096.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. James Fowler, Timothy Johnson, James Spriggs, Sangick Jeon, and Paul Wahlbeck. 2007. Network Analysis and the Law: Measuring the Legal Importance of Precedents at the U.S. Supreme Court. Political Analysis 15 (06 2007).Google ScholarGoogle Scholar
  12. Marco Gori and Augusto Pucci. 2006. Research Paper Recommender Systems: A Random-Walk Based Approach. 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings) (WI'06) (2006), 778--781.Google ScholarGoogle Scholar
  13. Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. 2010. Context-Aware Citation Recommendation. Proceedings of the 19th International Conference on World Wide Web (2010), 421--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Daniel E. Ho, Cassandra Handan-Nader, David Ames, and David Marcus. 2019. Quality Review of Mass Adjudication: A Randomized Natural Experiment at the Board of Veterans Appeals, 2003--16. The Journal of Law, Economics, and Organization 35, 2 (03 2019), 239--288. Google ScholarGoogle ScholarCross RefCross Ref
  15. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wenyi Huang, Zhaohui Wu, Chen Liang, Prasenjit Mitra, and C. Lee Giles. 2015. A Neural Probabilistic Model for Context Based Citation Recommendation. In Proceedings AAAI '15. 2404--2410.Google ScholarGoogle Scholar
  17. Casetext Inc. 2020. CARA A.I. / Casetext. Retrieved December 17, 2020 from https://casetext.com/cara-aiGoogle ScholarGoogle Scholar
  18. ROSS Intelligence Inc. 2020. ROSS Intelligence. Retrieved December 17, 2020 from https://blog.rossintelligence.comGoogle ScholarGoogle Scholar
  19. Glen Jeh and Jennifer Widom. 2002. SimRank: A Measure of Structural-Context Similarity. In Proceedings KDD '02. 538--543.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. Proceedings KDD '02 (2002), 133--142.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Marios Koniaris, Ioannis Anagnostopoulos, and Yannis Vassiliou. 2017. Network analysis in the legal domain: a complex model for European Union legal sources. Journal of Complex Networks 6, 2 (08 2017), 243--268.Google ScholarGoogle Scholar
  22. Peng-Hsuan Li, Tsu-Jui Fu, and Wei-Yun Ma. 2020. Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER. In AAAI '20. 8236--8244.Google ScholarGoogle Scholar
  23. David Liben-Nowell and Jon Kleinberg. 2007. The Link-Prediction Problem for Social Networks. J. Am. Soc. Inf. Sci. Technol. 58, 7 (May 2007), 1019--1031.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692Google ScholarGoogle Scholar
  25. Ji Ma, Kuzman Ganchev, and David Weiss. 2018. State-of-the-art Chinese Word Segmentation with Bi-LSTMs. In Proceedings EMNLP '18. 4902--4908.Google ScholarGoogle ScholarCross RefCross Ref
  26. Shutian Ma, Chengzhi Zhang, and Xiaozhong Liu. 2020. A review of citation recommendation: from textual content to enriched context. Scientometrics 122, 3 (2020), 1445--1472.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jerry L Mashaw. 1985. Bureaucratic justice: Managing social security disability claims. Yale University Press.Google ScholarGoogle Scholar
  28. Sean M. McNee, Istvan Albert, Dan Cosley, Prateep Gopalkrishnan, Shyong K. Lam, Al Mamunur Rashid, Joseph A. Konstan, and John Riedl. 2002. On the Recommending of Citations for Research Papers. In Proceedings of the 2002 ACM Conference on Computer Supported Cooperative Work (CSCW '02). 116--125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Gábor Melis, Chris Dyer, and Phil Blunsom. 2017. On the state of the art of evaluation in neural language models. arXiv preprint arXiv:1707.05589 (2017).Google ScholarGoogle Scholar
  30. J.C. Oleson. 2003. You Make Me Sic: Confessions of a Sadistic Law Review Editor. U.C. Davis Law Review 37 (2003).Google ScholarGoogle Scholar
  31. Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1998. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford University (1998).Google ScholarGoogle Scholar
  32. Lazar Peric, Stefan Mijic, Dominik Stammbach, and Elliott Ash. 2020. Legal Language Modeling with Transformers. In Proceedings ASAIL 2020, Vol. 2764. CEUR-WS.Google ScholarGoogle Scholar
  33. Anna Ritchie. 2009. Citation context analysis for information retrieval. PhD thesis, University of Cambridge.Google ScholarGoogle Scholar
  34. Anna Ritchie, Stephen Robertson, and Simone Teufel. 2008. Comparing Citation Contexts for Information Retrieval. Proceedings CIKM '08 (2008), 213--222.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ali Sadeghian, Laksshman Sundaram, Daisy Zhe Wang, William F. Hamilton, Karl Branting, and Craig Pfeifer. 2018. Automatic Semantic Edge Labeling over Legal Citation Graphs. Artif. Intell. Law 26, 2 (2018), 127--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural Machine Translation of Rare Words with Subword Units. In Proceedings ACL '16 (Volume 1: Long Papers). 1715--1725.Google ScholarGoogle ScholarCross RefCross Ref
  37. Trevor Strohman, W. Bruce Croft, and David Jensen. 2007. Recommending citations for academic papers. In Proceedings SIGIR '07. 705--706.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Trevor Strohman, Donald Metzler, Howard Turtle, and W. Bruce Croft. 2005. Indri: a language-model based search engine for complex queries. Technical Report. in Proceedings of the International Conference on Intelligent Analysis.Google ScholarGoogle Scholar
  39. Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. In Proceedings ACL '19. 3645--3650.Google ScholarGoogle ScholarCross RefCross Ref
  40. Merine Thomas, Thomas Vacek, Xin Shuai, Wenhui Liao, George Sanchez, Paras Sethia, Don Teo, Kanika Madan, and Tonya Custis. 2020. Quick Check: A Legal Research Recommendation System. In Proceedings NLLP '20, Vol. 2645. CEUR-WS.Google ScholarGoogle Scholar
  41. Radboud Winkels, Alexander Boer, Bart Vredebregt, and Alexander von Someren. 2014. Towards a Legal Recommender System. In Proceedings JURIX '14.Google ScholarGoogle Scholar
  42. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings EMNLP '20: System Demonstrations. 38--45. Google ScholarGoogle ScholarCross RefCross Ref
  43. Paul Zhang and Lavanya Koppaka. 2007. Semantics-Based Legal Citation Network. In Proceedings ICAIL '07. 123--130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lucia Zheng, Neel Guha, Brandon R. Anderson, Peter Henderson, and Daniel E. Ho. 2021. When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset. In Proceedings ICAIL '21. arXiv:2104.08671 (in press).Google ScholarGoogle Scholar

Index Terms

  1. Context-aware legal citation recommendation using deep learning

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader