ABSTRACT
Lawyers and judges spend a large amount of time researching the proper legal authority to cite while drafting decisions. In this paper, we develop a citation recommendation tool that can help improve efficiency in the process of opinion drafting. We train four types of machine learning models, including a citation-list based method (collaborative filtering) and three context-based methods (text similarity, BiLSTM and RoBERTa classifiers). Our experiments show that leveraging local textual context improves recommendation, and that deep neural models achieve decent performance. We show that non-deep text-based methods benefit from access to structured case metadata, but deep models only benefit from such access when predicting from context of insufficient length. We also find that, even after extensive training, RoBERTa does not outperform a recurrent neural model, despite its benefits of pretraining. Our behavior analysis of the RoBERTa model further shows that predictive performance is stable across time and citation classes.
- Giambattista Amati. 2009. BM25. Springer US, Boston, MA, 257--260.Google Scholar
- David Ames, Cassandra Handan-Nader, Daniel E. Ho, and David Marcus. 2020. Due Process and Mass Adjudication: Crisis and Reform. Stanford Law Review 72 (2020), 1--78.Google Scholar
- Shannon Bradshaw. 2004. Reference Directed Indexing: Redeeming Relevance for Subject Search in Citation Indexes. In Research and Advanced Technology for Digital Libraries, Vol. 2769. 499--510.Google ScholarCross Ref
- Cornelia Caragea, Adrian Silvescu, Prasenjit Mitra, and C. Lee Giles. 2013. Can't See the Forest for the Trees? A Citation Recommendation System. In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '13). 111--114.Google Scholar
- Caselaw Access Project. 2020. Caselaw Access Project. https://case.law.Google Scholar
- CaseText. 2020. The Machine Learning Technology Behind Parallel Search. https://casetext.com/blog/machine-learning-behind-parallel-search/. Accessed: 2020-12-18.Google Scholar
- Columbia Law Review Ass'n, Harvard Law Review Ass'n, and Yale Law Journal. 2015. The Bluebook: A Uniform System of Citation (21st ed.).Google Scholar
- Faraz Dadgostari, Mauricio Guim, P. Beling, Michael A. Livermore, and D. Rockmore. 2020. Modeling law search as prediction. Artif. Intell. Law 29 (2020), 3--34.Google ScholarDigital Library
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings NAACL-HLT '19. 4171--4186.Google Scholar
- Travis Ebesu and Yi Fang. 2017. Neural Citation Network for Context-Aware Citation Recommendation. In Proceedings SIGIR '17. 1093--1096.Google ScholarDigital Library
- James Fowler, Timothy Johnson, James Spriggs, Sangick Jeon, and Paul Wahlbeck. 2007. Network Analysis and the Law: Measuring the Legal Importance of Precedents at the U.S. Supreme Court. Political Analysis 15 (06 2007).Google Scholar
- Marco Gori and Augusto Pucci. 2006. Research Paper Recommender Systems: A Random-Walk Based Approach. 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings) (WI'06) (2006), 778--781.Google Scholar
- Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. 2010. Context-Aware Citation Recommendation. Proceedings of the 19th International Conference on World Wide Web (2010), 421--430. Google ScholarDigital Library
- Daniel E. Ho, Cassandra Handan-Nader, David Ames, and David Marcus. 2019. Quality Review of Mass Adjudication: A Randomized Natural Experiment at the Board of Veterans Appeals, 2003--16. The Journal of Law, Economics, and Organization 35, 2 (03 2019), 239--288. Google ScholarCross Ref
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarDigital Library
- Wenyi Huang, Zhaohui Wu, Chen Liang, Prasenjit Mitra, and C. Lee Giles. 2015. A Neural Probabilistic Model for Context Based Citation Recommendation. In Proceedings AAAI '15. 2404--2410.Google Scholar
- Casetext Inc. 2020. CARA A.I. / Casetext. Retrieved December 17, 2020 from https://casetext.com/cara-aiGoogle Scholar
- ROSS Intelligence Inc. 2020. ROSS Intelligence. Retrieved December 17, 2020 from https://blog.rossintelligence.comGoogle Scholar
- Glen Jeh and Jennifer Widom. 2002. SimRank: A Measure of Structural-Context Similarity. In Proceedings KDD '02. 538--543.Google ScholarDigital Library
- Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. Proceedings KDD '02 (2002), 133--142.Google ScholarDigital Library
- Marios Koniaris, Ioannis Anagnostopoulos, and Yannis Vassiliou. 2017. Network analysis in the legal domain: a complex model for European Union legal sources. Journal of Complex Networks 6, 2 (08 2017), 243--268.Google Scholar
- Peng-Hsuan Li, Tsu-Jui Fu, and Wei-Yun Ma. 2020. Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER. In AAAI '20. 8236--8244.Google Scholar
- David Liben-Nowell and Jon Kleinberg. 2007. The Link-Prediction Problem for Social Networks. J. Am. Soc. Inf. Sci. Technol. 58, 7 (May 2007), 1019--1031.Google ScholarDigital Library
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. CoRR abs/1907.11692 (2019). http://arxiv.org/abs/1907.11692Google Scholar
- Ji Ma, Kuzman Ganchev, and David Weiss. 2018. State-of-the-art Chinese Word Segmentation with Bi-LSTMs. In Proceedings EMNLP '18. 4902--4908.Google ScholarCross Ref
- Shutian Ma, Chengzhi Zhang, and Xiaozhong Liu. 2020. A review of citation recommendation: from textual content to enriched context. Scientometrics 122, 3 (2020), 1445--1472.Google ScholarDigital Library
- Jerry L Mashaw. 1985. Bureaucratic justice: Managing social security disability claims. Yale University Press.Google Scholar
- Sean M. McNee, Istvan Albert, Dan Cosley, Prateep Gopalkrishnan, Shyong K. Lam, Al Mamunur Rashid, Joseph A. Konstan, and John Riedl. 2002. On the Recommending of Citations for Research Papers. In Proceedings of the 2002 ACM Conference on Computer Supported Cooperative Work (CSCW '02). 116--125.Google ScholarDigital Library
- Gábor Melis, Chris Dyer, and Phil Blunsom. 2017. On the state of the art of evaluation in neural language models. arXiv preprint arXiv:1707.05589 (2017).Google Scholar
- J.C. Oleson. 2003. You Make Me Sic: Confessions of a Sadistic Law Review Editor. U.C. Davis Law Review 37 (2003).Google Scholar
- Larry Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1998. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford University (1998).Google Scholar
- Lazar Peric, Stefan Mijic, Dominik Stammbach, and Elliott Ash. 2020. Legal Language Modeling with Transformers. In Proceedings ASAIL 2020, Vol. 2764. CEUR-WS.Google Scholar
- Anna Ritchie. 2009. Citation context analysis for information retrieval. PhD thesis, University of Cambridge.Google Scholar
- Anna Ritchie, Stephen Robertson, and Simone Teufel. 2008. Comparing Citation Contexts for Information Retrieval. Proceedings CIKM '08 (2008), 213--222.Google ScholarDigital Library
- Ali Sadeghian, Laksshman Sundaram, Daisy Zhe Wang, William F. Hamilton, Karl Branting, and Craig Pfeifer. 2018. Automatic Semantic Edge Labeling over Legal Citation Graphs. Artif. Intell. Law 26, 2 (2018), 127--144.Google ScholarDigital Library
- Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural Machine Translation of Rare Words with Subword Units. In Proceedings ACL '16 (Volume 1: Long Papers). 1715--1725.Google ScholarCross Ref
- Trevor Strohman, W. Bruce Croft, and David Jensen. 2007. Recommending citations for academic papers. In Proceedings SIGIR '07. 705--706.Google ScholarDigital Library
- Trevor Strohman, Donald Metzler, Howard Turtle, and W. Bruce Croft. 2005. Indri: a language-model based search engine for complex queries. Technical Report. in Proceedings of the International Conference on Intelligent Analysis.Google Scholar
- Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. In Proceedings ACL '19. 3645--3650.Google ScholarCross Ref
- Merine Thomas, Thomas Vacek, Xin Shuai, Wenhui Liao, George Sanchez, Paras Sethia, Don Teo, Kanika Madan, and Tonya Custis. 2020. Quick Check: A Legal Research Recommendation System. In Proceedings NLLP '20, Vol. 2645. CEUR-WS.Google Scholar
- Radboud Winkels, Alexander Boer, Bart Vredebregt, and Alexander von Someren. 2014. Towards a Legal Recommender System. In Proceedings JURIX '14.Google Scholar
- Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings EMNLP '20: System Demonstrations. 38--45. Google ScholarCross Ref
- Paul Zhang and Lavanya Koppaka. 2007. Semantics-Based Legal Citation Network. In Proceedings ICAIL '07. 123--130.Google ScholarDigital Library
- Lucia Zheng, Neel Guha, Brandon R. Anderson, Peter Henderson, and Daniel E. Ho. 2021. When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset. In Proceedings ICAIL '21. arXiv:2104.08671 (in press).Google Scholar
Index Terms
- Context-aware legal citation recommendation using deep learning
Recommendations
Neural Citation Network for Context-Aware Citation Recommendation
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalThe accelerating rate of scientific publications makes it difficult to find relevant citations or related work. Context-aware citation recommendation aims to solve this problem by providing a curated list of high-quality candidates given a short passage ...
Pre-trained transformer-based citation context-aware citation network embeddings
JCDL '22: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital LibrariesAcademic papers form citation networks wherein each paper is a node and citation relationships between papers are edges. The embeddings of each paper obtained by projecting the citation network into a vector space are called citation network embeddings. ...
Context-aware citation recommendation of scientific papers: comparative study, gaps and trends
AbstractWith the exponential increase in the number of published articles, recommending them on the basis of the citation context (also called local or citation-aware citation recommendation) has attracted many researchers in the last few years. Recently, ...
Comments