Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vern R. Walker

Toward an Intelligent Tutoring System for Argument Mining in Legal Texts

Oct 24, 2022

Hannes Westermann, Jaromir Savelka, Vern R. Walker, Kevin D. Ashley, Karim Benyekhlef

Figure 1 for Toward an Intelligent Tutoring System for Argument Mining in Legal Texts

Figure 2 for Toward an Intelligent Tutoring System for Argument Mining in Legal Texts

Figure 3 for Toward an Intelligent Tutoring System for Argument Mining in Legal Texts

Figure 4 for Toward an Intelligent Tutoring System for Argument Mining in Legal Texts

Abstract:We propose an adaptive environment (CABINET) to support caselaw analysis (identifying key argument elements) based on a novel cognitive computing framework that carefully matches various machine learning (ML) capabilities to the proficiency of a user. CABINET supports law students in their learning as well as professionals in their work. The results of our experiments focused on the feasibility of the proposed framework are promising. We show that the system is capable of identifying a potential error in the analysis with very low false positives rate (2.0-3.5%), as well as of predicting the key argument element type (e.g., an issue or a holding) with a reasonably high F1-score (0.74).

* Accepted for presentation at the 35th International Conference on Legal Knowledge and Information Systems (JURIX 2022) and publication in the Frontiers of Artificial Intelligence and Applications series of IOS Press

Via

Access Paper or Ask Questions

Data-Centric Machine Learning in the Legal Domain

Jan 17, 2022

Hannes Westermann, Jaromir Savelka, Vern R. Walker, Kevin D. Ashley, Karim Benyekhlef

Figure 1 for Data-Centric Machine Learning in the Legal Domain

Figure 2 for Data-Centric Machine Learning in the Legal Domain

Figure 3 for Data-Centric Machine Learning in the Legal Domain

Figure 4 for Data-Centric Machine Learning in the Legal Domain

Abstract:Machine learning research typically starts with a fixed data set created early in the process. The focus of the experiments is finding a model and training procedure that result in the best possible performance in terms of some selected evaluation metric. This paper explores how changes in a data set influence the measured performance of a model. Using three publicly available data sets from the legal domain, we investigate how changes to their size, the train/test splits, and the human labelling accuracy impact the performance of a trained deep learning classifier. We assess the overall performance (weighted average) as well as the per-class performance. The observed effects are surprisingly pronounced, especially when the per-class performance is considered. We investigate how "semantic homogeneity" of a class, i.e., the proximity of sentences in a semantic embedding space, influences the difficulty of its classification. The presented results have far reaching implications for efforts related to data collection and curation in the field of AI & Law. The results also indicate that enhancements to a data set could be considered, alongside the advancement of the ML models, as an additional path for increasing classification performance on various tasks in AI & Law. Finally, we discuss the need for an established methodology to assess the potential effects of data set properties.

Via

Access Paper or Ask Questions

Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

Dec 21, 2021

Hannes Westermann, Jaromir Savelka, Vern R. Walker, Kevin D. Ashley, Karim Benyekhlef

Figure 1 for Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

Figure 2 for Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

Figure 3 for Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

Figure 4 for Sentence Embeddings and High-speed Similarity Search for Fast Computer Assisted Annotation of Legal Documents

Abstract:Human-performed annotation of sentences in legal documents is an important prerequisite to many machine learning based systems supporting legal tasks. Typically, the annotation is done sequentially, sentence by sentence, which is often time consuming and, hence, expensive. In this paper, we introduce a proof-of-concept system for annotating sentences "laterally." The approach is based on the observation that sentences that are similar in meaning often have the same label in terms of a particular type system. We use this observation in allowing annotators to quickly view and annotate sentences that are semantically similar to a given sentence, across an entire corpus of documents. Here, we present the interface of the system and empirically evaluate the approach. The experiments show that lateral annotation has the potential to make the annotation process quicker and more consistent.

* Frontiers in Artificial Intelligence and Applications, Volume 334: Legal Knowledge and Information Systems, 2020, pp. 164-173

Via

Access Paper or Ask Questions

Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

Dec 10, 2021

Hannes Westermann, Jaromir Savelka, Vern R. Walker, Kevin D. Ashley, Karim Benyekhlef

Figure 1 for Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

Figure 2 for Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

Figure 3 for Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

Abstract:In this paper, we present a method of building strong, explainable classifiers in the form of Boolean search rules. We developed an interactive environment called CASE (Computer Assisted Semantic Exploration) which exploits word co-occurrence to guide human annotators in selection of relevant search terms. The system seamlessly facilitates iterative evaluation and improvement of the classification rules. The process enables the human annotators to leverage the benefits of statistical information while incorporating their expert intuition into the creation of such rules. We evaluate classifiers created with our CASE system on 4 datasets, and compare the results to machine learning methods, including SKOPE rules, Random forest, Support Vector Machine, and fastText classifiers. The results drive the discussion on trade-offs between superior compactness, simplicity, and intuitiveness of the Boolean search rules versus the better performance of state-of-the-art machine learning models for text classification.

* Frontiers in Artificial Intelligence and Applications, Volume 334: Legal Knowledge and Information Systems, 2019, pp. 164-173

Via

Access Paper or Ask Questions