Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Han Qin

Use Image Clustering to Facilitate Technology Assisted Review

Dec 16, 2021

Haozhen Zhao, Fusheng Wei, Hilary Quatinetz, Han Qin, Adam Dabrowski

Abstract:During the past decade breakthroughs in GPU hardware and deep neural networks technologies have revolutionized the field of computer vision, making image analytical potentials accessible to a range of real-world applications. Technology Assisted Review (TAR) in electronic discovery though traditionally has dominantly dealt with textual content, is witnessing a rising need to incorporate multimedia content in the scope. We have developed innovative image analytics applications for TAR in the past years, such as image classification, image clustering, and object detection, etc. In this paper, we discuss the use of image clustering applications to facilitate TAR based on our experiences in serving clients. We describe our general workflow on leveraging image clustering in tasks and use statistics from real projects to showcase the effectiveness of using image clustering in TAR. We also summarize lessons learned and best practices on using image clustering in TAR.

* 2021 IEEE International Conference on Big Data (Big Data)

Via

Access Paper or Ask Questions

Application of Deep Learning in Recognizing Bates Numbers and Confidentiality Stamping from Images

Feb 05, 2021

Christian J. Mahoney, Katie Jensen, Fusheng Wei, Haozhen Zhao, Han Qin, Shi Ye

Figure 1 for Application of Deep Learning in Recognizing Bates Numbers and Confidentiality Stamping from Images

Figure 2 for Application of Deep Learning in Recognizing Bates Numbers and Confidentiality Stamping from Images

Figure 3 for Application of Deep Learning in Recognizing Bates Numbers and Confidentiality Stamping from Images

Abstract:In eDiscovery, it is critical to ensure that each page produced in legal proceedings conforms with the requirements of court or government agency production requests. Errors in productions could have severe consequences in a case, putting a party in an adverse position. The volume of pages produced continues to increase, and tremendous time and effort has been taken to ensure quality control of document productions. This has historically been a manual and laborious process. This paper demonstrates a novel automated production quality control application which leverages deep learning-based image recognition technology to extract Bates Number and Confidentiality Stamping from legal case production images and validate their correctness. Effectiveness of the method is verified with an experiment using a real-world production data.

* 2020 IEEE International Conference on Big Data (Big Data)

Via

Access Paper or Ask Questions

Image Analytics for Legal Document Review: A Transfer Learning Approach

Dec 19, 2019

Nathaniel Huber-Fliflet, Fusheng Wei, Haozhen Zhao, Han Qin, Shi Ye, Amy Tsang

Figure 1 for Image Analytics for Legal Document Review: A Transfer Learning Approach

Figure 2 for Image Analytics for Legal Document Review: A Transfer Learning Approach

Figure 3 for Image Analytics for Legal Document Review: A Transfer Learning Approach

Figure 4 for Image Analytics for Legal Document Review: A Transfer Learning Approach

Abstract:Though technology assisted review in electronic discovery has been focusing on text data, the need of advanced analytics to facilitate reviewing multimedia content is on the rise. In this paper, we present several applications of deep learning in computer vision to Technology Assisted Review of image data in legal industry. These applications include image classification, image clustering, and object detection. We use transfer learning techniques to leverage established pretrained models for feature extraction and fine tuning. These applications are first of their kind in the legal industry for image document review. We demonstrate effectiveness of these applications with solving real world business challenges.

* 2019 IEEE International Conference on Big Data (Big Data)

Via

Access Paper or Ask Questions

Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

Dec 19, 2019

Robert Keeling, Rishi Chhatwal, Nathaniel Huber-Fliflet, Jianping Zhang, Fusheng Wei, Haozhen Zhao, Shi Ye, Han Qin

Figure 1 for Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

Figure 2 for Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

Figure 3 for Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

Figure 4 for Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review

Abstract:Research has shown that Convolutional Neural Networks (CNN) can be effectively applied to text classification as part of a predictive coding protocol. That said, most research to date has been conducted on data sets with short documents that do not reflect the variety of documents in real world document reviews. Using data from four actual reviews with documents of varying lengths, we compared CNN with other popular machine learning algorithms for text classification, including Logistic Regression, Support Vector Machine, and Random Forest. For each data set, classification models were trained with different training sample sizes using different learning algorithms. These models were then evaluated using a large randomly sampled test set of documents, and the results were compared using precision and recall curves. Our study demonstrates that CNN performed well, but that there was no single algorithm that performed the best across the combination of data sets and training sample sizes. These results will help advance research into the legal profession's use of machine learning algorithms that maximize performance.

* 2019 IEEE International Conference on Big Data (Big Data)

Via

Access Paper or Ask Questions

Empirical Study of Deep Learning for Text Classification in Legal Document Review

Apr 03, 2019

Fusheng Wei, Han Qin, Shi Ye, Haozhen Zhao

Figure 1 for Empirical Study of Deep Learning for Text Classification in Legal Document Review

Figure 2 for Empirical Study of Deep Learning for Text Classification in Legal Document Review

Figure 3 for Empirical Study of Deep Learning for Text Classification in Legal Document Review

Figure 4 for Empirical Study of Deep Learning for Text Classification in Legal Document Review

Abstract:Predictive coding has been widely used in legal matters to find relevant or privileged documents in large sets of electronically stored information. It saves the time and cost significantly. Logistic Regression (LR) and Support Vector Machines (SVM) are two popular machine learning algorithms used in predictive coding. Recently, deep learning received a lot of attentions in many industries. This paper reports our preliminary studies in using deep learning in legal document review. Specifically, we conducted experiments to compare deep learning results with results obtained using a SVM algorithm on the four datasets of real legal matters. Our results showed that CNN performed better with larger volume of training dataset and should be a fit method in the text classification in legal industry.

* 2018 IEEE International Conference on Big Data (Big Data)

Via

Access Paper or Ask Questions