Machine Learning with Legal Texts

Kevin D. Ashley

doi:10.1017/9781316761380.008

8 - Machine Learning with Legal Texts

from PART II - LEGAL TEXT ANALYTICS

Published online by Cambridge University Press: 13 July 2017

Kevin D. Ashley

Show author details

Kevin D. Ashley: Affiliation:
University of Pittsburgh

Book contents

Get access

Summary

INTRODUCTION

In the examples of ML so far, a program has learned from data about judges, trends, or cases as in the Supreme Court Database, but not from the texts of cases or other legal documents. This chapter introduces applying ML algorithms to corpora of legal texts, discusses how ML models implicitly represent users’ hypotheses about relevance, illustrates how ML can improve full-text legal information retrieval, and explains its role in conceptual information retrieval and in cognitive computing. The chapter also distinguishes between supervised and unsupervised ML from text and discusses techniques for automating learning of structure and semantics from legal documents.

Along the way, the chapter answers the following questions: How can ML be applied to textual data? What is the difference between supervised and unsupervised ML from texts? What is predictive coding? How well does predictive coding work? What is “information extraction” from text? How are texts represented for purposes of applying ML? What is a “support vector machine (SVM)” and why use one with textual data?

APPLYING MACHINE LEARNING TO TEXTUAL DATA

ML algorithms identify patterns in data, summarize the patterns in a model, and use the models to make predictions by identifying the same patterns in new data (see Kohavi and Provost, 1998).

A model is a structure that summarizes the patterns in data in some statistical or logical form in which it can be applied to new data (see Kohavi and Provost, 1998). This book has already introduced some examples of ML models, such as the decision tree for bail decisions in Figure 4.2 or the random forests of decision trees referred to in Section 4.4.

The models capture the strength of the association in the patterns between observed features and an outcome feature. For example, the decision on bail is an outcome feature, and the observed features included whether the offense involved drugs or the offender had a prior record. The Supreme Court's decision to affirm or not is an outcome feature, and the observed features included a justice's gender or the appointing president's party. The model captures the strength of the association in the patterns between observation and outcome features either statistically, logically, or in some combination of the two.

Information

Type: Chapter
Information: Artificial Intelligence and Legal Analytics
New Tools for Law Practice in the Digital Age
, pp. 234 - 258

DOI: https://doi.org/10.1017/9781316761380.008 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.