Guest article: Predicting the future of disclosure
Here's a case study on the effectiveness of predictive coding technology to determine relevance in e-disclosure/ediscovery projects by Dominic Lacey and Jamie Tanner of Eversheds LLP and James Moeskops of Millnet Limited
What is Predictive Coding Technology?
Predictive coding technology uses statistical sampling techniques to score the relevance of documents. A sub-set of documents are the subject of a human review. This human review creates a model that is then applied to the wider set of documents, each of which is scored against the model for relevance.
In this case the software used was Equivio Relevance. Documents for review are loaded into the Equivio Relevance software whiche only looks at document text, not family relationships etc. Accordingly, families are separated and all documents de-duplicated at attachment level. The human review then commences. This is an iterative process. Initially random batches are presented to the reviewer by the software. Each document is scored by the reviewer as either relevant or irrelevant to the matters in issue. Over time a set of rules or a model is created. Once a threshold of stability is reached the software can be applied to all of documents whereby the software scores the relevance of each document. The technique is based on the text of documents, so does have limitations in respect of plans/ drawings, photographs and number only documents, eg spreadsheets. The methodology works to link associations and words in the text. It is much more than key word searching.
The Case Study
The initial harvesting of documents was done with reference to custodians, date ranges and key word filters. Predictive coding is a purely second stage process. In this case, in excess of 250,000 top level documents were harvested. These documents were then de-duplicated at top level and loaded into Equivio Relevance. The lead reviewer then started the process. The first step was to review 1000 documents selected by the software on an entirely random basis. Of the 1000 documents presented, the lead reviewer considered only 42 documents to have any relevance to the matters in issue.
The next stage in the iterative review process was to review batches of documents. The software presented the reviewer with batches of documents, each documents being classified as relevant or not relevant. Each time a batch of documents was completed, the software applies statistical algorithms to analyse the text contained within the batch of documents reviewed against those documents classified as relevant and not relevant in all previous batches reviewed. In total the lead reviewer reviewed circa 2000 documents being the initial random 1000 documents and a further 1000 in multiple batches.
Through the batch review process the software learns. The process could be seen to be working with the batches being presented for review towards the end of the process with roughly half of the documents being relevant (the software deliberately continues to present documents it classifies as irrelevant to ensure the reviewer is kept alert and consistent). Through the review process the model is refined until the consistency of review produces a stable model to apply to all documents.
The application of the model to the whole set of documents takes a number of hours to run and in our case was done overnight. The software does not decide between relevance or irrelevance. The result is that each document is given a percentage relevance score between 0-to-100. The documents were presented grouped into ten percentile bands, eg X number of documents are shown to be in the 0% – 10% band, Y in the 11% to 20% band etc. Documents with a 0% score are documents without text, such as photographs.
The decision is then where to make the cut. We carried out a sampling review of the extremes of the results. This confirmed that the 0-10% scores and the 91-100% scores were almost entirely irrelevant and relevant respectively. Documents in the middle ranges were then sampled to make a determination of where to determine the boundary. The cut made resulted in 7000 documents being considered relevant for further review and the balance being irrelevant. This was a conservative judgement, but significantly reduced the number of documents for manual review. The documents that made the cut numbered some 7000 documents. This was a manageable number to review compared to the starting pool of over 250,000. These documents were manually reviewed for privilege and relevance prior to disclosure.
A quality audit process was also carried out on the results to verify the effectiveness of Equivio Relevance. This took place over a number of weeks. It was not a rushed audit under pressure but a proper and carefully considered review. Of the documents deemed irrelevant, a sample of 20% were manually reviewed. From the manual review, 1.7% of the documents manually reviewed were judged relevant, but incorrectly classified as irrelevant by Equivio Relevance. Put another way more than 98% of the documents deemed irrelevant were correctly coded as irrelevant.
The standard for predictive coding is not perfection (ie perfect categorisation of documents as relevant/non relevant for a particular matter). The standard is whether the results can be shown to have been effective. Human review is not the gold standard for disclosure. A 2010 study* comparing the results of human review against computer review found that the computer review was at least as accurate as manual review.
Rather, a study by Grossman & Cormack** concluded “the myth that exhaustive manual review is the most effective – and therefore, the most defensible – approach to document review is strongly refuted. Technology assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.”
The particular processes found to be superior in the study are both interactive, employing a combination of computer and human input. The study considered whether manual reviewers would have had the same error rate had they reviewed the entire disclosure rather than batches, and found that reviewers would tend to miss “needles in the haystack” due to fatigue, inattention, boredom and other related human factors.
Costs Implications & Limitations
It is estimated that to carry out a human review for relevance of 250,000 documents would take many hundreds of lawyer days and cost well in excess of £1,000,000. The process of Equivio Relevance assisted review was completed for a fraction of this cost and in a much faster time frame. The obvious limitations of predictive coding technology are that it is a textual analysis. It is ineffective with drawings, photographs etc and has limitations with regard to spreadsheets.
The use of predictive coding software is downstream from the preservation, collection, processing and other initial review and filtering stages in the electronic disclosure process. Therefore, it is critical that the approaches, judgement calls and technologies used to identify the sub-set of documents against which predictive coding technology is applied are therefore robust and defensible.
Predictive coding is lawyer led. The effectiveness of the process is determined by the ability of the person undertaking the sampling review. It is strongly suggested that this person should be the partner with responsibility for the conduct of the matter. There must be some degree of considered quality control checking. On large datasets this will involve sampling and document reviewing from the relevant/non-relevant categories and may involve iterative adjustments to improve the effectiveness of the predictive coding results.
Court’s Approach to Predictive Coding Technology
There has been considerable commentary and some case studies (particularly in the US) on the use of predictive coding, although to date there has not been a single judgment in any jurisdiction relating specifically to the use of predictive coding. However, in Goodale v The Ministry of Justice, Senior Master Whitaker endorsed the use of predictive coding “this [case] is a prime candidate for the application of software that providers now have, which can de-duplicate that material and render it down to a more reasonable size and search it by computer to produce a manageable corpus for human review – which is of course the most expensive part of the exercise. Indeed, when it comes to review, I am aware of software that will effectively score each document as to its likely relevance and which will enable a prioritisation of categories within the entire document set.”
The New Practice Direction 31B CPR which applies to proceedings started on or after 1 October 2010 deals specifically with disclosure of electronic documents. It includes a general principle that “technology should be used in order to ensure that document management activities are undertaken efficiently and effectively.” In addition, parties are required to discuss the “tools and techniques (if any) which should be considered to reduce the burden and cost of disclosure of electronic documents” which includes the use of “agreed software tools.” These provisions make it clear that technology may be employed in the e-disclosure process, subject always to the overriding principles of reasonableness and proportionality.
* Document Categorization in Legal E-Discovery: Computer Classification vs. Manual Review, Herb Roitblat, Anne Kershaw and Patrick Oot, January 2010
** Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, Maura R. Grossman & Gordon V. Cormack