A few weeks ago we featured an article defining the key terms used by those in the predictive coding field and explaining them from a technical perspective. It turns out the article stimulated a hearty discussion among our authors, resulting in this piece – an examination of the same key terms but from the perspective of a lawyer and providing context for legal teams as they interact with a service provider.
When litigating a multijurisdictional case involving European entities, costs of discovery can be alarmingly higher as compared to a domestic U.S entity because of the additional cost in complying with relevant European data protection laws. European entities are prohibited from disclosing personal data to third parties unless there is statutory permission or consent of all persons to whom the data relates which in many cases makes it necessary to manually redact personal data in each document before disclosure. Because of the exacerbated costs, and despite improvements in automatic redaction tools, European entities have to find ways to reduce costs of discovery without jeopardizing compliance with relevant rules.
Incorporating some level of predictive coding is one solution. Nevertheless, for European entities predictive coding is far from mainstream. One of the many possible reasons is that entities open to the use of predictive coding will quickly be confronted with statistical terminology which without full understanding, prevents further internal consideration, hinders chances of endorsement from stakeholders and even worse, prevents in-house counsel’s true participation in the project. How can terms like “precision” and “recall” be described to a U.S. or European entity who has never used predictive coding? How do these terms relate to proportionality?
What the Algorithm Does and Why We Care About Recall and Precision
In a manual review of electronically stored information, parties are required to produce documents according to the scope of discovery as defined in Federal Rule of Civil Procedure (“FRCP”) 26. We have to satisfy ourselves, opposing counsel and the judge that we have adequately searched for and produced a substantial portion of responsive documents based on document requests. When parties use predictive coding the duties and requirements of the law do not change. Simply put, predictive coding will rank the universe of documents in order of “most likely relevant” on top to “least likely relevant” on bottom. The algorithm has the ability to recognize what is “most likely relevant” based on the coding of a small set of documents which were reviewed and tagged by an attorney. If you think of the result of predictive coding as a list, you will agree with opposing counsel and the judge as to a cutoff point within that list where you will stop producing documents. Most entities choose to manually review documents from the highest likelihood of responsiveness to the agreed cutoff point. It is also common for sampling to take place below the cutoff point to confirm the algorithm’s output. Explaining predictive coding to a client includes an honest assessment as to whether the project is large enough and involves the correct file types to justify its use. As with any type of discovery project documentation along the way is crucial.
It is important to understand the concepts of recall and precision because understanding these two tangible figures empowers a party to more effectively argue proportionality. Remember that the new proposed FRCP 26 focuses the scope of discovery around proportionality. Rule 26 list factors judges should consider with regard to proportionality: “the importance of the issues at stake in the action, the amount in controversy, the parties’ relative access to relevant information, the parties’ resources, the importance of the discovery in resolving the issues and whether the burden or expense of the proposed discovery outweighs its likely benefit.” A producing party who understands recall can more persuasively articulate proportionality and therefore advocate for the most efficient set of documents to be produced to opposing counsel – the minimal universe of data needing manual review by the producing party. Being able to reduce the number of documents needing manual review, especially in the case of European clients, reduces the costs of discovery.
Recall Tells Me Whether I Found a Substantial Portion of Responsive Documents
A producing party will be motivated to argue for production of the smallest number of documents. If you imagine the ranking of the universe of documents in order of “most likely relevant” (green) to “least likely relevant” (red) on a target as the illustration below portrays, a producing party will argue that it should only be required to produce the documents within the smallest bullseye possible. A responding party will argue that the size of the bullseye should be significantly larger. The recall measurement will help guide the judge in understanding if you are agreeing to review and produce the appropriate number of documents i.e. whether the bullseye should be small or large to catch a substantial portion of potentially relevant documents. The wider you expand the bullseye the closer you get to 100% recall but you only get 100% recall if you review every document – exactly what we are trying to avoid by using predictive coding.
Precision Shows Me How Well the Algorithm Worked
If we continue using the bullseye example to understand precision, we are focusing on how well the algorithm collected documents within the bullseye. Precision provides a tangible figure to both parties and the judge to evaluate whether the algorithm collected the “most likely responsive” documents in the bullseye. The bullseye is important to a party concerned with costs because these are the documents needing manual review. This means that for every document incorrectly within the bullseye the producing party loses money because these documents will not be produced. They will not be produced because when the manual review takes place, they will be recognized as non-responsive and thrown out of the production set. For this reason, as well as a desire to not overproduce data, the producing party is encouraged to seek high precision from the algorithm. In truth, the receiving party also hopes for high precision because they also have little desire to review irrelevant documents. This concept of focusing on the documents which are “most likely responsive” makes sense with our new FRCP. The move to a scope of discovery which focuses on proportionality encourages a precise production of documents which will lead the parties to address the specified issues in dispute.
By putting these two concepts together, we can see that a bullseye with only “most likely responsive” documents and the tightest ring around those documents would be the optimal solution for a producing party. The illustrations above make it easier to understand that as recall increases (as we expand the size of the bullseye) precision decreases. This is because as you widen the bullseye, there is a higher chance that a “less likely relevant” or “least likely relevant” document comes within the bullseye. It may take a few iterations of modifying the training sets to reach a recall and precision rate that is agreeable among the parties and the judge but it makes sense to remain patient during this stage of the process.
Existing technology has the potential of reducing costs and document overproduction. Analytical tools are already built into most review tools (but vary among providers) and many litigation service providers do not separately charge for the use of analytics as they have in the past. Helping clients understand the technology available to best handle their case may involve a discussion about predictive coding, an explanation of the terms recall and precision and can be supplemented by a demonstration from a service provider. Your conversation with the client, judge or opposing party will hit the bullseye if you can explain the terms in an easy to understand fashion.