Originally published in October 2015, this was among the most-viewed posts following Relativity Fest that year—which goes to show that analytics has been top-of-mind for e-discovery teams for quite a while. Cristin's insights still resonate, so we thought we'd share them again in light of the new email threading innovations coming to Relativity in 2016.
The best way to tackle data is to take an organized approach. Analytics features such as email threading help case teams get the most out of their case data by using metadata to automatically organize documents, streamlining review from the start.
Today’s General Counsel recently featured an article entitled “Threading is the New Global De-duplication” from Cristin Traylor, counsel at McGuireWoods, on just that subject. To get more insight into how and why text analytics has become part and parcel of her team’s approach to e-discovery, we sat down with Cristin.
Sam: To quickly recap your article for Today’s General Counsel, what makes email threading a no-brainer for your e-discovery projects?
Cristin: In a nutshell, email threading saves time and money for our clients. There are fewer documents to review, and the review decisions themselves are more consistent. There is really no reason not to use it.
What objections do you most commonly hear to using email threading? How do you address them?
The most common objection is the cost to run analytics. However, when I explain that expending a small amount up front to run the technology will save the client more on review costs in the long run, everyone jumps on board. The other concern—I wouldn’t call it an objection—is that many times the case team doesn’t understand what threading is. Once I explain it to them in a language they can understand, they immediately see the benefits. An example I give is that if I send them an email and then we reply back-and-forth multiple times, we don’t need to review each part of the conversation separately and all together. We just need to review the last email between us, since it includes the rest of the conversation as well. That is a way of describing threading that most people can visualize.
Opposing counsel only object when they don’t understand what threading entails or, more generally, when they are less sophisticated when it comes to e-discovery matters. Again, once you explain what it is, tell them that they will not be missing any substance, and let them know that, in fact, they’ll have fewer documents to review on their end after the production comes through, they usually agree to it. When they don’t agree, it’s often because they don’t want to cooperate due to the contentious nature of the dispute.
Does case size matter when it comes to the benefits of email threading?
When it comes to threading, size doesn’t matter. The technology is beneficial, whether you have 1,000 documents or 1 million documents. If you are narrowing the population to review only the inclusive documents or just using threading to accelerate the review and improve consistency, threading can be useful. Threading can also assist in performing quality control on any size population. For example, you can setup a search for privileged documents and then include thread groups. This will allow you to QC the threads in the same group that are not marked privileged, in order to check the accuracy of the calls.
Internal investigations are a prime candidate for email threading, as well. Typically, you are trying to get a handle on a certain issue and have a large amount of data that you need to review and analyze quickly. With threading, you can narrow the population for review but at the same time have the ability to really delve into the email conversations. It may be very important to see what was happening in a certain email thread and who branched off to have a side conversation. Email threading allows you to easily group the threads together for this more in-depth analysis.
Are there any other text analytics features you find particularly useful?
Near-duplicate identification can play a role in quality control, as well. You want to make sure that you are treating duplicate or near-duplicate documents the same for privilege and responsiveness, if warranted. Near-duplicate identification allows us to easily group those documents together for analysis.
For projects with foreign documents, language identification can be useful to segregate the documents that contain text in other languages and develop a separate workflow for those documents. That could mean either having lawyers fluent in the other language review the documents, or having the documents translated into English.
Clustering can also be very beneficial. We combine clustering and other features to get the best of all worlds. For example, we may first run threading to identify the inclusive emails and exclude duplicate spares. Then we would cluster the resulting set so that we are grouping similar documents together. When setting up batches for assignment, we set the Batch Unit field as the clusters and the Family field as Email Thread Group. That way, the documents in the batches are conceptually similar but still include all the emails in a thread. We find this helps reviewers with both speed and consistency.
Aside from text analytics features such as email threading, what do you see becoming the norm in e-discovery?
There are so many brilliant people trying to solve these challenges, I am sure we will continue to see more sophisticated workflow options. For now, plain, old-fashioned cooperation among parties goes a long way toward reducing data sets. It would be wonderful if “reasonableness” became the norm.
Cristin Traylor is counsel at McGuireWoods, where she advises clients on e-discovery, records management, and information governance. She oversees the firm's e-Discovery Review Center, handling all aspects of discovery using cutting-edge technology.
Sam Bock is a member of the marketing communications team at kCura and serves as editor of the Relativity blog.