This is Part Two of Two in a set of Tips of the Month addressing email archives. The tip of the Month for May addressed initial considerations and functionality relating to email archives.

A large multinational corporation journals all incoming and outgoing employee emails in an email archive system in order to better manage email volume and the potential complications associated with backup media in its frequent litigations. For each legal matter requiring the production of documents, the corporation must preserve, search, retrieve and produce large volumes of email from the archive. The corporation is beginning to realize just how time consuming, cumbersome and subject to error this process can be, and is evaluating ways to better manage the process in future litigation.

Understanding the Challenges of Email Archives in Legal Matters
Email archives can be useful to an organization in many ways. For example, they can assist with regulatory compliance and with email consolidation to streamline preservation and collection in connection with legal matters. However, many email archives were primarily designed to enable consolidation and retention of email, not to preserve an individual employee’s email, search and retrieve emails for litigations or to provide a defensible email deletion program that is consistent with an organization’s policies and legal obligations. This deficiency can cause several challenges to managing the preservation, collection and production of email from email archives in a cost-effective way.

First, discovery, whether in federal or state court, remains largely focused on the concept of data associated with a “custodian.” A custodian is traditionally understood to be an individual employee of the organization who has access to, and stores, information related to the operations of the organization. Parties to a litigation often spend significant time and effort identifying relevant custodians and discussing how to associate responsive data with that custodian. Email is a classic example of data considered to be associated with a particular custodian. 

This use of the term custodian, however, is a misnomer when it comes to email archives. Email archives do not generally store email by custodian. Email archives generally store email based on the concept of applicable retention periods and limited duplication. Thus, in the case of an email archive, the “custodian” is the archive itself, not the individual employee. Thus, when an organization searches for an employee’s email in an email archive, what the organization is really doing is running searches on certain metadata fields in the email archive using that employee’s name or email addresses as keywords. 

Second, the search capabilities of email archives are often limited—allowing for only the most basic of searches, such as searches by date or simple keywords. In addition, some archives are limited in the type of documents they can search. For example, some email archives search the text of an email, but not the content of any email attachments, while other email archives may be able to search some email attachments, but may not be able to search certain types of email attachments (e.g., older PDFs or facsimiles that are not readable).

Third, searches in email archives must often be conducted sequentially, i.e., the email archives cannot conduct more than one search at a time. Nor can an email archive “tag” or otherwise identify a document that has already been collected for a particular legal matter. This means that when an organization conducts more than one data collection for the same legal matter using employee names/email addresses or keywords, the organization invariably is collecting duplicate data. This result can translate into significant additional costs to the organization in the processing of that data for review. 

Fourth, in many cases, organizations use email archives that are maintained and serviced by a third-party vendor. While the organization may technically “control” the email in a legal sense, the organization is nonetheless limited in its ability to expedite or control the search and retrieval. Given the large volumes of data that are often maintained in an email archive, the search limitations imposed by the architecture of such email archives and the effort required to extract that data, the collection and retrieval process can take weeks or months for just one search and collection request.

Finally, email archives are, in essence, large databases of information. Those databases are fallible. Most email archives experience the corruption of a certain percentage of the data they retain in the ordinary course of business. Email archives do not, therefore, contain 100 percent of all emails flowing through the organization’s systems. Moreover, when searching through massive amounts of information, and retrieving vast quantities of data, technical failures may arise. This often means that the search and retrieval of emails from an archive is a manual process, requiring significant hours by employees to ensure that the archive is functioning correctly. 

Best Practices for Managing Search and Retrieval from Email Archives
Understanding the operation and limitations of email archives is an important step in managing those archives when it comes to search and retrieval of data for legal purposes. It is also critical in properly communicating with opposing counsel or the court when it comes to collection and production in connection with a particular legal matter.

  • Plan Your Collections. Once you understand the operation of the organization’s email archive, plan your collections accordingly. Consider how to minimize duplication in collections and how to most efficiently and effectively search for responsive emails. And be prepared to discuss your collection methodology—including any limitations to the email archive’s search capabilities and how your methodology addresses those limitations—with opposing counsel.  
  • Custodians. In many document productions, the parties negotiate the deduplication of the collected data and how custodian information may be preserved, collected and produced in conjunction with such deduplication. Be prepared to discuss the concept of “custodian” when it comes to email archives with the court or opposing counsel. It may be less burdensome—and more accurate—to identify the archive itself as the custodian of any email retrieved (in the custodian metadata field of the load file), and to rely on the names listed in the “to,” “from,” “cc” and “bcc” fields to identify who sent or received the email. Collecting the email data from the archive for a group of employees at once (which may eliminate the need for deduplication) may also confuse your outside e-discovery vendor and opposing counsel, and impair your processing and review workflow. Planning is critical to avoid an ad hoc approach during the pendency of a litigation or investigation. 
  • Timing. Be aware of the time it is likely to take to retrieve email from the organization’s email archive, and be sure to communicate that potential time frame to legal counsel. It is critical to make sure that any representations to the court or opposing counsel take into consideration the time it will take to actually obtain the necessary data.
  • Investing in Technology. Most archive vendors are developing new technology to help manage the legal hold and collection process. There may be a significant return on investment for such technology, but be sure to understand how the new technology functions and whether it actually makes the way your organization presently manages legal holds and collections more cost effective.

This is Part Two of Two in a set of Tips of the Month addressing email archives. The Tip of the Month for May addressed initial considerations and functionality relating to email archives. 

For inquiries related to this Tip of the Month, please contact Anthony J. Diana at or Therese Craparo at

Learn more about Mayer Brown’s Electronic Discovery & Records Management practice or contact Anthony J. Diana at, Michael E. Lackey at, or Ed Sautter at

Downloads –