A putative class action complaint has just been filed against a large company. After distributing a legal hold, the general counsel considers who should be in charge of the eventual collection of documents, including electronically stored information (ESI), when the collection process should begin, and how the collection process should operate in order to be efficient and effective.

Why Worry About Collecting Data?
Collecting data, including ESI, can be a daunting task. Counsel’s collection and production obligations often can go far beyond e-mail servers and archives. Depending on the claims and defenses in a dispute, a company may have to produce ESI from a wide variety of additional computer systems and media, including: hard drives of laptop and desktop computers; PDAs and smartphones; various removable media such as flash drives; personal network storage locations assigned to individuals; shared network storage locations assigned to departments or business units; various software applications and associated databases; and sometimes even telephonic and voice mail systems or instant messages.

Ensuring that ESI is properly collected requires a plan designed to meet basic discovery objectives, overcome claims of under-collection, allow for the processing of data for review and production, and ensure the admissibility of the evidence if necessary.

Who Should Collect Data?
The first choice that a collecting party faces is whether to do the collection internally or to outsource the process to a specialized vendor. The appropriate decision will vary based on the frequency with which the party finds itself in litigation, the financial stakes of the current litigation, and the breadth of the issues in play.

A frequent litigant may want to bring ESI collection in house, where an investment in software tools that enable document collection and personnel to manage the process can ultimately save money over the course of repeated litigations. An infrequent litigant may be better served by bringing in an outside vendor retained to collect data in an efficient and defensible manner, rather than diverting IT staff, who may be unfamiliar with the objectives of collecting data, from their regular functions.

Outsourcing the collection process is not an all-or-nothing decision, however. As a middle ground, companies may invest in ESI collection software and contract with the software vendor or another entity to provide the personnel and technology for certain tasks, such as collections from relational databases or other large scale collections. In this way, the litigant can outsource some portions of the collection process while retaining direct control over others.

When Should Data Be Collected?
Unlike in the days of exclusively paper discovery, where it was common practice to await discovery requests before collecting evidence, the current Federal Rules (and the nature of ESI) require a much earlier commitment to collection of data. Indeed, given the difficulty and risk involved in enforcing a legal hold that relies on preserving ESI ‘in place’, litigants increasingly are “collecting to preserve” at the outset of litigation.

This approach combines the identification of potentially relevant information with the storage of that information in an appropriate and defensible manner. In addition, given that a more detailed review of individual sources of ESI typically accompanies the collection effort, early collection can reveal unforeseen sources of data. 

Front-loading collection can come with a cost, including the expense of organizing and maintaining data long before, if ever, it is produced. Thought can be given to collecting some data, including data responsive to Rule 26(a) initial disclosures, early in the litigation and collecting other data, such as data responsive to a putative class that has yet to be certified, later in the case. 

How Should Data Be Collected?
A litigant need not collect every e-mail, electronic document, backup tape or shred of paper it possesses. Understanding the kinds of data to be collected helps a litigant to prepare a collection plan. For example, data collection should be conducted in a manner that considers how the particular data is kept in the ordinary course of business for several reasons. First, the processing for review and production of different types of ESI—e-mails vs. spreadsheets, for example—can differ. Second, the default form of production for ESI under the Federal Rules is the form in which the data is ordinarily maintained. The collection process can also consider whether steps should be taken to collect any metadata associated with collected files. In some instances, the collection of metadata is essential; for other kinds of ESI, the collection of metadata is either unnecessary or unduly burdensome.

Further, especially when collection is done early in the case, it is not always possible to limit the collection with a defensible set of search terms. Therefore, litigants often opt to collect all ESI from key custodians, and to cull unneeded files later. However, one can limit the ESI collected by using “exclusionary” restrictions; that is, collecting all files except those that satisfy particular filtering criteria. For instance, it is generally appropriate to exclude the collection of system files and executables, as these rarely contain responsive information. Likewise, it is often appropriate to exclude data that has been “deleted” but still resides in fragmented form or in slack space on a hard drive.

Once collected, the data is best placed in a “write once, read many” format, such as a recordable CD or DVD, to ensure that the files and associated metadata, if any, are not inadvertently modified. Collected data is then typically processed, either internally or by an outside vendor, for review and production.

What’s the Takeaway?
Planning for collection of ESI ideally starts before litigation. Planning early allows a company to choose an optimal combination of in-house resources and outside vendors to efficiently handle collection of ESI. Once litigation has started, it is often best to collect ESI early and to use a collection methodology intended to maintain the integrity of the data but is also relatively easy and inexpensive to process for review and production. Employing exclusionary restrictions to limit collection can be effective when used at an early stage of litigation.

Downloads –