The Journal of The DuPage County Bar Association

Back Issues > Vol. 21 (2008-09)

Metadata Mining During e-Discovery: The Potential for Finding a Diamond
in the Rough

By Erin E. Wright


Metadata is an intriguing aspect of e-discovery. Some view it as a wealth of valuable information while others view it as a drain on time and resources. Metadata is often described as "data about data."1 The advisory notes to the Federal Rules of Civil Procedure define it as "information describing the history, tracking, or management of an electronic file."2 The advisory notes are the official commentary on the Federal Rules, but the Sedona Conference’s Guidelines were the impetus for incorporating the topic of e-discovery into the 2006 amendments. The Sedona Conference Working Group brought together a group of judges and attorneys to develop uniform recommendations on electronic document production, including metadata.3 The Sedona Conference explains that metadata is "information about a particular data set which describes how, when and by whom it was collected, created, accessed, or modified and how it is formatted (including data demographics such as size, location, storage requirements and media information)."4 Some types of metadata are readily visible, like file size and date, while other types may be hidden or embedded and difficult for lay people to access,5 like when a document is printed or converted to an image file.6

Practically speaking, metadata exists where electronically stored information ("ESI") exists. ESI exists in many forms and is an integral part of today’s practice of law. For example, lawyers rely on Microsoft Word, Excel, PowerPoint, e-mail, and .pdf documents every day. The list of potentially discoverable ESI is endless, and can include computer hard drives, disks, flashdrives, digital tapes, microfilm, and other types of electronic data storage devices.7 Beyond the types of discoverable ESI, the sources from which the ESI can be extracted is also endless: network servers, Blackberrys, voicemail messages, employees’ home or laptop computers, websites, and portable drives are all viable possibilities.8 While the laundry list of discoverable ESI and corresponding metadata may appear daunting, the metadata associated with each individual piece of ESI can be revealing.

Consider an attorney who represents a company. In the regular course of business, the company’s employees produce documents in Excel and Word. When the company is sued and is forced to produce its files during discovery, the metadata attached to the files may reveal important, or even confidential, information. In the event the requesting attorney reviews the metadata, it would disclose who drafted each document, when it was drafted, when the employees accessed the file, and the number of document versions. The metadata would also reveal the file’s size and that it was stored in a public folder on a document management system. The requesting attorney would learn that the file contained pictures and a .pdf attachment.

Not only is this a significant amount of information not available in hard-copy form, but also this information may prove valuable during discovery for those who know how to use it. The requesting party may scour the metadata to learn that the file size of the hard-copy document produced and the ESI noted in the metadata do not match up; the requesting party may discover that the producing party removed documents from its tangible production but failed to explain the reason for doing so.

Known as "mining" the metadata, it is possible to discover changes and comments that have been made to a document as well as images embedded within a document.9 The trouble with this scenario is that the disclosing attorney may have spent a substantial amount of time sifting through the hard-copy documents in order to determine the appropriate items to disclose. If the disclosing attorney neglects the metadata and fails to explain an inconsistency between the metadata and the hard copies, the document and electronic production is undermined. In a case of first impression, the Northern District of Illinois in Autotech Technologies Ltd. v., Inc., ruled upon the topic of metadata and explained how parties should approach the topic in this jurisdiction.

Autotech Technologies Ltd. v., Inc. The Northern District of Illinois recently ruled that a party need only produce that for which its adversary asks. That is, a party adequately complies with an adversary’s discovery requests if it produces electronic documents but omits the corresponding metadata.

In Autotech Technologies v., the plaintiff, a manufacturer of touch screen technologies, filed a state court claim against the defendant ("ADC"), a marketer, for its development of a competing touch screen. Through the course of discovery, Autotech produced a document titled EZTouch File Structure in both .pdf format, which was burned on to a CD-ROM, and paper format.10 Two of Autotech’s employees signed affidavits stating that they saved the files directly from Autotech’s engineering server, which was maintained in the ordinary course of business, and saved two Microsoft Word documents to a CD-ROM. Before the mid-1990s, Autotech used a "PGI" file structure to create the EZTouch panel.11 During the mid-1990s, Autotech switched to Microsoft Word and began saving its EZTouch documents in this form. Autotech’s employees burned both the PGI documents and the Word documents on to CDs for ADC. The Autotech employees stated that no changes were made to the Word documents, their contents, or their metadata during the burning process.12 However, ADC sought to compel production of an electronic copy of the EZTouch File Structure because it wanted the metadata to determine whether the production was comprehensive.

ADC argued that Autotech failed to produce the EZTouch documents in the form in which it was ordinarily maintained because the documents at issue were converted from Word documents to .pdf. 13 ADC also argued that it needed the documents in their native format, which includes metadata, because this form reveals when the document was created, modified, and when it was designated confidential.14 According to ADC, the electronic Word documents (converted into .pdf form) and the hard copies were insufficient because neither contained the metadata as it was stored in Autotech’s engineering computers.15

The court considered whether Autotech’s electronic disclosures satisfied Federal Rule of Civil Procedure 34(b)(2)(E), the section that controls the production of ESI. The Federal Rules require a party to produce documents or ESI as they are kept in the usual course of business.16 A party may produce ESI in the form in which it is ordinarily maintained, or in a reasonably useable form, unless the opposing party specifies a form.17

The court pointed out that ADC did not specify the form of production and, therefore, Autotech had the choice of producing it in the form in which it was reasonably maintained or in a reasonably useable form.18 On one hand, ADC alleged that the electronic documents had been altered when converted from Word to .pdf, but offered no affidavits or other evidence to support its claim. On the other, the court pointed out, Autotech offered affidavits from two of its employees that stated no changes had been made during the conversion.19 The court explained that accepting these affidavits would end the dispute because Autotech would have complied with Federal Rule 24(b)(2)(E)(ii). If any changes were made to the documents, however, then the question before the court was whether either the hard copies or the .pdfs were "reasonably useable forms."20

Based on the arguments presented, the court held absent a special request for metadata and a prior order demonstrating need, Autotech’s production of paper copies and .pdf documents complied with the ordinary meaning of Rule 34.21 Citing the seminal case on the topic of metadata, the court acknowledged that "there should be a modest legal presumption in most cases that the producing party need not take special efforts to preserve or produce metadata."22

The court pointed out that ADC did not request that Autotech include metadata in its document production and admonished it because the notion of metadata came to ADC shortly after it received the hard copies of Autotech’s file structure documents.23 "[C]ourts will not compel the production of metadata when a party did not make that a part of its request."24 Accordingly, the court concluded, "ADC was the master of its production requests; it must be satisfied with what it asked for."25

Conclusion.The issue for many lawyers is whether engaging in e-discovery is worthwhile. At first glance, the notion of metadata may only complicate an already complex process. However, the Sedona Conference issued a new set of Guidelines in 2007 that comment on the extent to which metadata should be preserved and produced.26 The Guidelines encourage lawyers and their clients to consider what metadata is ordinarily maintained, the potential relevance of the metadata to the dispute, and reasonable accessible metadata to facilitate the parties’ review, production, and use of the information.27

Once lawyers discern that metadata will be useful to their case, Autotech teaches us that lawyers should specify a form for producing ESI. The best form in which to request the production of discovery is to request that the documents be produced in their native format with metadata intact. If the requesting party does not make a particular request, the Federal Rules permit the producing party to determine the form of discovery. Under Rule 34, the disclosing party may produce documents in the form they are ordinarily maintained, or alternatively, in a reasonable usable form.28 While there is nothing deficient about documents in either of these forms, the requesting party may be missing pivotal information not apparent from the hard-copy production.

To alleviate the burdens of e-discovery, the requesting party should request the native-format metadata in a form that is searchable.29 The Sedona Conference’s 2007 Guidelines encourage that production should account for "the need to produce reasonably accessible metadata that will enable the receiving party to have the same ability to access, search, and display the information as the producing party where appropriate or necessary in light of the nature of the information and the needs of the case."30 While this Guideline has not yet been adopted by courts, it may very well be adopted in the near future. In the meantime, metadata may prove to be a significant advantage during the course of litigation. Metadata may in fact be the diamond in the e-discovery rough.

1 Williams v. Sprint/United Mgmt. Co., 230 F.R.D. 640, 646 (D. Kan. 2005).

2 Fed. R. Civ. P. 26(f), advisory committee’s note to the 2007 amendments.

3 Williams, 230 F.R.D. at 648 (citing The Sedona Guidelines: Best Practice Guidelines & Commentary for Managing Information & Records in the Electronic Age, 5-6 (2002)).

4 Id. at 646 (quoting The Sedona Guidelines: Best Practice Guidelines & Commentary for Managing Information & Records in the Electronic Age, Appendix F (2002)).

5 Autotech Techs. Ltd. P’ship v., Inc., 248 F.R.D. 556, 557 n.1 (N.D. Ill. 2008) (citing Scotts Co. L.L.C. v. Liberty Mut. Ins. Co., 2007 WL 1723509, *3 n.2 (S.D. Ohio 2007).

6 Williams, 230 F.R.D. at 646.

7 Daniel R. Murray, Timothy J. Chorvat, & Chad E. Bell, Taking a Byte Out of Discovery: How the Properties of Electronically Stored Information Have Shaped E-Discovery Rules, 41 U.C.C. L.J. 35, 38 (2008).

8 Id. at 38, 42.

9 Marcia Coyle, Where Do the Footprints of Metadata Lead?, Nat’l L.J. (Feb. 20, 2008).

10 Autotech Techs. Ltd. P’ship, 248 F.R.D. at 557.

11 Id. at 558.

12 Id.

13 Id. at 559.

14 Id. at 557.

15 Id. at 559.

16 Fed. R. Civ. P. 34(b)(2)(E)(i).

17 Fed. R. Civ. P. 34(b)(2)(E)(ii).

18 Autotech Techs. Ltd. P’ship, 248 F.R.D. at 558.

19 Id. at 559.

20 Id.

21 Id. at 560.

22 Id. (citing Williams, 230 F.R.D. at 651).

23 Id. at 559.

24 Id.

25 Id. at 560.

26 See The Sedona Guidelines: Best Practice Guidelines & Commentary for Managing Information & Records in the Electronic Age, 60 (2007).

27 Id.

28 Fed. R. Civ. P. 34(b)(2)(E)(ii).

29 Murray, Chorvat, & Bell, supra note 7, at 43.

30 The Sedona Guidelines: Best Practice Guidelines & Commentary for Managing Information & Records in the Electronic Age, 61 cmt. 12a (2007).

Erin E. Wright graduated from The Ohio State University, magna cum laude with Honors in 2005 and The Ohio State University Moritz College of Law in 2008.  During law school, Ms. Wright served as Editor-in-Chief of I/S: A Journal of Law and Policy for the Information Society.  She is currently an associate at Swanson, Martin & Bell, LLP and practices general litigation and intellectual property law. 

DCBA Brief