Modern email systems include a deleted items folder that receives email messages from a user's inbox, sent items folder or other folders when a message is to be deleted. Some email systems also include a tombstone folder that receives the content of the deleted items folder when the deleted items folder is emptied. The tombstone folder, also known as a dumpster folder, provides a means for a user to recover email messages that were removed from the deleted items folder inadvertently.
Electronic discovery (also known as E-Discovery) refers to any process in which electronic data is sought, located, secured and searched with the intent of using it as evidence in civil or criminal litigation. The data is typically sought from electronic devices such as personal computers and email and other servers. While it is possible to search for certain email messages stored in a user's mailbox, such as those stored in the inbox, sent items folder and deleted items folder, dumpster folders, which are typically not accessible to a user, cannot be searched.
Embodiments of the invention are directed to searching for email messages on a server computer. A request is received on the server computer to search for one or more email messages in one or mailboxes on the server computer. Each of the one or more mailboxes is associated with a specific user. Each of the one or more mailboxes includes a dumpster folder. The request includes search criteria including a parameter indicating whether the dumpster folder associated with a mailbox should be searched. The dumpster folder stores one or more email messages that have been deleted from a user's mailbox.
One or more mailboxes that satisfy the search criteria in the request are identified. If the parameter indicates that the dumpster folder should be searched, the dumpster folder of each of the identified mailboxes that satisfy the search criteria is queried and any email messages in each dumpster folder that satisfy the search criteria are identified.
The details of one or more techniques are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of these techniques will be apparent from the description, drawings, and claims.
The present application is directed to systems and methods for searching the dumpster folder in a user's mailbox in an email system. In the systems and methods, each email message in the dumpster folder is indexed so that it can be located by a search. A search request includes a parameter, typically a Boolean flag, which when set permits the dumpster folder to be searched. The dumpster folder is typically searched during E-Discovery to ensure that all personal data on a computer system is obtained. The parameter is typically not set during normal use, permitting normal searches of the email folders to occur without the overhead generated by an E-Discovery search.
The example mailbox server 104 includes a plurality of mailboxes and each mailbox can have a plurality of folders. As shown in
In a typical email system, a user can delete an email message in the inbox folder 302, sent items folder 304 or one of the miscellaneous folders 310 by moving it to the deleted items folder 306. Typically, items remain in the deleted items folder 306 until the user or a system administrator empties the folder. Emails can be emptied from the deleted items folder 306 one at a time or in bulk. Occasionally, a user removes one or more email messages from the deleted items folder 306 inadvertently and wishes that the one or more items can be retrieved. For this reason, some email systems include a tombstone folder, such as example dumpster folder 308. The example dumpster folder 308 automatically receives all email messages removed from the deleted items folder 306 and stores these email messages for a predetermined period of time, a period of time that can be adjusted by a system administrator. In example embodiments, the deleted items folder 306 may be bypassed and the dumpster folder 308 may receive items directly from other mailbox folders. The system administrator can retrieve any items from the example dumpster folder 308 at the request of a user and move those items into a user accessible folder.
In order to accommodate a search of the example dumpster folder 308, the email messages in the dumpster folder are indexed. The email messages can be indexed during a crawl mode when a content index for the mailbox is created or updated. In the crawl mode, all email messages in the dumpster folder, as well as all email messages in the other mailbox folders, are indexed with an identifier that permits the email messages to be located during a search.
Email messages are also indexed during a notifications mode. Whenever a new email message is added to a user's mailbox, whenever an existing email message is changed by the user and whenever an email message is deleted, an event is generated and the email message is indexed. A new index identifier is created for a new email message and the existing index is updated for changed or deleted email messages. Notifications occur for changed and added messages in the dumpster folder as well as for other mailbox folders. So if an email message is deleted from an email folder and transferred to the dumpster folder, indexing for the deleted email message is maintained.
The example API 400 is called when a mailbox search is initiated during an E-Discovery or other search. An E-Discovery search can only be done by authorized individuals, such as members from a legal department, whose access is granted by a system administrator. The system administrator or the authorized individuals may create a command string with search information using a command line application such as a cmdlet. The cmdlet permits a command name and parameter data to be entered in string form. When the cmdlet is excuted, the example search API 400 is called. Because the example mailbox server 104 is organized at the mailbox level, the cmdlet may result in several API calls, each call querying a specific mailbox. Thus, multiple mailboxes may be queried during an E-Discovery search. Alternatively, command and parameter data may be entered by authorized individuals via a graphical user interface.
When searching for E-Discovery information, it is often necessary to do string searches for specific dates. An example E-Discovery request may be to obtain all emails from John Doe to Mary Smith from Jun. 21, 2004 to Sep. 30, 2004. Some email systems represent and index date fields numerically, making it difficult to do string searches on these fields. However, it is possible to represent numbers by a set of strings whose lengths reflect the place value of digits in the number. Representing numbers in this manner permits a more efficient numerical search.
A date may be thought of as three numeric properties—year, month and day. For example, for the date May 11, 2008, the year 2008 can be represented by three hexadecimal digits (nibbles), 7 DB, whose value equals 2008. Similarly, the month 05 can represented as 5 in hexadecimal and the day 11 can be represented as B in hexadecimal.
Further, each nibble can be represented as a string comprising a prefix string and a string of letters corresponding to the value of each nibble. For example, the nibble 7 can be represented by the prefix string “d3a2t1e” plus “qqqqqqq”, where the letter “q” appears seven times in a string, corresponding to the value 7 in the nibble. Similarly, the nibble D can be represented by the prefix string “d3a2t1e” plus “qqqqqqqqqqqqq”, where the letter “q” appears 13 times in a string, corresponding to the hexadecimal value D in the nibble. Similarly, the nibble B can be represented by the prefix string “d3a2t1e” plus “qqqqqqqqqqq”, where the letter “q” appears 11 times in a string, corresponding to the hexadecimal value B in the nibble. In this manner, a search for the year 2008 can be done by performing matches on the strings “d3a2t1eqqqqqqq”, “d3a2t1eqqqqqqqqqqqqq” and “d3a2t1eqqqqqqqqqqq”. A different year may have a different prefix string.
The example flowchart in
With reference to
The computing device 104 may have additional features or functionality. For example, the computing device 104 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in
The computing device 104 may also contain communication connections 618 that allow the device to communicate with other computing devices 620, such as over a network in a distributed computing environment, for example, an intranet or the Internet. Communication connection 618 is one example of communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. The term computer readable media as used herein includes both storage media and communication media.
The various embodiments described above are provided by way of illustration only and should not be construed to limiting. Various modifications and changes that may be made to the embodiments described above without departing from the true spirit and scope of the disclosure.