System and method for determining document version geneology

Description

FIELD OF INVENTION

The present invention generally relates to the field of electronic document management where multiple versions of one or more documents have a complex geneology.

BACKGROUND

In many business situations it is common for multiple versions of one or more documents to be created. Some businesses use tools such as Document Management Systems (DMS) or other content repositories to try to track and store each version of the document that is created. Even when such systems are in use, versions tend to be created and/or stored in locations outside the DMS when copies of the document are sent by email, received from 3rd party contributors, copied for offline editing, etc. This problem is most acute for document formats that encourage editing (such as Microsoft™ Office™ format documents) as opposed to document formats which are largely used for presentation of a final copy (such as Adobe™ PDF documents).

The problem facing a document author or collaborator is often this: having received or found a new version of a document, how do they decide what to do with it? Was the version of a document that has arrived in an email message created by editing the most recent version stored in the DMS? Was it created by editing an older version of the document? Is it just a duplicate of some other version of the document? Depending on the answers to these questions, different actions are required—for instance in the first case of the document being created by editing the latest DMS version it is likely enough just to save the received version as a new version into the DMS. In the second case it is likely that the changes made to the received version need to be merged into the latest DMS version, while in the last case no action at all may be required.

In these circumstances, a software tool capable of determining the genealogical relationships between document versions automatically would provide great value as it would provide the document author/collaborator with relevant information allowing them to make a proper decision on the action needed when new versions of a document are located or received. In order to be useful in the situations described above, the tool must be capable of determining genealogical relationships based on the content of the documents only, as other meta-information such as DMS version information, file names, file timestamps, etc., may not be present or may be modified in some or all versions located outside the DMS—for instance copied files may have altered names or timestamps and files sent via email may have lost their original timestamp.

A tool capable of determining document genealogy from content only would also be useful in the context of document forensics—in cases where large collections of documents and versions of documents have been collected and investigators wish to piece together the history of the document or documents involved.

SUMMARY

One embodiment of the invention applies to word processing documents in the RTF, DOC, DOCX and DOCM formats, which are most frequently edited using Microsoft Word ™. Recent versions of Microsoft Word (since at least Word 2003) have included a feature where a random integer of up to 4 bytes length, named a Revision Sequence ID or RSID, is added to the document for every editing session that the document undergoes. Microsoft Word itself uses this information to help in the process of merging documents—to determine whether a change noted between two versions was an insertion by author ‘A’ or a deletion by author ‘B’, however the list of RSIDs also provides information that can be used to accurately recreate the genealogy of a set of documents.

The storage of RSIDs in the different document file formats (RTF, DOC, DOCX, DOCM, etc.) is specified in the freely available Microsoft documentation for these file formats. Therefore, one embodiment of the invention includes a module that examines a set of RTF, DOC, DOCX, DOCM or other files provided as input and extracts the RSIDs for those files (101). This embodiment builds a data structure that tabulates an identifier for the file with that files' extracted RSID (104). This data structure is then used by the rest of the embodiment of the invention.

The use of RSIDs within each document format is actually quite complicated, but for the purposes of determining document version genealogy, all that is required is the complete set of all RSIDs present in the document version of interest. Although the specifications for the file formats seem to allow for RSIDs to take an integer value of 4 bytes length (i.e. between 0 and 232-1), in practice Microsoft Word only seems to allocate values of up to 3 bytes in length (i.e. between 0 and 224-1). This may be an implementation detail that could change in future versions of Microsoft Word and in any case the range allowed for the RSID values does not impact the methods described here other than the size of the RSID data structure and the speed of execution of embodiments of the invention.

In practice, Microsoft Word may assign more than one new RSID for each editing session, tests indicating that one is added when the document is opened and another each time it is saved to disk. Note that if the document is opened, but not modified or saved, even if new RSIDs are created within the memory of the Microsoft Word application, they will not be stored to the document file (as it is not saved) and are thus discarded without trace when the document is closed. The fact that the number of RSIDs added to the document per editing session may be greater than one does not affect the techniques described here.

Given that new RSIDs are added to the document each time it is modified and saved, it follows that if two document versions A and B are encountered the invention can determine that version B is an ancestor of version A when the following two conditions hold true

- There are at least some RSID values in common between the two versions; and
- The set of RSIDs derived from version B is a proper subset of the set of RSIDs derived from version A.

Based upon this principle and other similar derivations the invention can determine the genealogy of a set of documents from their RSIDs.

DESCRIPTION OF THE FIGURES

The headings provided herein are for convenience only and do not necessarily affect the scope or meaning of the claimed invention. In the drawings, the same reference numbers and any acronyms identify elements or acts with the same or similar structure or functionality for ease of understanding and convenience. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the Figure number in which that element is first introduced (e.g., element 204 is first introduced and discussed with respect to FIG. 2).

FIG. 1: Example system architecture for an embodiment of the invention.

FIG. 2: Example flow chart for an embodiment of the invention.

DETAILED DESCRIPTION

Various examples of the invention will now be described. The following description provides specific details for a thorough understanding and enabling description of these examples. One skilled in the relevant art will understand, however, that the invention may be practiced without many of these details. Likewise, one skilled in the relevant art will also understand that the invention can include many other features not described in detail herein. Additionally, some well-known structures or functions may not be shown or described in detail below, so as to avoid unnecessarily obscuring the relevant description. The terminology used below is to be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a detailed description of certain specific examples of the invention. Indeed, certain terms may even be emphasized below; however, any terminology intended to be interpreted in any restricted manner will be overtly and specifically defined as such in this Detailed Description section.

Consider document versions A, B, C, etc. Let the set of RSIDs associated with each version be R_A, R_B, R_C, etc. Let the mathematical symbol ‘<’ be used to denote ancestry, so that A<B can be read as ‘Version A is an ancestor of Version B’. Let T(A,B) indicate that versions A and B belong to the same genealogical version tree. Let E(A,B) indicate that the versions A and B have equal RSIDs and therefore cannot be distinguished by this methodology. Therefore we have the following logical conditions:

R_A≡R_B→E(A,B) (1)
R_A∩R_B≠∅→T(A,B) (2)
R_A⊂R_B→A<B (3)
T(A,B) and R_A⊂R_Band R_B⊂R_A→∃C:C<A,C<B,R_C=R_A∩R_B (4)

These four logical equations can be interpreted as

- (1) If two versions A and B have identical sets of RSIDs then they are equivalent for the point of view of determining genealogy
- (2) If two versions A and B have any RSIDs in common then they belong to the same genealogical version tree. Conversely, if two versions A and B have no versions in common then they belong in different version trees.
- (3) If the RSIDs of version A are a proper subset (i.e. not including equality) of the RSIDs of version B then version A is an ancestor of version B.
- (4) If two versions A and B belong in the same tree, but neither has RSIDs which are a subset of the other then there must have existed a version C that is a common ancestor of both versions A and B, i.e. the RSIDs of C are a subset of those of both A and B.

Preliminary steps before the construction of a genealogical tree for a set of documents proceeds as follows:

- The contents of each document to be considered are passed to the RSID extraction module (there may be multiple RSID extraction modules to handle different source file formats). (101). The RSIDs extracted by the module are stored in a list in a data structure along with meta-information about the file they were extracted from (for instance the source location of that file). (104). The data structure is stored in computer memory and a mass storage device (102).
- In an alternative embodiment, equivalent versions are detected and removed from consideration by the use of equation 1. Documents with equal RSID sets may be detected by comparing the RSID set of each document under consideration with the RSID set of each other document, or, more efficiently, by using a hash table or other similar optimized data structure. If equivalent versions are detected the duplicates may simply be removed with only one of the set of equivalent versions being retained. Alternatively information about the duplicates may be attached to the data structure representing the version that is retained for later use.
- The documents under consideration are grouped into related groups by the use of equation 2. Initially the collection of related groups is empty. Each document is tested against each group in the collection of related groups and if it has any RSIDs in common with any document in the group, it is added to that group. If the document fails to match any existing group, it is used to create a new group. Membership in a group can be indicated in the data structure by an entry that when null indicates no membership and when not null is a label value the module creates to be associated with the group.
  - In some circumstances the documents supplied may be known to all belong to a single group, in which this step can be skipped.
  - Use of standard document templates within an organization may lead to all documents in the organization sharing a certain number of RSIDs—those that they inherit from the standard template—in that case the condition for belonging to the same version tree (and hence being added to the same related group) may be altered to require a number of matching RSIDs that is bigger than some non-zero number. In an alternative embodiment, the RSID values associated with the template may be ignored for purposes of applying the geneology logic presented above. In other words, the fact that two document share only the RSID values that exist in a document template provides no useful information as to geneology of the document versions.
  - Other information may be used to assist related group construction, for instance document fingerprinting techniques such as Robust Winnowing (http.//theory.stanford.edu/˜aiken/publications/papers/sigmod03.pdf) may be used to help divide the original set of all input documents into groups.

Once a set of related groups have been constructed, a genealogical tree is determined for each group which has more than one member document. This is accomplished by the logic module (103) that applies logic rules to the extracted RSIDs. Note that all members of a given related document group are already determined to be versions of the same document—they will now be referred to as versions throughout the remaining description. This step is the primary step of the invention and proceeds as follows:

- 1. Each version in the group, A, is tested against each other version in the group, B. (203). If the RSID set of A is a proper subset of the RSID set of B, then by equation 3, A is an ancestor of B. (204). In this case, version A is added to a list of ‘potential parents’ stored in the data structure representing version B. (205). After this process is complete, each document version has a list of potential parent versions.
  - There can be no loops in this structure because it is derived from the RSIDs being proper subsets—i.e. consider the list of versions that can be reached from a version A by following each of its possible parents, then each of the possible parents of those versions, etc. It is impossible for A itself to appear in this list.
- 2. Each version A in the group that has not had its parent version determined already is tested to determine if it has only one version in its list of potential parents. (206). If there is one and only one version P in the list of potential parents then the following steps are taken:
  - Version A′s parent version is now determined to be version P. (207). Version P is stored as the parent version in the data structure associated with version A. Version A is added to the data structure associated with version P as one of P′s child versions.
  - Each other version C in the group is checked to see if both A and P appear in C's list of potential parents. If both appear then P is removed from the list.
- 3. Step 2 is repeated until it fails to determine any more parent versions.
- 4. The number of versions in the group without a determined parent version is counted—if the count is 1 then the determination of the genealogical tree is complete and the algorithm terminates (the version without a determined parent is the version that is at the root of the tree). (208) Note that it is impossible to find zero versions with no parent determined as this would involve a loop in the parent child relationship.
- 5. At this stage, equation 4 is used to infer the existence of missing versions that must have once existed but are not included in the related group because they have been lost, deleted or simply not presented to the algorithm.
  - The list of versions with no parent determined is calculated and each possible pair of versions A, B from this list is considered. Equation 4 would allow us to infer the existence of a missing version C from each pair A, B, but instead the algorithm at this stage only deduces the existence of a single missing version C from all the possible C's such that the number of RSIDs in R_Cis a maximum. Once this new version C has been constructed it is added to the related group.
- 6. Newly created version C is tested against each other version in the group, D. If R_D⊂R_Cthen D is added to the potential parent list of C. If R_C⊂R_Dthen C is added to the potential parent list of D. This updates the potential parent lists of all versions to be consistent with the state that would have been achieved if C was part of the related group at step 1.
- 7. The algorithm now returns to step 2.

At the end of this procedure, there will only be a single version remaining (the root version) with no parent version determined. The procedure will have possibly created several ‘missing versions’ where it can determine that two versions are related to each other as siblings but that there common ancestor has not been presented to the algorithm. The choice in step 5 of creating a single synthesized missing version such that the number of RSIDs in R_Cis at a maximum is important as it ensures that the fewest children are attached to each synthesized version and that thus the most detailed tree possible is generated. Constructing a synthesized missing version from the minimum number of intersecting RSIDs would instead lead to a tree where many child versions attached themselves to that new version, making the tree very wide but less deep and containing less information regarding detailed ancestry.

In yet another embodiment of the invention, the system is adapted to rely on codes extracted from the content in the versions itself. This would be useful in situations where the RSIDs are not used, for example, for text documents extracted from scanned data and the like. In this embodiment, numerical values called fingerprints are extracted from each document. The relative distance in value between fingerprints can provide an indication of the relative differences in the documents. By means of these distances, a relative geneology of the document versions can be determined automatically. String matching algorithms can be used to identify identical sections of the documents. One logical rule in this embodiment is that two versions that have a high number of identical strings are more likely to be closely related to than two with fewer. The relative distance between document versions can be used to determine a hierarchy that is the expected geneology of the document versions.

Operating Environment:

The system and method described herein can be executed using a computer system, generally comprised of a central processing unit (CPU) that is operatively connected to a memory device, data input and output circuitry (I/O) and computer data network communication circuitry. A video display device may be operatively connected through the I/O circuitry to the CPU. Components that are operatively connected to the CPU using the I/O circuitry include microphones, for digitally recording sound, and video camera, for digitally recording images or video. Audio and video may be recorded simultaneously as an audio visual recording. The I/O circuitry can also be operatively connected to an audio loudspeaker in order to render digital audio data into audible sound. Audio and video may be rendered through the loudspeaker and display device separately or in combination. Computer code executed by the CPU can take data received by the data communication circuitry and store it in the memory device. In addition, the CPU can take data from the I/O circuitry and store it in the memory device. Further, the CPU can take data from a memory device and output it through the I/O circuitry or the data communication circuitry. The data stored in memory may be further recalled from the memory device, further processed or modified by the CPU in the manner described herein and restored in the same memory device or a different memory device operatively connected to the CPU including by means of the data network circuitry. The memory device can be any kind of data storage circuit or magnetic storage or optical device, including a hard disk, optical disk or solid state memory.

The remote computer may be a laptop or desktop type of personal computer. It can also be a cell phone, smart phone or other handheld device, including a tablet. The precise form factor of the user's computer does not limit the claimed invention. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held, laptop or mobile computer or communications devices such as cell phones and PDA's, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

Those skilled in the relevant art will appreciate that the invention can be practiced with other communications, data processing, or computer system configurations, including: wireless devices, Internet appliances, hand-held devices (including personal digital assistants (PDAs)), wearable computers, all manner of cellular or mobile phones, multi-processor systems, microprocessor-based or programmable consumer electronics, set-top boxes, network PCs, mini-computers, mainframe computers, and the like. Indeed, the terms “computer,” “server,” and the like are used interchangeably herein, and may refer to any of the above devices and systems.

The computer can display on the display screen operatively connected to the I/O circuitry the appearance of a user interface. Various shapes, text and other graphical forms are displayed on the screen as a result of the computer generating data that causes the pixels comprising the display screen to take on various colors and shades. The user interface also displays a graphical object referred to in the art as a cursor. The object's location on the display indicates to the user a selection of another object on the screen. The cursor may be moved by the user by means of another device connected by I/O circuitry to the computer. This device detects certain physical motions of the user, for example, the position of the hand on a flat surface or the position of a finger on a flat surface. Such devices may be referred to in the art as a mouse or a track pad. In some embodiments, the display screen itself can act as a trackpad by sensing the presence and position of one or more fingers on the surface of the display screen. When the cursor is located over a graphical object that appears to be a button or switch, the user can actuate the button or switch by engaging a physical switch on the mouse or trackpad or computer device or tapping the trackpad or touch sensitive display. When the computer detects that the physical switch has been engaged (or that the tapping of the track pad or touch sensitive screen has occurred), it takes the apparent location of the cursor (or in the case of a touch sensitive screen, the detected position of the finger) on the screen and executes the process associated with that location. As an example, not intended to limit the breadth of the disclosed invention, a graphical object that appears to be a 2 dimensional box with the word “enter” within it may be displayed on the screen. If the computer detects that the switch has been engaged while the cursor location (or finger location for a touch sensitive screen) was within the boundaries of a graphical object, for example, the displayed box, the computer will execute the process associated with the “enter” command. In this way, graphical objects on the screen create a user interface that permits the user to control the processes operating on the computer.

The system may also be comprised of a central server that is connected by a data network to a user's computer. The central server may be comprised of one or more computers connected to one or more mass storage devices. The precise architecture of the central server does not limit the claimed invention. In addition, the data network may operate with several levels, such that the user's computer is connected through a fire wall to one server, which routes communications to another server that executes the disclosed methods. The precise details of the data network architecture does not limit the claimed invention.

A server may be a computer comprised of a central processing unit with a mass storage device and a network connection. In addition a server can include multiple of such computers connected together with a data network or other data transfer connection, or, multiple computers on a network with network accessed storage, in a manner that provides such functionality as a group. Practitioners of ordinary skill will recognize that functions that are accomplished on one server may be partitioned and accomplished on multiple servers that are operatively connected by a computer network by means of appropriate inter process communication. Practitioners of ordinary skill will recognize that the invention may be executed on one or more computer processors that are linked using a data network, including, for example, the Internet. In another embodiment, different steps of the process can be executed by one or more computers and storage devices geographically separated by connected by a data network in a manner so that they operate together to execute the process steps.

In one embodiment, a user's computer can run an application that causes the user's computer to transmit a stream of one or more data packets across a data network to a second computer, referred to here as a server. The server, in turn, may be connected to one or more mass data storage devices where the database is stored. A data message and data upload or download can be delivered over the Internet using typical protocols, including TCP/IP, HTTP, TCP, UDP, SMTP, RPC, FTP or other kinds of data communication protocols that permit processes running on two remote computers to exchange information by means of digital network communication.

As a result a data message can be one or more data packets transmitted from or received by a computer containing a destination network address, a destination process or application identifier, and data values that can be parsed at the destination computer located at the destination network address by the destination process in order that the relevant data values are extracted and used by the destination process.

The server can execute a program that receives the transmitted packet and interpret the transmitted data packets in order to extract database query information. The server can then execute the remaining steps of the invention by means of accessing the mass storage devices to derive the desired result of the query. Alternatively, the server can transmit the query information to another computer that is connected to the mass storage devices, and that computer can execute the invention to derive the desired result. The result can then be transmitted back to the user's computer by means of another stream of one or more data packets appropriately addressed to the user's computer.

In addition, the user's computer may obtain data from the server that is considered a website, that is, a collection of data files that when retrieved by the user's computer and rendered by a program running on the user's computer, displays on the display screen of the user's computer text, images, video and in some cases outputs audio.

The access of the website can be by means of a client program running on a local computer that is connected over a computer network accessing a secure or public page on the server using an Internet browser or by means of running a dedicated application that interacts with the server, sometimes referred to as an “app.” The data messages may comprise a data file that may be an HTML document (or other hypertext formatted document file), commands sent between the remote computer and the server and a web-browser program or app running on the remote computer that interacts with the data received from the server. The command can be a hyper-link that causes the browser to request a new HTML document from another remote data network address location. The HTML can also have references that result in other code modules being called up and executed, for example, Flash, scripts or other code. The HTML file may also have code embedded in the file that is executed by the client program as an interpreter, in one embodiment, Javascript. As a result a data message can be a data packet transmitted from or received by a computer containing a destination network address, a destination process or application identifier, and data values or program code that can be parsed at the destination computer located at the destination network address by the destination application in order that the relevant data values or program code are extracted and used by the destination application.

Computer program logic implementing all or part of the functionality previously described herein may be embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, and various intermediate forms (e.g., forms generated by an assembler, compiler, linker, or locator.) Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Javascript, C, C++, JAVA, or HTML or scripting languages that are executed by Internet web-broswers) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.

The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer program and data may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed hard disk), an optical memory device (e.g., a CD-ROM or DVD), a PC card (e.g., PCMCIA card), or other memory device. The computer program and data may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies, networking technologies, and internetworking technologies. The computer program and data may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software or a magnetic tape), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web.)

It is appreciated that any of the software components of the present invention may, if desired, be implemented in ROM (read-only memory) form. The software components may, generally, be implemented in hardware, if desired, using conventional techniques. In some instances, especially where a mobile computing device is used to access web content through the network (e.g., when a 3G or an LTE service of a mobile phone is used to connect to the network), the network may be any type of cellular, IP-based or converged telecommunications network, including but not limited to Global System for Mobile Communications (GSM), Time Division Multiple Access (TDMA), Code Division Multiple Access (CDMA), Orthogonal Frequency Division Multiple Access (OFDM), General Packet Radio Service (GPRS), Enhanced Data GSM Environment (EDGE), Advanced Mobile Phone System (AMPS), Worldwide Interoperability for Microwave Access (WiMAX), Universal Mobile Telecommunications System (UMTS), Evolution-Data Optimized (EVDO), Long Term Evolution (LTE), Ultra Mobile Broadband (UMB), or Voice over Internet Protocol (VoIP), Unlicensed Mobile Access (UMA).

The described embodiments of the invention are intended to be exemplary and numerous variations and modifications will be apparent to those skilled in the art. All such variations and modifications are intended to be within the scope of the present invention as defined in the appended claims. Although the present invention has been described and illustrated in detail, it is to be clearly understood that the same is by way of illustration and example only, and is not to be taken by way of limitation. It is appreciated that various features of the invention which are, for clarity, described in the context of separate embodiments may also be provided in combination in a single embodiment. Conversely, various features of the invention which are, for brevity, described in the context of a single embodiment may also be provided separately or in any suitable combination. It is appreciated that the particular embodiment described in the specification is intended only to provide an extremely detailed disclosure of the present invention and is not intended to be limiting.

It should be noted that the flow diagrams are used herein to demonstrate various aspects of the invention, and should not be construed to limit the present invention to any particular logic flow or logic implementation. The described logic may be partitioned into different logic blocks (e.g., programs, modules, functions, or subroutines) without changing the overall results or otherwise departing from the true scope of the invention. Oftentimes, logic elements may be added, modified, omitted, performed in a different order, or implemented using different logic constructs (e.g., logic gates, looping primitives, conditional logic, and other logic constructs) without changing the overall results or otherwise departing from the true scope of the invention.

Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed or implemented in parallel, or may be performed at different times.

Claims

1. A computer system comprised of at least one computer and at least one data storage device for determining the relative geneology of a predetermined set of versions of a document data files stored on the system, said system comprising: an extraction module adapted by logic to extract a set of at least one RSID values from each of the predetermined set of document versions, construct generate and store in computer memory a data structure that encodes the geneology of the predetermined set of document versions;a logic module adapted by logic to select at least one pair of RSID sets comprised of a first RSID set and a second RSID set corresponding to at least one selected pair of versions of the document associated with the corresponding at least one pair of selected RSID sets, and apply at least one of a predetermined set of logic rules to the selected pair of RSID setssaid logic rules being at least one of:(i) testing whether the pair of RSID sets are identical; (ii) testing whether any of the RSID values in the first set are the same as any RSID values in the second set; (iii) testing whether all of the RSID values in the first set are a proper subset of the RSID values of the second set; or (iv) testing whether neither the first set of RSID values is a subset of the second set of RSID values nor the second set of RSID values are subset of the first set of RSID values,and to update the stored data structure with an identifier entry for one of the versions in the corresponding at least one pair of versions referring to its genealogical relation to the other version.
2. The system of claim 1 further comprising a module adapted by logic to apply logical rules to the extracted RSID sets of values in order to determine which of the selected document versions are within a group and to update the data structure to indicate for each document version, membership in the group, whereby the logical module is further adapted to check for potential parentage within the group and determine version geneology within the group.
3. The system of claim 1 where the logic module is further adapted by logic to determine if there are missing versions of the document that are not included in the set of versions being checked for their geneology.
4. The system of claim 3 where the logic module is further adapted by logic to determine for the selected pair of document versions in the set, A, B, whether versions A and B belong in the same geneological tree, but neither A nor B has RSID values which are a subset of the other version in order to determine a logic state representing the condition that there must have existed a version C that is a common ancestor of both versions A and B.
5. The system of claim 3 further adapted by logic to create and store at least one RSID set of values corresponding to at least one of the missing document versions and applying a logic rule to the created RSID set to determine its position in the version geneology, said logic rule being a test whether for one of any document version in the set, whether the created RSID set corresponding to the missing document version is a subset of the RSID set of the one document, or that the RSID set of the one document is a subset of the created RSID set corresponding to the missing document version.
6. The system of claim 1 where the system is adapted by logic to ignore equivalent versions of the document by means of determining that a pair of the document versions have identical sets of RSID values.
7. The system of claim 1 where the logic module is further adapted by logic to determine whether for each pair of document versions in the set, whether the pair of document versions have any RSID values in common.
8. The system of claim 1 where the logic module is further adapted by logic to ignore a predetermined sub-set of the same RSID values extracted from the selected pair of document versions.
9. The system of claim 1 where the logic module is further adapted by logic to determine potential parentage by checking whether the extracted pair of RSID sets of a first pair of document versions is a proper subset of the extracted RSID values of a second pair of document versions.
10. The system of claim 1 further adapted by logic to determine that the geneology within the set of versions is complete when only one version of the document in the set is assigned a null value for potential parentage identifiers.
11. A method executed by a computer system comprised of at least one computer and at least one data storage device for determining the relative geneology of a predetermined set of versions of a document data file stored on the system comprising: extracting a set of at least one RSID values from each of the predetermined set of document versions, generating and storing in computer memory a data structure for encoding the geneology of the predetermined set of document versions;selecting at least one pair of RSID sets comprised of a first RSID set and a second RSID set corresponding to a selected pair of versions of the document;applying at least one of a predetermined set of logic rules to the selected at least one pairs of RSID sets, said logic rules being at least one of: (i) testing whether the pair of RSID sets are identical;(ii) testing whether any of the RSID values in the first set are the same as any RSID values in the second set;(iii) testing whether all of the RSID values in the first set are a proper subset of the RSID values of the second set; or(iv) testing whether neither the first set of RSID values is a subset of the second set of RSID values nor the second set of RSID values are subset of the first set of RSID values, andupdating the stored data structure with an identifier entry for one of the versions in the corresponding at least one pair of versions referring to its genealogical relation to the other version.
12. The method of claim 11 further comprising: applying logical rules to the extracted RSID sets of values in order to determine which of the selected document versions are within a group;updating the data structure to indicate for each version in the set of document versions membership in the group; andchecking for potential parentage within the group to determine version geneology within the group.
13. The method of claim 11 further comprising: determining if there are missing versions of the document that are not included in the set of versions being checked for their geneology.
14. The method of claim 13 further comprising: selecting at least one pair of document versions A, B;determining that neither version A nor B has RSID values which are a subset of the RSID values of the other version in order to determine a logic state representing the condition that there must have existed a version C that is a common ancestor of both versions A and B.
15. The method of claim 13 further comprising: creating and storing at least one RSID set of values corresponding to at least one of the missing document versions; and applying a logic rule to the created RSID set to determine its position in the version geneology, said logic rule being a test whether for one of any document version in the set, whether the created RSID set corresponding to the missing document version is a subset of the RSID set of the one document, or that the RSID set of the one document is a subset of the created RSID set corresponding to the missing document version.
16. The method of claim 11 further comprising: ignoring equivalent versions of the document by means of determining that a pair of the document versions have identical sets of RSID values.
17. The method of claim 11 further comprising: determining whether for each pair of document versions, whether the versions have any RSID values in common.
18. The method of claim 11 further comprising: ignoring a predetermined sub-set of the same RSID values extracted from the selected document versions.
19. The method of claim 11 further comprising: determining potential parentage by checking whether the extracted RSID set of a first of the pair of document versions is a proper subset of the extracted RSID set of values of a second of the pair of document versions.
20. The method of claim 11 further comprising the step of determining that the geneology within the set of versions is complete when only one version of the document in the set is assigned a null value for potential parentage identifiers.

Parent Case Info

Priority Claim: This application claims priority as a nonprovisional continuation to U.S. Provisional Patent Application No. 62/097,190 filed on Dec. 29, 2014, which is incorporated herein for all that it teaches.

US Referenced Citations (348)

Number	Name	Date	Kind
4479195	Herr et al.	Oct 1984	A
4949300	Christenson et al.	Aug 1990	A
5008853	Bly et al.	Apr 1991	A
5072412	Henderson, Jr. et al.	Dec 1991	A
5220657	Bly et al.	Jun 1993	A
5245553	Tanenbaum	Sep 1993	A
5247615	Mori et al.	Sep 1993	A
5293619	Dean	Mar 1994	A
5379374	Ishizaki et al.	Jan 1995	A
5446842	Schaeffer et al.	Aug 1995	A
5608872	Schwartz et al.	Mar 1997	A
5617539	Ludwig et al.	Apr 1997	A
5619649	Kovnat et al.	Apr 1997	A
5634062	Shimizu et al.	May 1997	A
5671428	Muranaga et al.	Sep 1997	A
5699427	Chow et al.	Dec 1997	A
RE35861	Queen	Jul 1998	E
5787175	Carter	Jul 1998	A
5787444	Gerken	Jul 1998	A
5801702	Dolan et al.	Sep 1998	A
5806078	Hug et al.	Sep 1998	A
5819300	Kohno et al.	Oct 1998	A
5832494	Egger et al.	Nov 1998	A
5890177	Moody et al.	Mar 1999	A
5898836	Freivald et al.	Apr 1999	A
6003060	Aznar et al.	Dec 1999	A
6012087	Freivald et al.	Jan 2000	A
6049804	Burgess et al.	Apr 2000	A
6067551	Brown et al.	May 2000	A
6088702	Plantz et al.	Jul 2000	A
6128635	Ikeno	Oct 2000	A
6145084	Zuili et al.	Nov 2000	A
6189019	Blumer et al.	Feb 2001	B1
6212534	Lo et al.	Apr 2001	B1
6219818	Freivald et al.	Apr 2001	B1
6243091	Berstis	Jun 2001	B1
6263350	Wollrath et al.	Jul 2001	B1
6263364	Najork et al.	Jul 2001	B1
6269370	Kirsch	Jul 2001	B1
6285999	Page	Sep 2001	B1
6301368	Bolle et al.	Oct 2001	B1
6317777	Skarbo et al.	Nov 2001	B1
6321265	Najork et al.	Nov 2001	B1
6336123	Inoue et al.	Jan 2002	B2
6351755	Najork et al.	Feb 2002	B1
6356937	Montville et al.	Mar 2002	B1
6377984	Najork et al.	Apr 2002	B1
6404446	Bates et al.	Jun 2002	B1
6418433	Chakrabarti et al.	Jul 2002	B1
6418453	Kraft et al.	Jul 2002	B1
6424966	Meyerzon et al.	Jul 2002	B1
6449624	Hammack et al.	Sep 2002	B1
6505237	Beyda et al.	Jan 2003	B2
6513050	Williams et al.	Jan 2003	B1
6547829	Meyerzon et al.	Apr 2003	B1
6556982	McGaffey et al.	Apr 2003	B1
6560620	Ching	May 2003	B1
6584466	Serbinis et al.	Jun 2003	B1
6591289	Britton	Jul 2003	B1
6594662	Sieffert et al.	Jul 2003	B1
6596030	Ball et al.	Jul 2003	B2
6614789	Yazdani et al.	Sep 2003	B1
6658626	Aiken	Dec 2003	B1
6662212	Chandhok et al.	Dec 2003	B1
6738762	Chen et al.	May 2004	B1
6745024	DeJaco et al.	Jun 2004	B1
6918082	Gross	Jul 2005	B1
7035427	Rhoads	Apr 2006	B2
7085735	Hall et al.	Aug 2006	B1
7107518	Ramaley et al.	Sep 2006	B2
7113615	Rhoads et al.	Sep 2006	B2
7152019	Tarantola et al.	Dec 2006	B2
7194761	Champagne	Mar 2007	B1
7212955	Kirshenbaum et al.	May 2007	B2
7233686	Hamid	Jun 2007	B2
7240207	Weare	Jul 2007	B2
7299504	Tiller et al.	Nov 2007	B1
7321864	Gendler	Jan 2008	B1
7356704	Rinkevich et al.	Apr 2008	B2
7434164	Salesin et al.	Oct 2008	B2
7454778	Pearson et al.	Nov 2008	B2
7496841	Hadfield et al.	Feb 2009	B2
7564997	Hamid	Jul 2009	B2
7570964	Maes	Aug 2009	B2
7613770	Li	Nov 2009	B2
7624447	Horowitz et al.	Nov 2009	B1
7627613	Dulitz et al.	Dec 2009	B1
7640308	Antonoff et al.	Dec 2009	B2
7673324	Tirosh et al.	Mar 2010	B2
7680785	Najork	Mar 2010	B2
7685298	Day	Mar 2010	B2
7694336	Rinkevich et al.	Apr 2010	B2
7707153	Petito et al.	Apr 2010	B1
7720256	Desprez et al.	May 2010	B2
7730175	Roesch et al.	Jun 2010	B1
7788235	Yeo	Aug 2010	B1
7796309	Sadovsky et al.	Sep 2010	B2
7797724	Calvin	Sep 2010	B2
7818678	Massand	Oct 2010	B2
7844116	Monga	Nov 2010	B2
7857201	Silverbrook et al.	Dec 2010	B2
7877790	Vishik et al.	Jan 2011	B2
7890752	Bardsley et al.	Feb 2011	B2
7895166	Foygel et al.	Feb 2011	B2
7903822	Hair et al.	Mar 2011	B1
7941844	Anno	May 2011	B2
7958101	Teugels et al.	Jun 2011	B1
8005277	Tulyakov et al.	Aug 2011	B2
8042112	Zhu et al.	Oct 2011	B1
8117225	Zilka	Feb 2012	B1
8181036	Nachenberg	May 2012	B1
8196030	Wang et al.	Jun 2012	B1
8201254	Wilhelm et al.	Jun 2012	B1
8209538	Craigie	Jun 2012	B2
8233723	Sundaresan	Jul 2012	B2
8286085	Denise	Oct 2012	B1
8286171	More et al.	Oct 2012	B2
8301994	Shah	Oct 2012	B1
8316237	Felsher et al.	Nov 2012	B1
8406456	More	Mar 2013	B2
8473847	Glover	Jun 2013	B2
8478995	Alculumbre	Jul 2013	B2
8555080	More et al.	Oct 2013	B2
8635295	Mulder	Jan 2014	B2
8732127	van Rotterdam	May 2014	B1
8776190	Cavage et al.	Jul 2014	B1
8797603	Dougherty	Aug 2014	B1
8839100	Donald	Sep 2014	B1
9092636	More et al.	Jul 2015	B2
9098500	Asokan	Aug 2015	B1
9652485	Bhargava	May 2017	B1
20010018739	Anderson et al.	Aug 2001	A1
20010042073	Saether et al.	Nov 2001	A1
20020010682	Johnson	Jan 2002	A1
20020016959	Barton et al.	Feb 2002	A1
20020019827	Shiman et al.	Feb 2002	A1
20020023158	Polizzi et al.	Feb 2002	A1
20020052928	Stern et al.	May 2002	A1
20020063154	Hoyos et al.	May 2002	A1
20020065827	Christie et al.	May 2002	A1
20020065848	Walker et al.	May 2002	A1
20020073188	Rawson, III	Jun 2002	A1
20020087515	Swannack et al.	Jul 2002	A1
20020099602	Moskowitz et al.	Jul 2002	A1
20020120648	Ball et al.	Aug 2002	A1
20020129062	Luparello	Sep 2002	A1
20020136222	Robohm	Sep 2002	A1
20020138744	Schleicher et al.	Sep 2002	A1
20020159239	Arnie et al.	Oct 2002	A1
20020164058	Aggarwal et al.	Nov 2002	A1
20030009518	Harrow et al.	Jan 2003	A1
20030009528	Sharif et al.	Jan 2003	A1
20030037010	Schmelzer	Feb 2003	A1
20030046572	Newman et al.	Mar 2003	A1
20030051054	Redlich et al.	Mar 2003	A1
20030061260	Rajkumar	Mar 2003	A1
20030078880	Alley et al.	Apr 2003	A1
20030093755	O'Carroll	May 2003	A1
20030097454	Yamakawa et al.	May 2003	A1
20030112273	Hadfield	Jun 2003	A1
20030115273	Delia et al.	Jun 2003	A1
20030131005	Berry	Jul 2003	A1
20030147267	Huttunen	Aug 2003	A1
20030158839	Faybishenko et al.	Aug 2003	A1
20030191799	Araujo et al.	Oct 2003	A1
20030196087	Stringer et al.	Oct 2003	A1
20030223624	Hamid	Dec 2003	A1
20030233419	Beringer	Dec 2003	A1
20030237047	Borson	Dec 2003	A1
20040002049	Beavers et al.	Jan 2004	A1
20040031052	Wannamaker et al.	Feb 2004	A1
20040122659	Hourihane et al.	Jun 2004	A1
20040128321	Hamer	Jul 2004	A1
20040186851	Jhingan et al.	Sep 2004	A1
20040187076	Ki	Sep 2004	A1
20040261016	Glass et al.	Dec 2004	A1
20050021980	Kanai	Jan 2005	A1
20050038893	Graham	Feb 2005	A1
20050055306	Miller et al.	Mar 2005	A1
20050055337	Bebo et al.	Mar 2005	A1
20050071755	Harrington et al.	Mar 2005	A1
20050108293	Lipman et al.	May 2005	A1
20050138540	Baltus et al.	Jun 2005	A1
20050204008	Shinbrood	Sep 2005	A1
20050251738	Hirano	Nov 2005	A1
20050251748	Gusmorino et al.	Nov 2005	A1
20050268327	Starikov	Dec 2005	A1
20060005247	Zhang et al.	Jan 2006	A1
20060013393	Ferchichi et al.	Jan 2006	A1
20060021031	Leahy et al.	Jan 2006	A1
20060047765	Mizoi et al.	Mar 2006	A1
20060050937	Hamid	Mar 2006	A1
20060059196	Sato et al.	Mar 2006	A1
20060064717	Shibata et al.	Mar 2006	A1
20060067578	Fuse	Mar 2006	A1
20060069740	Ando	Mar 2006	A1
20060098850	Hamid	May 2006	A1
20060112120	Rohall	May 2006	A1
20060129627	Phillips	Jun 2006	A1
20060158676	Hamada	Jul 2006	A1
20060171588	Chellapilla et al.	Aug 2006	A1
20060184505	Kedem	Aug 2006	A1
20060190493	Kawai et al.	Aug 2006	A1
20060218004	Dworkin et al.	Sep 2006	A1
20060218643	DeYoung	Sep 2006	A1
20060224589	Rowney	Oct 2006	A1
20060236246	Bono et al.	Oct 2006	A1
20060261112	Todd et al.	Nov 2006	A1
20060271947	Lienhart et al.	Nov 2006	A1
20060272024	Huang et al.	Nov 2006	A1
20060277229	Yoshida et al.	Dec 2006	A1
20060294468	Sareen et al.	Dec 2006	A1
20060294469	Sareen et al.	Dec 2006	A1
20070005589	Gollapudi	Jan 2007	A1
20070011211	Reeves et al.	Jan 2007	A1
20070025265	Porras et al.	Feb 2007	A1
20070027830	Simons et al.	Feb 2007	A1
20070094510	Ross et al.	Apr 2007	A1
20070100991	Daniels et al.	May 2007	A1
20070101154	Bardsley et al.	May 2007	A1
20070101413	Vishik et al.	May 2007	A1
20070112930	Foo et al.	May 2007	A1
20070150443	Bergholz et al.	Jun 2007	A1
20070179967	Zhang	Aug 2007	A1
20070192728	Finley et al.	Aug 2007	A1
20070220068	Thompson et al.	Sep 2007	A1
20070253608	Tulyakov et al.	Nov 2007	A1
20070261099	Broussard et al.	Nov 2007	A1
20070261112	Todd et al.	Nov 2007	A1
20070294318	Arora et al.	Dec 2007	A1
20070294612	Drucker et al.	Dec 2007	A1
20070299880	Kawabe	Dec 2007	A1
20080022003	Alve	Jan 2008	A1
20080033913	Winburn	Feb 2008	A1
20080034282	Zernik	Feb 2008	A1
20080065668	Spence et al.	Mar 2008	A1
20080080515	Tombroff et al.	Apr 2008	A1
20080082529	Mantena et al.	Apr 2008	A1
20080091465	Fuschino et al.	Apr 2008	A1
20080091735	Fukushima et al.	Apr 2008	A1
20080162527	Pizano et al.	Jul 2008	A1
20080177782	Poston et al.	Jul 2008	A1
20080209001	Boyle et al.	Aug 2008	A1
20080219495	Hulten et al.	Sep 2008	A1
20080235760	Broussard et al.	Sep 2008	A1
20080263363	Jueneman et al.	Oct 2008	A1
20080275694	Varone	Nov 2008	A1
20080288597	Christensen et al.	Nov 2008	A1
20080301193	Massand	Dec 2008	A1
20080306894	Rajkumar et al.	Dec 2008	A1
20080310624	Celikkan	Dec 2008	A1
20080320316	Waldspurger et al.	Dec 2008	A1
20090025087	Peirson et al.	Jan 2009	A1
20090030997	Malik	Jan 2009	A1
20090034804	Cho et al.	Feb 2009	A1
20090049132	Gutovski	Feb 2009	A1
20090052778	Edgecomb et al.	Feb 2009	A1
20090064326	Goldstein	Mar 2009	A1
20090083073	Mehta et al.	Mar 2009	A1
20090083384	Bhogal et al.	Mar 2009	A1
20090129002	Wu et al.	May 2009	A1
20090164427	Shields et al.	Jun 2009	A1
20090177754	Brezina et al.	Jul 2009	A1
20090183257	Prahalad	Jul 2009	A1
20090187567	Rolle	Jul 2009	A1
20090216843	Willner et al.	Aug 2009	A1
20090222450	Zigelman	Sep 2009	A1
20090234863	Evans	Sep 2009	A1
20090241187	Troyansky	Sep 2009	A1
20090271620	Sudhakar	Oct 2009	A1
20090319480	Saito	Dec 2009	A1
20100011077	Shkolnikov et al.	Jan 2010	A1
20100011428	Atwood et al.	Jan 2010	A1
20100017404	Banerjee et al.	Jan 2010	A1
20100017850	More et al.	Jan 2010	A1
20100049807	Thompson	Feb 2010	A1
20100058053	Wood et al.	Mar 2010	A1
20100064004	Ravi et al.	Mar 2010	A1
20100064372	More et al.	Mar 2010	A1
20100070448	Omoigui	Mar 2010	A1
20100076985	Egnor	Mar 2010	A1
20100083230	Ramakrishnan	Apr 2010	A1
20100114985	Chaudhary et al.	May 2010	A1
20100114991	Chaudhary et al.	May 2010	A1
20100131604	Portilla	May 2010	A1
20100146382	Abe et al.	Jun 2010	A1
20100174678	Massand	Jul 2010	A1
20100174761	Longobardi et al.	Jul 2010	A1
20100186062	Banti et al.	Jul 2010	A1
20100217987	Shevade	Aug 2010	A1
20100235763	Massand	Sep 2010	A1
20100241943	Massand	Sep 2010	A1
20100257352	Errico	Oct 2010	A1
20100287246	Klos et al.	Nov 2010	A1
20100299727	More et al.	Nov 2010	A1
20100318530	Massand	Dec 2010	A1
20100332428	McHenry et al.	Dec 2010	A1
20110029625	Cheng et al.	Feb 2011	A1
20110035655	Heineken	Feb 2011	A1
20110041165	Bowen	Feb 2011	A1
20110106892	Nelson et al.	May 2011	A1
20110107106	Morii et al.	May 2011	A1
20110125806	Park	May 2011	A1
20110141521	Qiao	Jun 2011	A1
20110145229	Vailaya et al.	Jun 2011	A1
20110173103	Batra et al.	Jul 2011	A1
20110197121	Kletter	Aug 2011	A1
20110225646	Crawford	Sep 2011	A1
20110252098	Kumar	Oct 2011	A1
20110252310	Rahaman et al.	Oct 2011	A1
20110264907	Betz et al.	Oct 2011	A1
20110314384	Lindgren et al.	Dec 2011	A1
20120011361	Guerrero et al.	Jan 2012	A1
20120016867	Clemm et al.	Jan 2012	A1
20120030563	Lemonik et al.	Feb 2012	A1
20120036157	Rolle	Feb 2012	A1
20120079267	Lee	Mar 2012	A1
20120079596	Thomas et al.	Mar 2012	A1
20120110092	Keohane et al.	May 2012	A1
20120117096	Massand	May 2012	A1
20120117644	Soeder	May 2012	A1
20120131635	Huapaya	May 2012	A1
20120133989	Glover	May 2012	A1
20120136862	Glover	May 2012	A1
20120136951	Mulder	May 2012	A1
20120151316	Massand	Jun 2012	A1
20120173881	Trotter	Jul 2012	A1
20120185511	Mansfield et al.	Jul 2012	A1
20120246115	King et al.	Sep 2012	A1
20120260188	Park et al.	Oct 2012	A1
20120265817	Vidalenc et al.	Oct 2012	A1
20120317239	Mulder	Dec 2012	A1
20130007070	Pitschke	Jan 2013	A1
20130060799	Massand	Mar 2013	A1
20130074195	Johnston et al.	Mar 2013	A1
20130097421	Lim	Apr 2013	A1
20130212707	Donahue et al.	Aug 2013	A1
20130227043	Murakami	Aug 2013	A1
20130227397	Tvorun et al.	Aug 2013	A1
20140032489	Hebbar et al.	Jan 2014	A1
20140115436	Beaver et al.	Apr 2014	A1
20140136497	Georgiev et al.	May 2014	A1
20140181223	Homsany et al.	Jun 2014	A1
20140280336	Glover	Sep 2014	A1
20140281872	Glover	Sep 2014	A1
20150026464	Hanner et al.	Jan 2015	A1
20150172058	Follis	Jun 2015	A1
20160350270	Nakazawa	Dec 2016	A1

Non-Patent Literature Citations (95)

Entry
Non-Final Office Action dated Apr. 27, 2012 in Co-Pending U.S. Appl. No. 12/275,185, filed Nov. 20, 2008.
Non-final Office Action issued for U.S. Appl. No. 13/799,067 dated Oct. 30, 2014.
Non-Final Office Action dated Apr. 26, 2013 in Co-Pending U.S. Appl. No. 13/659,817 by More, S., filed Oct. 24, 2012.
Non-Final Office Action dated Apr. 26, 2013 in Co-Pending U.S. Appl. No. 13/659,817 of More, S., filed Oct. 24, 2012.
Non-Final Office Action dated Apr. 27, 2012 in Co-Pending U.S. Appl. No. 12/275,185 of More, S., filed Nov. 20, 2008.
Non-Final Office Action dated Aug. 1, 2012 in Co-Pending U.S. Appl. No. 12/621,429, filed Nov. 18, 2009.
Non-Final Office Action dated Aug. 1, 2012 in Co-Pending U.S. Appl. No. 12/621,429 of More, S., filed Nov. 18, 2009.
Non-Final Office Action dated Aug. 13, 2013 in co-pending U.S. Appl. No. 13/306,819 by Glover, R.W., filed Nov. 29, 2011.
Non-Final Office Action dated Dec. 22, 2011 in Co-Pending U.S. Appl. No. 12/209,082.
Non-Final Office Action dated Dec. 6, 2012 in co-pending U.S. Appl. No. 13/306,798, filed Nov. 29, 2011.
Non-Final Office Action dated Jan. 9, 2012 in Co-Pending U.S. Appl. No. 12/177,043, filed Jul. 21, 2008.
Non-Final Office Action dated Mar. 11, 2011, in Co-pending U.S. Appl. No. 12/209,096, filed Sep. 11, 2008.
Restriction Requirement dated Feb. 14, 2005 for U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Restriction Requirement dated Feb. 5, 2008 for U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Restriction Requirement dated Jun. 30, 2006 for U.S. Appl. No. 10/136,733, filed Apr. 30, 2002.
Restriction Requirement dated Jun. 30, 2006 in U.S. Appl. No. 10/136,733, filed Apr. 30, 2002.
U.S. Appl. No. 13/789,104, filed Mar. 7, 2013, Gofman.
Non-Final Office Action dated Mar. 16, 2006 for U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Non-Final Office Action dated Mar. 16, 2006 in Co-Pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Non-Final Office Action dated Mar. 18, 2013 in Co-Pending U.S. Appl. No. 13/659,793 by More, S., filed Oct. 24, 2012.
Non-Final Office Action dated Mar. 18, 2013 in Co-Pending U.S. Appl. No. 13/659,793 of More, S., filed Oct. 24, 2012.
Non-Final Office Action dated Mar. 20, 2006 in Co-pending U.S. Appl. No. 10/136,733, filed Apr. 30, 2002.
Non-Final Office Action dated Mar. 20, 2006 in U.S. Appl. No. 10/136,733, filed Apr. 30, 2002.
Non-Final Office Action dated May 17, 2013 in co-pending U.S. Appl. No. 13/306,765 by Mulder, S.P.M., filed Nov. 29, 2011.
Non-Final Office Action dated May 7, 2008 in Co-pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001.
Non-Final Office Action dated May 7, 2008 in Co-Pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Non-Final Office Action dated Sep. 19, 2011 for U.S. Appl. No. 12/177,043, filed Jul. 21, 2008.
Non-Final Office Action dated Sep. 19, 2011 in Co-Pending U.S. Appl. No. 12/177,043, filed Jul. 21, 2008.
Non-Final Office Action dated Sep. 19, 2012 in Co-Pending U.S. Appl. No. 12/844,818 by Glover, R., filed Jul. 27, 2010.
Notice of Allowance dated Aug. 19, 2012 in Co-Pending U.S. Appl. No. 12/177,043, filed Jul. 21, 2008.
Notice of Allowance dated Jul. 8, 2013 in Co-Pending U.S. Appl. No. 12/209,082 by S. More et al., filed Sep. 11, 2008.
Notice of Allowance dated Jun. 26, 2012 in Co-Pending U.S. Appl. No. 12/275,185 of More, S., filed Nov. 20, 2008.
Notice of Allowance dated Jun. 26, 2012, in Co-Pending U.S. Appl. No. 12/275,185, filed Nov. 20, 2008.
Notice of Allowance dated Mar. 13, 2013 in Co-Pending U.S. Appl. No. 12/844,818 by Glover, R., filed Jul. 27, 2010.
Notice of Allowance dated Mar. 13, 2013 in Co-Pending U.S. Appl. No. 12/844,818 of Glover, R., filed Jul. 27, 2010.
Notice of Allowance dated Oct. 2, 2012, in Co-Pending U.S. Appl. No. 12/275,185 by More, S., filed Nov. 20, 2008.
Notice of Allowance dated Oct. 2, 2012, in Co-Pending U.S. Appl. No. 12/275,185 of More, S., filed Nov. 20, 2008.
Notice of Allowance dated Oct. 24, 2008 in Co-pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001.
Notice of Allowance dated Oct. 24, 2008 in Co-Pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Notice of Allowance dated Sep. 25, 2013, in Co-Pending U.S. Appl. No. 13/659,817 by More, S., filed Oct. 24, 2012.
Advisory Action dated Apr. 12, 2013, in Co-Pending U.S. Appl. No. 12/621,429 by More, S., filed Nov. 18, 2009.
Advisory Action dated Apr. 12, 2013, in Co-Pending U.S. Appl. No. 12/621,429 of More, S., filed Nov. 18, 2009.
Advisory Action dated Nov. 1, 2013, in Co-Pending U.S. Appl. No. 13/659,793 by More, S., filed Oct. 24, 2012.
Co-pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001.
Co-pending U.S. Appl. No. 10/136,733, filed Apr. 30, 2002.
Co-pending U.S. Appl. No. 12/177,043, filed Jul. 21, 2008.
Co-pending U.S. Appl. No. 12/209,082, filed Sep. 11, 2008.
Co-pending U.S. Appl. No. 12/209,096, filed Sep. 11, 2008.
Co-pending U.S. Appl. No. 12/275,185, filed Nov. 20, 2008.
Co-pending U.S. Appl. No. 12/621,429, filed Nov. 18, 2009.
Co-pending U.S. Appl. No. 12/844,818, filed Jul. 27, 2010.
Co-pending U.S. Appl. No. 13/306,765, filed Nov. 29, 2011.
Co-pending U.S. Appl. No. 13/306,798, filed Nov. 29, 2011.
Co-pending U.S. Appl. No. 13/306,819, filed Nov. 29, 2011.
Co-pending U.S. Appl. No. 13/620,364, filed Sep. 14, 2012.
Co-Pending U.S. Appl. No. 13/659,793, filed Oct. 24, 2012.
Co-Pending U.S. Appl. No. 13/659,817, filed Oct. 24, 2012.
Final Office Action dated Apr. 16, 2012 in Co-Pending U.S. Appl. No. 12/177,043, filed Jul. 21, 2008.
Final Office Action dated Apr. 17, 2007 for U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Final Office Action dated Apr. 17, 2007 for U.S. Appl. No. 10/023,010, filed Dec. 7, 2001, now U.S. Pat. No. 7,496,841.
Final Office Action dated Apr. 17, 2007 in Co-Pending U.S. Appl. No. 10/023,010, filed Dec. 17, 2001, now U.S. Pat. No. 7,496,841.
Final Office Action dated Aug. 12, 2011 for U.S. Appl. No. 12/209,096, filed Sep. 11, 2008.
Final Office Action dated Aug. 12, 2011 in Co-Pending U.S. Appl. No. 12/209,096, filed Sep. 11, 2008.
Final Office Action dated Aug. 16, 2013 in co-pending U.S. Appl. No. 13/306,798 of Glover, R.W., filed Nov. 29, 2011.
Final Office Action dated Feb. 1, 2013 in Co-Pending U.S. Appl. No. 12/621,429 by More, S., filed Nov. 18, 2009.
Final Office Action dated Feb. 1, 2013 in Co-Pending U.S. Appl. No. 12/621,429 of More, S., filed Nov. 18, 2009.
Final Office Action dated Jan. 18, 2013 in Co-Pending U.S. Appl. No. 12/844,818 by Glover, R., filed Jul. 27, 2010.
Final Office Action dated Jan. 18, 2013 in Co-Pending U.S. Appl. No. 12/844,818 of Glover, R., filed Jul. 27, 2010.
Final Office Action dated May 10, 2012 in Co-Pending U.S. Appl. No. 12/209,082, filed Sep. 11, 2008.
Final Office Action dated May 10, 2012 in Co-Pending U.S. Appl. No. 12/209,082.
Final Office Action dated Oct. 21, 2013, in Co-Pending U.S. Appl. No. 13/659,793 by More, S., filed Oct. 24, 2012.
“MIMEsweeper Solutions”.
3BOpen Doc Making StarOffice and OpenOffice.org a viable option.
Bettenburg et al., An Empirical Study on the Risks of Using Off-the-Shelf Techniques for Processing Mailing List Data, 2009, IEEE 4 pages.
Bindu et al., Spam War: Battling Ham against Spam, 2011 IEEE 6 pages.
Bobba et al. Attribute-Based Messaging: Access Control and Confidentiality, 2010, ACM 35 pages.
Chen et al., Online Detection and Prevention of Phishing Attacks, 2006, IEEE 7 pages.
Kamouskos et al., Active Electronic Mail, 2002, ACM 6 pages.
Kaushik et al., Email Feedback: A Policy based Approach to Overcoming False Positives, 2005, 10 pages.
Stolfo et al., AMT?MET: Systems for Modeling and Detecting Errant Email. 2003, IEEE 6 pages.
“EzClean—Metadata removal utility for Microsoft Office”.
“CS MAILsweeper™ 4.3 for SMTP” by Clearswift Ltd (© 2002).
“EzClean—New Features—version 3.3”.
“EzClean 3.2—New Features”.
“How do I make sure that there is no embarrassing Metadata in any documents that I attach to e-mails? ezClean makes it easy!”.
“Lotus Announces cc:Mail for The World Wide Web; Provides EasyAccess to E-Mail via The Web”.
“Middleboxes: Taxonomy and Issues,” Internet Engineering TaskForce (IETF), RFC 3234 (Feb. 2002).
“MIME (Multipurpose Internet Mail Extensions): Mechanisms forSpecifying and Describing the Format of Internet Message Bodies,” Internet Engineering Task Force (IETF), RFC 1341 (Jun. 1992).
“Think Your Deletions are Gone Forever? Think Again! ezClean Makes Metadata Removal Easy!”.
3B Transform from 2005.
3BOpenDoc—Convert documents to and from OSF.
Bitform Extract SDK 2005.1.
EZclean version 3.3 Installation Guide and Admin Manual.
Silver, Michael A.; MacDonald, Neil. Plan to Deal with Metadata Issues with Windows Vista. Gartner, Inc.. Dec. 21, 2005.ID No. G00136321.
Simple Mail Transfer Protocol, Internet Engineering Task Force(IETF), RFC 821 (Aug. 1982).

Related Publications (1)

	Number	Date	Country
	20160232158 A1	Aug 2016	US

Provisional Applications (1)

	Number	Date	Country
	62097190	Dec 2014	US

System and method for determining document version geneology

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Term Extension