The present invention relates in general to computer anti-virus detection and distribution and, in particular, to system and method for distributing portable computer virus definition records with binary file conversion.
Computer viruses are program code usually causing malicious and often destructive results. All computer viruses are self-replicating. More precisely, computer viruses include any form of self-replicating computer code which can be stored, disseminated, and directly or indirectly executed. Computer viruses can be disguised as application programs, functions, macros, electronic mail attachments, and even applets and in hypertext links.
Computer viruses travel between machines via infected media or over network connections disguised as legitimate files or messages. The earliest computer viruses infected boot sectors and files. Over time, computer viruses evolved into numerous forms and types, including cavity, cluster, companion, direct action, encrypting, multipartite, mutating, polymorphic, overwriting, self-garbling, and stealth viruses, such as described in “MCAFEE.com: Virus Glossary of Terms,” NETWORKS ASSOCIATES TECHNOLOGY, Inc., (2000), the disclosure of which is incorporated by reference. Most recently, macro viruses have become increasingly popular. These viruses are written in macro programming languages and are attached to document templates or as electronic mail attachments.
Historically, anti-virus solutions have reflected the sophistication of the viruses being combated. The first anti-virus solutions were stand-alone programs for identifying and disabling viruses. Eventually, anti-virus solutions grew to include specialized functions and parameterized variables that could be stored in a data file. During operation, the data file was read by an anti-virus engine operating on a client computer. Finally, the specialized functions evolved into full-fledged anti-virus languages for defining virus scanning and cleaning, including removal and disablement, instructions.
Presently, most anti-virus companies store the anti-virus language code for each virus definition into data files. For efficiency, the source code is compiled into object code at the vendor site. The virus definitions, including the object code, are then stored into the data files. To speed virus detection, the virus definitions are organized for efficient retrieval often as unstructured binary data.
Anti-virus companies are continually discovering new computer viruses on a daily basis and must periodically distribute anti-virus software updates. Each update augments the data file with new computer virus definitions, as well as replacing or deleting old virus definitions. Over time, however, the size of the data files tend to become large and can take excessive amounts of time to download. Long download times are particularly problematic on low bandwidth connections or in corporate computing environments having a large user base.
Consequently, one prior art approach to decreasing anti-virus data file downloading times determines and transfers only the changes between old and new data files. The anti-virus company first compares old and new data files and forms a binary delta file. The delta file is downloaded by users and a patching utility program converts the old data file into the new data file by replacing parts of the binary data file. While this approach can often decrease the amount of data to be downloaded, the sizes of the delta files are arbitrary and vary greatly, depending upon the differences in binary data. In the worst case, the old and new data files are completely different and the delta file effectively replicates the new data file, thereby saving no download time.
While the use of delta files can effect throughput, changing the format of data files, particularly in a corporate computing environment, to avoid the use of delta files would create a further concern with respect to maintaining backward compatibility. Any new data file format change would necessitate replacing the existing data files on fielded client computers at potentially high cost due to downloads and installation.
Therefore, there is a need for an approach to efficiently distributing virus definitions to allowing updating in a backward compatible manner. Preferably, such an approach would store virus definitions maintained as indexed records in a database management system coupled with the ability to convert the virus definitions between formats. Such an approach would allow efficient virus definition updating while preserving existing data file formats.
The present invention provides a system and method for sharing computer virus definition data in a backward compatible manner using a structured virus database. On a client, a structured virus database is maintained for storing virus definition records. Each record has a unique identifier, one or more virus names, and object code “sentences” defining operations for detecting the presence of and for removing a compute virus. The records are converted into a virus data file storing virus definitions. Each definition includes binary sentences defining virus detection and cleaning operations and identifies the computer viruses by name. Periodically, delta sets of virus definition records are retrieved. Each of the records is processed to add, delete, or replace records in the database after which the database is again converted into an updated virus data file.
An embodiment of the present invention is a system and method for distributing portable computer virus definition records with binary file conversion. One or more virus definition records are stored into a structured virus database. Each virus definition record includes an identifier uniquely identifying a computer virus, at least one virus name associated with the computer virus, a virus definition sentence comprising object code providing operations to detect the identified computer virus within a computer system, and a virus removal sentence comprising object code providing operations to clean the identified computer virus from the computer system. At least one updated virus definition record is stored into the structured virus database indexed by the identifier and the at least one virus name for each virus definition record. The virus definition records stored in the structured virus database are converted into a virus data file. The virus data file includes virus definition sets. Each virus definition set includes binary data encoding instructions to detect the computer virus within a computer system, instructions to clean the computer virus from the computer system, and names associated with the computer virus.
Still other embodiments of the present invention will become readily apparent to those skilled in the art from the following detailed description, wherein is described embodiments of the invention by way of illustrating the best mode contemplated for carrying out the invention. As will be realized, the invention is capable of other and different embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and the scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not as restrictive.
The server 11 includes a persistent store kept on a file system 18 maintained on a server storage device 14. Individual directories, files, and databases are stored in the file system 18. Suitable persistent storage devices include randomly accessible devices, such as hard drives and rewriteable media, although other forms of persistent storage devices could also be used by or incorporated into the server 11. Similarly, the client 12 also includes a persistent store kept on a file system 19 maintained on a client storage device 15.
The client 12 can potentially be exposed to computer viruses by virtue of having interconnectivity with outside machines. As protection, the client 12 includes an anti-virus system 17 (AVS) that executes operations to scan for the presence of and to clean off any computer viruses. An exemplary anti-virus system 17 is the VirusScan product, licensed by Networks Associates Technology, Inc., Santa Clara, Calif. As further described below beginning with reference to
The virus database must be periodically updated with new computer virus definitions to enable the anti-virus system 17 to continue to provide up-to-date anti-virus protection. Thus, the server 11 includes an anti-virus support system 16 (AVSS) that executes an updating service. The client 12 can connect to the server 11 and download updated virus definition records from the anti-virus support system 16.
The individual computer systems, including server 11 and client 12, are general purpose, programmed digital computing devices consisting of a central processing unit (CPU), random access memory (RAM), non-volatile secondary storage, such as a hard drive or CD ROM drive, network interfaces, and peripheral devices, including user interfacing means, such as a keyboard and display. Program code, including software programs, and data are loaded into the RAM for execution and processing by the CPU and results are generated for display, output, transmittal, or storage.
The anti-virus system 17 consists of two functional modules: a converter 34 and a database engine 35. The anti-virus support system 16 consists of two functional modules: a compiler 31 and a database engine 32. Each module is a computer program, procedure or module written as source code in a conventional programming language, such as the C++ programming language, and is presented for execution by the CPU as object or byte code, as is known in the art. The various implementations of the source code and object and byte codes can be held on a computer-readable storage medium or embodied on a transmission medium in a carrier wave. The anti-virus support system 16 and the anti-virus system 17 operate in accordance with a sequence of process steps, as further described below with reference to
The database engine 35 maintains a structured virus database 40 storing virus definition records. The converter 34 converts the stored virus definition records into conventional virus definitions for use in scanning for and cleaning off computer viruses. The structured virus database 40 is preferably organized as a relational database, as further described below with reference to
The anti-virus support system 16 provides virus definition updates through the database engine 32. The updated virus definition records are selected from the logical sets of structured master virus databases 37. The structured master virus databases 37 are also preferably organized as relational databases, as further described below with reference to
The structured master virus databases 37 are generated by the compiler 31 and database engine 32 from raw virus definitions 37. Each virus definition includes source code written in an anti-virus language for defining virus scanning and cleaning, including removal and disablement, instructions. The compiler 31 converts each set of source code instructions into object code sentences for execution by an anti-virus engine. Preferably, one object code sentence for virus detection and a second object code sentence for virus cleaning are generated. The database engine 32 then builds the virus definition records of the structured master virus databases 37 and populates each virus definition record with the object code sentences.
In an alternate embodiment, the server 11 and client 12 also respectively include a decompiler 33, 36. The decompilers 33, 36 on both systems convert each virus definition set in the virus data file into a virus definition record for incorporation into their respective structured databases. Thus, the decompilers 33, 36 provide an additional layer of backward compatibility, allowing the virus definitions stored in old virus data files 39, 41 to be reused.
Upon a periodic update cycle, the database engine 32 determines (step 60) the delta set of virus definition records 61 between structured master virus database ‘A’ 58 and structured master virus database ‘B’ 59. The client 12 (also shown in
In the described example, the structured master virus database ‘A’ 58 and structured master virus database ‘B’ 59 are separate database file. However, in practice, these two databases would preferably be maintained as a single database file and each updated virus database would be logically defined by selecting out new, changed, or deleted virus definition records.
Upon a periodic update cycle, the database engine 32 determines (step 84) the delta set of virus definition records 85 between structured master virus database ‘A’ 82 and structured master virus database ‘B’ 83. The client 12 (also shown in
To further optimize performance, the individual virus definitions 94–108 are ordered within their respective virus definition set for optimal retrieval. Thus, the scan virus definition set 91 stores the virus definitions 94–98 in order of first, third, second, fourth, and fifth viruses. Similarly, the clean virus definition set 92 stores virus definitions 99–103 in order of third, fourth, first, fifth, and second viruses, while the names virus definition set 93 stores virus definitions 104–108 in order of fourth, fifth, third, first, and second viruses. Other orderings or forms of organization are feasible.
The prior art data file 90 is divided and organized to optimize virus scanning and cleaning performance. However, this format is difficult to maintain due to the arbitrary orderings of virus definitions within their respective virus definition sets and by virtue of the binary nature of the stored data. As new virus definitions can be inserted into any arbitrary location within each virus definition set, binary patch utilities often end up replacing a substantially large portion of a virus definition set.
In an alternate embodiment, the virus definitions are stored in virus data files 39 (also shown in
Next, the local virus database 40 (shown in
For backward compatibility, the virus definition records stored in the updated structured virus database 40 are converted into virus definitions and stored into a set of virus data files 41, after which the routine ends. Note that this routine is also restarted whenever necessary, and preferably on a periodic basis, to update the structured master virus database 40 with new virus definitions 37.
As with the server routine 150, in an alternate embodiment, the delta set 61 can be encrypted (blocks 172–173) and compressed (blocks 174–175). Finally, in a further alternate embodiment, the virus data files set 41 can be encrypted (blocks 186–187) for heightened security.
While the invention has been particularly shown and described as referenced to the embodiments thereof, those skilled in the art will understand that the foregoing and other changes in form and detail may be made therein without departing from the spirit and scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
5559960 | Lettvin | Sep 1996 | A |
5802277 | Cowlard | Sep 1998 | A |
5960170 | Chen et al. | Sep 1999 | A |
6021510 | Nachenberg | Feb 2000 | A |
6029256 | Kouznetsov | Feb 2000 | A |
6067410 | Nachenberg | May 2000 | A |
6073239 | Dotan | Jun 2000 | A |
6279113 | Vaidya | Aug 2001 | B1 |
6314425 | Serbinis et al. | Nov 2001 | B1 |
6338141 | Wells | Jan 2002 | B1 |
6357008 | Nachenberg | Mar 2002 | B1 |
6397335 | Franczek et al. | May 2002 | B1 |
6622150 | Kouznetsov et al. | Sep 2003 | B1 |
6785732 | Bates et al. | Aug 2004 | B1 |
6802028 | Ruff et al. | Oct 2004 | B1 |
6851057 | Nachenberg | Feb 2005 | B1 |
20050283837 | Olivier et al. | Dec 2005 | A1 |
Number | Date | Country |
---|---|---|
PCTEP0109643 | Mar 2002 | WO |
WO 0219067 | Mar 2002 | WO |