Spam can generally be described as the use of electronic messaging systems to send unsolicited and typically unwanted bulk messages. Spam can generally be characterized as encompassing some unwanted or unsolicited electronic communication. Spam encompasses many electronic services including e-mail spam, instant messaging spam, Usenet newsgroup spam, Web search engine spam, spam in blogs, wiki spam, online classified ad spam, mobile device spam, Internet forum spam, social networking spam, etc. Spam detection and protection systems attempt to identify and control spam communications.
Current spam detection systems use basic content filtering techniques like regular expressions or keyword matches as part of detecting spam. However, these systems are unable to catch all types of spam and other unwanted communications. For example, spammers commonly reuse HTML/literal templates for sending spam. Adding to the detection and elimination problem, spamming techniques are continuously evolving in attempts to bypass in-place spam detection and/or exclusion techniques. Moreover, scalability and performance issues come into the equation with the deployment of certain spam detection systems. Unfortunately, conventional methods and systems for identifying and excluding unwanted communications can be resource intensive and difficult to implement additional prevention measures.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended as an aid in determining the scope of the claimed subject matter.
Embodiments provide unwanted communication detection and/or management features, including using one or more commonality measures as part of generating templates for fingerprinting and comparison operations, but the embodiments are not so limited. In an embodiment, a computing architecture includes components configured to generate templates and associated fingerprints for known unwanted communications, wherein the template fingerprints can be compared to unknown communication fingerprints as part of determining whether the unknown communications are based on similar templates and can be properly classified as unwanted or potentially unsafe communications for further analysis and/or blocking A method of one embodiment operates to use a number of template fingerprints to detect and classify unknown communications as spam, phishing, and/or other unwanted communications. Other embodiments are included.
These and other features and advantages will be apparent from a reading of the following detailed description and a review of the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory only and are not restrictive of the invention as claimed.
In an embodiment, components of the architecture 100 can be used as part of monitoring messages over a communication pipeline, including identifying unwanted communications based in part on one or more known unwanted communication template fingerprints. For example, template fingerprints can be generated and grouped according to various factors, such as by a known spamming entity. Known unwanted communication template fingerprints can be representative of a defined group or grouping of known unwanted communications. As described below, false and/or negative feedback communications can be used as part of maintaining aspects of a template fingerprint repository, such as deleting/removing and/or adding/modifying template fingerprints.
In one embodiment, templates can be generated based in part on extracting first portions of a number of unwanted communication based in part on a first commonality measure and extracting second portions of the number of unwanted communication based in part on a second commonality measure. For example, a template generating process can operate to identify and extract portions of a first group of electronic messages based in part on first commonality measure that indicates little or no commonality between the identified portions of the first group of electronic messages. Continuing the example, the template generating process can also operate to identify and extract portions of a second group (e.g., spanning multiple groups) of electronic messages based in part on a second commonality measure that indicates high or significant commonality (e.g., very common markup structure across multiple messages) between the identified portions of the second group of electronic messages. Once the portions have been extracted, fingerprints can be generated for use in detecting unwanted communications, as discussed below.
In another embodiment, templates can be generated based in part on the use of custom string parsers configured to extract defined portions of a number of unwanted communications including hypertext markup language (HTML) as part of generating templates for fingerprinting. A template generator of an embodiment can be configured to extract all literals and markup attributes from an unwanted communication data structure, exposing basic tags (e.g., <html>, <a>, <table>, etc.). For example, a template generator can use custom parsers to remove literals from MIME message portions and then apply regular expressions to remaining portions to extract pure tags as part of generating templates for fingerprinting and use in message characterization operations.
With continuing reference to
As shown in
As an example of an unknown message characterization operation, a collection of e-mail messages can be grouped together based on indications of a spam campaign (done via source IP address, source domain, similarity scoring, etc.) and template processing operations can be used to provide templates for fingerprinting. For example, Microsoft Forefront Online Protection for Exchange (FOPE) maintains a list of IP addresses that are known to send spam, wherein templates can be generated according to IP address groupings. In one embodiment, messages associated with the known IP addresses are used to capture live spam emails for use by the template generator 102 when generating templates for fingerprinting.
The template generator 102 is configured to generate electronic templates based in part on aspects of one or more source communications, but is not so limited. For example, the template generator 102 can generate unwanted communication templates based in part on aspects of known spam or other unwanted communications composed of a markup language and data (e.g., HTML template including literals). The template generator 102 of an embodiment can generate electronic templates based in part on aspects of one or more electronic communications, including the use of one or more commonality measures to identify communication portions for extraction. Remaining portions can be fingerprinted and used as part of identify unwanted communications or unwanted communication portions.
The template generator 102 of one embodiment can operate to generate unwanted communication templates by extracting first communication portions based in part on a first commonality measure and extracting second communication portions based in part on a second commonality measure. Once the portions have been extracted, the fingerprinting component 104 can generate fingerprints for use in detecting unwanted communications, as discussed below. For example, the template generator 102 can operate to identify and extract portions of a first group of electronic messages based in part on first commonality measure, indicating little or no commonality between identified portions of the first group of electronic messages (e.g., majority of e-mails in a group do not contain identified first portions, grouped according to know spamming IP addresses).
Commonality can be identified based in part on the inspection of message HTML and literals, a collection of the disjoint “tuples” or word units of a message using a lossless set intersection, and/or other automatic methods for identifying differences between the messages. Continuing the example above, the template generating process can also identify and extract portions of a second group (e.g., spanning multiple groups) of electronic messages based in part on a second commonality measure, indicating high or significant commonality between the associated portions of the second group of electronic messages.
As one example, very common portions can be identified using the second commonality measure defined as message parts that occur in ten (10) percent of all messages and include an inverse document frequency (IDF) measure beyond a basic value (e.g. <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>). Note that these very common identified portions likely span multiple groups and/or repositories. In one embodiment, the very common portions can be identified by compiling a standard listing or by dynamically generating a list based on sample messages, thereby improving the selectivity of the fingerprinting process. Any remaining portions (e.g., HTML and literals) can be defined as a template for fingerprinting by the fingerprinting component 104.
In another embodiment, the template generator 102 can operate to generate templates based in part on the use of custom string parsers configured to extract defined portions of a number of unwanted communications as part of generating templates for fingerprinting. A template generator of an embodiment can be configured to extract all literals and HTML attributes from an unwanted communication data structure and leave basic HTML tags (e.g., <html>, <a>, <table>, etc.). For example, the template generator can use custom parsers to remove literals from text of MIME message portions and then apply regular expressions to remaining portions to extract pure tags as part of generating templates for fingerprinting and use in message characterization operations.
The fingerprinting component 104 is configured to generate electronic fingerprints based in part on an underlying source, such as a known spam template or unknown inbound message for example, using a fingerprinting algorithm. The fingerprinting component 104 of an embodiment operates to generate electronic fingerprints based in part on a hashing technique and aspects of electronic communications including aspects of generated electronic templates classified as spam and at least one other unknown electronic communication.
In one embodiment, the fingerprinting component 104 can generate fingerprints for use in determining a similarity measure between known and unknown communications using a minwise hashing calculation. Minwise hashing of an embodiment involves generating sets of hash values based on word units of electronic communications, and using selected hash values from the sets for comparison operations. B-bit minwise hashing includes a comparison of a number of truncated of bits of the selected values. Fingerprinting new, unknown messages does not require removal or modification of any portions before fingerprinting due in part to the asymmetric comparison provided by using a containment factor or coefficient, discussed further below.
A type of word unit can be defined and used as part of a minwise hashing calculation. A choice of word unit corresponds to a unit used in a hashing operation. For example, a word unit for hashing can include a single word or term, or two or more consecutive words or terms. A word unit can also be based on a number of consecutive characters. In such an embodiment, the number of consecutive characters can be based on all text characters (such as all ASCII characters), or the number of characters can exclude non-alphabetic or non-numeric characters, such as spaces or punctuation marks.
Extracting word units can include extracting all text within an electronic communication, such as an e-mail template for example. Extraction of word pairs can be used as an example for extracting word units. When word pairs are extracted, each word (except for the first word and the last word) can be included in word pairs. For example, consider a template that begins with the words “Patent Disclosure Document. This is a summary paragraph, Abstract, Claims, etc.” The word pairs for this template include “Patent Disclosure”, “Disclosure Document”, “Document This”, “This is”, etc. Each term appears as both a first term in a pair and a second term in a pair to avoid the possibility that similar messages might appear different due to being offset by a single term.
A hash function can be used to generate a set of hash values based on extracted word units. In an embodiment where the word unit is a word pair, the hash function is used to generate a hash value for each word pair. Using a hash function on each word pair (or other word unit parsing) results in a set of hash values for an electronic communication. Suitable hash functions allow word units to be converted to a number that can be expressed as an n-bit value. For example, a number can be assigned to each character of a word unit, such as an ASCII number.
A hash function can then be used to convert summed values into a hash value. In another embodiment, a hash value can be generated for each character, and the hash values summed to generate a single value for a word unit. Other methods can be used such that the hash function converts a word unit into an n-bit value. Hash functions can also be selected so that the various hash functions used are min-wise independent of each other. In one embodiment, several different types of hash functions can be selected, so that the resulting collection of hash functions is approximately min-wise independent.
Hashing of word units can be repeated using a plurality of different hash functions such that each of the plurality of hash functions allows for creation of different set of hash values. The hash functions can be used in a predetermined sequence, such that a same sequence of hash functions can be used on each message being compared. Certain hash functions may differ based on the functional format of the hash function. Other hash functions may have similar functional formats, but include different internal constants used with the hash function. The number of different hash functions used on a document can vary, and can be related to the number of words (or characters) in a word unit. The result of using the plurality of hash functions is a plurality of sets of hash values. The size of each set is based on the number of word units. The number of sets is based on the number of hash functions. As noted above, the plurality of hash functions can be applied in a predetermined sequence, so that the resulting hash value sets correspond to an ordered series or sequence of hash value sets.
In an embodiment, for each set of hash values, a characteristic value can be selected from the set. For example, one choice for a characteristic value can be the minimum value from the set of hash values. The minimum value from a set of numbers does not depend on the size of the set or the location of the minimum value within the set of numbers. The maximum value of a set could be another example of a characteristic value. Still another option can be to use any technique that is consistent in producing a total ordering of the set of hash values, and then selecting a characteristic value based on aspects of the ordered set.
In one embodiment, a characteristic value can be used as the basis for a fingerprint value. A characteristic value can be used directly, or transformed to a fingerprint value. The transformation can be a transformation that modifies the characteristic value in a predictable manner, such as performing an arithmetic operation on the characteristic value. Another example includes truncating the number of bits in the characteristic value, such as by using only the least significant b bits of an associated characteristic value.
Fingerprint values generated from a group of hash functions can be assembled into a set of fingerprint values for a message, ordered based on the original predetermined sequence used for the hash values. As described below, fingerprint values representative of a message fingerprint can be used to determine a similarity value and/or containment coefficient for electronic communications. Fingerprints comprising an ordered set of fingerprint values can be easily stored in the fingerprint repository 108 and compared with other fingerprints, including fingerprints unknown message. Storing fingerprints rather than underlying sources (e.g., templates, original source communications, etc.) requires the use of much less memory and fewer processing demands. In an embodiment, hashing operations are not reversible. For example, original text cannot be reconstructed from resulting hashes.
The characterization component 106 of one embodiment is configured to perform characterization operations using electronic fingerprints based in part on a similarity and containment factor process. In an embodiment, the characterization component 106 uses a template fingerprint and an unknown (e.g., new spam/phishing campaign) communication fingerprint to identify and vet spam, phishing, and other unwanted communications. As described above, a word unit type is used as part of the fingerprinting process. A shingle represents n contiguous words of some reference text or corpus. Research has indicated that a set of shingles can accurately represent text when performing set similarity calculations. As an example, consider the message “the red fox runs far.” This would produce a set of shingles or word units as follows: {“the red”, “red fox”, “fox runs”, “runs far”}.
The characterization component 106 of one embodiment uses the following algorithm as part of characterizing unknown communication fingerprints, where:
Fingerprint: the fingerprint that represents St for purposes of template detection and effectively represents a sequence of hash values.
Fingerprint (i): returns the ith value in the fingerprint.
WordUnitCountt: the number of word units contained in a template (e.g., HTML template) dependent on template generation method.
Sc: the set of word units in an unknown communication (e.g., live e-mail).
R: R represents the set resemblance or similarity.
hash: hash is a unique hash function with random dispersion.
min: min(S) finds the lowest value in S.
bb(b,v1,v2): is equal to one (1) if last b bits of v1 and v2 are equal; otherwise, equal to zero (0).
Cr: the Containment Coefficient or fraction of one document, file, or other
structure found in another document, file, or other structure
and the text of St is therefore a subset of Sc
If St⊂Sc, then the unknown communication is based on the template and can be identified as unwanted (e.g., mail headers can be stamped accordingly).
An exemplary unique hashing algorithm with random dispersion can be defined below:
1) Use message-digest algorithm 5 (Md5) and a corresponding word unit to produce a 128 bit integer representation of the word unit.
2) Take 64 bits from this 128 bit representation (e.g., the 64 least significant bits).
3) Take an established large prime number “seed” from a consistent collection of large prime numbers (e.g., hash) would use the jth prime number seed from the collection).
4) Take an established small prime number “seed” from a collection (following the same process as (1)).
5) Take the lower 32 bits of the 64 bits from the Md5.
6) Multiply the value from (5) by the little prime number and take the 59 most significant bits; multiple the value by (5) by the little prime number and take the least significant 5 bits; “OR” these values.
7) Multiple the value from (6) by the large hash number from (3).
8) Take the upper 32 bits of the 64 bits from the Md5 and multiply that by the little prime number and take the most significant 59 bits; multiply the upper 32 bits of the 64 bits from the Md5 and the little prime number and take the 5 least significant bits; “OR” these values.
9) Add the values from (6) and (8) to produce a minwise independent value.
The hashing function can be deterministically reused to produce minwise independent values by modifying the prime number seeds from (3) and (4) above.
An example of the hashing function as implemented in F# can be seen below:
When the containment coefficient Cr is greater than a threshold value, the smaller St can be considered to be a subset (or substantially a subset) of Sc. If St is a subset or substantially a subset of Sc, then St can be considered as a template for Sc. The threshold value can be set to a higher or lower value, depending on the desired degree of certainty that St is a subset of Sc. A suitable value for a threshold can be at least about 0.50, or at least about 0.60, or at least about 0.75, or at least about 0.80, as a few examples. Other methods are available for determining a fingerprint and/or a similarity, and using these values to determine a containment coefficient.
Other variations on the minwise hashing procedure described above may be available for calculating fingerprints. Another option could be to use other known methods for calculating a resemblance, such as “Locality Sensitive Hashing” (LSH) methods. These can include the 1-bit methods known as sign random projections (or simhash), and the Hamming distance LSH algorithm. More generally, other techniques that can determine a Jaccard Similarity Coefficient can be used for determining the set resemblance or similarity. After determining a set resemblance or similarity, a containment coefficient can be determined based on the cardinality of the smaller and larger sets.
The fingerprint repository 108 of an embodiment includes memory and a number of stored fingerprints. The fingerprint repository 108 can be used to store electronic fingerprints classified as spam, phishing, and/or other unwanted communications for use in comparison with other unknown electronic communications by the characterization component 106 when characterizing unknown communications, such as unknown e-mails being delivered using a signal communication pipeline. The knowledge manager 110 can be used to manage aspects of the fingerprint repository 108 including using false positive and negative feedback communications as part of maintaining an accurate collection of known unwanted communication fingerprints to increase identification accuracy of the characterization component 106.
The knowledge manager 110 can provide a tool for spam analysts to determine if the false positive/false negative (FP/FN) feedback was accurate (for example, a lot of people incorrectly report newsletters as spam). After validating that the messages are truly false positives or false negatives, the anti-spam rules can be updated to improve characterization accuracy. Thus, analysts can now specify an HTML/literal template for a given spam campaign reducing the time and improving spam identification accuracy. Rule updates and certification can be used to validate that updated rules (e.g., regular expressions and/or templates) do not adversely harm the health of a service (e.g., cause a lot of false positives). If the rule passes the validation, it can then be released to production servers for example.
The functionality described herein can be used by or part of a hosted system, application, or other resource. In one embodiment, the architecture 100 can be communicatively coupled to a messaging system, virtual web, network(s), and/or other components as part of providing unwanted communication monitoring operations. An exemplary computing system includes suitable processing and memory resources for operating in accordance with a method of identifying unwanted communications using generated template and unknown communication fingerprints. Suitable programming means include any means for directing a computer system or device to execute steps of a method, including for example, systems comprised of processing units and arithmetic-logic circuits coupled to computer memory, which systems have the capability of storing in computer memory, which computer memory includes electronic circuits configured to store data and program instructions. An exemplary computer program product is usable with any suitable data processing system. While a certain number and types of components are described above, it will be appreciated that other numbers and/or types and/or configurations can be included according to various embodiments. Accordingly, component functionality can be further divided and/or combined with other component functionalities according to desired implementations.
At 306, the process 300 operates to generate an unwanted communication template fingerprint for the generated unwanted communication template. In one embodiment, a b-bit minwise technique is used to generate fingerprints. At 308, unwanted communication template fingerprints are stored in a repository, such as a fingerprint database for example. At 310, the process 300 operates to generate a fingerprint for an unknown communication, such as an unknown e-mail message for example. At 312, the process 300 operates to compare the unwanted communication template fingerprints and the unknown communication fingerprint. Based in part on the comparison, the unknown communication can be characterized or classified as not unwanted and allowed to be delivered at 314, or classified as unwanted and prevented from being delivered at 316. For example, a previously unknown message determined to be spam can be used to block the associated e-mails, and the sender(s), service provider(s), and/or other parties can be notified of the unwanted communication, including a reason to restrict future communications without prior authorization.
As described above, feedback communications can be used to reclassify an unwanted communication as acceptable, and the process 300 can operate to remove any associated unwanted communication fingerprint from the repository at 320, and move onto processing another unknown communication at 318. However, if an unknown communication has been correctly identified as spam, the process proceeds to 318. While a certain number and order of operations is described for the exemplary flow of
In another embodiment, the process 400 at 404 can be used to extract HTML attributes and literals as part of generating templates consisting essentially of HTML tags. In one embodiment, the process 400 at 404 uses remaining HTML tags to form a string data structure for each template. The information contained in the tag string or generated template provides a similarity measure for the HTML template for use in detecting unwanted messages (e.g., similarity across a spam campaign). Such a template includes relatively static HTML for each spam campaign, since the HTML requires a structure and cannot be easily randomized. Moreover, the literals can be ignored since this text can be randomized (e.g., via newsreader, dictionary, etc.). Such a string-based template can also provide exploitation of malformed headers (see “<i#mg>” in
At 406, the process 400 operates to generate and/or store unwanted communication fingerprints in computer memory. At 408, the template fingerprints can be used as a comparative fingerprint along with unknown communication fingerprints to identify unwanted communications. In one embodiment, a validation process is first used to verify that the associated unwanted communication or communication are actually known as being unwanted before using the template fingerprint as a comparative fingerprint along with an unknown communication fingerprint to identify unwanted communications. Otherwise, at 410, the template fingerprint can be removed from memory if the unwanted communication is determined to be an acceptable communication (e.g., not spam). While a certain number and order of operations is described for the exemplary flow of
At 710, the process 700 operates to fingerprint an inbound and unknown message, generating an unknown message fingerprint. In one embodiment, the process 700 at 710 uses a shingling process, an unknown message (e.g., using all markup and/or content), and a hashing algorithm to generate a corresponding communication fingerprint. If no template fingerprints match the unknown communication fingerprint, the flow proceeds to 712, and the unknown message is classified as good and released. In one embodiment, a regex engine can be used as a second layer of security to process messages classified as good to further ensure that a communication is not spam or unwanted.
If a template fingerprint matches the unknown message, the flow proceeds to 714, and the unknown message is classified as spam and blocked, and the flow proceeds to 716. At 716, the process 700 operates to receive false positive feedback, such as when an e-mail is wrongly classified as spam for example. Based on an analysis of the feedback communication and/or other information, the template fingerprint can be marked as spam related at 718 and continue to be used in unknown message characterization operations. Otherwise, the template fingerprint can be marked as not being spam related at 720 and/or removed from a fingerprint repository and/or reference database. While a certain number and order of operations is described for the exemplary flow of
The Virus component 808 performs basic anti-virus scanning operations and can block delivery if malware is detected. If a message is blocked by the Virus component 808, it may be more expensive to process using FOPE which may include handling sending back non-deliver and/or other notifications, etc. The Policy component 810 performs filtering operations and takes actions on messages based on authored rules (e.g., by customers for example, if it is from one an employee and uses vulgar words, block that message). The SPAM (Regex) component 812 provides anti-spam features and functionalities, such as keywords 814 and hybrid 816 features (Please add detail).
The Mail Extractor and Analyzer 906 operates to remove a message body and headers for storing in a database. Removing content from the raw message can save processing time later. The extracted content, along with existing anti-spam rules, can be stored in the Mails & Spam Rules Storage component 908. The knowledge engineering (KE) studio component 910 can be used as a spam analysis tool as part of determining whether FP/FN feedback was accurate (for example, routinely incorrectly reporting newsletters as spam). After validating that the messages are truly false positives or false negatives, the Rule Updates component 911 can update anti-spam rules to improve detection accuracy. A Rules Certification component 912 can be used to certify that the updated rules are valid before providing the updated rules to a mail filtering system 914 (e.g., FOPE). For example, rules updates and certification operations can be used to validate that the updated rules (e.g., regular expressions or templates) do not adversely harm the health of a service (e.g., cause a lot of false positives). If the rule passes validation, it can be released to production servers.
While certain embodiments are described herein, other embodiments are available, and the described embodiments should not be used to limit the claims Exemplary communication environments for the various embodiments can include the use of secure networks, unsecure networks, hybrid networks, and/or some other network or combination of networks. By way of example, and not limitation, the environment can include wired media such as a wired network or direct-wired connection, and/or wireless media such as acoustic, radio frequency (RF), infrared, and/or other wired and/or wireless media and components. In addition to computing systems, devices, etc., various embodiments can be implemented as a computer process (e.g., a method), an article of manufacture, such as a computer program product or computer readable media, computer readable storage medium, and/or as part of various communication architectures.
The term computer readable media as used herein may include computer storage media. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. System memory, removable storage, and non-removable storage are all computer storage media examples (i.e., memory storage.). Computer storage media may include, but is not limited to, RAM, ROM, electrically erasable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store information and which can be accessed by a computing device. Any such computer storage media may be part of device. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media.
The embodiments and examples described herein are not intended to be limiting and other embodiments are available. Moreover, the components described above can be implemented as part of networked, distributed, and/or other computer-implemented environment. The components can communicate via a wired, wireless, and/or a combination of communication networks. Network components and/or couplings between components of can include any of a type, number, and/or combination of networks and the corresponding network components include, but are not limited to, wide area networks (WANs), local area networks (LANs), metropolitan area networks (MANs), proprietary networks, backend networks, etc.
Client computing devices/systems and servers can be any type and/or combination of processor-based devices or systems. Additionally, server functionality can include many components and include other servers. Components of the computing environments described in the singular tense may include multiple instances of such components. While certain embodiments include software implementations, they are not so limited and encompass hardware, or mixed hardware/software solutions. Other embodiments and configurations are available.
Referring now to
Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Referring now to
The mass storage device 14 is connected to the CPU 8 through a mass storage controller (not shown) connected to the bus 10. The mass storage device 14 and its associated computer-readable media provide non-volatile storage for the computer 2. Although the description of computer-readable media contained herein refers to a mass storage device, such as a hard disk or CD-ROM drive, it should be appreciated by those skilled in the art that computer-readable media can be any available media that can be accessed or utilized by the computer 2.
By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, digital versatile disks (“DVD”), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computer 2.
According to various embodiments of the invention, the computer 2 may operate in a networked environment using logical connections to remote computers through a network 4, such as a local network, the Internet, etc. for example. The computer 2 may connect to the network 4 through a network interface unit 16 connected to the bus 10. It should be appreciated that the network interface unit 16 may also be utilized to connect to other types of networks and remote computing systems. The computer 2 may also include an input/output controller 22 for receiving and processing input from a number of other devices, including a keyboard, mouse, etc. (not shown). Similarly, an input/output controller 22 may provide output to a display screen, a printer, or other type of output device.
As mentioned briefly above, a number of program modules and data files may be stored in the mass storage device 14 and RAM 18 of the computer 2, including an operating system 24 suitable for controlling the operation of a networked personal computer, such as the WINDOWS operating systems from MICROSOFT CORPORATION of Redmond, Wash. The mass storage device 14 and RAM 18 may also store one or more program modules. In particular, the mass storage device 14 and the RAM 18 may store application programs, such as word processing, spreadsheet, drawing, e-mail, and other applications and/or program modules, etc.
It should be appreciated that various embodiments of the present invention can be implemented (1) as a sequence of computer implemented acts or program modules running on a computing system and/or (2) as interconnected machine logic circuits or circuit modules within the computing system. The implementation is a matter of choice dependent on the performance requirements of the computing system implementing the invention. Accordingly, logical operations including related algorithms can be referred to variously as operations, structural devices, acts or modules. It will be recognized by one skilled in the art that these operations, structural devices, acts and modules may be implemented in software, firmware, special purpose digital logic, and any combination thereof without deviating from the spirit and scope of the present invention as recited within the claims set forth herein.
Although the invention has been described in connection with various exemplary embodiments, those of ordinary skill in the art will understand that many modifications can be made thereto within the scope of the claims that follow. Accordingly, it is not intended that the scope of the invention in any way be limited by the above description, but instead be determined entirely by reference to the claims that follow.