Automatic categorization of network events

Information

  • Patent Application
  • 20070233650
  • Publication Number
    20070233650
  • Date Filed
    March 29, 2006
    18 years ago
  • Date Published
    October 04, 2007
    16 years ago
Abstract
A system and method to facilitate automatic categorization of events in a network, wherein one or more keywords are retrieved from a keyword database, each retrieved keyword matching a corresponding event unit of an event input by a user over a network. A dominant keyword corresponding to a highest parameter value calculated for each retrieved keyword is then selected. Finally, the event is categorized based on one or more categories associated with the dominant keyword. The dominant keyword may be selected based on one or more keyword categories associated with each retrieved keyword and an ambiguity parameter value calculated for each keyword. Alternatively, the dominant keyword may be selected based on a highest-ranked output value calculated for each retrieved keyword. One or more categories associated with the dominant keyword are subsequently retrieved from the keyword database and the event is categorized based on the category associated with the dominant keyword.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and not intended to be limited by the figures of the accompanying drawings in which like references indicate similar elements and in which:



FIG. 1 is a flow diagram illustrating an event processing sequence, according to one embodiment of the invention;



FIG. 2 is a block diagram illustrating an exemplary network-based entity containing a system to facilitate automatic categorization of events, according to one embodiment of the invention;



FIG. 3 is a block diagram illustrating the system to facilitate automatic categorization of events within the network-based entity, according to one embodiment of the invention;



FIG. 4 is a flow diagram illustrating a method to facilitate automatic categorization of events in a network, according to one embodiment of the invention;



FIG. 5 is a flow diagram illustrating a method to facilitate automatic categorization of events in a network, according to an alternate embodiment of the invention;



FIG. 6 is a block diagram illustrating a generalized behavioral targeting system;



FIG. 7 is a diagrammatic representation of a machine in the exemplary form of a computer system within which a set of instructions may be executed.


Claims
  • 1. A method comprising: retrieving at least one keyword from a keyword database, each retrieved keyword matching a corresponding event unit of an event input by a user over a network;selecting a dominant keyword of said at least one retrieved keyword corresponding to a highest parameter value calculated for said each retrieved keyword; andcategorizing said event based on at least one category associated with said dominant keyword.
  • 2. The method according to claim 1, further comprising storing said categorized event within said keyword database.
  • 3. The method according to claim 1, further comprising parsing said event to obtain at least one event unit.
  • 4. The method according to claim 1, wherein said selecting further comprises: identifying at least one keyword category associated with said each retrieved keyword in said keyword database; andselecting said dominant keyword based on said at least one keyword category and an ambiguity parameter value calculated for said each keyword as a factor of a conditional probability that said at least one keyword category is associated with said dominant keyword.
  • 5. The method according to claim 4, wherein said selecting further comprises: retrieving predetermined event processing rules from a rules database; andapplying said predetermined processing rules to rank said each retrieved keyword in conjunction with said calculated ambiguity parameter value.
  • 6. The method according to claim 1, wherein said selecting further comprises: selecting said dominant keyword based on a highest-ranked output value calculated for said each retrieved keyword; andretrieving said at least one category associated with said dominant keyword from said keyword database.
  • 7. The method according to claim 6, wherein said selecting further comprises: retrieving a set of statistical parameters corresponding to said each retrieved keyword;assembling a vector containing said set of statistical parameters for said each retrieved keyword; andcalculating said output value for said each retrieved keyword based on said vector, said output value indicating a probability that a corresponding retrieved keyword is selected as said dominant keyword.
  • 8. The method according to claim 1, wherein said selecting further comprises: selecting a keyword of a pair of retrieved keywords, said selection based on an output value calculated for said each keyword of said pair of keywords, said selected keyword having a highest-ranked output value;repeating said selecting for at least one subsequent pair of retrieved keywords including said selected keyword to obtain said dominant keyword; andretrieving said at least one category associated with said dominant keyword from said keyword database.
  • 9. The method according to claim 8, wherein said output value indicates a probability that said selected keyword is said dominant keyword.
  • 10. The method according to claim 1, wherein said each keyword is associated with at least one keyword category, and wherein said each keyword and said at least one keyword category are stored hierarchically within said keyword database.
  • 11. A system comprising: at least one database; anda token analysis module coupled to said at least one database, said token analysis module to retrieve at least one keyword from a keyword database, each retrieved keyword matching a corresponding event unit of an event input by a user over a network, to select a dominant keyword of said at least one retrieved keyword corresponding to a highest parameter value calculated for said each retrieved keyword, and to categorize said event based on at least one category associated with said dominant keyword.
  • 12. The system according to claim 11, wherein said token analysis module further stores said categorized event within said keyword database.
  • 13. The system according to claim 11, further comprising a parser module coupled to said token analysis module, said parser module to parse said event to obtain at least one event unit.
  • 14. The system according to claim 11, further comprising: an ambiguity processing module coupled to said token analysis module, said ambiguity processing module to calculate an ambiguity parameter value for said each keyword as a factor of a conditional probability that said at least one keyword category is associated with said dominant keyword;said token analysis module to identify at least one keyword category associated with said each retrieved keyword and to select said dominant keyword based on said at least one keyword category and said ambiguity parameter value.
  • 15. The system according to claim 14, wherein said token analysis module further retrieves predetermined event processing rules from a rules database, and applies said predetermined processing rules to rank said each retrieved keyword in conjunction with said calculated ambiguity parameter value.
  • 16. The system according to claim 11, wherein said token analysis module further selects said dominant keyword based on a highest-ranked output value calculated for said each retrieved keyword and retrieves said at least one category associated with said dominant keyword from said keyword database.
  • 17. The system according to claim 16, further comprising: a machine-learning unit coupled to said token analysis module;said token analysis module to retrieve a set of statistical parameters corresponding to said each retrieved keyword and to assemble a vector containing said set of statistical parameters for said each retrieved keyword;said machine-learning unit to calculate said output value for said each retrieved keyword based on said vector, said output value indicating a probability that a corresponding retrieved keyword is selected as said dominant keyword;
  • 18. The system according to claim 11, wherein said token analysis module further selects a keyword of a pair of retrieved keywords, said selection based on an output value calculated for said each keyword of said pair of keywords, said selected keyword having a highest-ranked output value, repeats said selecting for at least one subsequent pair of retrieved keywords including said selected keyword to obtain said dominant keyword, and retrieves said at least one category associated with said dominant keyword from said keyword database.
  • 19. The system according to claim 18, wherein said output value indicates a probability that said selected keyword is said dominant keyword.
  • 20. The system according to claim 11, wherein said each keyword is associated with at least one keyword category, and wherein said each keyword and said at least one keyword category are stored hierarchically within said keyword database.
  • 21. A computer readable medium containing executable instructions, which, when executed in a processing system, cause said processing system to perform a method comprising: retrieving at least one keyword from a keyword database, each retrieved keyword matching a corresponding event unit of an event input by a user over a network;selecting a dominant keyword of said at least one retrieved keyword corresponding to a highest parameter value calculated for said each retrieved keyword; andcategorizing said event based on at least one category associated with said dominant keyword.
  • 22. The computer readable medium according to claim 21, wherein said selecting further comprises: identifying at least one keyword category associated with said each retrieved keyword in said keyword database; andselecting said dominant keyword based on said at least one keyword category and an ambiguity parameter value calculated for said each keyword as a factor of a conditional probability that said at least one keyword category is associated with said dominant keyword.
  • 23. The computer readable medium according to claim 21, wherein said selecting further comprises: selecting said dominant keyword based on a highest-ranked output value calculated for said each retrieved keyword; andretrieving said at least one category associated with said dominant keyword from said keyword database.
  • 24. The computer readable medium according to claim 23, wherein said selecting further comprises: retrieving a set of statistical parameters corresponding to said each retrieved keyword;assembling a vector containing said set of statistical parameters for said each retrieved keyword; andcalculating said output value for said each retrieved keyword based on said vector, said output value indicating a probability that a corresponding retrieved keyword is selected as said dominant keyword.
  • 25. The computer readable medium according to claim 21, wherein said selecting further comprises: selecting a keyword of a pair of retrieved keywords, said selection based on an output value calculated for said each keyword of said pair of keywords, said selected keyword having a highest-ranked output value;repeating said selecting for at least one subsequent pair of retrieved keywords including said selected keyword to obtain said dominant keyword; andretrieving said at least one category associated with said dominant keyword from said keyword database.