1. Field of the Invention
The disclosure relates generally to measurement of behavior characteristics, and, more particularly to systems and methods that measure behavior characteristics of users according to semantics and interaction behaviors.
2. Description of the Related Art
With the expansion of the Internet, applications developed for users allow browsing and posting of comments via BBSs (Bulletin Board Systems). Users can publish articles via a specific web site or a dedicated web page. Currently, blog (web log) applications are popular, whereon user-owners can publish material, in addition to referencing material on other users' blogs. In such way, interaction behaviors between different users and/or articles are generated, implying behavior characteristics of users.
To strengthen loyalty and provide various enhanced services, service providers try to explore behavior characteristics of users from network content. US Application 2005/0108281 A1 analyzes email content in enterprises according to semantic hints using NLP (Nature Language Process) technology to recognize domain experts. In US Application 2006/0053156 A1, interested and trustworthy experts are recognized according to publications and comment records toward to specific articles in enterprise article databases. These methods, since only semantics of email content or behavior records of users are analyzed, can be applied to an open network environment such as blog environment having a large number of interaction behaviors. Additionally, since only related experts are recognized, behavior characteristics of users, such as personal interests, specialty, and other individual characteristics still cannot be explored from network content for service providers to develop related enhanced applications for users.
Systems and methods for measuring behavior characteristics are provided.
An embodiment of a system for measuring behavior characteristics comprises a database and a processing module. The database stores network content for a plurality of users, where the network content comprises a plurality of key terms and a plurality of interaction behaviors therebetween. For at least one specific user, the processing module calculates a first concern score for respective key terms according to use frequency of respective key terms corresponding to the specific user and use frequency of respective key terms corresponding to all users. The processing module calculates a first relation matrix for at least one specific key term according to at least two users corresponding to respective interaction behaviors and a type weighting corresponding to respective interaction behaviors. The processing module uses an algorithm to calculate a first interaction score for the specific user regarding the specific key term according to the first relation matrix. The processing module calculates a first characteristic score for the specific user regarding the specific key term according to the first concern score and the first interaction score.
In an embodiment of a method for measuring behavior characteristics, a database is provided. The database stores network content for a plurality of users, where the network content comprises a plurality of key terms, and a plurality of interaction behaviors therebetween. For at least one specific user, a first concern score for respective key terms is calculated according to use frequency of respective key terms corresponding to the specific user and use frequency of respective key terms corresponding to all users. A first relation matrix for at least one specific key term is calculated according to at least two users corresponding to respective interaction behaviors and a type weighting corresponding to respective interaction behaviors. A first interaction score for the specific user regarding the specific key term is calculated according to the first relation matrix using an algorithm. A first characteristic score for the specific user regarding the specific key term is calculated according to the first concern score and the first interaction score.
Systems and methods for measuring behavior characteristics may take the form of program code embodied in a tangible media. When the program code is loaded into and executed by a machine, the device becomes an apparatus for practicing the disclosed method.
The invention will become more fully understood by referring to the following detailed description with reference to the accompanying drawings, wherein:
Systems and methods for measuring behavior characteristics are provided.
The system for measuring behavior characteristics 100 comprises a database 110, a domain hierarchy 120, a term-concept association matrix 130, and a processing module 140. It is understood that type weightings 150 corresponding to respective interaction behaviors and participation weightings 160 of interaction behaviors for characteristic score calculation can be set in the system.
The database 110 stores network content for users, such as network articles in the network interaction environment, particularly, the blog environment. The network content can be fetched from the Internet via a data collection unit (not shown), or can be acquired via a data access interface provided by blog service providers. The network content comprises key terms, and interaction behaviors therebetween. The key terms may be tags and/or categories used to disclose the basic semantics of articles. Additionally, the interaction behaviors comprise comments, trackbacks, links, subscriptions, recommendations, blogrolls, and others.
The domain hierarchy 120 comprises a plurality of concepts and associations therebetween. The concepts may be from an Ontology, such as DMOZ, Wordnet, or terms defined in a concept hierarchy. The term-concept association matrix 130 defines association degrees for respective key terms toward respective concepts. If m key terms and n defined concepts are provided, a m×n term-concept relation matrix M is generated, where Mij represents the association degree between the ith key term and the jth concept, and 0≦Mij≦1. It is understood that the term-concept association matrix 130 can be established in any manner. For example, tags and categories in network content are determined as key terms to be processed. A term association hierarchy map among the key terms is first established. The overlapping term in the term association hierarchy map and the Ontology is set as a connection point, and the association degrees of respective terms and respective concepts are calculated using related technologies of Ontology merge to obtain the term-concept association matrix 130. The processing module 140 performs the methods for measuring behavior characteristics of the invention, as discussed later.
In step S310, for at least one specific user, a concern score for respective concepts is calculated according to use frequency of respective key terms corresponding to the specific user, use frequency of respective key terms corresponding to all users, and the term-concept relation matrix M. It is understood that while the invention can evaluate for respective users, measurement of behavior characteristics for a specific user is provided in this embodiment for explanation purposes.
In this step, a vector of use frequency for key terms fU={f1, f2, . . . , fm} of length m is constructed for the specific user, where fi is the frequency of the ith key term for representing article semantics in articles published by the specific user. In other words, fi represents use frequency of the ith key term corresponding to the specific user. Additionally, a vector of use frequency for key terms FALL={F1, F2, . . . , Fm} of length m is constructed, where Fi is the frequency of the ith key term for representing article semantics in articles published by all users. In other words, Fi represents use frequency of the ith key term corresponding to all users. Thereafter, the use characteristic of key terms of the specific user to all users is calculated, and the characteristic is converted to concept level to obtain a concern score vector GU for the specific user toward concepts (domains). The concern score vector GU is calculated as follows:
where GU={G1, G2, . . . , Gm}, and Gj represents the concern score of the jth concept corresponding to the specific user.
In step S320, a relation matrix for at least one specific concept is calculated according to users corresponding to respective interaction behaviors, a type weighting corresponding to respective interaction behaviors, and association degree for key terms used in the interaction behaviors toward the specific concept. Similarly, while the invention can calculate relation matrices for respective concepts, the calculation of relation matrix for a specific concept is provided in this embodiment for explanation purposes.
As described, type weightings 150 corresponding to respective interaction behaviors can be set.
In this embodiment, each interaction behavior is represented as (UA, UB, S, IC). UA and UB are the two users in an interaction behavior, where UA is the user initiating the interaction behavior, and UB is the user receiving the interaction behavior. S is the type weighting of the interaction behavior. IC is a semantic concept involved in the interaction behavior, where IC is represented by (CN, AD), CN is the concept name, and AD is the association degree of key term toward the concept. It is noted that several semantic concepts may be involved in an interaction behavior. The relation matrix corresponding to a specific concept is calculated as follows:
wherein Rij represents interaction relation strength for the ith user toward the jth user under the specific concept.
In step S330, at least one interaction score for the specific user regarding the specific concept is calculated according to the relation matrix using an algorithm such as HITS (Hypertext-Induced Topic Search) algorithm. In this embodiment, a hub score and an authority score are obtained by HITS algorithm. In HITS algorithm, a graph relation matrix is input, and a hub value and an authority value are provided to respective nodes after processing, where the hub value represents the strength of outward connection for the node, and the authority value represents the strength of reception connection for the node. HITS algorithm is well known, and omitted herefrom. The interaction score for the specific user regarding the specific concept can be calculated according to the relation matrix corresponding to the specific concept using HITS algorithm.
In step S340, a characteristic score for the specific user regarding the specific concept is calculated according to the formula:
BU=GU+k×IAU, wherein BU represents the characteristic score, GU represents the concern score, IAU represents the interaction score, and k is a participation weighting for interaction behaviors. Similarly, participation weightings can be set according to respective behavior characteristics.
For interest characteristic, an interest characteristic score is calculated following the formula: IU=GU+α×HU, where IU is the interest characteristic score, GU is the concern score, HU is the hub score in the interaction score, and α is the participation weighting for the whole interaction behaviors in the interest characteristic. For specialty characteristic, a specialty characteristic score is calculated following the formula: EU=GU+β×AU, where EU is the specialty characteristic score, GU is the concern score, AU is the authority score in the interaction score, and β is the participation weighting for the whole interaction behaviors in the specialty characteristic. It is understood that, in this embodiment, the characteristic score for user toward specific concept is calculated in concept level, however, in some embodiments without the domain hierarchy and the term-concept association matrix, the characteristic score for user toward specific key term can be directly calculated in key term level.
In this example, the use frequency of key terms corresponding to respective users is shown in
Interaction behaviors in
The invention measures behavior characteristics of users according to semantics and interaction behaviors in an interactive network environment. Service providers can develop and provide related enhanced application based on characteristic scores of users toward respective concepts.
Systems and methods for measuring behavior characteristics, or certain aspects or portions thereof, may take the form of program code (i.e., executable instructions) embodied in tangible media, such as floppy diskettes, CD-ROMS, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the device thereby becomes an apparatus for practicing the methods. The methods may also be embodied in the form of program code transmitted over some transmission medium, such as electrical wiring or cabling, through fiber optics, or via any other form of transmission, wherein, when the program code is received and loaded into and executed by a machine, such as a computer, the device becomes an apparatus for practicing the disclosed methods. When implemented on a general-purpose processor, the program code combines with the processor to provide a unique apparatus that operates analogously to application specific logic circuits.
While the invention has been described by way of example and in terms of preferred embodiment, it is to be understood that the invention is not limited thereto. Those who are skilled in this technology can still make various alterations and modifications without departing from the scope and spirit of this invention. Therefore, the scope of the invention shall be defined and protected by the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
95140026 A | Oct 2006 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
5243520 | Jacobs et al. | Sep 1993 | A |
5675707 | Gorin et al. | Oct 1997 | A |
5991710 | Papineni et al. | Nov 1999 | A |
6006221 | Liddy et al. | Dec 1999 | A |
6112202 | Kleinberg | Aug 2000 | A |
6112203 | Bharat et al. | Aug 2000 | A |
6122647 | Horowitz et al. | Sep 2000 | A |
6167397 | Jacobson et al. | Dec 2000 | A |
6285999 | Page | Sep 2001 | B1 |
6321220 | Dean et al. | Nov 2001 | B1 |
6353825 | Ponte | Mar 2002 | B1 |
6356899 | Chakrabarti et al. | Mar 2002 | B1 |
6363379 | Jacobson et al. | Mar 2002 | B1 |
6457028 | Pitkow et al. | Sep 2002 | B1 |
6499021 | Abu-Hakima | Dec 2002 | B1 |
6560600 | Broder | May 2003 | B1 |
6636862 | Lundahl et al. | Oct 2003 | B2 |
6658623 | Schilit et al. | Dec 2003 | B1 |
6738678 | Bharat et al. | May 2004 | B1 |
6766316 | Caudill et al. | Jul 2004 | B2 |
6925432 | Lee et al. | Aug 2005 | B2 |
7010527 | Alpha | Mar 2006 | B2 |
7028024 | Kommers et al. | Apr 2006 | B1 |
7117206 | Bharat et al. | Oct 2006 | B1 |
7165069 | Kahle et al. | Jan 2007 | B1 |
7216122 | Ohno | May 2007 | B2 |
7281005 | Canright et al. | Oct 2007 | B2 |
7281022 | Gruhl et al. | Oct 2007 | B2 |
7386542 | Maybury et al. | Jun 2008 | B2 |
7433876 | Spivack et al. | Oct 2008 | B2 |
7493320 | Canright et al. | Feb 2009 | B2 |
7523085 | Nigam et al. | Apr 2009 | B2 |
7596552 | Levy et al. | Sep 2009 | B2 |
7600017 | Holtzman et al. | Oct 2009 | B2 |
7660783 | Reed | Feb 2010 | B2 |
7680812 | Canright et al. | Mar 2010 | B2 |
7698640 | Krieglstein | Apr 2010 | B2 |
7716226 | Barney | May 2010 | B2 |
7725414 | Nigam et al. | May 2010 | B2 |
7725565 | Li et al. | May 2010 | B2 |
7752186 | Abajian | Jul 2010 | B2 |
7774335 | Scofield et al. | Aug 2010 | B1 |
7783639 | Bharat et al. | Aug 2010 | B1 |
7792827 | Amitay et al. | Sep 2010 | B2 |
7831545 | Betz | Nov 2010 | B1 |
7844483 | Arnett et al. | Nov 2010 | B2 |
7844484 | Arnett et al. | Nov 2010 | B2 |
7844590 | Zwicky et al. | Nov 2010 | B1 |
7877345 | Nigam et al. | Jan 2011 | B2 |
7885913 | Weber et al. | Feb 2011 | B2 |
7886024 | Kelly et al. | Feb 2011 | B2 |
7966291 | Petrovic et al. | Jun 2011 | B1 |
7970766 | Shamsi et al. | Jun 2011 | B1 |
7987115 | Shih et al. | Jul 2011 | B2 |
20020038350 | Lambert et al. | Mar 2002 | A1 |
20020107858 | Lundahl et al. | Aug 2002 | A1 |
20020116174 | Lee et al. | Aug 2002 | A1 |
20020129014 | Kim et al. | Sep 2002 | A1 |
20030101286 | Kolluri et al. | May 2003 | A1 |
20030220912 | Fain et al. | Nov 2003 | A1 |
20040066401 | Bushey et al. | Apr 2004 | A1 |
20040158544 | Taekman et al. | Aug 2004 | A1 |
20040243270 | Amirthalingam | Dec 2004 | A1 |
20050074738 | Tomlinson et al. | Apr 2005 | A1 |
20050171946 | Maim | Aug 2005 | A1 |
20050177805 | Lynch et al. | Aug 2005 | A1 |
20060020591 | Kommers et al. | Jan 2006 | A1 |
20060026013 | Kraft | Feb 2006 | A1 |
20060059119 | Canright et al. | Mar 2006 | A1 |
20070088609 | Reller et al. | Apr 2007 | A1 |
20070136248 | Sarid et al. | Jun 2007 | A1 |
20080059451 | Musgrove | Mar 2008 | A1 |
20080104128 | Drayer et al. | May 2008 | A1 |
20080114755 | Wolters et al. | May 2008 | A1 |
20090138351 | Shih et al. | May 2009 | A1 |
Number | Date | Country | |
---|---|---|---|
20080103721 A1 | May 2008 | US |