Systems and methods for measuring behavior characteristics

Information

  • Patent Grant
  • 8055663
  • Patent Number
    8,055,663
  • Date Filed
    Wednesday, December 20, 2006
    17 years ago
  • Date Issued
    Tuesday, November 8, 2011
    12 years ago
Abstract
Systems and methods for measuring behavior characteristics. For at least one specific user, a first concern score for respective key terms is calculated according to use frequency of respective key terms of network content corresponding to the specific user and all users. A first relation matrix for at least one specific key term is calculated according to at least two users corresponding to respective interaction behaviors between the key terms and a type weighting corresponding to respective interaction behaviors. A first interaction score for the specific user regarding the specific key term is calculated according to the first relation matrix. A first characteristic score for the specific user regarding the specific key term is calculated according to the first concern score and the first interaction score.
Description
BACKGROUND OF THE INVENTION

1. Field of the Invention


The disclosure relates generally to measurement of behavior characteristics, and, more particularly to systems and methods that measure behavior characteristics of users according to semantics and interaction behaviors.


2. Description of the Related Art


With the expansion of the Internet, applications developed for users allow browsing and posting of comments via BBSs (Bulletin Board Systems). Users can publish articles via a specific web site or a dedicated web page. Currently, blog (web log) applications are popular, whereon user-owners can publish material, in addition to referencing material on other users' blogs. In such way, interaction behaviors between different users and/or articles are generated, implying behavior characteristics of users.


To strengthen loyalty and provide various enhanced services, service providers try to explore behavior characteristics of users from network content. US Application 2005/0108281 A1 analyzes email content in enterprises according to semantic hints using NLP (Nature Language Process) technology to recognize domain experts. In US Application 2006/0053156 A1, interested and trustworthy experts are recognized according to publications and comment records toward to specific articles in enterprise article databases. These methods, since only semantics of email content or behavior records of users are analyzed, can be applied to an open network environment such as blog environment having a large number of interaction behaviors. Additionally, since only related experts are recognized, behavior characteristics of users, such as personal interests, specialty, and other individual characteristics still cannot be explored from network content for service providers to develop related enhanced applications for users.


BRIEF SUMMARY OF THE INVENTION

Systems and methods for measuring behavior characteristics are provided.


An embodiment of a system for measuring behavior characteristics comprises a database and a processing module. The database stores network content for a plurality of users, where the network content comprises a plurality of key terms and a plurality of interaction behaviors therebetween. For at least one specific user, the processing module calculates a first concern score for respective key terms according to use frequency of respective key terms corresponding to the specific user and use frequency of respective key terms corresponding to all users. The processing module calculates a first relation matrix for at least one specific key term according to at least two users corresponding to respective interaction behaviors and a type weighting corresponding to respective interaction behaviors. The processing module uses an algorithm to calculate a first interaction score for the specific user regarding the specific key term according to the first relation matrix. The processing module calculates a first characteristic score for the specific user regarding the specific key term according to the first concern score and the first interaction score.


In an embodiment of a method for measuring behavior characteristics, a database is provided. The database stores network content for a plurality of users, where the network content comprises a plurality of key terms, and a plurality of interaction behaviors therebetween. For at least one specific user, a first concern score for respective key terms is calculated according to use frequency of respective key terms corresponding to the specific user and use frequency of respective key terms corresponding to all users. A first relation matrix for at least one specific key term is calculated according to at least two users corresponding to respective interaction behaviors and a type weighting corresponding to respective interaction behaviors. A first interaction score for the specific user regarding the specific key term is calculated according to the first relation matrix using an algorithm. A first characteristic score for the specific user regarding the specific key term is calculated according to the first concern score and the first interaction score.


Systems and methods for measuring behavior characteristics may take the form of program code embodied in a tangible media. When the program code is loaded into and executed by a machine, the device becomes an apparatus for practicing the disclosed method.





BRIEF DESCRIPTION OF THE DRAWINGS

The invention will become more fully understood by referring to the following detailed description with reference to the accompanying drawings, wherein:



FIG. 1 is a schematic diagram illustrating an embodiment of a system for measuring behavior characteristics;



FIG. 2 is a schematic diagram illustrating an example of interaction behaviors among network content;



FIG. 3 is a flowchart of an embodiment of a method for measuring behavior characteristics;



FIG. 4 is a type weighting table for interaction behaviors;



FIG. 5 shows an example of behavior characteristic measurement;



FIG. 6 shows use frequency of key terms corresponding to users;



FIG. 7 shows the interaction behaviors in the example of FIG. 5;



FIGS. 8A, 8B and 8C show relation matrices corresponding to respective concepts, respectively; and



FIG. 9 shows characteristic scores for respective users regarding respective concepts.





DETAILED DESCRIPTION OF THE INVENTION

Systems and methods for measuring behavior characteristics are provided.



FIG. 1 illustrates an embodiment of a system for measuring behavior characteristics.


The system for measuring behavior characteristics 100 comprises a database 110, a domain hierarchy 120, a term-concept association matrix 130, and a processing module 140. It is understood that type weightings 150 corresponding to respective interaction behaviors and participation weightings 160 of interaction behaviors for characteristic score calculation can be set in the system.


The database 110 stores network content for users, such as network articles in the network interaction environment, particularly, the blog environment. The network content can be fetched from the Internet via a data collection unit (not shown), or can be acquired via a data access interface provided by blog service providers. The network content comprises key terms, and interaction behaviors therebetween. The key terms may be tags and/or categories used to disclose the basic semantics of articles. Additionally, the interaction behaviors comprise comments, trackbacks, links, subscriptions, recommendations, blogrolls, and others. FIG. 2 illustrates an example of interaction behaviors among network content. As shown in FIG. 2, network content comprises blogs B1 and B2. In blog B1, user U1 publishes article A1 (201). In blog B2, user U2 publishes article A2 (202). Article A1 tracks back to article A2 (203), and links to article A2 (204). Article A1 further links to user U2 (205). Additionally, user U1 announces a comment toward article A2 (206), and recommends article A2 (207). Further, user U2 blogrolls blog B1 of user U1 in blog B2 (208), and subscribes to articles of blog B1 of user U1 (209). The articles and related key terms in the network content, and interaction behaviors between the network content can be retrieved in advance for further processing.


The domain hierarchy 120 comprises a plurality of concepts and associations therebetween. The concepts may be from an Ontology, such as DMOZ, Wordnet, or terms defined in a concept hierarchy. The term-concept association matrix 130 defines association degrees for respective key terms toward respective concepts. If m key terms and n defined concepts are provided, a m×n term-concept relation matrix M is generated, where Mij represents the association degree between the ith key term and the jth concept, and 0≦Mij≦1. It is understood that the term-concept association matrix 130 can be established in any manner. For example, tags and categories in network content are determined as key terms to be processed. A term association hierarchy map among the key terms is first established. The overlapping term in the term association hierarchy map and the Ontology is set as a connection point, and the association degrees of respective terms and respective concepts are calculated using related technologies of Ontology merge to obtain the term-concept association matrix 130. The processing module 140 performs the methods for measuring behavior characteristics of the invention, as discussed later.



FIG. 3 is a flowchart of an embodiment of a method for measuring behavior characteristics.


In step S310, for at least one specific user, a concern score for respective concepts is calculated according to use frequency of respective key terms corresponding to the specific user, use frequency of respective key terms corresponding to all users, and the term-concept relation matrix M. It is understood that while the invention can evaluate for respective users, measurement of behavior characteristics for a specific user is provided in this embodiment for explanation purposes.


In this step, a vector of use frequency for key terms fU={f1, f2, . . . , fm} of length m is constructed for the specific user, where fi is the frequency of the ith key term for representing article semantics in articles published by the specific user. In other words, fi represents use frequency of the ith key term corresponding to the specific user. Additionally, a vector of use frequency for key terms FALL={F1, F2, . . . , Fm} of length m is constructed, where Fi is the frequency of the ith key term for representing article semantics in articles published by all users. In other words, Fi represents use frequency of the ith key term corresponding to all users. Thereafter, the use characteristic of key terms of the specific user to all users is calculated, and the characteristic is converted to concept level to obtain a concern score vector GU for the specific user toward concepts (domains). The concern score vector GU is calculated as follows:








if








f
U





0

,



G
U

=




f
U




f
U






F
ALL




F
ALL





×
M


;


if








f
U




=
0


,


G
U






is





a





0





vector

,




where GU={G1, G2, . . . , Gm}, and Gj represents the concern score of the jth concept corresponding to the specific user.


In step S320, a relation matrix for at least one specific concept is calculated according to users corresponding to respective interaction behaviors, a type weighting corresponding to respective interaction behaviors, and association degree for key terms used in the interaction behaviors toward the specific concept. Similarly, while the invention can calculate relation matrices for respective concepts, the calculation of relation matrix for a specific concept is provided in this embodiment for explanation purposes.


As described, type weightings 150 corresponding to respective interaction behaviors can be set. FIG. 4 shows a type weighting table 400 for interaction behaviors. In this example, the behavior characteristics comprise interest, participation, specialty and popularity, and respective interaction behaviors have different type weightings for respective behavior characteristics. In this example, the type weightings of trackback to interest, participation, specialty and popularity are 0.9, 0.6, 0.9 and 0.6, respectively. The type weightings of blogroll to interest, participation; specialty and popularity are 0.4, 0.7, 0.4 and 0.7, respectively. The type weightings of link to interest, participation, specialty and popularity are 0.5, 0.7, 0.5 and 0.7, respectively. The type weightings of subscription to interest, participation, specialty and popularity are 0.8, 0.5, 0.8 and 0.5, respectively. The type weightings of comment to interest, participation, specialty and popularity are 0.4, 0.6, 0.4 and 0.6, respectively. It is understood that the type weighting table 400 is an example, and the disclosure is not limited thereto.


In this embodiment, each interaction behavior is represented as (UA, UB, S, IC). UA and UB are the two users in an interaction behavior, where UA is the user initiating the interaction behavior, and UB is the user receiving the interaction behavior. S is the type weighting of the interaction behavior. IC is a semantic concept involved in the interaction behavior, where IC is represented by (CN, AD), CN is the concept name, and AD is the association degree of key term toward the concept. It is noted that several semantic concepts may be involved in an interaction behavior. The relation matrix corresponding to a specific concept is calculated as follows:







R
ij

=





S
×
AD


,



UA
=
i







UB
=
j








wherein Rij represents interaction relation strength for the ith user toward the jth user under the specific concept.


In step S330, at least one interaction score for the specific user regarding the specific concept is calculated according to the relation matrix using an algorithm such as HITS (Hypertext-Induced Topic Search) algorithm. In this embodiment, a hub score and an authority score are obtained by HITS algorithm. In HITS algorithm, a graph relation matrix is input, and a hub value and an authority value are provided to respective nodes after processing, where the hub value represents the strength of outward connection for the node, and the authority value represents the strength of reception connection for the node. HITS algorithm is well known, and omitted herefrom. The interaction score for the specific user regarding the specific concept can be calculated according to the relation matrix corresponding to the specific concept using HITS algorithm.


In step S340, a characteristic score for the specific user regarding the specific concept is calculated according to the formula:


BU=GU+k×IAU, wherein BU represents the characteristic score, GU represents the concern score, IAU represents the interaction score, and k is a participation weighting for interaction behaviors. Similarly, participation weightings can be set according to respective behavior characteristics.


For interest characteristic, an interest characteristic score is calculated following the formula: IU=GU+α×HU, where IU is the interest characteristic score, GU is the concern score, HU is the hub score in the interaction score, and α is the participation weighting for the whole interaction behaviors in the interest characteristic. For specialty characteristic, a specialty characteristic score is calculated following the formula: EU=GU+β×AU, where EU is the specialty characteristic score, GU is the concern score, AU is the authority score in the interaction score, and β is the participation weighting for the whole interaction behaviors in the specialty characteristic. It is understood that, in this embodiment, the characteristic score for user toward specific concept is calculated in concept level, however, in some embodiments without the domain hierarchy and the term-concept association matrix, the characteristic score for user toward specific key term can be directly calculated in key term level.



FIG. 5 shows an example of behavior characteristic measurement. As shown, user A publishes an article 510. Article 510 has key terms “Travel”, “Taiwan”, and “Culture”, and a link (501) linking to an article 520 published by user D. Article 520 has key terms “Taiwan” and “Culture”. Additionally, user B publishes an article 530 to comment (502) on article 510 published by user A. User C publishes an article 540, which references (Trackback) (503) article 510 published by user A. In this example, it is assumed that the key terms are defined concepts in the domain hierarchy.


In this example, the use frequency of key terms corresponding to respective users is shown in FIG. 6, where user A had used “Travel”, “Taiwan” and “Culture” once, respectively. User D had used “Taiwan” and “Culture” once, respectively. Therefore, fA=(1, 1, 1), |fA|=√{square root over (3)}, fD=(0, 1, 1), |fD|=√{square root over (2)}, and FALL=(1, 2, 2), |FALL|=3, and GA=(1.73, 0.87, 0.87), GD=(0, 1.06, 1.06), and GB=GC=(0, 0, 0) are obtained according to the formula for concern score vector GU and the term-concept relation matrix M.


Interaction behaviors in FIG. 5 are shown in FIG. 7. Relation matrices corresponding to “Travel”, “Taiwan” and “Culture” are generated according to the interaction behaviors in FIG. 7, and respectively shown in FIGS. 8A, 8B and 8C. Thereafter, interaction score vectors corresponding to respective relation matrices are respectively calculated according to the relation matrices in FIGS. 8A, 8B and 8C using HITS algorithm, where the authority score vector and hub score vector corresponding to the relation matrix in FIG. 8A are A=(1, 0, 0, 0) and H=(0, 1, 2.25, 0), the authority score vector and hub score vector corresponding to the relation matrix in FIG. 8B are A=(1, 0, 0, 0) and H=(0, 1, 2.25, 0), and the authority score vector and hub score vector corresponding to the relation matrix in FIG. 8C are A=(1, 0, 0, 0) and H=(0, 0, 1, 0). It is noted that the representation of the interaction score vector is the interaction scores of respective users toward a specific concept. Finally, the characteristic scores of respective users toward respective concepts are obtained according to the concern scores and the interaction scores, as shown in FIG. 9. It is understood that the participation weightings for respective behavior characteristics are both 0.5 in this example.


The invention measures behavior characteristics of users according to semantics and interaction behaviors in an interactive network environment. Service providers can develop and provide related enhanced application based on characteristic scores of users toward respective concepts.


Systems and methods for measuring behavior characteristics, or certain aspects or portions thereof, may take the form of program code (i.e., executable instructions) embodied in tangible media, such as floppy diskettes, CD-ROMS, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the device thereby becomes an apparatus for practicing the methods. The methods may also be embodied in the form of program code transmitted over some transmission medium, such as electrical wiring or cabling, through fiber optics, or via any other form of transmission, wherein, when the program code is received and loaded into and executed by a machine, such as a computer, the device becomes an apparatus for practicing the disclosed methods. When implemented on a general-purpose processor, the program code combines with the processor to provide a unique apparatus that operates analogously to application specific logic circuits.


While the invention has been described by way of example and in terms of preferred embodiment, it is to be understood that the invention is not limited thereto. Those who are skilled in this technology can still make various alterations and modifications without departing from the scope and spirit of this invention. Therefore, the scope of the invention shall be defined and protected by the following claims and their equivalents.

Claims
  • 1. A system for measuring behavior characteristics stored in a non-transitory machine-readable storage medium, comprising: a domain hierarchy comprising a plurality of concepts and associations therebetween;a database storing network content for a plurality of users, where the network content comprises a plurality of key terms, and a plurality of interaction behaviors therebetween, wherein each of the interaction behaviors involves at least two users, and has a corresponding type weighting, in which different interaction behaviors have different type weightings;a term-concept association matrix recording association degrees for respective key terms toward respective concepts; anda processing module, for at least one specific user, calculating a concern score for respective concepts according to use frequency of respective key terms corresponding to the specific user, use frequency of respective key terms corresponding to all users, and the term-concept association matrix, calculating a relation matrix for at least one specific concept according to the at least two users corresponding to respective interaction behaviors, the type weighting corresponding to respective interaction behaviors, and the association degree for at least one key term used in the interaction behaviors toward the specific concept, calculating an interaction score for the specific user regarding the specific concept according to the relation matrix using an algorithm, and calculating a characteristic score for the specific user regarding the specific concept according to the concern score and the interaction score,wherein the processing module calculates the relation matrix according to the formula, wherein “x” in the formula means a cross product:
  • 2. The system of claim 1 wherein the network content comprises a plurality of network articles.
  • 3. The system of claim 2 wherein the key terms comprise at least a tag or category of respective network articles.
  • 4. The system of claim 2 wherein the interaction behaviors comprise comments, trackbacks, links, subscriptions, recommendations or blogrolls.
  • 5. The system of claim 1 wherein the processing module calculates the concern score according to the formula, wherein “x” in the formula means a cross product:
  • 6. The system of claim 1 wherein the algorithm comprises a HITS (Hypertext-Induced Topic Search) algorithm.
  • 7. The system of claim 6 wherein the processing module calculates the characteristic score according to the formula, wherein “x” in the formula means a cross product: BU=GU+k×IAU,wherein BU represents the characteristic score, GU represents the concern score, IAU represents the interaction score, and k is a participation weighting for the interaction behaviors.
  • 8. The system of claim 7 wherein the interaction score comprises a hub score or an authority score.
  • 9. A computer-implemented method for measuring behavior characteristics for use in a computer, wherein the computer is programmed to perform the steps of: providing a domain hierarchy comprising a plurality of concepts and associations therebetween;providing network content for a plurality of users, where the network content comprises a plurality of key terms, and a plurality of interaction behaviors therebetween, wherein each of the interaction behaviors involves at least two users, and has a corresponding type weighting, in which different interaction behaviors have different type weightings;providing a term-concept association matrix recording association degrees for respective key terms toward respective concepts;for at least one specific user, calculating a concern score for respective concepts according to use frequency of respective key terms corresponding to the specific user, use frequency of respective key terms corresponding to all users, and the term-concept association matrix;calculating a relation matrix for at least one specific concept according to the at least two users corresponding to respective interaction behaviors, the type weighting corresponding to respective interaction behaviors, and the association degree for at least one key term used in the interaction behaviors toward the specific concept;calculating an interaction score for the specific user regarding the specific concept according to the relation matrix using an algorithm; andcalculating a characteristic score for the specific user regarding the specific concept according to the concern score and the interaction score,wherein the relation matrix is calculated according to the formula, wherein “x” in the formula means a cross product:
  • 10. The method of claim 9 wherein the network content comprises a plurality of network articles.
  • 11. The method of claim 10 wherein the key terms comprise at least a tag or category of respective network articles.
  • 12. The method of claim 10 wherein the interaction behaviors comprise comments, trackbacks, links, subscriptions, recommendations or blogrolls.
  • 13. The method of claim 9 further comprising calculating the concern score according to the formula, wherein “x” in the formula means a cross product:
  • 14. The method of claim 9 wherein the algorithm comprises a HITS (Hypertext-Induced Topic Search) algorithm.
  • 15. The method of claim 14 further comprising calculating the characteristic score according to the formula, wherein “x” in the formula means a cross product: BU=GU+k×IAU,wherein BU represents the characteristic score, GU represents the concern score, IAU represents the interaction score, and k is a participation weighting for the interaction behaviors.
  • 16. The method of claim 15 wherein the interaction score comprises a hub score or an authority score.
  • 17. A non-transitory machine-readable storage medium comprising a computer program, which, when executed, causes a device to perform a method for measuring behavior characteristics, the method comprising: providing a domain hierarchy comprising a plurality of concepts and associations therebetween;providing network content for a plurality of users, where the network content comprises a plurality of key terms, and a plurality of interaction behaviors therebetween, wherein each of the interaction behaviors involves at least two users, and has a corresponding type weighting, in which different interaction behaviors have different type weightings;providing a term-concept association matrix recording association degrees for respective key terms toward respective concepts;for at least one specific user, calculating a concern score for respective concepts according to use frequency of respective key terms corresponding to the specific user, use frequency of respective key terms corresponding to all users, and the term-concept association matrix;calculating a relation matrix for at least one specific concept according to the at least two users corresponding to respective interaction behaviors, the type weighting corresponding to respective interaction behaviors, and the association degree for at least one key term used in the interaction behaviors toward the specific concept;calculating an interaction score for the specific user regarding the specific concept according to the relation matrix using an algorithm; andcalculating a characteristic score for the specific user regarding the specific concept according to the concern score and the interaction score,wherein the relation matrix is calculated according to the formula, wherein “x” in the formula means a cross product:
Priority Claims (1)
Number Date Country Kind
95140026 A Oct 2006 TW national
US Referenced Citations (75)
Number Name Date Kind
5243520 Jacobs et al. Sep 1993 A
5675707 Gorin et al. Oct 1997 A
5991710 Papineni et al. Nov 1999 A
6006221 Liddy et al. Dec 1999 A
6112202 Kleinberg Aug 2000 A
6112203 Bharat et al. Aug 2000 A
6122647 Horowitz et al. Sep 2000 A
6167397 Jacobson et al. Dec 2000 A
6285999 Page Sep 2001 B1
6321220 Dean et al. Nov 2001 B1
6353825 Ponte Mar 2002 B1
6356899 Chakrabarti et al. Mar 2002 B1
6363379 Jacobson et al. Mar 2002 B1
6457028 Pitkow et al. Sep 2002 B1
6499021 Abu-Hakima Dec 2002 B1
6560600 Broder May 2003 B1
6636862 Lundahl et al. Oct 2003 B2
6658623 Schilit et al. Dec 2003 B1
6738678 Bharat et al. May 2004 B1
6766316 Caudill et al. Jul 2004 B2
6925432 Lee et al. Aug 2005 B2
7010527 Alpha Mar 2006 B2
7028024 Kommers et al. Apr 2006 B1
7117206 Bharat et al. Oct 2006 B1
7165069 Kahle et al. Jan 2007 B1
7216122 Ohno May 2007 B2
7281005 Canright et al. Oct 2007 B2
7281022 Gruhl et al. Oct 2007 B2
7386542 Maybury et al. Jun 2008 B2
7433876 Spivack et al. Oct 2008 B2
7493320 Canright et al. Feb 2009 B2
7523085 Nigam et al. Apr 2009 B2
7596552 Levy et al. Sep 2009 B2
7600017 Holtzman et al. Oct 2009 B2
7660783 Reed Feb 2010 B2
7680812 Canright et al. Mar 2010 B2
7698640 Krieglstein Apr 2010 B2
7716226 Barney May 2010 B2
7725414 Nigam et al. May 2010 B2
7725565 Li et al. May 2010 B2
7752186 Abajian Jul 2010 B2
7774335 Scofield et al. Aug 2010 B1
7783639 Bharat et al. Aug 2010 B1
7792827 Amitay et al. Sep 2010 B2
7831545 Betz Nov 2010 B1
7844483 Arnett et al. Nov 2010 B2
7844484 Arnett et al. Nov 2010 B2
7844590 Zwicky et al. Nov 2010 B1
7877345 Nigam et al. Jan 2011 B2
7885913 Weber et al. Feb 2011 B2
7886024 Kelly et al. Feb 2011 B2
7966291 Petrovic et al. Jun 2011 B1
7970766 Shamsi et al. Jun 2011 B1
7987115 Shih et al. Jul 2011 B2
20020038350 Lambert et al. Mar 2002 A1
20020107858 Lundahl et al. Aug 2002 A1
20020116174 Lee et al. Aug 2002 A1
20020129014 Kim et al. Sep 2002 A1
20030101286 Kolluri et al. May 2003 A1
20030220912 Fain et al. Nov 2003 A1
20040066401 Bushey et al. Apr 2004 A1
20040158544 Taekman et al. Aug 2004 A1
20040243270 Amirthalingam Dec 2004 A1
20050074738 Tomlinson et al. Apr 2005 A1
20050171946 Maim Aug 2005 A1
20050177805 Lynch et al. Aug 2005 A1
20060020591 Kommers et al. Jan 2006 A1
20060026013 Kraft Feb 2006 A1
20060059119 Canright et al. Mar 2006 A1
20070088609 Reller et al. Apr 2007 A1
20070136248 Sarid et al. Jun 2007 A1
20080059451 Musgrove Mar 2008 A1
20080104128 Drayer et al. May 2008 A1
20080114755 Wolters et al. May 2008 A1
20090138351 Shih et al. May 2009 A1
Related Publications (1)
Number Date Country
20080103721 A1 May 2008 US