According to Metcalfe's Law, the value of a network grows exponentially with the number of the nodes in the network. This premise holds true for people networks as well as digital networks. Also, Reed's law suggests that communities are composed of all the permutations of groups that can be formed within the overall population. Extracting the network value, however, can be a significant challenge. For instance, in an organization such as a medium or large corporation, much of the knowledge of the organization may be held by individuals, who may be considered subject matter experts (SMEs). When members of an organization need to solve a problem, they seek out SMEs, typically relying on their own personal networks, or extending to their associates networks. It is often the case that there is a relevant SME with the necessary knowledge, but that expert is outside the set of personal contacts reachable by the person seeking the knowledge. The knowledge or expertise of the SME is, therefore, not leveraged, and the optimal solution is either not achieved, or achieved at a greater cost and time. Also, as technologies develop and become more complex, solving a problem often requires the involvement of multiple experts from different disciplines. This requirement is often hindered by the typical organizational hierarchies, limiting the contacts among the right people, who might not even know each other's existence. Additionally, the faster pace of business and global competition requires faster development of solutions, further underscoring the need for quickly connecting the right people to address an opportunity.
In embodiments of the invention described below, a framework for developing and exploring a community network based on intellectual capital is provided. The network development framework is especially useful for connecting people in an organization based on the digital content objects they have generated. The framework includes an analytic approach for identifying related content objects and connecting people with similar interests or expertise via the connections between the content objects authored by the people. Content objects, as a form of digital assets of the organization in which the community network is to be built, may be in various forms. For instance, content objects may include white papers, patents, invention disclosures, technical reports, emails, etc. In some embodiments, concepts representing fields of expertise or interests may be automatically inferred from the content objects. The premise is that people implicitly report on their expertise or interests in the documents they create and in their communications. In this regard, the frequency of such references to the concepts pertaining to the expertise or interests may be reliably used to indicate how strongly the individuals are associated with the concepts.
Referring now to
Turning now to
By way of example.
There are various ways for evaluating the similarities among digital documents within a corpus. Based on a taxonomy, which can be manually constructed or automatically derived from the documents, each document can be fully or partially associated with various concepts. One document similarity assessment method is the Vector Space Model (VSM). Under VSM, each document is represented as a vector in the space of all available words. The ith entry holds the number of times the ith word appears in the document. All the document vector's form a document matrix D (see
Another similarity evaluation method, which is a modification of the VSM method, is Latent Semantic Indexing (LSI) or Latent Semantic Analysis (LSA). LSA computes the singular vectors that correspond to the largest singular values of the matrix that includes all documents represented as columns using VSM. Then, a new representation of a document is formed by calculating its projections onto those first singular vectors. The similarity between two documents is defined as the cosine distance between the two document vectors represented as projections onto the first singular vectors.
Another embodiment of the invention utilizes a document similarity method that leverages the idea of LSI, and enhances it with semantic topics computed by a Principal Atoms Recognition In Sets (PARIS) approach. The PARIS approach handles words as sets. Given a large number of sets, PARIS detects principal sets of elements that tend to frequently appear together in the data. The PARIS approach allows non-exact repetitions of the detected patterns in the data, and allows additional elements in the input sets that are not covered by any of the detected sets. Applying PARIS to the documents in the corpus results in sets of words that tend to appear together in many documents. These sets of words could be used to represent “concepts” discussed in the documents in the given corpus.
For the similarity calculation, the corpus of documents is represented as a binary matrix D, such that each document appears as a column {Di}i=n. An entry Di(j) equals 1 if the word j appears in the document i. As in LSI, the first M singular vectors of this matrix, corresponding the largest singular values, are computed and denoted by {Lm}m=M. A representation of the ith document over the singular vectors Pi is computed by projecting the relevant column on those singular vectors, resulting in M coefficients. Pi(m). In addition, the PARIS analysis is applied on the representing matrix D, which results into sets of words {Ai}i=1K that frequently appear together. Each such set of words Ai is referred to as an atom, where Ai(j)=1 if the jth word is included in the ith atom. For illustration,
In one embodiment, document similarity is computed as the cosine distance between the vectors that represent the documents over the latent concepts and the atoms. Specifically, first, the average support of each atom AS over the whole corpus is computed by
An element AS(j) is the average over all documents in the corpus of the ratio of words from the jth atom that appear in the documents, raised to the power of τ. The relative atoms' frequency of the ith document, RFi, is defined as the relative support of all atoms in the ith document, computed by
A representation of each document in the corpus is defined by Ri=[Pi, ρ.RFi], where P, is the LSI projection, and ρ is the constant that specifies the weight ratio between the LSI coefficients and the PARIS support. The similarity between documents i and j in the corpus is then computed as the cosine distance between the two representations,
The similarity computation may be updated whenever the document corpus evolves so as to take into account the new items. It should be noted that the similarity computation described above is only one approach to evaluating the similarity (or relevance) between two documents in a give corpus, and the invention may be implemented using other methods of similarity computation to link content objects in the intellectual capital graph.
Once the IC graph is constructed, information regarding social networking inside the organization can be derived using the graph. Interests can be inferred through content objects produced by the individuals. People are related to other people and/or concepts via paths on the IC graph that go through the content objects. In other words, people are connected to each other and to concepts by means of the content objects they created, and one person is related to another if they create similar contents.
In some embodiments of the invention, an interest flow analysis is applied to the IC graph to answer networking questions or queries related to the intellectual capital of the organization. For example, the networking questions may be: “Who is relevant to me in terms of common interests or expertise?”, “Who are the experts on the topics represented by documents X, Y, and Z?”, etc. The interest flow computation starts from a “focus node” or a set of “focus nodes,” and propagates along a path or paths to a “query node.” By way of illustration.
As mentioned above, each node or each edge may be assigned a certain weight, and the interest flow from one node to others can take into account the weights. The functional dependence on the weight of each edge or node passed in the interest flow process can be selected depending on the type of edge or node, and may be adjusted based on the data being analyzed. For instance, when the interest flows through an edge, the weight of the edge may function as a simple multiplier to the interest flow. Alternatively, as an example, the edge weight to the Nth power may be used as a multiplier. This tends to have the effect of magnifying the differences in the weights of edges, and may be useful for differentiating the edge connections when their weights are similar. Other types of functional dependence may be chosen based on the nature of the edge and other factors.
For example, in the graph of
To illustrate how the interest flow process is used to compute relevancy between two people, a simple numerical example is provided below with reference to
The interest then flows from each of these content nodes to other content nodes through the similarity edges. In this example, the content node 152 is connected to the content node 157 by a similarity edge with a weight of 0.4, and the content node 153 is connected to the content node 157 by a similarity edge with a weight of 0.25. The interest flowing from Tim to the content node 157 is 0.2*0.4+0.1*0.25=0.105. The content node 153 is also connected to the content node 162 via a similarity edge with a weight of 0.125, so the interest reaching the content node 158 is 0.2*0.125=0.025. Both the content objects 157, 158 are authored by Mey, and are connected to the people node 170 for Mey by author edges with weights of 0.66 and 0.33, respectively. The total interest that has flowed from Tim to Mey is then 0.105*0.66+0.025*0.33=0.078. Thus, the relevance of the Mey to Tim is indicated by the value 0.078. This interest can further flow from Mey to his manager Ruth via the manager edge 172, which has a weight of 0.33 (1 divided by the three individuals reporting to Ruth). As a result, the interest flow from Tim via Mey to Ruth is 0.078*0.33=0.026.
The interest flow analysis on the IC graph can be the foundation of many different types of social networking tools can be provided. For instance, a tool may be provided to suggest a list of people in the organization that a focus person may be interested in talking to or collaborating. This list may be compiled, for example, by applying the interest flow analysis to compute values of relevancy of other people to the focus person. A selected number (e.g., 10) of people with the highest relevancy scores may be identified, and a filter may be applied so that people that the focus person already knows well, such as the coauthors, manager, and direct reports of the focus person, are not included in the list. This list of people of comment interests or expertise may then be presented to the focus person. In this regard, graphical user interface applications may be employed to assist the user to visualize the networking information and to further explore the network.
For example,
In one embodiment, a tool (e.g., the matching engine 240 in
To compile such a list, the tool first computes the interest score I0 between each two conference attendees, as described above. Next, the tool sets to zero the interest values between people with the same second-level manager and between coauthors. The interest score between persons x and y is then turned into a symmetric score by defining I(x,y)=I0(x,y)+I0(y,x). Doing so implies that the organization will benefit from introducing between two persons x and y the sum of interests that flow from x to y and from y to x. The interest matrix I for the conference attendees is now a symmetric N×N matrix representing a clique graph with weighted edges, where the edge between people nodes i and j reflects the “potential benefit” for the organization from introducing these two persons. The tool than generates the suggested attendee introductions list using the interest matrix. This is done by detecting a sub-graph that consists of all nodes in which each node has an out degree of K, that results in the maximal benefit for the organization. The individual list of K suggested introductions for each conference attendee can be sent, for example, by email to that attendee prior to the conference, so that the attendee can contact the people on the list and make plans to meet them at the conference. It should be noted that this approach of suggesting pairwise introductions is not limited to meeting people at conferences and can be applied to various contexts involving social gatherings. For example, it can be used for dinner placement, grouping for crowdsourcing, etc.
The networking analyzer 226 provides various functions to allow a user to explore the IC graph 236 to derive various types of networking information, such as people of relevancy, suggested attendee introductions, and people with a particular type of expertise, as described above. To that end, the networking analyzer 226 includes network analytics tools to generate the desired networking information by analyzing the IC graph. For instance, the networking analyzer 226 includes an interest flow analyzer 258 for applying interest flow analyses to the IC graph. The networking analyzer 226 also includes a matching engine 240 for grouping persons with similar expertise, identifying people with similar expertise to a locus person, and finding a network of related experts, etc. The networking analyzer 226 further includes graphic user interface tools 242 for providing graphic representations 254 of the networking information on a display device 256 for viewing by a user.
The IC networking services module 222 can be implemented as machine-readable instructions stored on a storage medium and executable on a processor 252. The processor 252 is connected to the storage medium 262 and to a network interface 250. The storage medium 262 can be implemented as one or more computer-readable or machine-readable storage devices, including DRAMS, SRAMS, flash drives, hard drives, optical storage devices, etc. The computer-executable instructions of the IC networking services module 222 may be stored in the storage medium 262, or on a separate storage medium that is non-transitory. The storage medium 262 may be used to store the input data for the IC networking services module 222, such as the content objects and directory information, as well as the output data of the IC networking services module, such as the IC graph, the networking information generated by the networking service tools, and the visual display data for display by the display device. Alternatively, the input and output data of the IC networking services module 222 may be received from and transmitted to a data network 260, such as the intranet of an organization or the internet, or a combination thereof.
As described above, a networking framework based on intellectual capital is provided to enable people to find and interact with other people in an organization based on expertise and common interests. Besides the multiple social networking scenarios described above, the expertise identification and inter-person relevancy evaluation capabilities can be useful in many other situations, such as forming optimal teams for complex crowd sourcing problems, forming teams to review inventions and attend invention workshops, identifying mentors for human resource purposes, etc. The possible ways of benefiting from this intellectual capital networking approach are too many to enumerate here.
In the foregoing description, numerous details are set forth to provide an understanding of the present invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these details. While the invention has been disclosed with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover such modifications and variations as fall within the true spirit and scope of the invention.
This application claims the priority or U.S. Provisional Application 61/494,239, filed Jun. 7, 2011.
Number | Date | Country | |
---|---|---|---|
61494239 | Jun 2011 | US |