1. Field
This application relates generally to cloud computing, and more specifically to a system, method and apparatus for retrieval from online conversations and for finding relevant content for online conversations.
2. Related Art
Social networks can be used in enterprise for employees to exchange and/or discover knowledge ‘nuggets’. A knowledge nugget can be represented as an experience or documented as content. In one example, an employee can search for a successful customer RFP responses, content that includes a “to do” and “not to do” list, technical knowledge, etc. An employee may wish to solidify conversations based on topics. Here the challenge can be to isolate topics on conversations that are transient in nature. A topic of discussion may evolve as new feeds/posts come into the conversation. An employee may wish to track seemingly divergent discussions that may or may not converge. For example, in an archived or an ongoing conversation, isolate pans may converge or be identified as potentially converging or diverging. In a retrieved conversation, a method may be needed to identify and highlight converged parts and/or grey out diverged parts. This could help users to quickly retrieve relevant information. Based on the conversation, there could be content in a user's repository or an enterprise repository that could be shared in the conversation.
In one embodiment, a computer-implemented method of a retrieval from online conversations and for finding relevant content for online conversations can include the step of continuously associating mined attributes to a conversation. The method can include the step of identifying a portion of a conversation based on the continuous association and the step of providing a retrieval mechanism for the portion of the conversation. A real-time recommendation for knowledge sharing across an enterprise or for a particular user as part of the conversation can be provided. Optionally, the conversation comprises a current conversation or an archived conversation.
The Figures described above are a representative set, and are not an exhaustive with respect to embodying the invention.
Disclosed are a system, method, and article of manufacture for methods and systems of knowledge retrieval from online conversations and for finding relevant content for online conversations. The following description is presented to enable a person of ordinary skill in the art to make and use the various embodiments. Descriptions of specific devices, techniques, and applications are provided only as examples. Various modifications to the examples described herein can be readily apparent to those of ordinary skill in the art, and the general principles defined herein may be applied to other examples and applications without departing from the spirit and scope of the various embodiments.
Reference throughout this specification to “one embodiment,” “an embodiment,” ‘one example,’ or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
Furthermore, the described features, structures, or characteristics of the invention may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided, such as examples of programming, software modules, user selections, network transactions, database queries, database structures, hardware modules, hardware circuits, hardware chips, etc., to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art can recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, materials, and so forth. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
The schematic flow chart diagrams included herein are generally set forth as logical flow chart diagrams. As such, the depicted order and labeled steps are indicative of one embodiment of the presented method. Other steps and methods may be conceived that are equivalent in function, logic, or effect to one or more steps, or portions thereof, of the illustrated method. Additionally, the format and symbols employed are provided to explain the logical steps of the method and are understood not to limit the scope of the method. Although various arrow types and line types may be employed in the flow chart diagrams, and they are understood not to limit the scope of the corresponding method. Indeed, some arrows or other connectors may be used to indicate only the logical flow of the method. For instance, an arrow may indicate a waiting or monitoring period of unspecified duration between enumerated steps of the depicted method. Additionally, the order in which a particular method occurs may or may not strictly adhere to the order of the corresponding steps shown.
Example definitions for some embodiments are now provided.
Content blocks can be structured or unstructured digital information of any size that includes text, pictures, video, audio, and other modalities. As an example, a document can be a content block or a section of a document with its structure and any multi-media information that is present.
Feature vector can be an n-dimensional vector of numerical features that represent some object.
Hierarchical feature vector can be an n-dimensional vector where each element is itself a feature vector representing a certain aspect of the object.
Information retrieval (IR) can include the science of searching for information in or as documents or databases.
Online social network can be a platform to build social networks or social relations among people who share interests, activities, backgrounds or real-life connections. A social network service can include a representation of each user (e.g. as a profile), the user's social links, various messaging services (e.g. instant messaging, updates, microblog posts, etc.) and/or a variety of additional services. Social network sites can be web-based services. Example online social networks can include, inter alia: Facebook®, LinkedIn®, Twitter®, etc.
Request for proposal (RFP) can be a solicitation, often made through a bidding process, by an agency or company interested in procurement of a commodity, service or valuable asset, to potential suppliers to submit business proposals.
Search engine can include an information retrieval system designed to help find information stored on a computer system.
Strong similarity measure can be similarity measures above a specified value. A similarity measure (and/or similarity function) can be a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity measure exists, usually similarity measures are in some sense the inverse of distance metrics: they take on large values for similar objects and either zero or a negative value for very dissimilar objects. In some examples, a cosine similarity can used as a similarity measure.
Strong dissimilarity measure can be a dissimilarity measure above a specified value (e.g. a metric that produces a higher value as corresponding values in two compared vectors X and Y become less dependent and/or less alike).
Text-based search can include techniques for searching a single computer-stored document or a collection in a full text database. Text-based search can include a full-text search and/or searches based on metadata (e.g. titles, abstracts, selected sections, or bibliographical references, etc.) and/or on parts of the original texts represented in databases. For example, a search engine can examine all of the words in every stored document as it tries to match search criteria (e.g. text specified by a user).
Additional example definitions are provided herein.
Example Processes
The current invention relates to the fields of content mining in online social networks and discovery of relevant content (e.g. enterprise) social networks. The nature of a conversation can be identified by incrementally looking at the changes in the conversation and finding relevant information snippets. The relevant information snippet can be posted as responses to online conversations (e.g. an email conversation, an instant messaging conversation, a text messaging conversation, microblog posts, online social network messages, any combination thereof, etc.). In some embodiments, mined attributes can be used to identify parts of a current or an archived conversation based on the continuous association. A retrieval mechanism can be provided for parts of a conversation. Additionally, a real-time recommendation for knowledge sharing across the enterprise or for a particular user can be provided as part of the conversation.
It is noted that searching for content can include two dimensions. One dimension can include determining if content is available for the employee to reuse. Another dimension is to search in their own repository for content relevant that could enhance the conversation or allows knowledge sharing that improves productivity in enterprises. In such scenarios, enterprises can provide an archival system that allows search capabilities for finding relevant prior conversations. An enterprise can also allow users to look at current conversations for knowledge or for contributing to the thread.
One example embodiment, can be focussed on enterprise social networks through applicable to consumer social networks. An enterprise social network can be used in enterprise for internal discussions. Via the enterprise social network, the enterprise employees exchange and/or discover knowledge ‘nuggets’. A knowledge nugget can be represented as an experience or documented as content.
In one example, an enterprise employee, such as a service engineer or a sales person, can look at their enterprise online networks to find out the “to do” and the “not to do” list. There are two aspects to finding this out. One aspect is to search a whole or a part of conversation to search and find content available for reuse. A second dimension can be to search in their own repository for content relevant that could enhance the conversation or allows knowledge sharing that improves productivity in the enterprise. For an effective retrieval or contribution, especially in large enterprises with active social conversations, the ability to search or find relevance to parts of a conversation is crucial. Similarly, enterprise employees can share technical knowledge or best practices.
Another example is sales people responding to request-for-proposal (RFP). Often these RFPs consist of customer requirement in the form of questions and sales people respond to them by looking at the content they or their colleagues in the enterprise have. A typical salesperson responds to many such closely related but not identical RFPs. Some of the RFP questions could be already responded in earlier RFPs and shared in a common repository or in a discussion, especially the responses that are successful and that are not. Such knowledge or access to such knowledge enables sales people to quickly and effectively respond to RFPs. The ability to search partial or whole conversations based on the questions and reuse content or access pointers to content from an online enterprise network or to contribute to such networks will improve the productivity of the sales teams in an enterprise.
Accordingly, the enterprise can provide the following. The enterprise can deploy an archival system. The archival system can provide search capabilities for finding relevant prior conversations. The archival system can enable users to look at current conversations for knowledge and/or for contributing to the current conversation. Additionally, the archival system can solidify conversations based on topics. This topic association tracks convergence of the conversation and associates topics to parts of conversation for effective retrieval of relevant part of a conversation. For example, various topics can be isolated for conversations that are transient in nature. It is noted that a topic of discussion may evolve as new feeds/posts come into the conversation. The archival system can track seemingly divergent discussions that may or may not converge. Convergence of a topic depends on features such as the relative entropy of key phrases over a contiguous period of time, elimination of spurious user comments within a time frame based on their reputation, the (enterprise hierarchical) role of the contributing user, etc.
In one example, in an archived or an ongoing conversation, the archival system can isolate and identify parts of said conversation that converged, are yet to converge and/or that are diverging. In this way, the archival system can identify and highlight converged parts of a conversation. The archival system can also grey out diverged parts. This information can then be presented via a computer interface to help users to quickly retrieve to relevant information. For example, based on a present conversation, there could be content in a user's repository or an enterprise repository that could be shared in the conversation. This information can be retrieved by the archival system and presented to the user for incorporation into the present conversation.
Various methods to mine conversations by isolating parts of conversations are now provided. These methods can retrieve relevant conversations and/or sub-parts of a conversation in an intuitive way. Additionally, these methods can offer an intuitive way for users to find the areas of discussion of interest both for consumption as well as sharing in a larger conversation. It is noted that conversations can be of a transient nature.
One embodiment can be broken into the three phases.
Continuous Association of Meta-Data to Incremental Conversations
One aspect of process 200 is the capture of other meta-data that is generated during subsequent time intervals such as t2, t3. This meta-data can be relative to previous time intervals such as t1, t2, respectively. For example, in the case of keyword vectors, with each new part of conversations the key phrase values and other parameters based on contributor change (e.g. because the new parts of the conversation may or may not use those words/phrases). Process 200 associates the change in values as a meta-data vector along with the keyword vector. With this step, Process 200 has an association that maps meta-data (keyword vector and change in keyword scores across this time interval and previous time interval) to conversation with a time-stamp.
Retrieving and Recommendation
Second, step 406 proposes a system that acts on behalf of a user analysing content on their (e.g. could be per user, group, or entire enterprise) behalf. As new feeds/posts, comments, replies come in, in step 408, a watcher functionality uses the above indexing method and the above analysis to find out content nuggets that could be related to the contents of the discussion. In step 410, it automatically posts a link or suggests a link to the user that contains a ranked list of content blocks that could be relevant to the real-time discussion.
In step 412, the contents of the link are dynamic and follow the pattern of change in the conversation. The link posted in a conversation can give different results at different times based on how the conversation is proceeding. For example, if the query or conversation gets more specific or if the query gets more specific, the system proposed tracks those changes as described above and populates the contents of the link appropriately. This recommendation link is real-time and reacts to the current time interval.
Example Systems
Although the present embodiments have been described with reference to specific example embodiments, various modifications and changes can be made to these embodiments without departing from the broader spirit and scope of the various embodiments. For example, the various devices, modules, etc. described herein can be enabled and operated using hardware circuitry, firmware, software or any combination of hardware, firmware, and software (e.g., embodied in a machine-readable medium).
In addition, it can be appreciated that the various operations, processes, and methods disclosed herein can be embodied in a machine-readable medium and/or a machine accessible medium compatible with a data processing system (e.g., a computer system), and can be performed in any order (e.g., including using means for achieving the various operations). Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. In some embodiments, the machine-readable medium can be a non-transitory form of machine-readable medium.