The disclosure relates generally to monitoring and predicting the health of an online community.
Customer communities are starting to become an integral part of most enterprises. IDC predicts that by 2017, 80% of Fortune 500 companies will have an active customer community. Consequently it is important to measure the performance and vibrancy of a community. The traditional approach involves looking at many individual metrics in isolation. However, simple activity-counter metrics, in isolation, are often insufficient to give a big picture, because a community has so many moving parts. Moreover, optimizing one set of metrics can inadvertently hurt other metrics.
A disclosed community health index (CHI) is a single score that allows a user to quickly gauge how a community is doing. The disclosed community health index includes multiple health factor scores, such as traffic, content, membership, liveliness, interaction, and responsiveness. This is a powerful diagnostic tool because whenever CHI drops, community administrators can examine the health factor scores to figure out the problem. Furthermore, the drill down capability allows community managers to identify which part of the community is causing the problem. Some implementations prescribe a course of action to alleviate the problem.
Disclosed are methods for creating and maintaining a vibrant and resilient online social community.
People participate in social media communities to share their interests, to provide recommendations and preferences, and to look for help or advice from their peers. These online communities in turn have provided businesses ways to cut down on costs related to product support and to provide brand marketing and advocacy. It is useful for businesses to maintain vibrant and resilient communities where people can continually participate and share their experiences about products and/or services. This disclosure presents methods for maintaining vibrancy and resiliency for an online social community.
In some implementations, the methods include: (a) measuring specific activities within an online community and storing the data as activity metrics; (b) computing several Community Health Index factors, where distinct sets of activity metrics are grouped to compute each aggregated factor, and the factors are further aggregated to compute an overall Community Health Index; (c) defining a set of tolerance levels to monitor for each of the aggregated Community Health Index factors, which specify when a community manager should take action; and (d) defining a set of actions to take to remedy problems associated with any aggregated Community Health Index factor.
Previously, community managers tasked with keeping a community vibrant would have to examine many individual metrics (e.g., 10-50 metrics). By optimizing one metric, they may end up hurting another metric. Disclosed systems and methods provide a holistic view of community health and provide drill down capability. With this, community managers can focus on the recommendations that make a community healthy rather than figuring out what is wrong with the community.
In accordance with some implementations, a method of measuring the health of online communities is performed at a computing device (e.g., one or more servers 1000) having one or more processors and memory. The memory stores one or more programs configured for execution by the one or more processors. The process computes a plurality of health factors for an online community. Each health factor is computed based on historical data of the online community, and each health factor measures human interaction with the online community. The process normalizes each of the health factors and combines the health factors to compute a community health index. The process then displays the community health index to a user on a display associated with the computing device, thereby enabling the user to predict the future health of the online community.
In some implementations, the plurality of health factors includes one or more of: (1) growth in membership of the online community; (2) human page views of web pages for the online community; (3) quantity and quality of posts to the online community; (4) liveliness of the online community; (5) interaction with the online community; and (6) responsiveness of the online community.
In some implementations, the plurality of health factors includes a health factor that tracks a growth rate of members registered with the online community.
In some implementations, the plurality of health factors includes a health factor that tracks a number of pages views of web pages in the online community.
In some implementations, the plurality of health factors includes a health factor that estimates quantity and quality of posts to the online community by computing a product of a first number representing number of posts to the online community and a second number representing number of page views of web pages in the online community.
In some implementations, the plurality of health factors includes a health factor that measures liveliness of the online community. In some implementations, liveliness is computed as a function whose argument is number of posts to the online community divided by number of boards in the online community. In some implementations, the function includes a second argument that represents an expected number of posts per board during a specified unit of time.
In some implementations, the plurality of health factors includes a health factor that measures interaction of the online community. In some implementations, interaction is computed as an aggregated function of posting threads, including both the number of replies in each thread and the number of distinct users in each thread.
In some implementations, the plurality of health factors includes a health factor that measures responsiveness of the online community. In some implementations, responsiveness is computed as an average response time in posting threads.
In some implementations, normalizing each of the health factors includes applying a statistical distribution model.
In some implementations, normalizing each of the health factors includes computing a quantile for each health factor that ranks the health factor for the online community against other online communities.
In some implementations, combining the health factors to compute a community health index uses a generalized mean function of the plurality of health factors.
In some implementations, combining the health factors to compute a community health index uses a weighted average of the plurality of health factors.
In some implementations, combining the health factors to compute a community health index scales the computed value to a specific range.
In some implementations, displaying the community health index includes providing a user interface shat shows both the community health index and the individual health factors that were used to compute the community health index.
In some implementations, displaying the community health index includes providing a user interface that shows a graph of how the community health index has changed over time.
In accordance with some implementations, a computing device includes one or more processors and memory. The memory stores one or more programs configured for execution by the one or more processors. The one or more programs include instructions for performing any of the methods described above.
In accordance with some implementations, a non-transitory computer readable storage medium stores one or more programs configured for execution by a computing device having one or more processors and memory. The one or more programs include instructions for performing any of the methods described above.
Disclosed implementations provide methods and systems for computing the health of an online community, both retrospectively and prospectively. Implementations compute a community health index (CHI), which can be compared against other communities or historical values for the same community. Because of the acronym CHI, the community health index is sometimes represented by the Greek letter χ.
Some implementations for computing a community health index have been built on data warehouse technology, where the metrics that are used as input to the CHI algorithms are stored in mySQL tables. This solution uses simple counters that count when certain code sequences are reached or executed. These metrics are prone to interpretation errors because there is little contextual data available. In addition, such a process typically does not scale well.
As illustrated in
In some implementations, the events are stored as text strings in a Hadoop Distributed File System (HDFS). Some implementations derive the metrics that are used to compute CHI from these events using Hive and user defined functions (UDFs).
Because each event includes the user agent string, implementations can determine whether a page view action was initiated by an automated bot or by a human (e.g., using the WURFL package). The human contributed page views are aggregated and used as the input to a “traffic” health factor in some community health index computations.
As illustrated in
is 1 when the posts per board is at the average for a healthy online community, is greater than 1 when the posts per board is above average, and is less than 1 when the posts per board is less than average. In particular, the computed value for L in the formula 1600 is 1 when the posts per board is average.
The second formula 2002 computes the average tR of the response times over all threads, where Θ again denotes the total number of threads. In this example, the average weights all of the threads equally, but in some implementations, the threads are weighted differently. For example, in some implementations, threads that are longer are weighted more heavily. In some implementations, the weights are based on content analysis of the threads.
The third formula 2004 computes the responsiveness R of the community by comparing the average response time tR to an expected healthy response time te. Because lower response times indicate better health, the formula computes
As illustrated in
In these examples, each of the six rows 2150-1 to 2150-6 represents a different health factor. The first row 2150-1 corresponds to the calculations performed by the community traffic health factor module 1044. The second row 2150-2 corresponds to the post quantity/quality health factor module 1046. The third row 2150-3 corresponds to the membership growth health factor module 1042. The fourth row 2150-4 corresponds to the liveliness health factor module 1048. The fifth row 2150-5 corresponds to the interaction health factor module 1050. The sixth row 2150-6 corresponds to the responsiveness health factor module 1052.
In the first column 2110 in each row, raw health factors are computed for each board in each community. As noted above, the data is typically grouped into time periods such as weeks or months. The second column 2112 in each row illustrates how the data is aggregated for a community (e.g., aggregated over all boards within a community). The aggregation is performed differently based on the health factor. As illustrated, for the “traffic” and “content” health factors, the aggregation is performed by summing (illustrated with the symbol Σ). For the membership health factor, there is no aggregation because membership growth is typically at the community level (this is illustrated with the symbol=). Finally, the “liveliness,” “interaction,” and “responsiveness” health factors are aggregated by averaging (e.g., computing the mean average of all boards in a community). The averaging is illustrated by the symbol <•>. The third column 2114 in each row indicates the results of the aggregation specified in the second column 2112, and indicates a Greek or Roman letter used to refer to the aggregated calculation. For example, “L” is the aggregated liveliness health factor and μ is the community membership growth factor.
The fourth column 2116 in each row illustrates creating a log-transformed histogram of the data, which includes the data for many different time periods and/or many different communities. The histograms can be used to determine how well the factor for one community compares to a baseline average.
The fifth column 2118 in each row illustrates a statistical distribution model that may be applied. Although
The last box 2120 in each row (see
As illustrated in
As illustrated in
In some implementations, the memory 1014 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices. In some implementations, the memory 1014 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. In some implementations, the memory 1014 includes one or more storage devices remotely located from the CPU(s) 1002. The memory 1014, or alternately the non-volatile memory device(s) within the memory 1014, comprises a non-transitory computer readable storage medium. In some implementations, the memory 1014, or the computer readable storage medium of the memory 1014, stores the following programs, modules, and data structures, or a subset thereof:
In some implementations, the health factor modules 1040 include a membership growth health factor module 1042, which computes the overall membership growth of an online social community. This is illustrated above in
In some implementations, the health factor modules 1040 include a community traffic health factor module 1044, which computes the number of human page views for an online social community. This is illustrated above in
In some implementations, the health factor modules 1040 include a post content health factor module 1046, which measures both the quantity and the quality of posts to an online social community. This is illustrated above in
In some implementations, the health factor modules 1040 include a liveliness health factor module 1048, which measures perception of activity level in an online social community. This is illustrated above in
In some implementations, the health factor modules 1040 include an interaction health factor module 1050, which measures the scope of engagement for an online social community. This is illustrated above in
In some implementations, the health factor modules 1040 include a responsiveness health factor module 1052, which measures the quality of engagement for an online social community. This is illustrated above in
Each of the above identified elements in
Although
In some implementations, the memory 314 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices. In some implementations, the memory 314 includes non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. In some implementations, the memory 314 includes one or more storage devices remotely located from the CPU(s) 302. The memory 314, or alternately the non-volatile memory device(s) within the memory 314, comprises a non-transitory computer readable storage medium. In some implementations, the memory 314, or the computer readable storage medium of the memory 314, stores the following programs, modules, and data structures, or a subset thereof:
Each of the above identified executable modules, applications, or sets of procedures may be stored in one or more of the previously mentioned memory devices and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, the memory 314 stores a subset of the modules and data structures identified above. Furthermore, the memory 314 may store additional modules or data structures not described above.
Although
The terminology used in the description of the implementations herein is for the purpose of describing particular implementations only and is not intended to be limiting of the invention. As used in the description of the invention and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations described herein were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various implementations with various modifications as are suited to the particular use contemplated.
This application claims priority to U.S. Provisional Application Ser. No. 62/072,929, filed Oct. 30, 2014, entitled “Systems and Methods to Monitor Health of Online Social Communities,” which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
62072929 | Oct 2014 | US |