This invention relates to computer systems and methods for performing marketing research using social and information analytics based on user-generated content.
In the current information age, companies spend over 100 billion dollars per year advertising their industry, company, and/or products. The products may be advertised by TV, radio, newsprint, magazine, outdoors billboard, or on-line. In order to increase sales, companies and advertisers look for new consumers and try to create advertisements that will reach the company's target audience. However in order to reach the company's target audience, it is necessary to understand who is the company's target audience and the interests of the target audience. Many companies use information gathered by Nielsen Media Research to determine their target audience, to determine the interests of their target audience, and/or to position their products in the marketplace. However the information gathered by Nielsen Media Research is a statistical representation of data polled from a limited pre-selected group of people. This information may be biased based on the selection of the pre-selected group, the questions posed to the pre-selected group during the polling, and the timing of the polling.
To be more efficient and effective with its advertising, it would be helpful for a company to know what type of advertising, TV, radio, print, etc., works best for their product and what show, radio station, and/or magazine people in their target audience are most interested in and look at or listen to the most. To better reach the company's target audience, it would also be helpful to know when and what new interests are emerging. To increase consumers, it would be helpful for a company to know who is interested in similar or related products. It would be helpful to know who, in addition to being interested in, actually purchases the company's and/or the competitor's products and how they feel about those products and how strong that feeling is.
Accordingly, there is a need to gather more accurate information about audiences and their interests so that companies can know more about their target audience and be more efficient with their advertising.
One aspect of the present invention includes a computer-implemented method for performing marketing research using social and information analytics. In another aspect of the present invent, a server computer is specifically programmed for performing the method for marketing research. The method for marketing research includes the step of collecting user-generated content that is associated with a user from a data network. The method further includes the steps of, for each of a plurality of predetermined topics, determining if the user-generated content contains the predetermined topic and if it does, storing the user-generated content and user information associated with the user in an audience database associated with the predetermined topic. In one embodiment, the user-generated content includes following at least one of the plurality of predetermined topics. In one embodiment, the user information includes identity information. The user information may include post information determined from the user-generated content.
In a preferred embodiment of the present invention, the method includes the step of receiving from a researcher computer a master topic. The master topic may include one or more of the plurality of predetermined topics. The method further includes the steps of determining a master user list of user information associated with the master topic and, for each of the plurality of the predetermined topics except for the master topic, determining secondary user lists of user information associated with each of the predetermined topic. The master user list and the secondary user lists may be determined from user information stored in the audience database associated with each predetermined topic. The method includes for each of the plurality of the predetermined topics except for the master topic, determining a correlation associated with the predetermined topic and storing the predetermined topic and its associated correlation in a master topic affinity table. The method also includes the step of transmitting to the researcher computer marketing research information determined from the master topic affinity table. In one embodiment, the correlation associated with the predetermined topic includes at least one match between user information in the master user list and user information in the secondary user list associated with the predetermined topic.
In another embodiment, the method may further include limiting the predetermined topics stored in the master topic affinity table based on the correlation being greater than or less than a threshold. The threshold may be input by the researcher. The method may further include formatting, or ordering, the marketing research information determined from the master topic affinity table based on the value of the correlation.
In another embodiment, the user information may include post information determined from the user-generated content and the method further includes storing an accumulation of the post information in the master topic affinity table. The post information may include a sentiment associated with the predetermined topic, characteristics of language associated with the predetermined topic, purchase intent information associated with the predetermined topic, the klout of the user, demographic data about the user associated with the user-generated content, a timeframe of the user-generated content. The accumulation of the post information may be an addition or an average of the post information for each user-generated content. The method may include receiving from the researcher computer a user information selection and formatting the marketing research information based on the user information selection. Furthermore, the method may include receiving from the researcher computer an accumulation selection and the accumulation of the user information is based on the accumulation selection. The accumulation selection may be, for example, a time limitation or a demographic limitation that causes only user information meeting the accumulation selection to be accumulated in the master topic affinity table.
The aforementioned and other aspects, features and advantages can be more readily understood from the following detailed description with reference to the accompanying drawings wherein:
Before the present invention is described, it is to be understood that this disclosure is not limited to the particular embodiments described, as these may vary. It is also to be understood that the terminology used in the description is for purposes of describing the particular versions or embodiments only, and is not intended to limit the scope. It is to be understood that each specific element includes all technical equivalents that operate in a similar manner. In addition, a detailed description of known functions and configurations will be omitted when it may obscure the subject matter of the present invention.
Referring now to the drawings, wherein like reference numerals designate identical or corresponding parts throughout the several views. One aspect of the present invention is methods of performing marketing research. The methods, as described below, include collecting user-generated content and determining audiences of predetermined topics from the user-generated content; and determining information about the audience of the predetermined topic selected by a researcher and transmitting the information to the researcher. As used in this disclosure, user-generated content is defined as any content provided by any user to any data network; an audience of topic is defined as one or more users who are interested in a topic, where interest is correlated to any information on a data network related to the topic; and information about the audience is defined as any knowledge about the audience gleaned from collected user-generated content.
Shown in
As shown in
The server computer 5 is configured to isolate, into different audience databases 40, 45, and 50 (also called buckets), for example, (a) viewers of different television shows, (b) advocates for different brands and products, and/or (c) people actively discussing different topics, based on the user generated content 225 received by the server computer 5. As an example of (a), the server computer 5 may process the user generated content 225 to determine all the users that have viewed and or discussed a particular television show, based on the user generated content. For example, the server computer 5 may collect any information by users discussing a particular basketball sports update show from news websites, social networking websites (such as Facebook), bulletin board websites, websites with discussion forums and/or comments sections, blog websites, editorial websites, status update websites (such as Twitter), personal websites, and so forth. The information by the users (and any other information corresponding to these users that can be obtained) is collected into a bucket corresponding to the basketball sports update show. Then the process is repeated for a different type of television show, such as a Peruvian cooking television show. While these examples of (a) refer to television shows, it should be understood that a similar process can be employed to group into buckets information from (b) advocates for different brands and products (such as a drink, food, phone or car), and/or (c) people actively discussing any variety of different topics.
Each predetermined topic audience database 40, 45, and contains user-generated content 225 and user information 15 Referring to
The user information 15 may also include post information 250, such as klout of the user 255, sentiment of user-generated content 260, characteristics of language of the user-generated content 265, timeframe of the user-generated content 270, demographic data of the user 275, and/or purchase intent 280. It should be understood that the post information 250 may include anything that describes the user or the user-generated content 225. The post information 250 may be gathered and provided to the server computer 5 by a social media monitoring service as described above, may be collected by the server computer 5 from spidering different websites, or may be determined by the server computer 5. For example, the klout of a user 255, which may be the number of people that follow the user on Twitter or the number of people the user is LinkedIn with or is friends with on Facebook, may be provided to the server computer 5 from different websites, while the characteristics of language of the user-generated content 265 or purchase intent 280 may be determined by the server computer 5 using NLP analysis on the user-generated content 225, as known by one skilled in the art. In the embodiment where the user follows a predetermined topic 25, the user-generated content 225 may only be an indication of the following rather than actual content. Similarly, the post information 250 may only include information about the user. It should be understood that the post information 250 may be gathered from previously stored user-generated content 225 and other sources. Any information where the user identity 230 is the same may be captured and stored as post information 250. Knowledge of the user identity 230 allows the present invention to categorize interests of the user that may be gathered at various times and from various sources. The gathered interests of actual users, rather than a time correlation of topics that assumes a relationship, allows the researcher to discover new interests and emerging trends that are not possible by looking at small time windows. As described above the user identity 230 may be a Twitter handle or any other identifying name (typically not the users actual name). User-generated content (conversations, tweets, followings, etc.) from a particular user can be collected in real-time or can be retrieved from stored user-generated content previously collected, for example from the past few years.
Referring to
By way of example, the storage and processing of user-generated content 225 from Twitter API and Twitter Decahose will be described. The server computer 5 accesses the Twitter website and downloads Twitter API data for a topic. The Twitter API data contains a list of the identity of all current followers of the topic. The server computer 5 compares the list of the current followers with a stored list containing the identity of followers 230 of the predetermined topic previously downloaded and adds or deletes any changes. The server computer 5 may insert or update, depending on whether the follower is newly stored in the predetermined topics audience databases 40, 45, or 50, the user identity 230 in the audience database 40 and any post information 250 about the follower 25 that was previously collected and stored in any of the audience databases 40, 45, and 50. The Twitter Decahose is provided by GNIP and allows the server computer 5 to download and store all tweets and information related to the tweets. The server computer 5 parses the tweet from the information received from GNIP and determines if any of the predetermined topics in the predetermine topics list is in a tweet and stores (or tags) the tweet and the user identity information 15 associated with the tweet 20 in the audience database associated with the predetermined topic 40, 45, and 50. The server computer 5 may also build a map that identifies a list of all predetermined topics a particular user mentions in any tweet 20. The server computer 5 may also store post information 250 downloaded directly from the Twitter Decahose or information that has been determined from the tweet 20. As described above, the post information 250 may include demographic and psychographic information 275, characteristics of language 265, sentiment 260, purchase intent 280, etc. The demographic information 275 may include profession, gender, age, family status, race/ethnicity, employment status, and location. The psychographic information may include likes and interests, account categories followed, accounts followed, Twitter activity, klout score, Twitter Followers, time on Twitter, Tweets count, and platform used to tweet. The sentiment 260, purchase intent 280 and characteristics of language 265 may be determined by analyzing the tweet 20. Sentiment analysis may be performed as described in U.S. Pat. No. 7,996,210, the disclosure of which is incorporated by reference herein.
Referring to
In another embodiment, the master topic 205 may be a combination of predetermined topics that the research enters into the researcher computer 200. Referring to
Referring back to
Referring to
Once the master user list and the secondary user lists are determined, a correlation is determined between the master user list and each secondary user list. Referring to
The marketing research information may be equivalent to the master topic affinity table or it may be a formatted version of the master topic affinity table that can be displayed by the researcher computer 200. The method may include in some embodiments ordering or formatting the marketing research information determined from the master topic affinity table (step 430) (or the master topic affinity table may be what is ordered). The ordering or ranking may be based on the correlation. For example, if fans of the hybrid car talk about food item M twice as much as alcoholic beverage item N, then the ranked marketing research information may reflect that food item M has a higher ranking than alcoholic beverage item N. This may indicate to the researcher (i.e. advertisers) that food item M may generate more successful results from advertising to fans of hybrid cars. As described above hybrid cars may be a combination of predetermined topics. Similarly, ordering or ranking may be based on the post information 250. For example, if fans of the hybrid car talk about food item M in a more positive manner than beverage item N, then the ranked marketing research information may reflect that food item M has a higher ranking than alcoholic beverage item N. This may indicate to the researcher that food item M may generate more successful results from advertising to fans of the hybrid car.
Furthermore, the researcher may limit the correlation and/or the post information 250. For example, the researcher may only want to consider user-generated information from users who are women or who are over the age of 18. For example, many people may have opinions about hybrid cars, food item M, and beverage N, the researcher may be targeting women purchasers and therefore would want to limit the opinions about hybrid cars, food item M, and beverage N specifically to women. Similarly, the researcher may want to consider user-generated content 225 at a time before or after an event. For example the researcher may be interested in finding out if fans of the Seinfeld pay attention to advertisements on TV more than advertisements on a website such as Hulu. The ranking of an item can be determined after a commercial airs on the Hulu website showing Jerry Seinfeld promoting the item and the ranking of the item can be determined at another time after a commercial airs on TV showing Jerry Seinfeld promoting the item. One skilled in the art would understand that, similar to the predetermined Audience databases, the user lists and the master topic affinity table may be tagged memory locations rather than actual lists and tables stored in contiguous memory locations.
Accordingly, the method of the present invention may provide marketing research information that consists simply of a list of predetermined topics and for each predetermined topic, the number of users that are associated with the master topic 205 that are also associated with the predetermined topic. In addition, the method may provide marketing research information that is much more intricate, incorporating a multitude of characteristic of the user-generated content and allowing a researcher to select, to limit, and to order the marketing research information based on the researcher's needs. This versatility allows the researcher to better understand its target audience, the interests of the target audience, and the emergence of new interests. The method of the present invention also allows the researcher to collect unbiased information because the user-generated content 225 is unsolicited.
Shown in
Although the system implementing the method of the present invention has been described as a server computer 5, it should be understood that the method of the present invention may be executed on any computer system, client terminal and/or network-connected device. The computer system may include a data store that can comprise one or more structural or functional parts that have or support a storage function. For example, the data store can be, or can be a component of, a source of electronic data, such as a document access apparatus, a backend server connected to a document access apparatus, an e-mail server, a file server, a multi-function peripheral device (MFP or MFD), a voice data server, an application server, a computer, a network apparatus, a terminal etc. It should be appreciated that the term “electronic document” or “electronic data” as used herein, in its broadest sense, can comprise any data that a user may wish to access, retrieve, review, etc.
The data network 10 may be provided via one or more of a secure intranet or extranet local area network, a wide area network (WAN), any type of network that allows secure access, etc., or a combination thereof. Further, other secure communications links (such as a virtual private network, a wireless link, etc.) may be used as well as the network connections. In addition, the data network 10 may use TCP/IP (Transmission Control Protocol/Internet Protocol), but other protocols such as SNMP (Simple Network Management Protocol) and HTTP (Hypertext Transfer Protocol) can also be used. How devices can connect to and communicate over the networks is well-known in the art and is discussed for example, in “How Networks Work”, by Frank J. Derfler, Jr. and Les Freed, (Que Corporation 2000), and “How Computers Work”, by Ron White, (Que Corporation 1999), the entire contents of each of which is incorporated herein by reference.
The server computer 5 may be a special purpose device (such as including one or more application specific integrated circuits or an appropriate network of conventional component circuits) or it may be software configured on a conventional personal computer or computer workstation with sufficient memory, processing and communication capabilities to operate as a terminal and/or server, as will be appreciated to those skilled in the relevant arts. The server computer 5 includes a data network 10 interface for communications through a network, such as communications through the network 10. However, it should be appreciated that the subject matter of this disclosure is not limited to such configuration. For example, the computer may communicate with client terminals through direct connections and/or through a network to which some components are not connected. As another example, the devices need not be provided by a server that services terminals, but rather may communicate with the devices on a peer basis, or in another fashion. The system is not limited to a computer or server, but can be manifested in any of various devices that can be configured to communicate over a network and/or the Internet. The system may be any network-connected device including but not limited to a personal, notebook or workstation computer, a terminal, a kiosk, a PDA (personal digital assistant), a tablet computing device, a smartphone, a scanner, a printer, a facsimile machine, a multi-function device (MFD), a server, a mobile phone or handset, another information terminal, etc. Each device may be configured with software allowing the device to communicate through networks with other devices.
An example of a configuration of a multi-function device (MFD) includes a central processing unit (CPU), and various elements connected to the CPU by an internal bus. The CPU services multiple tasks while monitoring the state of the device. The elements connected to the CPU may include a scanner unit, a printer unit, an image processing device, a read only memory (for example, ROM, PROM, EPROM, EEPROM, etc.), a random access memory (RAM), a hard disk drive (HOD), portable media (for example, floppy disk, optical disc, magnetic discs, magneto-optical discs, semiconductor memory cards, etc.) drives, a communication interface (I/F), a modem unit, and an operation panel. Program code instructions for the device can be stored on the read only memory, on the HOD, or on portable media and read by the portable media drive, transferred to the RAM and executed by the CPU to carry out the instructions. These instructions can include the instructions to the device to perform specified ones of its functions and permit the device to interact with other network-connected devices. The operation panel includes a display screen that displays information allowing the user of the device to operate the device. The display screen can be any of various conventional displays (such as a liquid crystal display, a plasma display device, a cathode ray tube display, etc.), but is preferably equipped with a touch sensitive display (for example, liquid crystal display), and configured to provide the GUI based on information input by an operator of the device, so as to allow the operator to conveniently take advantage of the services provided by the system. The display screen does not need to be integral with, or embedded in, the operation panel, but may simply be coupled to the operation panel by either a wire or a wireless connection. The operation panel may include keys for inputting information or requesting various operations. Alternatively, the operation panel and the display screen may be operated by a keyboard, a mouse, a remote control, touching the display screen, voice recognition, or eye-movement tracking, or a combination thereof. The device may be a multifunction device (with scanner, printer and image processing) and in addition can be utilized as a terminal to download documents from a network.
Although the preferred embodiments of the invention have been described above by way of example only, it will be understood by those skilled in the art that modifications may be made to the disclosed embodiments without departing from the scope of the invention. For example, the post information 250 may include many other measurements and characteristics. The determination of the secondary user list (steps 310 and 320) may be performed prior to determining the master user list (step 305). Additionally, the marketing research information may be transmitted to a different computer on the data network 10 instead of or in addition to the researcher computer 200.
Furthermore, various embodiments described herein or portions thereof can be combined without departing from the present invention. For example, the collected user information may be determined from collected and stored user-generated content 225. Likewise, the master topic affinity table may include a variety of post information 250 and some of the post information 250 may be limited by the researcher while other post information 250 may not limited by the researcher.
The above-described embodiments of the present invention are presented for purposes of illustration and not of limitation, and the present invention is limited only by the claims that follow.
This application claims the benefit of the filing date of U.S. Provisional Patent Application No. 61/524,243 filed on Aug. 16, 2011, the disclosure of which is hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61524243 | Aug 2011 | US |