Through web-based media services like Twitter and Facebook, a user is exposed to a vast amount of messages from hundreds if not thousands of online sources and friends, culminating in massive amounts of information overload. Because the distinctions between each social network are not entirely clear, users feel obligated to juggle different applications and social networks just to keep up and be heard everywhere.
It would be one thing if all our social messages were part of a single, pars able, filtered stream. But instead, they come from all different directions. The situation is aggravated by social streams that originate in many competing silos. Users or consumers spend nearly as much time hopping between networks as we do meaningfully digesting and engaging the content within. Furthermore, the cross-posting across networks further exacerbates the noise and redundancy of the various networks and services.
One or more embodiments of the present invention are illustrated by way of example and are not limited by the figures of the accompanying drawings, in which like references indicate similar elements.
The following description and drawings are illustrative and are not to be construed as limiting. Numerous specific details are described to provide a thorough understanding of the disclosure. However, in certain instances, well-known or conventional details are not described in order to avoid obscuring the description. References to one or an embodiment in the present disclosure can be, but not necessarily are, references to the same embodiment; and, such references mean at least one of the embodiments.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not other embodiments.
The terms used in this specification generally have their ordinary meanings in the art, within the context of the disclosure, and in the specific context where each term is used. Certain terms that are used to describe the disclosure are discussed below, or elsewhere in the specification, to provide additional guidance to the practitioner regarding the description of the disclosure. For convenience, certain terms may be highlighted, for example using italics and/or quotation marks. The use of highlighting has no influence on the scope and meaning of a term; the scope and meaning of a term is the same, in the same context, whether or not it is highlighted. It will be appreciated that the same thing can be said in more than one way.
Consequently, alternative language and synonyms may be used for any one or more of the terms discussed herein, nor is any special significance to be placed upon whether or not a term is elaborated or discussed herein. Synonyms for certain terms are provided. A recital of one or more synonyms does not exclude the use of other synonyms. The use of examples anywhere in this specification including examples of any terms discussed herein is illustrative only, and is not intended to further limit the scope and meaning of the disclosure or of any exemplified term. Likewise, the disclosure is not limited to various embodiments given in this specification.
Without intent to further limit the scope of the disclosure, examples of instruments, apparatus, methods and their related results according to the embodiments of the present disclosure are given below. Note that titles or subtitles may be used in the examples for convenience of a reader, which in no way should limit the scope of the disclosure. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. In the case of conflict, the present document, including definitions will control.
Embodiments of the present disclosure include systems and methods for natural language processing optimized for microcontent.
Architecture Overview
The social intelligence system 100 can also have client applications 132 running on client nodes 130. Messages from third-party networks 150 come in through sync connectors 140 which run on the server nodes 110 as well as client nodes 130. Many services allow direct messaging pipelines from the client nodes 130 to the external services from the third-party networks 150.
Storage on the server nodes does not take place until an action is done on these messages (e.g. like, annotate, repost). This avoids storing vast amounts of messages for each user which can become very costly when thousands of messages come in per user per day.
The client applications 132 can run multiple layers of stream analytics. In one embodiment, all layers of stream analytics run in the client nodes 130 to reduce the amount of CPU burdens on the server nodes 110. In another embodiment, the social intelligence system can be made more decentralized by enabling client-to-client messaging between client nodes. In yet another embodiment, the client nodes 130 can be configured to run as a stand-alone agent in a cloud computer platform 160.
Data Layer
All data coming in from external services are normalized based on a standard. In one embodiment, the data are normalized based on Activity Streams Open Standard (“ASOS”). The normalization process makes sure that all messages are stored in a structured way and that there is a common vocabulary to communicate regarding to the social objects. For example, a “User Timeline on Twitter” is normalized to “A Person's Activities”.
External services can be queried by using a query language. In one embodiment, the external services are queried by using Activity Stream Query Language (ASQL). Using ASQL, data can be pushed and pulled between services. To support a new external service, the social intelligence system can implement a common Activity Stream interface for that service using ASQL.
After normalization of the message data, all messages are enriched with metadata (details of the metadata will be discussed in the following section). The social intelligence system then filters these messages by matching the metadata of the messages against specific rules. In one embodiment, the social intelligence system can use a specialized rule language for this which allows complex conditional statements in filters.
Natural Language Processing (“NLP”) and Annotation Capabilities
The social intelligence system utilizes a natural language processing stack optimized for microcontents. A microcontent is a small group of words that can be skimmed by a person to get a clear idea of the content of a content container such as a web page. Examples of microcontent include article headlines, page titles, subject lines, e-mail headings, instant messages, blog posts, RSS feeds, and abstracts. Such microcontent may be taken out of context and displayed on a directory, search result page, bookmark list, etc. Microcontents (e.g. Twitter messages, Facebook messages, and short message service (SMS) messages) are often written in a casual way. Such microcontents contain micro-syntax like repost directives and hashtags.
Parsing messages for the real-time web requires dealing with vast numbers of microcontents (e.g. small messages). That requires an efficient handling of the microcontents. In one embodiment, the natural language processing stack can be implemented in JavaScript. The natural language processing stack can run in any modern JavaScript environment (e.g. Webkit, NodeJS, Internet Explorer, etc.).
In one embodiment, the natural language processing stack extracts different types of metadata from the messages including topics, types, categories, languages, and others. The topics metadata include keywords that are most relevant to the messages. In some embodiments, the natural language processing stack assigns confidence scores to each of these keywords. The types metadata includes status of the messages, mood of the messages, whether the message is an offer, whether the message is a service, whether the message is a news. The categories metadata can include business, technology, entertainment, etc. The languages metadata indicates the language that the message's content is written in. Other metadata include uniform resource locators (“URLs”), mentions, hashtags, repost content, emoticons, content identification keys, etc.
Then the messages go through a part of speech (“PoS”) tagger layer 430. A part-of-speech tagging, also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition, as well as its context—i.e. relationship with adjacent and related words in a phrase, sentence, or paragraph. For instance, a form of PoS tagging is the identification of words as nouns, verbs, adjectives, adverbs, etc. In one embodiment, the PoS tagger layer 430 can use error-driven transformation-based tagger such as Brill tagger that is streamlined towards picking out nouns and proper nouns.
The fourth layer is a topic extractor layer 440. The topic extractor is unique in that it can do entity recognition without needing existing databases. For instance, topics like the name of Lady Gaga's latest album can be identified without having a prior dictionary of known entities. The topic extractor 440 extract topics from the messages and assigns confidence scores to the extracted topics based on capitalization and other factors. This approach is different than conventional entity extraction in which the extraction relies on dictionary lookups. For example, the latest album by Lady Gaga that was just released would not show up in such a dictionary and therefore will not be extracted by the conventional entity extraction approach as a topic. The topic extractor 440 can get such valuable information that cannot be identified by a dictionary.
The fifth layer is a type classification layer 450. In order to classify types and identify categories of messages, the natural language processing stack utilizes a database (also referred to as ontology) of classes together with a database of annotation rules. These annotation rules are composed of lists of names and patterns that help to assign the proper type tree to a message. For instance, more than a hundred message classes can be defined in the database with rules to detect these automatically.
In addition to the type classification process, the natural language processing stack can further perform sentiment analysis to classify the sentiment of each message. Sentiment can be positive, negative or neutral. The sentiment analysis can run fully on the client nodes (as well as server nodes) allowing for high scalability. In one embodiment, to make this possible and light weight, the natural language processing stack includes a sentiment classifier implemented as a Naive Bayesian classifier, which is trained offline on an annotated set of positive, negative and neutral messages. Then the resulting probabilistic model is send to the client node for the sentiment classification.
Stream Rank
After messages are enriched with the metadata, the messages and their metadata are put through a stream rank analyzer. The stream rank analyzer takes the messages for a given context and produce new intelligence in the form of metadata for a stream of the context. For example, when the topic Japan gets mentioned very often, this is obviously something significant. The stream rank analyzer takes the messages for the topic Japan and produce new intelligence in the form of metadata for a stream regarding Japan.
The stream rank analyzer can include two components. One is a clustering component for analyzing co-occurrences of metadata in a stream, and producing a graph data structure that can be used to recognize clusters of related data. Another is a profiling component for producing a flat list of most occurring and most trending (highest velocity or momentum) metadata.
The stream rank analyzer can look at any attribute of a message and rank the message. Examples of the attributes are: topics, types, mentioned people, authors, hashtags, links, media, keywords, author information, source information, etc. Each of these metadata attribute is counted and weighted in the clustering and profiling components. This results in a new data set of the most trending and relevant metadata items for a stream. In both clustering and profiling components, the time factor is used to look at which metadata items are gaining trend (i.e. velocity).
The results of stream rank analyzer can be used to provide all sorts of new intelligence for streams, including but not limited to: profiling interests, search personalization, targeted advertising, alerts of trending global events, etc.
Cloud and Trend Database
In one embodiment, the stream rank analyzer can run in a cloud computing platform in which the metadata of each message is converted into trend a trend database record. These trend database records have fields that store counts for specific conditions in which the message occurred in a given timeframe. For example, the message “I hate Christmas”, can result in the following metadata: Topic=Christmas, Sentiment=Negative. Hence, for the trend database record with topic name “Christmas” the stream rank analyzer increases the ‘sentiment negative counter’ with 1. As time progresses and more messages flow through the stream rank analyzer, the trend database reflects a state of all the trends that are happening in a stream. Queries regarding the topics can be performed on this trend database. For instance, a list of topics that had the most negative can be generated by count of negative sentiment.
Trend database records are created for any piece of metadata regarding a message, including topics, types, hashtags, mentioned people, author, links, geographic segments, ethnicity, gender, etc.
For instance, the stream rank analyzer can use the following attributes as counters for the trend database record:
For each of these counters except for volume, the rank stream analyzer can also use relative counters. These relative counters are percentages of the volume. So for instance, if the “Gender Male” count is 2, and the total volume is 4, the “Gender Male Percentage” counter would be “50%”.
Also, the rank stream analyzer can use acceleration counters associated with the relative counters. This is basically a counter that compares the database of the current timeframe with the timeframe before. For instance, if there were 1000 impressions in the timeframe before, and 3000 impressions in the current timeframe, the “impressions acceleration counter” would be “+2000”.
Thirdly, the rank stream analyzer can store specific ratio counters for some of the fields. These ratio counters allow the easy retrieval of specific ratio trends. For instance, a “Gender Female Ratio” helps the retrieval of “Most Masculine Topics”, “Most Feminine Geo Regions”, “Most Feminine Links”, etc. Some example ratio counters are Gender (Male VS Female), Ethnicity (Black VS White, Black VS Asian, etc.), Engagement (Volume VS Impressions), and Network (Twitter VS Facebook, etc.).
Audience Segments
In one embodiment, the stream rank analyzer can be used to detect trends in any stream of messages. One of such streams includes messages from an audience segment; i.e. messages by a group of people that match a certain criteria. For example, “Female Soccer Fans” or “Republican Beer Drinkers”. By using the stream rank analyzer can find trends from these segments and new insights to answer specific marketing research and business intelligence questions.
For any given topic (e.g. a brand called “Heineken”), the stream rank analyzer can compare the ranks of streams between different audience segments. In that way, the stream rank analyzer can show how the topic is trending and performing across different segments. This includes the ability to drill down into specific attributes of the trends, for example, how is the brand performing across different ethnicities in each segment, or how is sentiment across segments, or how opinionated is the brand in different segments.
Another example of an audience segment is “Everyone”. When the stream rank analyzer is sued for the stream of all global mentions and messages, the stream rank analyzer can show what topics people around the world are focusing.
Quadrant Visualization of Stream Rank Trends
The stream rank analyzer can further include a quadrant visualizer for plotting the current “Trend State” on a scatter plot which is divided into 4 quadrants, based on the recorded stream rank trends.
Each quadrant of the scatter plot has a different meaning. The Waves quadrant indicates small trends that are spiking right now, but have a low momentum. The Bubbles quadrant indicates no trend yet (low acceleration, and low momentum). The Currents quadrant indicates sustained trends that have a low acceleration. The Tsunamis quadrant indicates trends that have high momentum, high acceleration.
For each timeframe a plot of this kind can be visualized. These plots can be animated over time to reflect how the trend is changing across different quadrants over a time period, as illustrated in
Stream Rank Derived Trend Activity Events
The stream rank analyzer can further generate a trend activity record for each attribute that changed inside a stream rank trend. For example, if the volume of topic X dropped 25% in a given timeframe, the stream rank analyzer can generate a trend activity of a severity of “topic X, volume−25%”. Any attribute mentioned in the previous sections can be used to generate a trend activity record. For each of these trend activity events, the stream rank analyzer can assign an impact score based on the severity in event. For instance, an event of volume change of 40% has a higher impact score than an event of volume change of 4%.
Decentralized Attention Indexing
In order to analyze messages on a global level, the stream rank analyzer includes a strategy for building an index of real-time social networking data. Conventional scraping and spidering approaches employed by search engines such as Google rely on software programs that find, crawl and download web pages using a large server infrastructure. This approach does not work for the real-time data needs of social networks. By the time the crawl would be finished, all data can be outdated.
The stream rank analyzer provides a solution by including application programs for indexing messages part of the browser. The stream rank analyzer includes a browser extension or other form of software called an attention tracker that can be installed within the browser.
The keywords inside the indexing commands can include global popular topics, but can also include common words or expressions on social networks like “A”, “http”, “the”. Some of these common keywords represent a large part of the activity on social networks. For example 60% of all messages on Twitter include the word “http”. By continuously searching for this on the Twitter website with a random date-range interval, it is possible to siphon a large portion of the global stream with a relatively few number of attention trackers. Also, this mechanism bypasses API access controls and can not be blocked in the way that conventional indexing bots are blocked.
Another benefit is that implicit attention data can be added to the attention index. Examples of the implicit attention data include browsing behavior (e.g. which links were visited, and which pieces of metadata did those links have) or user behavior inside social networks (e.g. how long did a person look at a message, which messages were not seen, etc.). All of these implicit attention data can be used in building a map of the user population's attention in a high detail.
Real-Time Applications
In one embodiment, the social intelligence system includes an application layer for developing, running and managing real-time applications. Developers can code apps in HTML5 and JavaScript for this layer. Developers do not need to worry about integrating with hundreds of APIs. The rich metadata and structure around messages give developers the power to build highly domain specific tools and new interactive experiences around the stream.
The application layer a framework that allows plugins (also referred to as micro apps) to be developed at light speed using a technique called real-time coding. A developer can use tools provided by the social intelligence system to change the current running instance of the social intelligence system. The framework can rapidly hot-swap the changed pieces of running code.
The applications and servers of the social intelligence system can be implemented using various techniques, as readily understood by a person having ordinary skill in the art. For instance, in one embodiment, the applications running on the client nodes along with browser, browser-side routing frameworks are used to ensure UI flow is neatly structured and that user interaction is speedy. HTML5 and modern JavaScript APIs are used to allow access to storage, audio, rendering capabilities and web workers. For mobile applications the HTML and/or JavaScript programs are wrapped in a wrapper that allows communication with the mobile operating system.
In one embodiment, the server nodes of the social intelligence system uses a minimalist web framework (called Express) that runs on top of a server side software system for writing scalable internet applications such as NodeJS. The server nodes utilize the framework in combination with a JavaScript library such as SockJS to provide a real-time communication pipeline to the client-side applications of the social intelligence system.
In one embodiment, all software on the server nodes is written in JavaScript which runs in a NodeJS environment. Every message that gets posted through the client-side application of the social intelligence system will be stored with metadata on a central server implemented as a scalable, high-performance, database, such as MongoDB. The server nodes can be operated by a standard Ubuntu Linux distribution.
Since the heavy use of client-side capacity, the social intelligence system's hardware needs are relatively low. In one embodiment, social intelligence system includes multiple database servers and multiple application servers running NodeJS. Each of these servers can have a standard multi-core CPU, high memory and solid-state drive configuration. In one embodiment, third-party content delivery platform, e.g. Amazon's CloudFront CDN, can be used to rapidly serve all code, media assets and static data to client-side applications of the social intelligence system.
In process 1530, the system performs a language detection on the microcontent message to detect language metadata. Then in process 1535, the system associates the detected language metadata with the microcontent message. In one embodiment, the language detection is based on a dictionary-based language detector or an NGram language detector.
In process 1540, the system performs a speech tagging on the microcontent message to identify nouns in the microcontent message. In one embodiment, the speech tagging is based on a Brill tagger method.
In process 1550, the system performs a topic extraction on the microcontent message to extract topic metadata. Then in process 1552, the system assigns a confidence score for each piece of the extracted topic metadata. In process 1554, the system associates the topic metadata and confidence scores with the microcontent message. In one embodiment, the topic extraction is based on factors including capitalization.
In process 1560, the system identifies type metadata of the microcontent message from an ontology of predetermined microcontent types by applying a database of annotation rules to the text tokens of the microcontent message. In one embodiment, the process 1560 can include compiling a matching index including a set of expressions based on the ontology of predetermined microcontent types or the database of annotation rules; and matching the expressions of the matching index with the microcontent message.
In one embodiment, the ontology of predetermined microcontent types is organized as a relational database including trees of the predetermined microcontent types. The annotation rules includes list of names and patterns that help assigning a tree of the trees of the predetermined microcontent types to the microcontent message. In process 1565, the system associates the identified type metadata to the microcontent message.
In process 1570, the system identifies sentiment metadata by performing a sentiment analysis on the microcontent message. In process 1575, the system associates the sentiment metadata with the microcontent message. In one embodiment, the sentiment analysis is based on a Naïve Bayesian classifier that is trained offline to an annotated set of positive, negative and neutral microcontent messages.
In process 1640, the system produces a trend database. Each record of the trend database includes a count for a specific condition based on the metadata attributes in which a message occurred in a timeframe. In one embodiment, a record of the trend database can further include a relative counter, an acceleration counter or a ratio counter. In process 1650, the system presents a list of trending metadata attributes based on the counts of the records of the trend database. In one embodiment, the metadata attributes include topics, types, hashtags, mentioned people, author, links, geographic segments, ethnicity, or genders.
In process 1660, the system identifies an audience segment correlated with a trending topic based on the messages from users of the audience segment. In process 1670, the system presents the trending topics as scatters in quadrant scatter plot, wherein a volume of each scatter indicates a metadata attribute of the topics. Then in process 1680, the system produces an animation of the quadrant scatter plot over a time period to reflect how users' attentions changes on the trending topics.
In alternative embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a client machine in a client-server network environment, or as a peer machine in a peer-to-peer (or distributed) network environment.
The machine may be a server computer, a client computer, a personal computer (PC), a user device, a tablet PC, a laptop computer, a set-top box (STB), a personal digital assistant (PDA), a cellular telephone, an iPhone, an iPad, a Blackberry, a processor, a telephone, a web appliance, a network router, switch or bridge, a console, a hand-held console, a (hand-held) gaming device, a music player, any portable, mobile, hand-held device, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
While the machine-readable medium or machine-readable storage medium is shown in an exemplary embodiment to be a single medium, the term “machine-readable medium” and “machine-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” and “machine-readable storage medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the presently disclosed technique and innovation.
In general, the routines executed to implement the embodiments of the disclosure, may be implemented as part of an operating system or a specific application, component, program, object, module or sequence of instructions referred to as “computer programs.” The computer programs typically comprise one or more instructions set at various times in various memory and storage devices in a computer, and that, when read and executed by one or more processing units or processors in a computer, cause the computer to perform operations to execute elements involving the various aspects of the disclosure.
Moreover, while embodiments have been described in the context of fully functioning computers and computer systems, those skilled in the art will appreciate that the various embodiments are capable of being distributed as a program product in a variety of forms, and that the disclosure applies equally regardless of the particular type of machine or computer-readable media used to actually effect the distribution.
Further examples of machine-readable storage media, machine-readable media, or computer-readable (storage) media include, but are not limited to, recordable type media such as volatile and non-volatile memory devices, floppy and other removable disks, hard disk drives, optical disks (e.g., Compact Disk Read-Only Memory (CD ROMS), Digital Versatile Disks, (DVDs), etc.), among others, and transmission type media such as digital and analog communication links.
The network interface device enables the machine 1100 to mediate data in a network with an entity that is external to the host server, through any known and/or convenient communications protocol supported by the host and the external entity. The network interface device can include one or more of a network adaptor card, a wireless network interface card, a router, an access point, a wireless router, a switch, a multilayer switch, a protocol converter, a gateway, a bridge, bridge router, a hub, a digital media receiver, and/or a repeater.
The network interface device can include a firewall which can, in some embodiments, govern and/or manage permission to access/proxy data in a computer network, and track varying levels of trust between different machines and/or applications. The firewall can be any number of modules having any combination of hardware and/or software components able to enforce a predetermined set of access rights between a particular set of machines and applications, machines and machines, and/or applications and applications, for example, to regulate the flow of traffic and resource sharing between these varying entities. The firewall may additionally manage and/or have access to an access control list which details permissions including for example, the access and operation rights of an object by an individual, a machine, and/or an application, and the circumstances under which the permission rights stand.
Other network security functions can be performed or included in the functions of the firewall, can be, for example, but are not limited to, intrusion-prevention, intrusion detection, next-generation firewall, personal firewall, etc. without deviating from the novel art of this disclosure.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.” As used herein, the terms “connected,” “coupled,” or any variant thereof, means any connection or coupling, either direct or indirect, between two or more elements; the coupling of connection between the elements can be physical, logical, or a combination thereof. Additionally, the words “herein,” “above,” “below,” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively. The word “or,” in reference to a list of two or more items, covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.
The above detailed description of embodiments of the disclosure is not intended to be exhaustive or to limit the teachings to the precise form disclosed above. While specific embodiments of, and examples for, the disclosure are described above for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative embodiments may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified to provide alternative or subcombinations. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel, or may be performed at different times. Further, any specific numbers noted herein are only examples: alternative implementations may employ differing values or ranges.
The teachings of the disclosure provided herein can be applied to other systems, not necessarily the system described above. The elements and acts of the various embodiments described above can be combined to provide further embodiments.
Any patents and applications and other references noted above, including any that may be listed in accompanying filing papers, are incorporated herein by reference. Aspects of the disclosure can be modified, if necessary, to employ the systems, functions, and concepts of the various references described above to provide yet further embodiments of the disclosure.
These and other changes can be made to the disclosure in light of the above Detailed Description. While the above description describes certain embodiments of the disclosure, and describes the best mode contemplated, no matter how detailed the above appears in text, the teachings can be practiced in many ways. Details of the system may vary considerably in its implementation details, while still being encompassed by the subject matter disclosed herein. As noted above, particular terminology used when describing certain features or aspects of the disclosure should not be taken to imply that the terminology is being redefined herein to be restricted to any specific characteristics, features, or aspects of the disclosure with which that terminology is associated. In general, the terms used in the following claims should not be construed to limit the disclosure to the specific embodiments disclosed in the specification, unless the above Detailed Description section explicitly defines such terms. Accordingly, the actual scope of the disclosure encompasses not only the disclosed embodiments, but also all equivalent ways of practicing or implementing the disclosure under the claims.
While certain aspects of the disclosure are presented below in certain claim forms, the inventors contemplate the various aspects of the disclosure in any number of claim forms. For example, while only one aspect of the disclosure is recited as a means-plus-function claim under 35 U.S.C. §112, ¶6, other aspects may likewise be embodied as a means-plus-function claim, or in other forms, such as being embodied in a computer-readable medium. (Any claims intended to be treated under 35 U.S.C. §112, ¶6 will begin with the words “means for”.) Accordingly, the applicant reserves the right to add additional claims after filing the application to pursue such additional claim forms for other aspects of the disclosure.
This application claims the benefit of U.S. Provisional Application No. 61/600,553 under 35 USC 119(e), entitled “NATURAL LANGUAGE PROCESSING OPTIMIZED FOR MICROCONTENT,” filed on Feb. 17, 2012, the contents of which are incorporated herein by reference. This application also claims the benefit of U.S. Provisional Application No. 61/723,280 under 35 USC 119(e), entitled “SYSTEMS AND METHOD FOR CONTINUOUS AND REAL-TIME OR NEAR REAL-TIME TARGETING OF SOCIAL NETWORK ADVERTISEMENTS AND OTHER PROMOTIONAL CONTENT,” filed on Nov. 6, 2012, the contents of which are incorporated herein by reference. This application is related to U.S. patent application Ser. No. 13/403,937, entitled “SYSTEM AND METHOD FOR ANALYZING MESSAGES IN A NETWORK OR ACROSS NETWORKS,’ filed on Feb. 23, 2012, the contents of which are incorporated herein by reference. This application is related to U.S. patent application Ser. No. 13/752,333, entitled “TRENDING OF AGGREGATED PERSONALIZED INFORMATION STREAMS AND MULTI-DIMENSIONAL GRAPHICAL DEPICTION THEREOF,’ filed on Jan. 28, 2013, the contents of which are incorporated herein by reference. This application is related to U.S. patent application Ser. No. 13/752,343, entitled “TARGETED ADVERTISING BASED ON TRENDING OF AGGREGATED PERSONALIZED INFORMATION STREAMS,” also filed on Jan. 28, 2013, the contents of which are incorporated herein by reference. This application is related to U.S. patent application Ser. No. 13/403,948, entitled “ADAPTIVE SYSTEM ARCHITECTURE FOR IDENTIFYING POPULAR TOPICS FROM MESSAGES,” filed on Feb. 23, 2012, the contents of which are incorporated herein by reference. This application is related to U.S. patent application Ser. No. 13/403,962, entitled “SYSTEMS AND METHODS FOR RECOMMENDING ADVERTISEMENT PLACEMENT BASED ON IN NETWORK AND CROSS NETWORK ONLINE ACTIVITY ANALYSIS,” also filed on Feb. 23, 2012, the contents of which are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
6151585 | Altschuler et al. | Nov 2000 | A |
6286005 | Cannon | Sep 2001 | B1 |
6373483 | Becker et al. | Apr 2002 | B1 |
7512612 | Akella et al. | Mar 2009 | B1 |
7636714 | Lamping et al. | Dec 2009 | B1 |
7885844 | Cohen et al. | Feb 2011 | B1 |
8073850 | Hubbard et al. | Dec 2011 | B1 |
8086605 | Xu et al. | Dec 2011 | B2 |
8166026 | Sadler | Apr 2012 | B1 |
8166925 | Suggett et al. | May 2012 | B2 |
8352549 | Sacco et al. | Jan 2013 | B2 |
20020082923 | Merriman et al. | Jun 2002 | A1 |
20030110056 | Berghofer et al. | Jun 2003 | A1 |
20030229531 | Heckerman et al. | Dec 2003 | A1 |
20040225592 | Churquina | Nov 2004 | A1 |
20050060312 | Curtiss et al. | Mar 2005 | A1 |
20050108630 | Wasson et al. | May 2005 | A1 |
20050114786 | Decombe | May 2005 | A1 |
20050131897 | Grasso et al. | Jun 2005 | A1 |
20050154690 | Nitta et al. | Jul 2005 | A1 |
20050192957 | Newbold | Sep 2005 | A1 |
20050256866 | Lu et al. | Nov 2005 | A1 |
20060041543 | Achlioptas | Feb 2006 | A1 |
20060059055 | Lin | Mar 2006 | A1 |
20060074836 | Gardner et al. | Apr 2006 | A1 |
20060112111 | Tseng et al. | May 2006 | A1 |
20060112146 | Song et al. | May 2006 | A1 |
20060173957 | Robinson et al. | Aug 2006 | A1 |
20060218577 | Goodman et al. | Sep 2006 | A1 |
20060248073 | Jones et al. | Nov 2006 | A1 |
20060282328 | Gerace et al. | Dec 2006 | A1 |
20060294085 | Rose et al. | Dec 2006 | A1 |
20060294086 | Rose et al. | Dec 2006 | A1 |
20060294134 | Berkhim et al. | Dec 2006 | A1 |
20070027744 | Carson et al. | Feb 2007 | A1 |
20070078699 | Scott et al. | Apr 2007 | A1 |
20070121843 | Atazky et al. | May 2007 | A1 |
20070124432 | Holtzman et al. | May 2007 | A1 |
20070150398 | Rossen et al. | Jun 2007 | A1 |
20070156636 | Norton et al. | Jul 2007 | A1 |
20070168533 | Canright et al. | Jul 2007 | A1 |
20070174057 | Genly | Jul 2007 | A1 |
20070208613 | Backer | Sep 2007 | A1 |
20070219863 | Park et al. | Sep 2007 | A1 |
20070226248 | Darr | Sep 2007 | A1 |
20070282867 | McAllister et al. | Dec 2007 | A1 |
20070297405 | He | Dec 2007 | A1 |
20080004942 | Calabria | Jan 2008 | A1 |
20080010253 | Sidhu et al. | Jan 2008 | A1 |
20080059466 | Luo et al. | Mar 2008 | A1 |
20080104225 | Zhang et al. | May 2008 | A1 |
20080133426 | Porat et al. | Jun 2008 | A1 |
20080177813 | Wang et al. | Jul 2008 | A1 |
20080215429 | Ramer et al. | Sep 2008 | A1 |
20080215557 | Ramer et al. | Sep 2008 | A1 |
20080215571 | Huang et al. | Sep 2008 | A1 |
20080275833 | Zhou et al. | Nov 2008 | A1 |
20080275861 | Baluja et al. | Nov 2008 | A1 |
20080288305 | LaLuzerne et al. | Nov 2008 | A1 |
20080288596 | Smith et al. | Nov 2008 | A1 |
20080313011 | Rose et al. | Dec 2008 | A1 |
20090037521 | Zilca et al. | Feb 2009 | A1 |
20090049038 | Gross | Feb 2009 | A1 |
20090049308 | Jin et al. | Feb 2009 | A1 |
20090106224 | Roulland et al. | Apr 2009 | A1 |
20090106244 | Dash et al. | Apr 2009 | A1 |
20090119173 | Parsons et al. | May 2009 | A1 |
20090234828 | Tu | Sep 2009 | A1 |
20090319342 | Shilman et al. | Dec 2009 | A1 |
20090327275 | Walker et al. | Dec 2009 | A1 |
20100030648 | Manolescu et al. | Feb 2010 | A1 |
20100042944 | Robinson et al. | Feb 2010 | A1 |
20100049852 | Whitnah et al. | Feb 2010 | A1 |
20100058196 | Krishnan et al. | Mar 2010 | A1 |
20100070448 | Omoigui | Mar 2010 | A1 |
20100083124 | Druzgalski et al. | Apr 2010 | A1 |
20100100537 | Druzgalski et al. | Apr 2010 | A1 |
20100114946 | Kumar et al. | May 2010 | A1 |
20100119053 | Goeldi | May 2010 | A1 |
20100121707 | Goeldi | May 2010 | A1 |
20100121817 | Meyer et al. | May 2010 | A1 |
20100121839 | Meyer et al. | May 2010 | A1 |
20100174692 | Meyer et al. | Jul 2010 | A1 |
20100281041 | Almeida | Nov 2010 | A1 |
20100306049 | Kakade et al. | Dec 2010 | A1 |
20100306192 | Kapur et al. | Dec 2010 | A1 |
20100312769 | Bailey et al. | Dec 2010 | A1 |
20110004465 | Rose et al. | Jan 2011 | A1 |
20110022602 | Luo et al. | Jan 2011 | A1 |
20110029636 | Smyth et al. | Feb 2011 | A1 |
20110067030 | Isard et al. | Mar 2011 | A1 |
20110078156 | Koss | Mar 2011 | A1 |
20110093459 | Dong et al. | Apr 2011 | A1 |
20110096915 | Nemer | Apr 2011 | A1 |
20110184806 | Chen et al. | Jul 2011 | A1 |
20110191372 | Kaushansky et al. | Aug 2011 | A1 |
20110206198 | Freedman et al. | Aug 2011 | A1 |
20110212717 | Rhoads et al. | Sep 2011 | A1 |
20110213661 | Milana et al. | Sep 2011 | A1 |
20110238496 | Gurbuxani et al. | Sep 2011 | A1 |
20110238647 | Ingram et al. | Sep 2011 | A1 |
20110264599 | Dalton | Oct 2011 | A1 |
20110307429 | Probst et al. | Dec 2011 | A1 |
20110320715 | Ickman et al. | Dec 2011 | A1 |
20120005224 | Ahrens et al. | Jan 2012 | A1 |
20120166438 | Wu et al. | Jun 2012 | A1 |
20120272160 | Spivack et al. | Oct 2012 | A1 |
20120278164 | Spivack et al. | Nov 2012 | A1 |
20120296920 | Sahni et al. | Nov 2012 | A1 |
20130013801 | Leeder | Jan 2013 | A1 |
20130018698 | Parnaby et al. | Jan 2013 | A1 |
20130046603 | Grigg et al. | Feb 2013 | A1 |
20130144682 | Dhara et al. | Jun 2013 | A1 |
20130159507 | Mason et al. | Jun 2013 | A1 |
20130198204 | Williams et al. | Aug 2013 | A1 |
20130304818 | Brumleve et al. | Nov 2013 | A1 |
20140035920 | Duwenhorst | Feb 2014 | A1 |
Number | Date | Country |
---|---|---|
2003288437 | Oct 2003 | JP |
2006227965 | Aug 2006 | JP |
20090074108 | Jul 2009 | KR |
WO-2006073977 | Jul 2006 | WO |
WO-2006116516 | Nov 2006 | WO |
WO-2008006059 | Jan 2008 | WO |
WO-2009007754 | Jan 2009 | WO |
WO-2009035618 | Mar 2009 | WO |
Entry |
---|
International Search Report & Written Opinion for Application No. PCT/US2012/026410, Date of filing: Feb. 23, 2012, Applicant: Spivack Nova et al., Date of Mailing: Sep. 23, 2012, 11 pages. |
International Search Report & Written Opinion for Application No. PCT/US2012/026405, Date of filing: Feb. 23, 2012, Applicant: Spivack Nova et al., Date of Mailing: Dec. 28, 2012, 15 pages. |
International Search Report & Written Opinion for Application No. PCT/US2013/023504, Date of filing: Jan. 28, 2013, Applicant: Bottlenose Inc., Date of Mailing: Jun. 11, 2013, 20 pages. |
International Search Report & Written Opinion for Application No. PCT/US 2013/26763, Date of filing: Feb. 19, 2013, Applicant: Bottlenose, Inc., Date of Mailing: Apr. 23, 2013, 8 pages. |
International Search Report & Written Opinion for Application No. PCT/US2013/26528, Date of filing: Feb. 15, 2013, Applicant: Bottlenose Inc., Date of Mailing: Apr. 18, 2013, 9 pages. |
International Search Report and Written Opinion for Application No. PCT/US2013/52981, Date of Filing: Jul. 31, 2013, Applicant: Bottlenose, Inc., Date of Mailing: Oct. 8, 2013, 10 pages. |
Brdiczka et al., “Automatic Detection of Interaction Groups,” ICMI, 2005, 5 pages. |
Wyatt et al, “Conversation and Speaker Segmentation in Privacy-Sensitive Situated Speech Data,” Interspeech, 2007, 4 pages. |
International Search Report and Written Opinion for Application No. PCT/US2013/68810, Date of Filing: Dec. 6, 2013, Applicant: Bottlenose Inc., Date of Mailing: May 7, 2014, 10 pages. |
Ashkan et al., “Classifying and Characterizing Query Intent,” University of Waterloo, Canada, Emory University, USA, Apr. 6, 2009, pp. 578-586. |
Dai et al, “Mining Anchor Text Trends for Retrieval,” Department of Computer Science and Engineering, Lehigh University, USA, Mar. 28, 2010, pp. 127-139. |
Dietz et al., “Unsupervised Prediction of Citation Influences,” Proceedings of the 24th International Conference on Machine Learning. Jun. 20, 2007, pp. 233-240. |
Fan et al., “Suggesting Topic-Based Query Terms as Your Type,” 2010 12th International Asia-Pacific Web Conference, Apr. 2010, 7 pages. |
Garfield, “Citation Analysis as a Tool in Journal Evaluation,” Essays of an Information Scientist, vol. 1, 1972, 24 pages. |
Guha et al., “Propagation of trust and distrust,” In Proceedings of the 13th International Conference on World Wide Web, New York, NY, USA, May 17-20, 2004, WWW '04. ACM. New York, NY, pp. 403-412. |
Huang et al. “Conversational Tagging in Twitter,” Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, Jun. 13, 2010, 5 page. |
Johnson, Celeb, “Likebutton.me Reveals What Your Friends ‘Like’ Across the Web,” from the internet, http://www.switched.com, Apr. 26, 2010, 8 pages. |
Liu et al., “Co-authorship networks in the digital library research community,” 2005, Elsevier, Information Processing and Management 41, 2005, pp. 1462-1480. |
Lu et al., “Finding Query Suggestions for PubMed,” 2009, AMIA 2009 Symposium Proceedings, 5 pages. |
Makkonen et al., “Topic Detection and Tracking with Spatio-Temporal Evidence,” Department of Computer Science, University of Helsinki, Finland, published Spring 2003, 15 pages. |
Massa et al, “Using Trust in Recommender Systems: An Experimental Analysis,” International Graduate School in Information and Communication Technologies, University of Trento, 38050 Povo, Italy, Feb. 26, 2004, 15 pages. |
Page, et al., “The PageRank Citation Ranking: Bringing Order to the Web,” from the internet, http://dbpubs.stanford.edu/pub, 1998. |
Tao et al., “Selectivity Estimation of Predictive Spatio-Temporal Queries,” Proceedings 19th International Conference on Data Engineering, ICDE'2003, Bangladore, India, Mar. 5-8, 2003, pp. 417-428. |
von der Weth et al., “Towards an Objective Assessment of Centrality Measures in Reputation Systems,” 2007 IEEE, 8 pages. |
Zhou et al., “PowerTrust: A Robust and Scalable Reputation System for Trusted Peer-To-Peer Computing,” 2007 IEEE, 30 pages. |
Ziegler et al., “Propagation Models for Trust and Distrust in Social Networks,” Information Systems Frontiers 7, Dec. 2005, pp. 337-358. |
European Search Report for 14151134.5, Applicant: Bottlenose, Inc., Date of Mailing: May 27, 2014, 8 pages. |
European Search Report for 14151154.3, Applicant: Bottlenose, Inc., Date of Mailing: May 27, 2014, 8 pages. |
Number | Date | Country | |
---|---|---|---|
20130290317 A1 | Oct 2013 | US |
Number | Date | Country | |
---|---|---|---|
61600553 | Feb 2012 | US | |
61723280 | Nov 2012 | US |