Many advertisers seek to deliver relevant content to Internet users based on user preferences and surfing habits. However known methods fail to deliver adequately personalized advertising content to Internet users and therefore user participation diminishes over time.
My invention is a relevance engine for delivering increasingly relevant content to Internet users over time. My invention is adaptable to computers as well as cell-phones. PDAs and other similar wireless devices. It is a push-based content delivery means in which the user can passively receive desired content without having to surf and search the Internet. The invention also provides an incentive system for the user to view advertising material by offering tickets for lifetime prize draws. The relevance engine learns and predicts the user's ad preferences such as frequency of viewing ads, the viewer's interests, and the time of day for viewing ads. A novel aspect of my invention is that the relevance engine acts like an adjustable digital valve that controls the rate of advertising delivered to the user. The frequency of ads sent to the user is controlled by the user's preferences and the relevance of the information carried in the ads.
My invention has a number of advantages which contribute to its novelty and inventiveness.
The Relevance Engine
One novel aspect of my invention is the “relevance engine”. The relevance engine is digital means that learns and predicts the user's ad content preferences based upon accumulated data about the user. There is an enrollment means whereby the user is able to submit personal demographic information such as age, sex and occupation; explicit preferences as to ad content; and, a set of participant-generated taxonomic keywords or a “folksonomy” tags to attract relevant content for the user. There is also a system recordal means that monitors and records data relating to the user's dynamic response to delivered content, for example, click-through rates, response time and time of day. The user is also able to weight ads according to relevance by clicking through to the ad, viewing the ad and then rating the ad's relevance. Therefore the relevance engine is rule-based to predict user preferences and uses a degree of artificial intelligence to refine the predictions of user preferences. With the relevance engine, the user is able to receive increasingly relevant content over time which promotes continued and increased participation in the system. The relevance of content to the user can be determined in a number of ways. For example, the user may wish information on consumer goods and so content that falls into the category of consumer goods is relevant to the user. If the user is interested in information about a particular location then the relevant content is categorized based on the geographic or special nature of the user's interest. Finally, if the user is interested in making a purchase of, say a house, in a certain location and within a certain time frame then the category is classified as to goods as well as to the special and temporal nature of the requirement.
In another embodiment of my invention there is included a reward system for viewing and interacting with the delivered content. One example of an award system is the awarding of points that can be redeemed for material goods or goods having extrinsic or intrinsic value. Another embodiment of the reward system would a prize draw system that would award tickets that would not expire. The number of tickets would continue to accumulate over time thereby incentivising the user to continue to use the system over the long term. This in effect is a lifetime lottery.
Various explanatory samples of my invention follow.
Description
The relevance engine may consist of the following components;
Inputs to the System
The system takes the following inputs:
Outputs from the System
It generates the following from those inputs:
User Synaptic Map
A User Synaptic Map is set of labels tied to a unique identity element representing a user. A user synaptic map looks like
Each relation consists of a synaptic weight that ranges from 1.0 to +10.0. A synaptic strength >0 is excitory while a strength <0 is inhibitory. The relation weight is represented as a 2-dimensional vector where each dimension represents the synaptic strength in a one direction (U to label, or label to U).
The map is developed through a number of methods:
User Synaptic Maps enable the system to learn correlations between user interests. In the example above, there may exist a correlation between users that have an interest in “soccer” and users that have an interest in “bellydancing”.
Ad Synaptic Map
An Ad Synaptic Map is a set of labels tied to a unique identity element representing an ad. An ad synaptic map looks like
An ad map is generated for each ad input into the system. An ad map is developed through any or all of the following methods:
As an ad is delivered to users of the system, synaptic weights are adjusted based on the responses of those users. When user responses resonate highly with the synaptic map, weights are strengthened. When user responses do not resonate highly, they are weakened.
Ad Synaptic Maps enable the system to learn synonyms and similarities about things. In the example above, “nike” and “shoes” have a strong similar relation.
Advertiser Synaptic Map
An Advertiser Synaptic Map is built up over successive ad synaptic maps that correspond to the same advertiser. The synaptic strengths in the map depend on the similarity or resonance of ads from that advertiser. An Advertiser Synaptic Map looks like
An advertiser synaptic map is principally used to suggest labels when new ads are inputted for a known advertiser.
Label Synaptic Map
A Label Synaptic Map is derived from large sets of user and ad synaptic maps. Based on commonly occurring relations and correlations of labels, a label synaptic map is learned. It principally answers the question on how labels are related. A label synaptic map is shown in
Label synaptic maps learn from every (1) ad entered, (2) new users (3) existing user changes and (4) ad response.
A label synaptic map has the advantage that it follows a natural associative memory model.
Label synaptic maps can be enhanced further by clustering them into groups based on semantic relations. For example, all consumer brands would be clustered into one group based on analyzing similarities in their map structure.
Label synaptic maps can also be polymorphic based on a particular attribute. For example, a synaptic map could be geospatially sensitive in that its structure would be different in the US than in Canada. Synaptic relations to the brand “Tim Hortons”, which does not exist in the US, would cause a polymorphic map.
Labels can also be associated into a multi-dimensional, nonlinear hierarchy so that all types of sports would be classified under the label “sports”. By the same token, sports might be classified under the label “Nike”. However, “Nike” might be classified under the label “basketball” which is also under the label “sports”. This creates a circular hierarchy but one that is in fact acceptable and desirable.
Learning and Feedback
All synaptic map weights are modified when any one of the following activities occurs:
Ad Synaptic Maps are modified by:
Advertiser Synaptic Maps are modified by:
Mathematical Model
The relevance engine defines two algorithms for determining relevance, including (1) a learning algorithm, and (2) a resonance algorithm.
The learning algorithm builds upon principles of unsupervised, auto-associative, and hetero-associative learning principles derived from the artificial intelligence domain.
The resonance algorithm builds upon the concepts of mechanical resonance in physics, and applies algorithms from statistics and fuzzy logic in the model.
Mathematical Model—Resonance Algorithm
The resonance algorithm computes the similarity between two synaptic maps, principally a user synaptic map and an ad synaptic map. This is illustrated mathematically in Dirac bracket notation to facilitate readability in this section.
Each user has a minimum of two (2) synaptic maps that can be represented in vector form. One for hard preferences (initially specified explicitly by the user) and one for soft preferences (learned implicitly from user behavior and other implicit sources).
Let I uiH>=(w1, w2, w3, . . . ) be the hard synaptic map for user i
Where the vector has one dimension for each label known in the system
And where vector entires wk represent the synaptic weight to each label for user i.
For hard maps: wk can take one of three values in the set {−1,0,+1}
Similarly, let I uiS> be the soft synaptic map for user i.
For soft maps: wk can be any real value [−1,+1]
Note that these vectors are necessarily sparse.
Each ad also has a minimum of two (2) synaptic maps that can be represented in vector form. One for hard preferences (initially specified when the ad in inputted into the system by either by a machine or a human) and soft preferences (learned implicitly from user behavior and other implicit sources),
With one dimension for each label known in the system
The generalized similarity of user i to ad j can be computed as follows,
S
ij=1/Na<uHi+uSiIaHj+aSj>
Where Na=the number of nonzero entries in I aHj+aSj>
Note that generalized similarity is computed using both soft and hard synaptic maps. A hard similarity can also be computed by using only hard synaptic maps. Likewise, a soft similarity can be computed using only soft synaptic maps.
If a label importance matrix L is available, then similarity becomes,
S
ij=1/Na<uHi+uSiILIIaHj+aSj>
To find ads with the highest similarity to push out to user i, the following algorithm is employed:
Mathematical Model—Synatic Learning Algorithm
The synaptic learning algorithm is applied once an ad is delivered to a user. It principally learns about both user preferences and ad attributes by modifying a user's soft synaptic map and an ad's soft synaptic map.
Its other major function is to “harden” soft labels by promoting them into the hard synaptic map from the soft synaptic map.
After ad j is pushed to user i, the following algorithm is applied:
S
ij=1/Na<uHi+uSiILIIaHj+aSj>
I u
S
i
=IÎ−α><u
S
i
I±Iα><a
H
j
I
Where Î=(1,1,1, . . . )T and
Where Iα>=c/ajk I aHj> with c≦1 and
With ark is the kth element of aHj
This process modifies all label in the synaptic map for user i that appear in the synaptic map of ad j. As a result, the synaptic map of ad j is imprinted faintly on user i.
In this equation, the vector |α> represents the learning rate. Its numerical values depend on the type of action taken by the user (explicit rating, ignored push, view, click-through, etc). If the action is positive (user thought the ad was relevant) then it is an additive equation. If the action is negative then it is a subtractive equation.
In addition to modifying the user soft synaptic map, the ad soft synaptic map is also modified using a similar technique:
Ia
s
j
>=IÎ−α><a
s
j
I±I α><u
H
i
I
After each application of the synaptic learning algorithm or in batch, the last phase is to examine any soft labels that are candidate for hardening or unhardening.
Any soft labels with synaptic weight IwkI>TH are hardened to either +1 or −1 and represent a learned label. Any previously hardened labels with synaptic weight IwkI<TL are unhardened and represent forgotten but previously learned label.
Parameters TH and TL represent promotion and demotion thresholds respectively. They are tuning parameters of the algorithm that indicate how quickly new labels are learned and forgotten. Fundamentally they dictate the system trade-off between prediction accuracy and prediction latency.
Enabling Learning via Staged Delivery
It is not valuable to deliver ads to all users immediately as it does not give the system a chance to learn about the relevance of the ad. Therefore the system will build in staged delivery concepts as follows:
Steps 1-3 will repeat so long as the ad is deemed relevant to users.
For example, an ad could be delivered to the top 1 percent of users with the highest resonance. Based on these users' receptivity to the ad (as measured by click-through rates, page-views, etc.), the ad could then be unrolled to a larger percentage of users.
Semantic Equivalencies
Second-order analysis can be performed on Label Synaptic Maps to derive semantic equivalents. This will be done to improve the usability and intelligence of the engine. For example, if “house” and “music” are very highly correlated, a semantic equivalent to “house music” will automatically be generated
Description of Lottery System
Specific Description of How Lottery System Awards Tickets
Tickets earned for different actions may be entered into different incentive prize draws. The implication is that the Lottery system must have a way of differentiating tickets awarded for different actions. One method of differentiating tickets is to use a taxonomic system that generates unique identifiers for each ticket. For example, a ticket number may consist of a string of numbers and/or letters that encode information such as: unique user identification number, the date on which tickets were awarded, type of action user was engaged in when the ticket was awarded, etc.
For example, the ticket number may look like “012345-20060708-154”, where “012345” is the unique user identification number, “120060708” is the date on which the ticket was awarded, and “154” was the action the user was engaged in to earn the ticket.
Description of How Prizes are Awarded
Prize draws may follow the standard format where a winning ticket is randomly selected from all of the eligible tickets for the draw. For examples only the tickets awarded for engaging in a certain action may be eligible for a particular prize draw.
If a user possesses a winning ticket, he or she may be contacted to verify a shipping address for delivery of a prize.
Advantages of Lottery System Coupled with Relevance Engine
When the user submits his or her address information for the purposes of claiming a prize, this allows the system to verify that the user's address information is valid and correct (it is well known that users often submit false addresses in order to conceal their identities or remain anonymous; this can reduce the effectiveness of other reward systems).
As a method of incenting users to take particular actions, users may be rewarded with lottery tickets. Multiple tickets can be awarded for each action. Actions that provide rewards will be determined dynamically by the relevance engine, based on actions it would like a user to take.
Dealing with Stale Data
In addition to incenting users to use the system, the lottery can be also be used for the purpose of incenting users to take actions that positively impact the relevance engine. For example, if a user has not recently reviewed her User Synaptic Map, the system may incent her by offering a large number of lottery tickets.
To deal with stale or inaccurate data, the system will employ self-correcting algorithms that periodically scan for bad or unknown data (ads, users, etc.). To better qualify that bad or unknown data, the system will entice users to provide feedback via lottery ticket offers attached to actions.
Flow Rate
In the proposed system, the rate at which ads are delivered to the user is fully under user control. This flow rate is controlled by a digital valve that can be adjusted by the user to match his or her preferences. The flow rate is also a learned quantity that the system can fine-tune in response to a user's change in behavior. For example, the user could set the delivery of content to three times a week instead of twice a week.
Tweaking the Learning Rate
It is proposed that the system learn about the user by assemblies sets of Synaptic Maps. Much like normal human beings however, the rate at which effective neuron connections are strengthened or weakened (forgotten) depends very much on the how quickly an individual user consumes ads.
For example, a user that consumes 10 ads every week should learn faster and forget faster than a user that consumes 2 ads every week. Therefore, in our invention, the learning rate is proportional to the flow rate of ads to the user.
Consider User A that has just begun to use the system. As a starting point, she enters the following labels to describe her preferences in a User Synaptic Map.
Assuming the system already has a developed a set of Label Synaptic Maps, User A's User Synaptic Map would be extended by resonance with a set of Label Synaptic Maps.
This would create a virtual User Synaptic Map that looks as
Notice that the system understands how to extend a user's map by leveraging a label map, in this example, the system has made a second order inference that User A is interested in purses due to her liking of Louis Vuitton. In addition, the system has a weaker third order inference that she may have in interest in Paris, France.
Over time t, user receives N ads that resonate with her preference. Of the N ads, she responds (clicks) to M of them. Of those M responses, m are strong favorable and (M-m) are not favorable,
This implies that:
Let Δt represent the time interval over which a single response occurs. Then, after each Δt, the user's synaptic map is modified via resonance with ad responded to within that interval. In this example, after N ads, User A's synaptic map becomes as shown in
Notice that the synaptic weights between User A's original map shown in
The process of resonance with the N ads has also added new labels based on common occurrences in viewed ads. Also, some negative labels have appeared based on labels contained in unfavorable responses.
The initial ad map shown in
Notice that the map has evolved to include the “exotic cars” classification and has reduced the strength of the weighting to the “sports car” label. This reflects the fact that the ad is better classified under the “exotic cars” label than the “sports cars” label,
Consider an evolved Ad Synaptic Maps after resonance with a large set of responders as shown in
For example, we can say that “Porsche GT” and “exotic cars” have a similarity of 0.91×0.5=0.455 for this ad.
To derive a Label Synaptic Map over N ads, this can be generalized as follows:
S(wi,wj)=ΣNwi,wj/N, N={Users Vwi≠0}
That is, the similarity of two labels is given by the function S which indicates the strength of the similarity in the sample set. The default sample set is all users where either wi or wj are non-zero. Note that exact form of the similarity function can be tweaked depending on use. A single form of the equation is shown here.
Instead of similarities, we can also derive a Label Synaptic Map of label correlations. This is done by using User Synaptic Maps, instead of Ad Synaptic Maps as above.
Consider the example shown in
To derive a Label Synaptic Map over N users, this can be generalized as follows:
C(wi,wj)=ΣNwi,wj/N, N={Users Vwi≠0 or wj≠0}
That is, the correlation of two labels is given by the function C which indicates the strength of the correlation over the sample set. The default sample set is all users where either wi or wj are non-zero. Note that exact form of the similarity function can be tweaked depending on use. A single form of the equation is shown here.
Beyond email, the system can deliver ads over any communication channel so a user could direct ads via RSS, mobile SMS/MMS, SIP, voice channels, or other content distribution channels. For example, the user could direct all ads to their RSS reader or mobile phone.
The system could also intelligently decide when to route messages to different mediums by learning about user behavior or incorporating user presence from mobile networks, instant messaging networks, calendars, phone activity, or other sources of presence information. For location sensitive ads, the system could leverage location (GPS) services to intelligently route ads of interest to users in specific areas For example, if the system knows that User X has a strong interest in clothes from Store Y, then the system could direct ads from Store Y via SMS/MMS to that user when in the vicinity of Store Y.
Although the description above contains much specificity, these should not be construed as limiting the scope of the invention but as merely providing illustrations of some of the presently preferred embodiments of this inventions. Thus the scope of the invention should be determined by the appended claims and their legal equivalents,