The present invention relates generally to advertising, more specifically to the optimization of an advertisement delivery plan for allocating advertisements to a contract in a network-based environment.
Since the widespread acceptance of the Internet, advertising as a source of revenue has proven to be both effective and lucrative. Advertising on the Internet provides the possibility of allowing advertisers to cost-effectively reach highly specific target audiences as opposed to traditional print and “hard copy” advertising media that reach only broadly definable target audiences (e.g. radio listeners in the greater Memphis area).
The very nature of the Internet facilitates a two-way flow of information between users and advertisers and allows these transactions to be conducted in real time or near-to-real time. For example, a user may request an ad and may intentionally, or inherently, transmit various pieces of data describing himself or herself. Additionally, an advertising management system may be able to intelligently determine which ads to place on a given website requesting advertisement content, thus increasing the revenue for the parties involved and increasing user satisfaction by eliminating “nuisance” ads.
Current systems, however, fail to fully exploit the interactive aspects of the Internet in the advertising realm. Most current advertising systems need to coordinate a number of components such as components for forecasting web traffic, for procuring ad placements based on target demographics, and for delivering display ads. Within this architecture, each component relies on the cooperative and reliable performance of the others. Unfortunately, current advertising systems are decoupled. A decoupled system results in a number of inconsistencies with respect to contracts for the promised placement and delivery of advertisements. Even just a slight overestimation of future web traffic may jeopardize an advertising system's ability to deliver the advertisements promised. Likewise, an underestimation of future web traffic hurts advertisers and publishers alike because of lost opportunities for ad placements.
Current systems create a strict and artificial separation between display inventory of available advertisement placements that is sold many months in advance in a guaranteed fashion (e.g. guaranteed delivery), and display inventory that is sold using a real-time auction in a market or through other means (e.g. non-guaranteed delivery). For instance, the Yahoo!® advertising system may serve guaranteed contracts their desired quota before serving non-guaranteed contracts, creating a possibly unnecessary and also possibly inefficient bias toward guaranteed contracts. While acceptable in the past, the shift in the advertising industry demands an efficient mix of guaranteed and non-guaranteed contracts.
Another flaw with the decoupled advertising system is the failure to take advantage of the stores of information available when pricing contracts and allocating advertisements to advertisement placements. For example, the current pricing systems use advertising information and contract information at a granularity and specificity that is coarse and untargeted. The failure to mine and use information regarding how advertisement placements may be allocated at a more granular level creates a gap between the price paid for an advertisement placement and the actual value that a contract derives from the advertisements placed.
This failure leads to the inability of legacy systems to provide more refined and targeted advertisements. To the contrary, increased refinement in targeting allows advertisers to reach a more relevant customer base. The frustration of advertisers moving from broad targeting parameters (e.g. “1 million Yahoo! Finance users) to more fine-grained parameters (e.g. “100,000 Yahoo! Finance users from September 2008-December 2008 who are California males between the ages of 20-35 working in the healthcare industry”) is evident. Unfortunately, the increased refinement and targeting is not computationally pragmatic within the context of legacy system designs.
Accordingly, there exists a need for a more unified marketplace for the optimization of an advertisement plan and allocation of advertisements to a contract in a network-based environment.
A computer-implemented Internet advertising method and advertising server network for serving impression opportunities in a system for delivery of display advertising. A probability mass (a numeric value) models the likelihood that a booked contract could be served by a future forecasted user visit. The probability mass is calculated using a collection of booked contracts, and a sample of forecasted user visits; a resulting probability mass is associated with each of the booked contracts. The relative sizes of the probability masses of a plurality of eligible contracts is used a selector in conjunction with a generated pseudo random number. In exemplary embodiments, a server is configured for receiving an event predicate as a result of a user visit to a web site. Based on the received event predicate, a set of eligible contracts is assembled. The eligible contracts are assigned to exactly one interval selected from a range, the size of the interval corresponding to the probability mass of the eligible contract. A generated pseudo-random number is used for selecting a single interval corresponding to an eligible contract, and the advertisement of the selected eligible contract is served. Thus, eligible contracts are served to user visits probabilistically.
The novel features of the invention are set forth in the appended claims. However, for purpose of explanation, several embodiments of the invention are set forth in the following figures.
In the following description, numerous details are set forth for purpose of explanation. However, one of ordinary skill in the art will realize that the invention may be practiced without the use of these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to not obscure the description of the invention with unnecessary detail.
Guaranteed delivery display advertising is a form of online advertising whereby advertisers can buy a fixed number of targeted user visits in advance, and publishers guarantee these user visits. In case the guarantee is not met, the publisher incurs some penalty (monetary or otherwise), so it is in the best interest of the publisher to try and meet the guarantees. For example, a sports shoe manufacturer (an advertiser) can buy one hundred million user visits for Males in California who visit Yahoo! Sports between 1 Jun. 2010 and 15 Jun. 2010, and Yahoo! (as a publisher) will guarantee these user visits even though the duration of interest occurs several months later than the current date. The guaranteed delivery model of online display advertising is prevalent among the major advertising networks such as Yahoo!, AOL, and MSN, and represents a multi-billion dollar industry.
Therefore, multiple competing advertisers might elect to bid in a market via an exchange auction engine server 107 in order to win the most prominent spot, or an advertiser might enter into a contract (e.g. with the internet property, or with an advertising agency, or with an advertising network, etc) to purchase the desired spots for some time duration (e.g. all top spots in all impressions of the web page empirestate.com/hotels for all of 2010). Such an arrangement, and variants as used herein, is termed a contract.
In embodiments of the systems within environment 100, components of the additional content server perform processing such that, given an advertisement opportunity (e.g. an impression opportunity profile predicate), processing determines which (if any) contract(s) match the advertisement opportunity. In some embodiments, the environment 100 might host a variety of modules to serve management and control operations (e.g. an objective optimization module 110, a forecasting module 111, a data gathering and statistics module 112, an advertisement serving module 113, an automated bidding management module 114, an admission control and pricing module 115, a probability mass assignment module 116, a probability mass serving policy module 117, etc) pertinent to serving advertisements to users, including serving ads under guaranteed delivery terms and conditions. In particular, the modules, network links, algorithms, assignment techniques, serving policies, and data structures embodied within the environment 100 might be specialized so as to perform a particular function or group of functions reliably while observing capacity and performance requirements. For example, an additional content server 108, possibly in conjunction with a probability mass serving policy module 117 might be employed to implement a mass-based approach for serving impressions in guaranteed delivery advertising.
In a guaranteed delivery setting the publisher faces two major problems. The first is that of accurate booking—the publisher's goal is to sell all of its inventory to guaranteed contracts. The leftover inventory is typically sold on a non-guaranteed marketplace and fetches much lower prices. The second is that of accurate serving—given a set of booked contracts and a visit by a user, decide which of the eligible contracts to show to the user so that all of the contracts are satisfied.
A simple approach addressing these problems might be worded as follows: At the time of booking, solve an allocation problem using forecasted user visits and existing booked contracts to see if the addition of a new booked contract is still feasible; if so, admit the new contract, else reject it. Similarly, at the time of serving, solve the same allocation problem using current and predicted future user visits and booked contracts, and serve the ad corresponding to the contract allocated to the current user visit. In some cases, this approach may be impractical because it may not be practical to obtain and optimize for all future visits at the time of serving. Note that if the ad server serves using a different—say, greedy—serving policy instead of solving the allocation problem, then it will under-deliver because it may not be able to find the feasible solution as was found in the earier timeframe by the booking system.
Another approach might be to solve the allocation problem at the time of booking, and then send the solution to the ad server so that it can simply “follow” the solution. For instance, the solution produced by the booking system might look like:
The ad server can then simply follow the solution and serve the ad corresponding to Contract 1 to the first visit of User A, and so on. However, this approach may not be feasible because (a) the solution is extremely large given tens of billions of impressions processed per day and, (b) while it is possible to predict the overall distribution of user visits, it is impossible to reliably predict which specific users are going to visit at a specific time.
Given the limitations of the aforeto discussed simple approaches, the question that arises becomes: Is there a way to leverage the time, resources, and approximate long-term forecast information available at booking time to produce a compact and generalizable plan for the ad server that can be used in real-time to serve ads for actual user visits? The key requirements here are compactness, which ensures that the information can be meaningfully stored in the ad server, generalizability, which ensures that decisions made on approximate forecast information translate to meaningful actions on real user visits, and real-time execution, which requirement demands that the ad server can make reliable decisions (and within a span of time on the order of hundreds of milliseconds).
Let I be the set of user visits and J be the set of contracts. Then denote a user visit using subscript i and denote a contract using subscript j. Each user visit i can be represented as a collection (e.g. set, vector) of attribute-value pairs (e.g. target predicates) that include the properties of the user, the properties of the web pages the user is visiting, and the time of the visit. For instance, a visit by a Male user from New York interested in travel and visiting a Sports page on 31 Jan. 2010 at 10 pm might have a target predicate represented as: Gender=Male, Location=New York, Category=Sports, Interests=Travel, Time=31 Jan. 2010 10 pm. Similarly, each contract j can represented as one or more Boolean expressions characterizing the user visit attributes (e.g. event predicates). For instance, a contract that targets Males visiting Sports pages in the month of January might have an event predicate represented as: Gender e {Male}̂Categoryε{Sports}̂Durationε[1 Jan. 2010, 31 Jan. 2010]. In addition, each contract j further specifies its demand, i.e. the number of user visits that are guaranteed to be shown, which is denoted by dj. A plurality of contracts might be represented in an inverted index such that one or more contracts might be retrieved via the index using one or more target predicates.
In more formal terms, one might say that a user visit iεI is eligible for contract jεJ if (and only if) it satisfies the target predicates of j; it can also sometimes be said that j is eligible for i in this case. Thus, a bipartite eligibility graph can be constructed.
The left-hand vertices (depicted as circles) consist of I (i.e. a supply of impressions); the right-hand vertices (depicted as rectangles) consist of J (i.e. demand from contracts). The edge-set, E, consists of edges (i, j) such that i is eligible for contract j. The set of user visits eligible for contract j is denoted by E(j). Likewise, the set of contracts eligible for i is denoted by E(i). Note that the eligibility graph shows the target predicates set annotated beside the contracts.
In an exemplary allocation problem, a publisher may be associated with a set of booked contracts, and the publisher may posess information about future user visits, which forecast might be obtained from a forecasting module 111. One possible allocation problem goal can be described as: Find an allocation of user visits to contracts such that every user visit is allocated to at most one contract, and each contract j is allocated to at least d1 impressions. Let xε{0,1}E denote the allocation; set xij=1 to mean that the ad associated with contract j is shown for the impression i, and xij=0 otherwise.
The publisher may have some objective function, H:{0,1}E→R, over the set of feasible allocations. Such an objective function generally relates the goals of revenue, advertiser satisfaction, and user happiness, though other objective functions are reasonable and envisioned.
Thus, the allocation problem may be formally written as:
However, the allocation problem itself presents many difficulties. A bipartite eligibility graph 300 corresponding to commercially reasonable characteristics might include billions of user visits (e.g. impression opportunities 350), and tens of thousands of contracts, resulting in trillions of edges in the bipartite eligibility graph 300.
One way to make the problem more tractable is to reduce the size of the overall problem by sampling from the set of user visits. For example, a sampling might be comprised from a uniform sample of, for example, 10% of user visits, and scale each of the demands appropriately (in this example dividing them by a factor of 10). Although a sampling may not be a perfect representation of the whole set sampled, the resulting problem is smaller by an order of magnitude and, thus, might be easier to solve (especially for a small bipartite eligibility graph). However, even after sampling, the bipartite eligibility graph might still include many hundreds of thousands of edges, and the solution might become long, and might involve significant computing cycles.
A second complementary way of reducing the solution-time problem is to relax the integrality constraint, replacing it with the more flexible constraint, 0≦xij>1 thus expressing xij as a probability of allocating i to j. Then in the allocation problem, the demand constraint holds in expectation.
A Chernoff bound may be used in this randomized algorithm to determine a bound on the number of runs necessary to determine a value by majority agreement—up to a specified probability. Since the typical demand is on the order of hundreds of thousands of impressions, an application of Chernoff bounds proves that any integral realization of the fractional solution will violate the demand by at most 1% with high probability. Using the above two techniques, the reduced allocation problem is usually solvable in practice (e.g. using one or more modules within an additional content server 108).
In the booking problem, the publisher has a set of already booked contracts and certain statistical predictions as well as other information about future user visits. In this booking problem, an advertiser wishes to book a new contract j′ targeting a specific subset of users, iεE(j′). The goal of the publisher is to find the maximum amount of inventory that they can allocate to the new contract. That is, the publisher needs to solve the following variant of the allocation problem:
The above booking problem may be expressed as a bi-partite graph network flow, and thus solved quickly using modern computing techniques.
In the serving problem, the publisher wishes to implement a series of decisions that inplement a feasible solution to the allocation problem. That is, as each user visit occurs, the publisher (or agent for the publisher) must make an immediate and irrevocable decision as to which contract to serve. The goal is to make the series of serving decisions such that, at least approximately, the planned allocation is achieved. Of course, the challenge here lies in the dearth of information and lack of resources available at serving time.
For example, a serving policy that precomputes an allocation for each user visit may underperform as it may be impossible to forecast exactly how many times a user will appear. Furthermore, a desired allocation plan should be general enough to be able to handle new users that have never been part of the system before (and thus not considered in earlier forecasting).
One possible solution to the serving problem is to run an offline optimization to produce an allocation plan, which can then be interpreted by an advertisement serving module 113.
One way to generate an allocation plan is to observe an objective function for meeting guarantees of the guaranteed delivery contracts. In some embodiments, the essence of an allocation plan resides in a single number for each contract, called its mass.
When the ad server processes a user visit, it first finds the set of contracts eligible for the user visit. It then probabilistically allocates the user visit to one of the eligible contracts, where the probability of allocating the user visit to a contract is proportional to the mass of the contract. That is, if the user visit is eligible for k contracts with masses m1 . . . mk, then the user visit is allocated to contract j with probability mj/Σimi,
The method 400 then probabilistically allocates the user visit to one of the eligible contracts, where the probability of allocating the user visit to a contract is proportional to the mass of the contract (see operation 430).
In further detail, and following earlier disclosure, every contract is assigned a mass. A mass may be represented as a single positive number. At serving time, when a user visit arrives, first find the event predicates (see operation 410) and then find the set of eligible contracts (see operation 420). Then allocate the user visit to one of these contracts at random, with probability proportional to the contract's mass. That is, if the user visit eligible to be served to k contracts with masses m1 . . . mk, then the user visit is allocated to contract j with probability mj/Σimi. In some cases there might exist more supply than can be consumed by the set of eligible guaranteed contracts. In such a case, an artificial contract (a ‘ghost contract’) can be added to the set of eligible contracts, the ghost contract serving as a proxy for a set of non-guaranteed contracts. Thus, when the ad server allocates a visit to this ghost contract, it in effect allocates the user visit to a non-guaranteed contract.
Now described is an iterative algorithm to calculate the masses that are then assigned to contracts. Initially, construct the left side of a graph similar to the form of the bipartite eligibility graph 300, and construct the right side of a graph similar to the form of the bipartite eligibility graph 300. For each contract on the right side of the graph, initialize the mass of each contract to equal 1. Simulate what the delivery to each contract would be (in expectation) if each user visit appearing in the linked eligibility graph is served, based on the then current masses. In particular, for any setting of the masses, {circumflex over (m)}, define delivery j({circumflex over (m)}) to be the expected delivery to contract j. That is,
where for each i, Mi=Σj′εE
For each contract j, increase its mass in proportion to its demand divided by its expected delivery, delivery j. If a contract j is underdelivering, then iterate the simulation and update until all demands are satisfied. In some embodiments, the demand may be padded by a padding value ε to ensure better convergence. The pseudo-code is given in Algorithm 1 below.
By virtue of its stopping condition, Algorithm 1 (below) is guaranteed to produce an allocation plan that ensures every contract meets its demand (in expectation), so long as the algorithm actually stops. In fact, it can be shown that it is guaranteed to converge, so long as the demands of all contracts (padded by (1+2ε)) are feasible. In practice, the demand of contracts can be trimmed somewhat to ensure feasibility.
Then a series of iterations commences (see operation 550 through operation 570) for simulating a series of supply events, and iteratively calculating a probability mass for each of the booked contracts, wherein said calculating includes a simulation of a series of supply events. More specifically, the iterations may include simulating the delivery of the user visits to each contract based on the current probability masses, which probability masses generally vary over the iterations (see operation 550). Then, based on the demand divided by expected delivery (as described supra), increase the probability mass (see operation 560). The method as shown continues the iterations until all demands are satisfied (see operation 570).
Any node of the network 1200 may comprise a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof capable to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g. a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration, etc).
In alternative embodiments, a node may comprise a machine in the form of a virtual machine (VM), a virtual server, a virtual client, a virtual desktop, a virtual volume, a network router, a network switch, a network bridge, a personal digital assistant (PDA), a cellular telephone, a web appliance, or any machine capable of executing a sequence of instructions that specify actions to be taken by that machine. Any node of the network may communicate cooperatively with another node on the network. In some embodiments, any node of the network may communicate cooperatively with every other node of the network. Further, any node or group of nodes on the network may comprise one or more computer systems (e.g. a client computer system, a server computer system) and/or may comprise one or more embedded computer systems, a massively parallel computer system, and/or a cloud computer system.
The computer system 1250 includes a processor 1208 (e.g. a processor core, a microprocessor, a computing device, etc), a main memory 1210 and a static memory 1212, which communicate with each other via a bus 1214. The machine 1250 may further include a display unit 1216 that may comprise a touch-screen, or a liquid crystal display (LCD), or a light emitting diode (LED) display, or a cathode ray tube (CRT). As shown, the computer system 1250 also includes a human input/output (I/O) device 1218 (e.g. a keyboard, an alphanumeric keypad, etc), a pointing device 1220 (e.g. a mouse, a touch screen, etc), a drive unit 1222 (e.g. a disk drive unit, a CD/DVD drive, a tangible computer readable removable media drive, an SSD storage device, etc), a signal generation device 1228 (e.g. a speaker, an audio output, etc), and a network interface device 1230 (e.g. an Ethernet interface, a wired network interface, a wireless network interface, a propagated signal interface, etc).
The drive unit 1222 includes a machine-readable medium 1224 on which is stored a set of instructions (i.e. software, firmware, middleware, etc) 1226 embodying any one, or all, of the methodologies described above. The set of instructions 1226 is also shown to reside, completely or at least partially, within the main memory 1210 and/or within the processor 1208. The set of instructions 1226 may further be transmitted or received via the network interface device 1230 over the network bus 1214.
It is to be understood that embodiments of this invention may be used as, or to support, a set of instructions executed upon some form of processing core (such as the CPU of a computer) or otherwise implemented or realized upon or within a machine- or computer-readable medium. A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g. a computer). For example, a machine-readable medium includes read-only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical or acoustical or any other type of media suitable for storing information.
While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. Thus, one of ordinary skill in the art would understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims.