This invention relates to computer systems and data processing. In particular, methods and apparatus are provided for efficiently indexing large quantities of data.
Computer systems and services that serve electronic content usually index the content using indices that are specific to the type of content being served. Therefore, an index for a web server will be designed and optimized to locate and serve web pages (e.g., .html files), an index for an ad server will be optimized to select and serve advertisements (e.g., images), a file server may be optimized to locate and serve documents, and so on. An index designed for one type of content cannot be used with other types of content.
Specialized indices generally do not permit simultaneous reading and writing. Therefore, whenever the index must be maintained (e.g., to add or remove an entry), the entire index may be temporarily locked or taken offline, and can't be used to locate and serve content during the maintenance. Depending on how long it is locked for writing, searches may be delayed for unacceptable periods of time.
Yet further, scanning or searching a specialized index can be relatively slow when entries in the index contain some or all the semantic content (e.g., in textual format) that must be read, parsed and compared with some target parameters in order to determine which index entries are relevant. In particular, an index entry representing a web page, an advertisement, a news story or other type of item may contain textual content of, or metadata regarding, the item. When a query is executed against the index, the textual content or metadata must be compared with the query, which can involve execution of a relatively slow pattern-matching algorithm for comparing text.
Even further, a specialized index for serving just one type of content may not be optimized to find the best or most valuable content first. Instead, such indices may be organized such that the entire index may need to be searched in order to ensure that the most valuable content is located.
In some embodiments of the invention, methods and apparatus are provided for efficiently indexing content to be served to users via an electronic system (e.g., an online service). The service may comprise a social networking service, a web server, a portal and/or some other type of service, and the content may be of multiple types (e.g., advertisements, résumés, status updates, job listings).
In these embodiments, the index is composed of multiple “slices,” each of which is formatted to contain multiple index entries, with each entry corresponding to an item of content. One slice may undergo maintenance (e.g., to add a new entry, to change or remove an entry) at the same time other slices continue being read and used to identify or select content to be served.
An entry for a particular content item contains a list, array or other collection of integer values representing attributes or characteristics of the content item. Each unique integer maps to a unique name/value pair for an attribute (e.g., age, gender, location) and a corresponding value (e.g., 21-25, female, Southern California).
Similarly, the target attributes or characteristics of a query or request for content are formatted as integers. Thus, when a query is to be applied to the index, its integer values can be quickly compared with integer values of index entries. Content items corresponding to matching index entries can then be ranked to reduce the number of results, if necessary, and the winning content items served.
In some embodiments of the invention, within each entry, the integers representing the corresponding content item's attributes are ordered so that the integers representing the most distinguishing attributes (or the attributes least likely to match) are scanned first when a query is applied. Thus, if a given entry's content item does not match the terms of a query, the mismatch will be detected quickly and the query can jump to the next entry.
Further, within a slice, index entries may be ordered according to the values of their corresponding content items. In some implementations, the value of a content item reflects the revenue earned (or estimated to be earned) by the system when it serves the content item, the observed or estimated performance of the content item (e.g., how frequently users act upon the content item), and/or other measures of effectiveness.
The following description is presented to enable any person skilled in the art to make and use the invention. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the scope of the present invention. Thus, the present invention is not intended to be limited to the embodiments shown.
In some embodiments of the invention, methods and apparatus are provided for indexing electronic content. The content may be any type of electronic data formatted for presentation via a browser, application program or other user interface. The indexed content may include complete compositions presented individually, such as web pages, documents or videos, or may be components that can be presented as part of a web page or other composition, such as advertisements, job listings, notifications, status updates, news, documents, sports information, images, videos and so on. In short, electronic content items indexed in embodiments of the invention may include any type of content that can be presented to a user on a communications or computing device.
Because an index provided herein can accommodate multiple types of contents, separate indices are not necessary for each type. Embodiments of the invention may be implemented as part of virtually any online service that serves data, whether it is a social network service, a web server, a portal site, a search engine, etc. The content may be indexed on a set of one or more computer systems, and may be presented to users operating portable and/or stationary devices.
In systems that serve electronic content for presentation to users, some information about a target user to whom a content item is to be presented is often provided as part of a query or content request. For example, when a user of a social network service connects to the service's site, and navigates to a page of the site, a web browser or other display engine may generate a query to a data server operated by the site, to identify and/or obtain content to present to the user. The content query may include or be accompanied by one or more attributes or characteristics of the user (e.g., sex, age, location, employment status).
Similarly, content items that have been stored and that are to be served to users of electronic services and applications may have associated attributes that identify target audiences of the content items. For example, an advertisement designed to promote sales of a particular product, or a job listing regarding a new job opening, may be received with information identifying types of users to whom the item should be presented (e.g., sex, age, location, employment status).
In response to a query or request for content to be served to a particular user, the data server searches for appropriate content, by comparing known attributes of the user to recorded attributes of the target audiences of the stored content items. Some number of appropriate content items are identified and delivered for presentation to the user.
In some embodiments of the invention, attributes used to characterize content items, and/or to characterize a target audience of the content items, are stored in an index as integers (or integer tokens) instead of as text. For example, one attribute that may be used to select content items for serving to a target user is age, and a provider of a content item may specify the age (or a range of ages) of people to whom the content item is targeted. When the content item is indexed, within the content's entry in the index a particular integer is stored to indicate that age (or age range).
Thus, if the content item is an advertisement targeted at people between the ages of 21 and 25, the age attribute may be stored as a first integer, such as “2045”. For another content item targeted at people between the ages of 46 and 50, the age attribute may be stored as a second integer, such as “8749.” Another integer, such as “8” may represent an “intent” attribute having the value “job seeker” and may be stored within entries corresponding to content items that the providers want to have presented to people looking for a job.
In these embodiments, a content item's entry in an index contains any number of integers representing name/value pairs of specific attributes and values for those attributes. In some implementations, each unique attribute/value pair maps to a unique integer, and vice versa, meaning that a given integer within an index entry corresponds only to one specific attribute having one specific value.
In other embodiments of the invention, however, a given integer may be unique to a particular type of content (e.g., advertisement, subscription plan, status update, news feed). In these embodiments, the meaning of a particular integer stored within a content item's entry will depend upon the type of content item, and the same integer found within index entries for two content items of different types may or may not map to the same attribute/value pair.
In some embodiments of the invention, an integer may represent multiple attributes and their associated values. For example, an integer such as “24” may represent an age in the range of 26 to 30, combined with a geographic location attribute having the value “Northern California.” Also, an integer may represent a negative or “not” value. For example, the integer “19452” may correspond to an industry of targeted users, with the value “not farming,” in which case an entry with this integer would match a query for content items that target industries other than farming.
By using an integer-based scheme for coding attributes provided in an embodiment of the invention described herein, index entries can be searched very rapidly to find content items appropriate for a target user having particular values for those attributes. When the user's characteristics are received (e.g., his age, his geographic region, his status within an online service), they are converted into corresponding integers (if not received as such), and some or all index entries can be rapidly scanned for matching integers.
Because the search mainly or only involves comparing integers—a set of integers representing attribute/value pairs of a target user, against collections of integers found within entries of the index and representing attribute/value pairs of the corresponding content items—it may be done without the overhead associated with a semantic-laden textual search. In other words, comparing integer values against each other is inherently simpler and faster than parsing and searching text for a particular pattern of text characters.
Within memory 102, index 104 comprises multiple slices 110 (i.e., slices 110a-110m) each storing a subset of all index entries of index 104. Each entry 114 within a slice (e.g., entries 114b-1 to 114b-n of slice 110b), when populated, corresponds to one content item. The content items may be stored on the same device that memory 102 is part of, or on one or more different devices.
Although each slice 110 is portrayed as having the same number of entries 114 in
In some implementations, a new content item (or some portion of the content item) is hashed in some manner to identify which slice the item will be stored in. Content items of the same type (e.g., advertisements, job listings) may map to the same slice or different slices, and similarly, content items relating to the same thing (e.g., a product, a service, a person, a company) may map to the same slice or different slices.
Each slice includes metadata 112 (e.g., metadata 112b for slice 110b). As shown in the expanded view, metadata 112a of slice 110a comprises slice identifier 150a, lock 152a and status 154a. Identifier 150a uniquely identifies slice 110a, while lock 152a is used by reader and writer entities to lock the slice as needed.
Status 154a of metadata 112a of slice 110a provides status information, which may include an indication as to whether the slice is online (can be searched) or offline (not available for new searches), may indicate or identify entries that are free (or not free), may identify a number of entries in the slice, etc. Illustratively, a slice may be taken offline while it is being maintained (as described below). In implementations in which populated entries are packed toward the front of a slice, status 154a may identify the first free entry in slide 110a, and this pointer or reference would be updated as new entries are stored and old ones emptied.
As shown in illustrative entry 114m-1 of slice 110m, and described above, an entry may comprise a collection of integers representing attributes of the target audience of the content item corresponding to the entry. Therefore, each integer listed in the array of integers of entry 114m-1 may map to a unique attribute/value pair.
In an illustrative embodiment of the invention, index 104 may have approximately 10 slices, and each slice may store approximately 100,000 integers, the length of each of which may be 16 bits, 32 bits, 64 bits or some other length. For purposes of maintaining the index (described below), index 104 and memory 102 may include an extra or spare slice, or one of the illustrated slices may be used as a spare.
In these embodiments, similar to the embodiments reflected in
Entry 214a-1 includes entry metadata 250a-1, which stores information about the content item corresponding to entry 214a-1 and which is described further below. Metadata 250a-1 need not be stored at the front of the entry.
In embodiments of the invention reflected in
Each disjunction is composed of one or more terms (Xn) connected by the OR operator (), all the disjunctions are coupled by AND operators (), and there may be one or more disjunctions in the conjunction. Each disjunction is represented as a count field 260 and one or more integer fields 262. Count field 260 identifies the number of terms in a disjunction (i.e., the number of integers), and the corresponding integer fields store the integer terms of the disjunction.
The conjunction of the three disjunctions shown in entry 214a-1 may be represented textually as:
Replacing the integers with illustrative attribute/value pairs they may represent, this conjunction of disjunctions could correspond to a content item whose target audience includes people who:
Terms within a single disjunction need not correspond to the same attribute. For example, the disjunction (59 OR 74) may instead map to attributes/values such as marital status=married [59] OR education=bachelor's degree [74].
Any attribute or characteristic that can be known about a person or that can be used to describe a target of an item of content, along with its corresponding value for a specific person (e.g., a user of an online service) or a specific content item (e.g., a job posting for a software engineer), can be represented and stored as an integer. A database, table or other data structure for mapping attribute/value pairs to integers, and vice versa, may be maintained in the same memory as an index described herein, or may be stored elsewhere.
Together, populated count fields 260 and integer fields 262 of entry 214a-1 may be referred to as the targeting data or targeting section of the entry or of the corresponding content item, because they identify targets of the content item.
The metadata section of an entry (e.g., entry metadata 250a-1 of entry 214a-1) may store information such as, but not limited to: an identifier of the content item corresponding to entry 214a-1, a storage location of the content item, a type of the content item (e.g., advertisement, job posting, video, status notification), a score (or rating, value or other measure of effectiveness) of the content item, the number of disjunctions in the entry, a budget for serving the content item (e.g., a budget for a period of time, remaining unused budget for a period of time), etc.
The score, rating or other measure of performance or effectiveness of a content item may indicate how often the item has been served, how successful it has been (e.g., how frequently users who receive the content item act upon it), its value to the system or service that serves the content items, etc. The score may encompass all servings of the item, meaning that it may indicate how many times users have acted on the item, over all those servings, without regard for different types of users (e.g., users having different attribute values).
Also, or alternatively, a score or rating corresponding to one or more types of user may be stored in metadata 250a-1 or elsewhere. For example, ratings of content items regarding individual users and/or groups of users sharing a common attribute may be stored in the same computer system or a different one.
Therefore the effectiveness of a given content item may be quickly determined, over all types of users to whom it has been served, and/or for users having specific attributes. Illustratively, measures of effectiveness may be maintained for all relevant integers—that is, for each integer (i.e., each attribute/value pair) that matches at least one user to whom the content item has been served, the performance of the content item may be tracked.
In some embodiments of the invention, content items' scores or measures of effectiveness may be used to rank or filter content items identified when a query is executed against index 204. Execution of a query may result in many (e.g., hundreds, thousands) of matching content items, but by considering the scores of those items (and/or other information), the top X (e.g., one, three) content items can be identified, which may be those that are most likely to elicit action on the part of the target user, based on historical performance of the items.
Some or all of metadata in entry metadata 250a-1 may be represented as integers, as done with the targeting information of the rest of the entry. In particular, metadata that may need to be searched as part of a query execution (e.g., content item identifier, content item type, remaining budget for the content item, score) may be stored as integers. A particular integer may be unique across an entry's entry metadata field and integer fields, or integers used in entry metadata fields may be mapped to a different set of name/value pairs than integers within the entry's targeting section.
Also shown in
Index 204 and queue(s) 220, and possibly other data structures (e.g., one or more tables mapping integers to attribute/value pairs) may reside within one computer system's memory, thereby promoting rapid operation of the index. These structures may be replicated across multiple different computer systems, however, to provide distributed processing, load-balancing, redundancy and/or other benefits.
In some embodiments of the invention, additional optimizations may be employed for an index, beyond the use of integers to represent attribute/value pairs. One such optimization involves sorting the integer tokens of an entry so that those representing the most distinguishing or narrowest attributes, or those representing the attributes least likely to match a query, are listed (and scanned) first.
With this optimization, when a content query or request for content is received, and integers representing a target audience are used to search the index for matching entries, the scan of each entry will begin with integers that are least likely to match the query. Therefore, as soon as a comparison fails, because the index entry contains an integer (attribute/value pair) that conflicts with those of the query, the scan of that entry can be aborted and the search can move to the next entry.
In other words, if it is determined that a particular set of attributes provide the most effective or fastest targeting, integers representing those attributes may be positioned within index entries such that they are the first ones scanned when searching for content items to serve to users. Integers representing attributes that are likely to match many queries (e.g., language=English) may be positioned toward the end of index entries, because they are least likely to provide a meaningful differentiation between different content items and are more likely to match many queries.
In embodiments of the invention in which a content item's attributes are stored as a conjunction of disjunctions (e.g., as shown in
The manner in which entries' targeting data (i.e., integer tokens) are sorted may depend on the type of content item represented by the entries. In particular, for one type of content (e.g., advertisements), one collection of attributes may be particularly distinguishing and less likely to match a future query, and therefore may be placed at the front of the targeting section. For another type of content (e.g., job announcements), a different collection of attributes may be more distinguishing, and so on. The type of content represented by an entry may be identified in the entry's metadata field and/or elsewhere.
Another optimization that may be applied in some embodiments of the invention involves sorting index entries within a slice. In particular, entries may be sorted according to the values of their corresponding content items to the system or service that serves the content items. For example, for content items that consist of advertisements or other sponsored content that the system is paid to serve (e.g., job listings, company announcements, status updates), index entries for those items that yield the most revenue (or that are estimated will yield the most revenue) may be positioned earlier in the index than other entries.
In some embodiments of the invention, a search of an index slice may by default terminate after a predetermined period of time (e.g., 25 milliseconds), even if not all entries in the slice have been searched. Only content items corresponding to matching entries found during that time will be considered for serving. This allows the system to identify and serve content items quickly, with less latency from the time the requests for content items are received. By organizing the index entries within a slice according to the value of the content items, the system can also promote high revenue.
In operation 302, updates to the index are received and queued. Illustratively, the computer system memory in which the index is implemented may host one or more queues for storing the updates. An update to the index may be a new index entry to be stored for a newly received content item, a modification to the target attributes of an existing index entry (e.g., because the provider of the corresponding content item redefined the target audience), a directive or criteria for sorting the index entries within a slice, and/or other changes to the index.
In operation 304, a maintenance cycle commences. In some implementations, a new cycle may commence on a regular basis (e.g., every several seconds, every couple of minutes). The more frequently the maintenance cycles commence, the less time each cycle takes to execute and the shorter disruption there is to the content serving process, if any. For example, if a new cycle is initiated every seven seconds, an entire cycle may last on the order of one second (or less).
In operation 306, one slice of the index is marked or tagged as being offline. Illustratively, a flag in a metadata portion of the slice (e.g., in status field 154a of metadata 112a of slice 110a of index 104 of
In operation 308, the contents of the slice are copied to a spare slice (e.g., with a point-in-time copy operation). In some implementations, an index comprises some number of active slices (e.g., ten) and one or more spare slices for assisting with maintenance of the index.
In operation 310, updates that affect the current slice are retrieved from the queue of updates and applied to the copy. For updates that modify existing entries (e.g., to change targeting, to update a content item's score or measure of effectiveness, to adjust an item's budget), the affected entries are located and changed accordingly. Entries that are to be removed (e.g., because the corresponding content items have been purged) are cleared.
Some updates may comprise new entries to be stored in the slice. In some implementations, new entries may be received in the queue fully formed, that is, ready to be written to the slice. In other implementations, some processing may be required to generate an entry from the update that requires creation of a new entry.
For example, a new content item or information regarding a new content item may be received, including a definition of its target audience. The targeting data may be expressed textually and need to be converted into corresponding integers for storage in the entry's targeting section, and pieces of metadata (e.g., daily budget, score, content item identifier) may need to be converted into integers, if appropriate, for storage in the entry's metadata field.
Other types of updates may involve sorting or ordering the entries of the slice, and/or ordering or re-ordering the contents of one or more entries' targeting sections. In some implementations, the slice's entries may be automatically sorted (according to default or specified criteria) at the end of operation 310, to pack them, place the most valuable at the head of the slice, etc.
In operation 312, the slice copy is brought online, with its identity set to match that of the slice taken offline in operation 306. The slice that was formerly a spare, and that received the copied contents of the slice being maintained, thus takes the place of the offline slice.
After the slice is brought online, the maintenance cycle may pause (e.g., for one or two times the average length of time needed to read the slice), so that any reader entities that were reading the slice that was taken offline have time to complete their operations. The slice that was taken offline in operation 306 may be used as the spare slice for the maintenance of the next slice of the index, and so it is beneficial to ensure that the readers have completed their searches before the offline slice is used to maintain another slice.
In operation 314, it is determined whether all slices have been maintained during the present maintenance cycle. If so, the illustrated method ends. Otherwise, the method returns to operation 306 to select the next slice to be maintained.
It may be noted that updates to the index may be continually received and queued, throughout its operation and maintenance. Operation 302 of
In some implementations, when an update is received at the content-serving system that employs the index, the content item associated with the update (or the content item being delivered with the update) is hashed to identify the slice in which it should be stored. When a content item is modified by an update, this may or may not cause it to be moved to a different slice (e.g., by placing an appropriate entry in the queue of updates). In other implementations, a content item's corresponding entry in the index persists in the same slice from the time it is first indexed until it is removed.
Content items that may be served in the illustrated embodiments of the invention include content that can be presented to a user electronically, via a portable or stationary communication or computing device (e.g., smart phone, tablet computer, laptop, desktop computer), within a browser or other program. Illustrative types of content items that may be indexed for serving include, but are not limited to: advertisements, subscription offers (e.g., subscriptions to enhanced access to a system or service), status updates (e.g., regarding individuals, products, companies, other organizations), job postings or listings, résumés, user profiles or components of user profiles, etc.
In embodiments of the invention reflected in
Each campaign has a target audience, which may be identified using attributes and corresponding values. Illustrative attributes that a source of a campaign or content item may use to define its target(s) may include, but are not limited to: age, gender, industry in which a person works, employer, education level, education institution(s), geographic area of residence, geographic area of work, job title, job description, rank, seniority, years of employment, income, marital status, job status (e.g., unemployed, searching), skills, achievements, qualifications, licenses, membership in organizations, religion, political affiliation, and many more.
As described above, these attribute/value pairs are converted into unique integers that are used within campaign queries (or requests for campaigns) to identify a target user to whom a content item of the requested campaign will be presented. The integers are also used in index entries representing individual campaigns to represent the target audiences of those campaigns.
In operation 402, campaigns are received and indexed. An index described above, or a similar structure, may be used. The content items corresponding to the indexed campaigns may be stored on the same system or machine as the index, or on another system or machine.
The system or service executing the illustrated method may encompass or operate all computing devices associated with indexing the campaigns and serving content items in response to requests or, alternatively, just the devices involved in indexing the campaigns.
In operation 404, a request or query for one or more campaigns is received at the index system. For the purpose of describing the illustrated method of the invention, the terms “request” and “query” may be used interchangeably.
In some implementations, the system or service operating the index executes the illustrated method but does not maintain or serve the associated content items (i.e., content items associated with campaigns identified by the index system in response to queries). In these implementations, the index system simply receives a request for campaigns matching a specified set of attribute/value pairs, which may be expressed as integers or converted into integers, as described above, and returns zero or more campaigns (or identifiers of zero or more campaigns).
A request to the index system in these implementations may come from an aggregator, content server, web server or other entity that receives requests for content items to serve to users (e.g., from web browsers and application programs executing on user-operated devices). Although the index system operates separately from the content serving system, they may be managed or operated by a common entity. Hardware entities (e.g., computer systems) and/or software entities (e.g., computer program modules) of the index system and the content serving system may be completely separate or may overlap to some degree.
In some other implementations, the index system and the content serving system are co-located or one is part of the other. In these implementations, the combined system receives a request for some number of content items, from a website, portal, application, web browser, communication service provider or other entity that will receive responsive content items from the system and present them to a user.
For example, when a user connects to a social network website, a page of the website will begin loading on the user's device. As part of composing the page, some number (e.g., three) of content items is needed for presentation within the page, and so a request for three content items is issued to the combined system, and the combined system queries the index system to identify suitable campaigns. From the identified campaigns, three content items will be served by the combined system, for presentation to the target user
In operation 406, attributes of a target user to whom the results of the request will be served are extracted from the query. If not already in integer form, the attributes are converted into their integer equivalents, using a mapping table or other data structure maintained by the index system or the entity that issues the request to the index system (e.g., a content serving system).
When a content request is received at a content serving system, or a combined content serving and index system, it may include one or more attributes of the target user to whom the served content items will be presented. The content request may also include information about the page, frame or other construct in which the content items will be presented. Such information may indicate a nature of the page or the website, (e.g., social network service, search engine, a professional sports team's site, employment recruiter), characterizations of other content that will be presented in the page (e.g., job listings, news articles), and/or other attributes of the page or website in which the content items will be presented.
Some or all of the attributes of the target user and/or the environment in which the content items will be presented may be received in integer form or may be converted into integer form upon receipt of the request. Any or all of these integer representations of attribute/value pairs may be used to search the index.
In operation 408, attributes of the query are supplied, preferably in integer form, to reader entities configured to read the index's slices. A reader entity may be a hardware or software module configured to search a slice's entries to find matches for queries. A reader may be tied to a specific slice, or a pool of readers may be maintained for use in reading any slice.
In operation 410, at each slice that is online, a timer is started and at least one reader begins scanning the slice to find entries having attributes that match those of the query. As each entry is searched for the query's attributes, as soon as an integer token is encountered that conflicts with the query's attributes, the search advances to the next entry.
In operation 412, the search of each slice terminates when it reaches a threshold duration of time (e.g., 25 milliseconds), as long as it has identified a threshold number of entries (e.g., 10). If the threshold number of entries has not been identified by that time threshold, the search may continue until (a) all entries have been search, (b) the threshold number of entries has been identified or (c) a second time threshold is reached. For each identified entry, the corresponding campaign and/or its storage location is identified (e.g., from the entry's metadata field).
It may be noted that any slice that is offline at the time execution of the query against the index begins may be omitted from the search. Because only one slice will be offline at a time during a maintenance cycle, the number of campaigns not searched because of their slice being offline is kept to a minimum. Also, as described above, maintenance of individual slices is performed expeditiously, and any search that was initiated before the slice was marked offline will be able to complete normally.
In operation 414, the results of the searches of the slices are returned. The results may comprise a collection of campaigns (or identifiers of campaigns), and may be aggregated by the index system or by the entity that submitted the request to the index system.
In some embodiments of the invention, results of a search are ranked or filtered to reduce the number of matches. For example, the results may be ranked based on scores of the campaigns (e.g., as noted in the corresponding entries' metadata), measures of effectiveness (e.g., over all users or over all users that match the target user's attributes, etc.
Index system 500 of
Memory 504 stores one or more indices of electronic content; the content may be stored on and served from index system 500, or may be stored on and served from one or more other systems coupled to index system 500.
Storage 506 of the index system stores logic that may be loaded into memory 504 for execution by processor 502. Such logic includes attribute/integer conversion logic 522, reader logic 524, maintenance logic 526 and optional content items/campaigns 528. In other embodiments of the invention, any or all of these logic modules or other content may be combined or divided to aggregate or separate their functionality as desired.
Attribute/Integer conversion logic 522 comprises processor-executable instructions for mapping between attribute/value pairs (e.g., of a target audience, of a target user) and corresponding integer tokens. Logic 522 may include or be accompanied by one or more tables or indices for mapping between a given integer and its corresponding attribute/value pair.
Reader logic 524 comprises processor-executable instructions for searching index entries to find matches with a query. Logic 524 may therefore be designed to perform comparisons between integer tokens of a query and integer tokens stored in index entries.
Maintenance logic 526 comprises processor-executable instructions for updating and maintaining an index stored in memory 504. In particular logic 526 will apply updates queued for the index, and maintain each of multiple slices in turn. As described previously, updating the index may involve removing an entry, updating the metadata and/or targeting data of an entry, and writing new entries into the index.
Content items and campaigns 528 include content items and/or campaign descriptions. This may be stored on a system coupled to index system 500 if they are not stored on system 500.
The environment in which some embodiments of the invention are executed may incorporate a general-purpose computer or a special-purpose device such as a hand-held computer or communication device. Details of such devices (e.g., processor, memory, data storage, display) may be omitted for the sake of clarity.
Data structures and code described in this detailed description are typically stored on a non-transitory computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system. Non-transitory computer-readable storage media includes, but is not limited to, volatile memory, non-volatile memory, magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital video discs), or other non-transitory computer-readable media now known or later developed.
The methods and processes described in the detailed description can be embodied as code and/or data, which can be stored in a non-transitory computer-readable storage medium as described above. When a processor or computer system reads and executes the code and/or data stored on the medium, the processor or computer system performs the methods and processes embodied as data structures and code and stored within the medium.
Furthermore, the methods and processes described below can be included in hardware modules. For example, the hardware modules may include, but are not limited to, application-specific integrated circuit (ASIC) chips, field-programmable gate arrays (FPGAs) and other programmable-logic devices now known or later developed. When the hardware modules are activated, the hardware modules perform the methods and processes included within the hardware modules.
The foregoing descriptions of embodiments of the invention have been presented for purposes of illustration and description only. They are not intended to be exhaustive or to limit the invention to the forms disclosed. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. The scope of the invention is defined by the appended claims, not the preceding disclosure.