Ad and Offer Targeting with Couchbase

Selecting an ad to display or an offer to present on a web page is a choice with direct revenue impact; and a choice that must be made quickly to minimize page load times. To make a revenue-maximizing decision, targeting logic must consider current behavioral, demographic and psychographic characteristics known, or inferred, about the targeted ad or offer recipient; and it must weigh those against the current status of running campaigns, contract commitments and goals.

AOL Advertising runs one of the largest online ad serving operations, serving billions of impressions each month to hundreds of millions of people. A combination of real-time database technology and Big Data analytics helps them address the challenges inherent in operating an ad platform.

AOL faced three data management challenges when building their ad serving platform:

  • How to analyze billions of user-related events, presented as a mix of structured and unstructured data, to infer demographic, psychographic and behavioral characteristics that are encapsulated into hundreds of millions of user profiles
  • How to make hundreds of millions of user profiles available to their targeting platform with sub-millisecond latency on random reads
  • How to keep the user profiles fresh and current

The solution was to integrate two data management systems: one optimized for high-throughput data analysis (the “analytics” system), the other for low-latency random access (the “transactional” system). After evaluating alternatives, the final architecture paired Hadoop with Couchbase:

In this architecture, (1) click-stream data and other events are fed into Hadoop from a wide variety of sources (2) the data is analyzed using Hadoop MapReduce to generate hundreds of millions of user profiles; then, based on which ad campaigns are running, selected user profiles are loaded into Couchbase where (3) ad targeting logic can query Couchbase with sub-millisecond latency to get the data needed to make optimized decisions about real-time ad placement. The targeting system also writes to Couchbase after each decision is made – updating historical and statistical information that helps shape subsequent targeting decisions.

A number of intrinsic characteristics and capabilities of Couchbase make it a strong fit for the real-time data management needs of modern ad and offer targeting systems:

  • Production proven in large-scale ad and offer targeting systems. Couchbase has proven itself in large-scale production ad and offer targeting deployments with organizations including Chango, ShareThis, Context Web, Delta Project’s Ad Action, Media Mind, Ad Scale and AOL Advertising.
  • Schema-free data model. No need to define (or redefine) a database schema before inserting data. Targeting algorithms and approaches can change rapidly and often require changes in input data.
  • Elastic scaling. Effortlessly scales out to hold billions of data items for hundreds of millions of users, on commodity hardware or cloud computing instances
  • Consistent sub-millisecond random read and write latency. Consistently delivers sub-millisecond random read and write latency across the entire data set, supporting not only optimized decision making but enabling finely-targeted personalization of ad and offer content in tight decision time windows
  • Hadoop integration. Available Sqoop and Flume connectors provide “off the shelf” bi-directional connectivity between Hadoop and Couchbase
  • Built-in transparent caching. Stores data supporting active campaigns in main memory for deterministic sub-millisecond latency, while automatically migrating data items not currently required to disk for lower-cost storage