Membase-Cloudera Integration Joins Leading Hadoop Distribution and Real-Time NoSQL Database
Bi-directional connection between Membase and Cloudera’s Distribution for Hadoop is revolutionizing ad, offer and content targeting at AOL Advertising and ShareThis
New York – October 12, 2010 – Hadoop World – Membase, Inc. (now Couchbase) and Cloudera today announced they have executed a partnership agreement and completed an integration of Membase Server with Cloudera’s Distribution for Hadoop (CDH). At Hadoop World today, AOL Advertising and ShareThis will deliver a presentation outlining how this integration has accelerated and increased the effectiveness of their ad targeting and serving platforms.
- Membase Server is a simple, fast, elastic, distributed NoSQL database management system, optimized for low-latency, high-volume data access by web applications.
- CDH is the most comprehensive platform available for accelerating the deployment of Apache-Hadoop.
- Ad (and other content) targeting systems must make complex decisions in a very small window of time – typically between 40-100 milliseconds.
- In consumer-facing web systems, many of these decisions are made in parallel.
- Minimizing input data load time leaves more time on the clock to make intelligent targeting decisions; with enough time, even complex real-time customization of ad content is possible.
- User, or cookie, profiles are standard input data to targeting systems
“The integration of Membase Server and the Cloudera’s Distribution for Hadoop dramatically increases the performance and effectiveness of ad targeting platforms like those in use at AOL and ShareThis, where sub-millisecond random access to a very large data set can lead to measurable increases in advertising effectiveness,” said James Phillips, co-founder and SVP of Products at Membase. “No other database system can maintain the low latency and high throughput characteristics of Membase. For a 2KB user profile, Membase can sustain mean random read and write latency of 300 microseconds with 99th percentile latency under 800 microseconds; and it can do it while scaling from a single node to a multi-hundred node cluster.”
CDH and Membase together provide the technology underpinnings to support ad, offer, and content targeting scenarios:
- User profiles are generated using CDH.
- A stream of events associated with a given cookie or user is fed to CDH from Membase and other sources.
- Scheduled MapReduce jobs are used to process and transform these event streams into user profiles, which are fed in to Membase.
- Membase speeds delivery of the user profile data to the targeting logic, maximizing the amount of time the ad serving platform has for decision making and ad customization.
“AOL serves billions of impressions per day from our ad serving platforms, and any incremental improvement in processing time translates to huge benefits in our ability to more effectively serve the ads to needed meet our contractual commitments,” said Pero Subasic, Chief Architect, AOL. “Traditional databases lack the scalability required to support our goal of five milliseconds per read/write. Creating user profiles with Hadoop, then serving them from Membase, reduces profile read and write access to under a millisecond, leaving the bulk of the processing time budget for improved targeting and customization.”
“Integrating with Membase Server with Cloudera’s Distribution for Hadoop adds complementary functionality that customers are interested in,” said Mike Olson, Cloudera CEO. “The result is a highly optimized data delivery system with virtually no lag time. This real-time processing capability is essential for any solution on which split decisions must be made, including ad targeting and social gaming.”
AOL Chief Architect Pero Subasic and ShareThis Architect Manu Mukerji will join Membase co-founder James Phillips at Hadoop World later today to present “Better ad, offer and content targeting using Membase with Hadoop.” The session will be held at 1:45pm in Sutton South at the Hilton Hotel, NYC.
Membase, Inc., (formerly NorthScale) is the company behind Membase, the simple, fast, elastic NoSQL database technology. The company provides products and services that enable customers to dramatically lower costs while simultaneously improving the scalability and performance of their interactive web applications. Membase is behind some of the world’s busiest web applications, including popular social games by Zynga, played by millions of users daily. It provides a shared data management platform for NHN, Korea’s largest web application operator with nearly 70 million unique users. Membase is also available through cloud service providers such as Heroku and RightScale, supporting thousands of applications of all sizes. Founded in 2009 and headquartered in Mountain View, Calif., Membase is a privately held company funded by Accel Partners, Mayfield Fund and North Bridge Venture Partners. www.membase.com
Cloudera (www.cloudera.com) is a leading provider of Hadoop-based software and services and works with customers in financial services, web, telecommunications, government and others industries. The company’s products, Cloudera Enterprise and Cloudera’s Distribution for Hadoop, help organizations profit from all of their information. Cloudera's Distribution for Hadoop is the most comprehensive Apache Hadoop-based platform in the industry. Cloudera Enterprise is the most cost-effective way to perform large-scale data storage and analysis and includes the tools, platform and support necessary to use Hadoop in a production environment. Cloudera provides professional services, technical support and training to help any business use the software created by Google, Facebook and Yahoo!. Founded by pioneers in large-scale data and home of the original Apache Hadoop creator, Cloudera is a private company backed by venture investors Accel Partners and Greylock Partners with headquarters in Palo Alto, California.