GSOC 2013 Project Ideas

Skip to end of metadata
Go to start of metadata

1. Couchbase Memory Allocator

Description: Couchbase currently uses the "the system allocator" to allocate continuous memory chunks and tcmalloc on certain supported platforms. This works well for allocating a lot of variable sized objects but not very well with large objects. Internally, we don't need continuous memory allocations so this task is about replacing the use of the system allocator with a "block allocator" and enhancing tcmalloc to work in such scenarios.

Expected results: After this project is completed, Couchbase's memory allocator should be a block allocator.

Knowledge : C++
Difficulty : Hard

3. In-Memory compression technique for Couchbase

Description: Couchbase currently stores keys, metadata and potentially document contents in memory. This project involves coming up with efficient technique for compressing objects stored in memory.
Expected results: After this project is completed, data in-memory should be compressed efficiently.
Knowledge : C/C++
Difficulty : Medium

4. Integrate Google Breakpad

Description: Couchbase relies on coredumps being available on the system in order to gather crash information back to engineering. In some deployments, users prefer to not have this enabled  since it takes for instance a fair amount of time to dump core of a binary with a 64GB memory footprint, and a fair amount of diskspace. In these circumstances it would be better to gather some simple diagnostics instead of just the simple "the program crashed".

The goal of this project is to integrate Google Breakpad into the various components of Couchbase, and make it configurable to enable it.
Expected results: All components of Couchbase may be configured to use Google Breakpad (including documentation on how to enable/disable this), and the crash information is submitted as part of the generated diagnostic sent back from Couchbase.
Knowledge: C/C++ and build systems
Difficulty : Easy

5. CBFS Blob Chunking

Description: CBFS (a.k.a Couchbase Large Object Store) is built on top of Couchbase Server.
File content in CBFS is represented as a single sequence of bits.  Large files require large blobs to move around during replication.  Small changes to large blobs require full duplication of the common parts.  Simple block-based chunking will make it a lot easier to move bits around and make appends (for example) cheaper.
Expected results: cbfs has file -> [blob, ...] instead of file -> blob
Knowledge: The student should be comfortable working in the GO programming language.
Difficulty : Medium

6. Couchbase Drupal Adapter

Description: Write an adapter to run the drupal platform on Couchbase.
Expected results: After this project is completed, Couchbase should be the primary backend platform for Drupal.
Knowledge: PHP
Difficulty: Medium

8. Couchbase Worker Queue

Description: An important part of a distributed application is a means of asynchronously processing work.  Many users have attempted to create their own work queues, but often in problematic ways.  By providing this as a service, users can build reliable, scalable applications significantly faster.

Expected results: Design a well thought about API on the wire for doing work management. Modify a client of your choice to implement this.
Knowledge: Any programming language.
Difficulty : Medium

9. Couchbase Big Data Testing

Description: Testing large volumes of data for correctness is a challenging tasks. Here are a few options you can think of -  (1) sampling smaller datasets and testing the validity, (2) building indexes on existing smaller datasets and verifying the data after the tests are run, (3) tracing a smaller dataset injected from start and end of test, monitoring it, (4) compressing/hashing data and compare hash values at end of large scale testing, and (5) identifying similar patterns in data and identifying outliers. The challenge here is having deterministic and predictable measures from this large volume of data with random sampling.

Expected results: Create measures/tools for finite and deterministic large data tests.
Knowledge: Any programming language.
Difficulty : Medium

10. DataMapper Adapter for Couchbase's Ruby Programming Language

Description: Datamapper is a good ORM framework. Currently it supports plenty of storage types but does not have one for Couchbase.  

Expected results: Provide rubygem which will implement Couchbase adapter for datamapper allowing people to integrate it more smoothly into their applications.

Additional resources:

Couchbase ruby client, which implements the protocol and networking. And also provides lower-level API to Couchbase server

Standalone experimental implementation of ORM

Knowledge: Good knowledge of the ruby programming language. Ability to build good API's.
Difficulty : Medium

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Apr 03, 2013

    asiriwork says:

    Hi I'm interested in several project ideas mentioned above. But yet I couldn't f...

    I'm interested in several project ideas mentioned above. But yet I couldn't find a mailing list to subscribe ,how to contact a mentor or any discussion form which I can join and get more specific details about project ideas.
    It would help a lot if those links are given.
    Asiri (

    1. Apr 03, 2013

      asiriwork says:

      Just found the google group.!foru...