Error 400 with filtered replication between PouchDB and Sync Gateway

Hello all,

I’m trying to sync PouchDB to Couchbase via Sync Gateway. It’s all working fine when I sync all docs (within my allowed channels). However, when I try to add a filter, to do filtered replication, PouchDB is giving a 400 error (Bad request).

My filter currently just returns true, so is about as simple as you can get and should just let all docs through. I am adding the filter design doc to PouchDB with the following function:

function addFilterIndex_sync() {
var designDoc = {
"_id": “_design/sync_filter”,
“filters”: { “sync_filter”: function(doc, req) { return true; }.toString() }
};

database.put(designDoc).then(function (result) {
}).catch(function (err) { /* handle errror */ });
}

This appears to be successful (at least, there is no error from the put command). I am trying to use it in the sync in the following way:

database.sync(remoteDatabase, {
live: true,
retry: true,
filter: ‘sync_filter’,
query_params: { ‘project’: project }
})….

If I comment out the “filter:” line it all works fine and I get all my docs. With the filter specification I get the 400 error.

This is with PouchDB 5.4.5 and Sync Gateway 1.2.1.

I have already asked this question to the PouchDB community, and received the following reply from Nolan Lawson:
“we don’t run the full test suite against CSG and I know not every API works. You might try syncing PouchDB -> CouchDB -> CSG instead”

I don’t want to use CouchDB as we are already using Couchbase and it’s suited our needs so far. I find it difficult to believe that CSG doesn’t support filtered replication and that others aren’t using this, so I think I’m probably just doing something silly. Can anyone please help?

Thanks,
Giles

Thanks
I have already asked this question in the PouchDB community, and received the following response from Nolan Lawson

Sync Gateway supports filtered replication by channel or doc id. To accomplish the equivalent of the above, you’d need to incorporate your filter into Sync Gateway’s sync function to assign docs to channels, and then use the channel filter.

Details on usage here:
http://developer.couchbase.com/documentation/mobile/1.2/develop/references/sync-gateway/rest-api/database-public/get-changes/index.html

@adamf: You’re thinking of a pull filter, which runs on the server. But it looks like @giles is trying to push, since he’s using a filter function in PouchDB.

@giles: Push filters are entirely local to the peer running the replication, i.e. PouchDB. PouchDB’s push replicator calls the function on every candidate revision to check whether it should push it. If the filter isn’t working, it must be something to do with PouchDB. Have you tried running the exact same sync to CouchDB, or to another PouchDB instance?

(If you’re not trying to push, then please be specific about what you’re doing. The word “sync” is really vague because it doesn’t imply directionality.)

@giles: IIRC the filter name has to be scoped to the design doc name. So given the design doc you posted, the filter property of your replication request should be sync_filter/sync_filter.

Thanks for the replies guys.

@adamf, I already use channels on a more permanent basis (i.e. which projects does the user have access to?) rather than dynamically (which project is active now?). I want to use filters to pull only the docs for the project that the user currently has selected rather than pulling all docs for all projects that the user can access. I will look at using channels for this as well.

I think there are two different possible methods for this:

  • specify the channel in the GET request, as per your link. However, would this mean that I can’t use PouchDB to do the replicating with the sync function, and would need to do the sync myself with REST GET calls?
  • Rewrite the user to include only the current project in the “all_channels” property. This seems like the simplest solution with the least changes as the sync function should all work in the way that I am currently using it, pulling only the current project. Two questions on this: (a) Are there any issues with repeatedly writing the user doc to change channels as they switch between docs, and are there any limits on this? (b) will the sync automatically keep up as the channels change for the current user?

@jens, I am mostly trying to pull. However, the sync function should allow filtering in both directions. If the filter is local, it syncs on the client side, even for a pull. Once this was working I was planning to push the filter to the server, but it appears that it doesn’t work either way.
On your second point, if the filter property has the same name as the design doc ID, I don’t think you need to specify it. So if the design doc is “sync_filter”, and the filter name is “on_project” you would need to specify "sync_filter/on_project, however, if both names are “sync_filter”, you just specify “sync_filter”. However, I have also tried “sync_filter/sync_filter”. I have also tried a different name for the filter.

Thanks all,
Giles

Pull filters are always on the server side. In CouchDB they’re installed in a design document on the server, but Sync Gateway doesn’t support that — it doesn’t scale well at all, which is why we came up with channels.

You can specify channels when pulling by using the filter name sync_gateway/bychannel, and setting the filter parameter named channels to a string containing comma-separated channel names.

Thanks Jens,

I have just tried using that filter with a channel name and it works a treat!

I had to switch to using PouchDB.replicate, rather than database.sync, so that the filter is only used for pulling, which makes sense. I can just do a normal replicate without the filter for the push (I don’t need to filter when pushing as I have more control over the records that I write).

Great advice, thanks very much. Is this filter documented anywhere?

One final question - does the channels list that you specify here override the channels and roles that are specified for the user? I.e. do the user’s all_channels and roles properties just default this filter if it isn’t specified? Or do the two work together? I.e. do the channels that you specify here also need to be in the allowed channels for the user?)

Thanks,
Giles

Actually another final question… :slight_smile:

Is there a recommended maximum limit to the number of channels that can be specified, either for a user/role, or in this replication filter?

Thanks,
Giles

There isn’t a specific recommendation for the maximum number of channels. Increasing the number of channels will result in an increase in memory usage by Sync Gateway, as it maintains an in-memory cache of document metadata for the latest documents written to each channel (where “latest” is configurable using the channel_cache settings described here:
http://developer.couchbase.com/documentation/mobile/current/develop/guides/sync-gateway/configuring-sync-gateway/config-properties/index.html#story-h2-8)

There’s will be some additional CPU usage on a given changes request if it’s retrieving data for a larger number of channels, but I think memory usage would be the main performance consideration.

Yes. The default with no filter is “all channels I have access to”, so specifying a list of channels is how you pull a subset.

Thanhs for the useful info @adamf and @jens. Very useful.