Puller hangs in active state, no errors in output

Developing on Android with Xamarin, in 1.4 build 028:

When trying to log in to our app, the pusher goes Idle but the puller hangs in Active state. Upon inspection of the puller, it reports that
"processed 0/0 changes"
“error count = 0”

As far as I can tell it hangs indefinitely, it hasn’t changed to idle after 15 minutes of waiting. This problem happens intermittently and appears to depend on the computer it is being built from. Once it starts happening, it happens continuously. My co-worker using the same code can log in successfully within a few seconds. Tried clean/rebuild, different devices (including ones that work from other computers).

Here is the output log:

[0:] INFO) DATABASE (Database): [6] 2017-1-19 12:23:56.595+08:00 Using Couchbase.Lite.Storage.ForestDB.ForestDBCouchStore for ForestDB implementation
[0:] INFO) DATABASE (Database): [6] 2017-1-19 12:23:56.601+08:00 Using Couchbase.Lite.Storage.SystemSQLite.SqliteCouchStore for SQLite implementation
[0:] INFO) DATABASE (SqliteCouchStore): [7] 2017-1-19 12:23:56.625+08:00 Initialized SQLite store (version 3.8.10.2 (2015-05-20 18:17:19 2ef4f3a5b1d1d0c4338f8243d40a2452cc1f7fe4))
[0:] INFO) DATABASE (Database): [6] 2017-1-19 12:23:56.632+08:00 Using Couchbase.Lite.Storage.ForestDB.ForestDBCouchStore for db at /data/user/0/xxxxxxxxx/files/.local/share/de531717e7f76a7ea09b86061717df3f7.cblite2; upgrade=False
[0:] INFO) TASK SCHEDULING (SingleTaskThreadpoolScheduler): [7] 2017-1-19 12:23:56.694+08:00 New scheduler created
Loaded assembly: System.Runtime.Serialization.dll [External]
Loaded assembly: Anonymously Hosted DynamicMethods Assembly [External]
[0:] INFO) TASK SCHEDULING (SingleTaskThreadpoolScheduler): [6] 2017-1-19 12:23:57.148+08:00 New scheduler created
Thread started: #7
Thread started: #8
[0:] INFO) SYNC (Replication): [9] 2017-1-19 12:23:57.244+08:00 Attempting to start puller (ea9b284f-5539-46b3-a486-dd10981c1448)
[0:] INFO) SYNC (Replication): [9] 2017-1-19 12:23:57.250+08:00 Attempting to start pusher (891f43e4-c292-466a-bc0c-478182cde632)
[0:] INFO) SYNC (Replication): [10] 2017-1-19 12:23:57.511+08:00 Beginning replication process…
[0:] INFO) SYNC (Replication): [9] 2017-1-19 12:23:57.622+08:00 Pusher 891f43e4-c292-466a-bc0c-478182cde632 successfully logged in!
Thread started: #9
[0:] INFO) SYNC (Replication): [11] 2017-1-19 12:23:57.732+08:00 Beginning replication process…
[0:] INFO) SYNC (Replication): [9] 2017-1-19 12:23:57.745+08:00 Puller ea9b284f-5539-46b3-a486-dd10981c1448 successfully logged in!
[0:] INFO) SYNC (Replication): [11] 2017-1-19 12:23:57.824+08:00 Puller ea9b284f-5539-46b3-a486-dd10981c1448 no local checkpoint, not getting remote one
[0:] INFO) SYNC (Replication): [11] 2017-1-19 12:23:57.830+08:00 Pusher 891f43e4-c292-466a-bc0c-478182cde632 no local checkpoint, not getting remote one
[0:] INFO) SYNC (Puller): [9] 2017-1-19 12:23:57.837+08:00 Puller ea9b284f-5539-46b3-a486-dd10981c1448 will use MaxOpenHttpConnections(8), MaxRevsToGetInBulk(50)
[0:] INFO) CHANGE TRACKER (WebSocketChangeTracker): [11] 2017-1-19 12:23:57.882+08:00 Starting WebSocketChangeTracker[sync_gateway]…
01-19 12:23:59.677 I/cbforest(18793): Database: beginTransaction
01-19 12:23:59.707 I/cbforest(18793): Database: commit transaction
[0:] INFO) DATABASE (Database): [10] 2017-1-19 12:23:59.727+08:00 No pending doc expirations

im having the same problem, with the same build

Is data is visible at SG admin panel.Please check it at localhost:4986/_admin. If data is not present here than it clearly show that there is communication problem between CB server and sync_gateway.

Are the versions of Xamarin Android different on each of the machines? Do the projects happen to have different permission requests? That’s the only thing that I can think of off the top of my head for things being different between two environments.

That can be better answer by Xamarin Android.

No the Xamarin Android versions are exactly the same, synced from the same git repo. The projects have the same permissions going on. Is there any way I can see what the Puller is waiting for? This is really frustrating, I haven’t been able to access the database for an entire day.

Can you provide the url of your sync gateway… I will check pull at my end.

The puller is waiting for some information from the changes feed but it seems like they are never coming.

How you put the data in server. Through SG or you manually enter that data?

@borrrden how can I check what data the puller is waiting for? Or are there any other ways I can debug this thing?

I don’t know if this is relevant or not, but if I disable wifi while my app is running I get an error relating to the puller. This stack trace shows the transition from online to offline, then the exception that happens. It’s strange to see the pusher comes back online again even though wifi is still disconnected.

[0:] INFO) SYNC (Replication): [10] 2017-1-20 09:10:20.692+08:00 Attempting to start pusher (660a1ffb-ca8f-478f-a9ad-d6330b88f3ff)
[0:] INFO) SYNC (Replication): [10] 2017-1-20 09:10:20.702+08:00 Attempting to start puller (44cf1935-a81d-4462-ac60-cc870104c881)
[0:] INFO) SYNC (Replication): [11] 2017-1-20 09:10:20.944+08:00 Beginning replication process...
[0:] INFO) SYNC (Replication): [10] 2017-1-20 09:10:21.070+08:00 Pusher 660a1ffb-ca8f-478f-a9ad-d6330b88f3ff successfully logged in!
01-20 09:10:21.101 D/DalAuthEventCOMMON( 8433): Active
Thread started: <Thread Pool> #10
[0:] INFO) SYNC (Replication): [12] 2017-1-20 09:10:21.195+08:00 Beginning replication process...
[0:] INFO) SYNC (Replication): [10] 2017-1-20 09:10:21.214+08:00 Puller 44cf1935-a81d-4462-ac60-cc870104c881 successfully logged in!
01-20 09:10:21.211 D/DalAuthEventCOMMON( 8433): Active
[0:] INFO) SYNC (Replication): [11] 2017-1-20 09:10:21.300+08:00 Puller 44cf1935-a81d-4462-ac60-cc870104c881 no local checkpoint, not getting remote one
[0:] INFO) SYNC (Replication): [11] 2017-1-20 09:10:21.310+08:00 Pusher 660a1ffb-ca8f-478f-a9ad-d6330b88f3ff no local checkpoint, not getting remote one
[0:] INFO) SYNC (Puller): [10] 2017-1-20 09:10:21.319+08:00 Puller 44cf1935-a81d-4462-ac60-cc870104c881 will use MaxOpenHttpConnections(8), MaxRevsToGetInBulk(50)
[0:] INFO) CHANGE TRACKER (WebSocketChangeTracker): [11] 2017-1-20 09:10:21.381+08:00 Starting WebSocketChangeTracker[sync_gateway]...
01-20 09:10:21.391 D/DalAuthEventCOMMON( 8433): Active
01-20 09:10:22.891 I/cbforest( 8433): Database: beginTransaction
01-20 09:10:22.921 I/cbforest( 8433): Database: commit transaction
[0:] INFO) DATABASE (Database): [10] 2017-1-20 09:10:22.947+08:00 No pending doc expirations
01-20 09:10:26.951 D/ViewRootImpl( 8433): ViewPostImeInputStage processPointer 0
01-20 09:10:27.361 D/ViewRootImpl( 8433): ViewPostImeInputStage processPointer 1
[0:] INFO) SYNC (Replication): [9] 2017-1-20 09:10:28.847+08:00 Puller 44cf1935-a81d-4462-ac60-cc870104c881 going offline
[0:] INFO) CHANGE TRACKER (WebSocketChangeTracker): [10] 2017-1-20 09:10:28.870+08:00 WebSocketChangeTracker[sync_gateway] requested to stop
[0:] INFO) SYNC (Replication): [10] 2017-1-20 09:10:28.901+08:00 Pusher 660a1ffb-ca8f-478f-a9ad-d6330b88f3ff going offline
01-20 09:10:28.931 D/DalAuthEventCOMMON( 8433): Offline
[0:] INFO) QUERY (Query): [10] 2017-1-20 09:10:28.971+08:00 [Query: Database=Database[/data/user/0/com.alertedh.health/files/.local/share/de531717e7f76a7ea09b86061717df3f7.cblite2], Limit=2147483647, Skip=0, Descending=True, StartKey=<redacted>,
EndKey=<redacted>, StartKeyDocId=<redacted>, EndKeyDocId=<redacted>, InclusiveStart=True, InclusiveEnd=True,
IndexUpdateMode=Before, AllDocsMode=AllDocs, Keys=, MapOnly=True, GroupLevel=0, Prefetch=False,
IncludeDeleted=False, PostFilter=, PrefixMatchLevel=0] running...
01-20 09:10:28.971 D/DalAuthEventCOMMON( 8433): Offline
01-20 09:10:28.971 I/cbforest( 8433): Database: beginTransaction
01-20 09:10:29.001 D/ViewRootImpl( 8433): #1 mView = android.widget.LinearLayout{b801a45 V.E...... ......I. 0,0-0,0}
01-20 09:10:29.011 I/cbforest( 8433): Database: beginTransaction
01-20 09:10:29.021 I/cbforest( 8433): DB 0x7f5cf8fb40: added 01 --> 07050000000000000000050000000000000000062b0004400fffffffffffff05000000000000000005401400000000000005000000000000000000 (meta ) (seq 5)
01-20 09:10:29.021 I/cbforest( 8433): Database: commit transaction
01-20 09:10:29.031 D/ViewRootImpl( 8433): MSG_RESIZED_REPORT: ci=Rect(0, 0 - 0, 0) vi=Rect(0, 0 - 0, 0) or=1
01-20 09:10:29.031 I/cbforest( 8433): Database: commit transaction
[0:] INFO) QUERY (View): [11] 2017-1-20 09:10:29.050+08:00 Query recordings: Returning iterator
01-20 09:10:29.061 V/RenderScript( 8433): 0x7f54b54000 Launching thread(s), CPUs 8
01-20 09:10:30.971 D/ViewRootImpl( 8433): #1 mView = android.widget.LinearLayout{3ea659a V.E...... ......I. 0,0-0,0}
01-20 09:10:30.981 D/ViewRootImpl( 8433): #3 mView = null
01-20 09:10:31.031 D/ViewRootImpl( 8433): MSG_RESIZED_REPORT: ci=Rect(0, 0 - 0, 0) vi=Rect(0, 0 - 0, 0) or=1
01-20 09:10:32.521 D/ViewRootImpl( 8433): #3 mView = null
Thread finished: <Thread Pool> #8
The thread 'Unknown' (0x8) has exited with code 0 (0x0).
[0:] INFO) SYNC (Replication): [5] 2017-1-20 09:10:36.591+08:00 Puller 44cf1935-a81d-4462-ac60-cc870104c881 going online
[0:] INFO) SYNC (Replication): [5] 2017-1-20 09:10:36.607+08:00 Pusher 660a1ffb-ca8f-478f-a9ad-d6330b88f3ff going online
[0:] INFO) SYNC (Puller): [5] 2017-1-20 09:10:36.618+08:00 Puller 44cf1935-a81d-4462-ac60-cc870104c881 will use MaxOpenHttpConnections(8), MaxRevsToGetInBulk(50)
[0:] INFO) CHANGE TRACKER (WebSocketChangeTracker): [5] 2017-1-20 09:10:36.626+08:00 Starting WebSocketChangeTracker[sync_gateway]...
01-20 09:10:36.631 D/DalAuthEventCOMMON( 8433): Offline
[0:] ERROR) SYNC (Replication): [12] 2017-1-20 09:10:36.643+08:00 State machine error:
System.NullReferenceException: Object reference not set to an instance of an object.
  at Couchbase.Lite.Internal.WebSocketChangeTracker.Start () [0x000ac] in <52da0b1f3186477b8bd33a888bd1e89c>:0 
  at Couchbase.Lite.Replicator.Puller.StartChangeTracker () [0x0023c] in <52da0b1f3186477b8bd33a888bd1e89c>:0 
  at Couchbase.Lite.Replicator.Puller.BeginReplicating () [0x000eb] in <52da0b1f3186477b8bd33a888bd1e89c>:0 
  at Couchbase.Lite.Replicator.Puller.PerformGoOnline () [0x00006] in <52da0b1f3186477b8bd33a888bd1e89c>:0 
  at Couchbase.Lite.Replication.<InitializeStateMachine>m__A (Stateless.StateMachine`2+Transition[TState,TTrigger] transition) [0x00049] in <52da0b1f3186477b8bd33a888bd1e89c>:0 
  at Stateless.StateMachine`2+ExitActionBehavior+Sync[TState,TTrigger].Execute (Stateless.StateMachine`2+Transition[TState,TTrigger] transition) [0x00000] in <3986113ba7814d109bc72451439ee2d0>:0 
  at Stateless.StateMachine`2+StateRepresentation[TState,TTrigger].ExecuteExitActions (Stateless.StateMachine`2+Transition[TState,TTrigger] transition) [0x00020] in <3986113ba7814d109bc72451439ee2d0>:0 
  at Stateless.StateMachine`2+StateRepresentation[TState,TTrigger].Exit (Stateless.StateMachine`2+Transition[TState,TTrigger] transition) [0x00036] in <3986113ba7814d109bc72451439ee2d0>:0 
  at Stateless.StateMachine`2[TState,TTrigger].InternalFireOne (TTrigger trigger, System.Object[] args) [0x0005b] in <3986113ba7814d109bc72451439ee2d0>:0 
  at Stateless.StateMachine`2[TState,TTrigger].InternalFire (TTrigger trigger, System.Object[] args) [0x0002f] in <3986113ba7814d109bc72451439ee2d0>:0 
  at Stateless.StateMachine`2[TState,TTrigger].Fire (TTrigger trigger) [0x00000] in <3986113ba7814d109bc72451439ee2d0>:0 
  at Couchbase.Lite.Replication+<FireTrigger>c__AnonStorey0.<>m__0 () [0x00000] in <52da0b1f3186477b8bd33a888bd1e89c>:0 
[0:] INFO) SYNC (Replication): [12] 2017-1-20 09:10:36.660+08:00 Pusher 660a1ffb-ca8f-478f-a9ad-d6330b88f3ff successfully logged in!
[0:] INFO) SYNC (Replication): [12] 2017-1-20 09:10:36.664+08:00 Pusher 660a1ffb-ca8f-478f-a9ad-d6330b88f3ff no local checkpoint, not getting remote one

This is looking a lot like https://github.com/couchbase/couchbase-lite-net/issues/803 which I thought was fixed but maybe I’m mistaken…

I will double check the fix for it.

I added in another null check but it occurs to me that this problem was fixed very recently and so I wonder if you actually have build 28 on the device. It could be stale thanks to Xamarin Android’s less than stellar record of leaving behind old files. That would also explain why your colleague’s device is working. The logs should say near the beginning the version of the assemblies that are loaded.

Seems to be working after deleting all the things, thanks.