I am getting this error in response when a query is being executed through an application (java).
Prepared name in encoded plan parameter is not [127.0.0.1:8091]b217fd0b75bc740ab648222d07cdc6814eee0e9f
This has started coming today only. however, I am not aware of any change in the configuration of Couchbase server which can cause an encoded plan to get corrupted.
I checked query.log but couldn’t find any log related to this error.
Can anyone tell me what this error actually means and how I can debug it further?
@vsr1, Is there any way I can retrieve the encoded plan that is being sent by SDK?
Also, can you explain the flow of execution as well? I am interested in how this works internally so that I can visualize the issue in a better way.
Hi Khrishnan, let me say this.
In 5.5.x there should never be a case in which your applications resort to using the encoded_plan REST API parameter, in the sense that N1QL should be able to handle prepareds without ever having a look at it.
If N1QL has to go to the last resort of even inspecting encoded_plan, something nasty is happening to your prepared cache, which it shouldn’t.
It seems to me like your developers are churning the cache beyond control - the N1QL prepared cache by default hold 16k statements that are managed by LRU, meaning that old prepared statements go away after they have not been used and 16383 new prepared statements have been created since the statement’s last use.
Either that, or you have something that’s periodically deleting from system:prepareds.
Or somehow your prepared limit has been shrunk from the default.
If the first or third case applies - you can increase the prepareds cache
In the first case, you also should see why you have so many prepared statements created that the cache can’t hold them?
In general terms I would say that such a churn is not normal.
In my 20 years in relational engine development, the norm I have seen is to have a few thousand prepareds system wide
If something is deleting from system:prepareds - well, that process needs to stop because it is affecting your operations.
@Marco_Greco, I’m not sure what makes you think that N1QL prepared cache is not storing as much statements as expected. I just verified and system:prepareds holds 16384 statements on our dev environment as expected. Also, I there no such process written which deletes from system:prepareds periodically. But still this issue persisted.
Can you explain how this relates to the bug you resolved by fixing your code?
Since I had to do something to unblock our testing team, I deleted old cached plans and restarted all the nodes. For now, the issue is not visible but I’m not sure it will work for long.
I’d appreciate if you can explain whats going on behind the curtain so that I don’t have to shoot in the dark.
Also, I was tweaking around with Rest API when this issue initially came and found that I was able to run the query when I sent the parematers: name and encoded_plan via query service on Postman. Prepared statement was available on server for that document but still the service was throwing same error.
the encoded_plan REST API parameter (the one we are not using correctly) is only used if the prepared statement cannot be found in the cache.
That means, that out of the 16384 statements that you have in the cache, the failing statement was not there.
The cache is processed by LRU, by that meaning, that the more current statements are kept at the top, the least frequently used ones eventually get deleted to make space for new statements.
If the statement that you are trying to execute is not found (and it isn’t because you are resorting to encoded_plan), then the cache quite simply is not big enough, and you should increase it as per my suggestion.
But you should also investigate why you do have so many prepared statements that they do not actually fit in the cache in the first place.
@Marco_Greco thanks for clarifying the process. I have two more questions:
As per my last comment, when I was checking for one document for which error was coming, the statement was available in system:prepareds. When I was running the Rest API call manually, it was working and the service was still throwing the same error. Why was this happening?
I observed that for one query, one parameter keeps changing in where clause. and for all those parameters, a different prepared statement is saved in system:prepareds. Is this normal?
If encoded_plan is being inspected when the prepared is available, then what that means is that the two plans are mismatching.
I would suspect that you have clients that have been running for a long time, a change has intervened in your schema (eg dropped and recreated an index), and the n1ql service has generated a new plan, while the application still carries the old one.
If you restart the applications, they will carry the new plan, and everything will match.
You can also disable the encoded_plan REST API at the application level, and then there would never be any mismatch. @daschl, could you remind me on how you do that?
Also, yes you can have duplicated statements in the prepared cache, it’s all down to how your application is written (eg named statements vs anonymous statements, placeholders vs embedding values in the statement text, etc).
If you find that you have lots of prepareds that differ only by one value, you could suggest to your developers to use placeholders.
@Marco_Greco, I suspected the same for #1 but after I restarted the application that is responsible for executing N1QL query in couchbase along with couchbase server, all saved
plans got removed from the server. I could not verify the same on application side as I don’t know how to do that. The error was still coming.
Now, the frequent change I know of is Cluster re-balancing which occurs in an interval of few months on various environments (removing/adding nodes) which requires indexes to be dropped and recreated. But this issue occurred only for one particular query. Shouldn’t all the queries be affected with same issue?