Thanks @ingenthr, @keshav_m, @deepkaran.salooja for the great suggestions and tools. We will try that for sure.
We had 5.0 EE beta2 running for some weeks (but with Standard GSI instead of Plasma), but we want to start with EE for now.
Today we found that one of the 3 nodes had high CPU load (not sure why…some CB-related process had high CPU constantly and was probably slowing things down… not sure what it was exactly). We feel that the query sometimes has hit that node and sometimes not.
A colleague restarted (re-installed) that node, re-created the indexes and now it seems that query response times have stabilized around 0.5s.
We will check how to improve our monitoring, so we will recognize this earlier and have a quick way to turn on the profiling tools you suggested. We will now continue putting more load and documents.