In the first installment of this series, we discussed the drivers behind creating microservices, and why Couchbase is the perfect datastore to use with a microservice architecture. With their stateless nature, they can be deployed anywhere and horizontally scaled as needed. While you could write a microservice in any language, to fit into an agile workflow where you need to deliver functionality fast, you should choose a language known by most developers and allow for rapid development. Since Python and JavaScript are two popular languages, either would be a good fit. In this blog series, we are focusing on Python.

Distributed performance considerations

Application performance requirements are important, but they are sometimes difficult to quantify. A simple example is a report that runs in X hours; however, it should run in Y minutes due to business requirements. That example is easy to quantify and provides a clear target for improvement. When you have a geographically distributed architecture, if a component can support X requests per second, that does not mean anything. Considering the number of variables in a geographically distributed application, you need to take a step back and start with something you can quantify; for example, you want a smartphone app to fully load in X seconds, or you want a web page to fully load in Y seconds. Then work backward to see what is needed to make that happen.

Python microservice performance considerations

When you create a microservice in Python, you have some options to consider before you start coding. You can either write all your own Python code to create the service or use a Python API framework such as Flask or FastAPI. In the first installment of this series, I provided an example of the first option. I will call this the “full code” option. In this second part of the series, I will introduce an implementation of the simple User Profile service using FastAPI. I chose FastAPI over Flask for this Blog because most consider it faster, and I thought it would be fun to give it a try.

But first, let’s focus on our original “full code” example. We used Python’s HTTPServer class to create a basic Web Server to respond to our API calls. For our API, we decided to use paths (as opposed to parameters or posting a JSON body) as it is fast and easy to parse. Our simple User Profile API does not need to provide much – just a few methods to look up a user profile and get the data. I included options for lookup by ID, Nickname or Username. In real life, requirements will vary based on how the upstream application is designed.

The full code program has two logical areas – code that is executed once and the code executed for each request – namely the do_GET function. For expedience, we won’t focus on the limited execution code, but we will focus on the do_GET function and its satellite functions. With Python’s HTTPServer class, this function will be called with each request. The request path will be in the class and accessible via self.path and the headers are in self.headers. If you are just getting started with Python, self is like this in Java – it references the calling instance of the class.

The service will need to iterate on the contents of the path string so it can do the appropriate lookup and return the data. Thanks to the beauty of Couchbase’s JSON native design, we don’t have to do much, if anything, to the data before we send it. So, we will focus on how to inspect the path. Python has a lot of built-in options for string processing which enable you to write pretty code, but not necessarily the fastest code. Python is an interpreted language (it runs directly from the source) so statements make a difference.

Let’s look at two options for path string processing – the startswith and split methods.

The split is more expensive, but we are going to have to do it, so it would be best to do it only once. We can then avoid calling anything else by using the array returned from the split as opposed to startswith.

Conditional statements are fast, so while it may not look pretty, we will do a single split and then build an if…elif…else structure to iterate through the path. We will write short helper functions to do a query or key-value get and return the JSON data to the requestor with minimal processing.

Also, to make our microservice secure, we will add a Bearer Token. We would use something like OAuth with Bearer and JWT tokens in a real environment. For our example, we will greatly simplify this and add a collection to our schema with a fixed token. The service will query this token on startup and only respond to requests that provide this token as a Bearer token. Finally, if needed, we will add a health check path that responds with an HTTP 200 so we know our service is healthy.

Containerizing the Python microservice

I decided to use Kubernetes to test the service, so I had to build containers with the various implementations of the API. There is a published Python container that can be used as a base. Some OS prerequisites will have to be installed before the required Python packages. The Python container is based on Debian so the prerequisite packages can be installed with APT. Then pip can be called to install the required Python packages. The service port will need to be exposed, and finally, the service can be run as it would be run from the command line. To containerize the service, it will need an additional modification to support environment variables as this is the preferred method for passing parameters into a container.

This is an example of the Dockerfile for the full code service:


Some time ago, WSGI (Web Server Gateway Interface) was created for Python web frameworks. It enabled a developer to only focus on building web applications instead of all the other lower-level tasks required with a web server. That standard was extended into ASGI (Asynchronous Server Gateway Interface), which supports asynchronous Python programming and therefore is well suited for stateless applications such as REST APIs.

Uvicorn is an ASGI web server implementation for Python, and FastAPI integrates with Uvicron to create a rapid API development platform. I decided to use this to create a second API implementation to compare to the entire code version. Since it fully supports asynchronous Python, it also plays well with the Couchbase Python SDK, which fully supports asynchronous programming.

Using this framework accelerates development because much less code is needed than the full code version. Some functions to connect to Couchbase are required, but beyond that, decorated app methods are used to interact with the FastAPI instance calling minimal code segments to fetch and return data. As with the full code version, the service connects to Couchbase once and uses the resulting collection methods to get data. The on_event method is used on startup to connect to Couchbase, retrieve the auth token, and set all the needed variables.

Once the startup actions are complete, short functions for each possible request path are invoked through app method calls. The path parameter is extracted from the path and passed to the function, along with a dependency on the function to check the auth token. With this implementation, only environment variables are used to pass connection parameters.

The container for this implementation can use the same base as the full code version and install the same dependencies; however, it will have a few extra Python package requirements and the service is invoked through Uvicorn.

Setting up Node.js to test endpoints

The blog post is about Python, but it would be helpful to have a non-Python comparison for the API so for this I decided to use Node.js; it is asynchronous and works well with APIs. The Node.js implementation uses the Express module to create a Web Server, and in a similar fashion to FastAPI it uses the app.get method for all supported paths. It calls a function to check the auth token first, and if successful it calls a function to get the requested data.

There is a module for the Couchbase functions located in a JavaScript file, and the functions for the supported API calls are also in modules in separate JavaScript files. Like with Python, there is a Node container that is used as a base and the NPM utility maintains the dependencies and starts the service.

Kubernetes to spinup Couchbase autonomously

As mentioned earlier, Kubernetes was chosen to test the service implementations. This enabled accelerated testing due to the ability to rapidly deploy and scale the services for different test scenarios. There are two options for using Couchbase with Kubernetes. The Couchbase Autonomous Operator can be used to deploy Couchbase into the Kubernetes environment, or the service can connect to an external cluster. The service was tested with an external cluster that was deployed in the same cloud VPC. All nodes were in the same cloud region, and both the Couchbase cluster nodes and Kubernetes nodes were deployed across availability zones to simulate what would likely be seen in a real-world deployment.

Using Kubernetes to test Couchbase implementations

Three deployment YAML files were created to deploy the three implementations. Each deployment YAML creates a namespace for the service. It uses a secret for the Couchbase password. The service is deployed with 4 replicas initially. As it is a stateless microservice, it can scale up and down as needed. Traffic is directed to the service with a load balancer. As the Kubernetes environment used was integrated with a cloud provider, each deployment also provisioned a cloud load balancer for the service.

Using the deployment YAML files, the service can be deployed and scaled as needed with the Kubernetes CLI. Optionally if this was an actual production environment, tools such as autoscaling and advanced load balancing could be used to control and access the deployment.

Cluster performance results

Before testing the services, the Couchbase cluster was tested from the Kubernetes cluster to create a baseline. YCSB workload B was used (which is primarily key-value get operations) and it yielded 156,094 ops/s. API testing was done with Apache JMeter. The ID API call was used to keep it simple, and the JMeter random number generator was leveraged to create test runs against random user profiles. The test scenario was time bound with a run time of three minutes where it would generate unrestrained load against the load balancer service requesting random user profiles with no ramp-up (the load was constant for the whole test duration).

Setting REST service thread groups with JMeterTesting REST service endpointswith JMeter

For the first set of tests, the JMeter test parameters were unchanged, and what varied was the scale of the three API implementations. Testing started with 4 Pods for each implementation deployment and scaled up to 8 and finally 16 Pods. All the implementations scaled throughput as the Pods scaled in the deployment.

Couchbase Python performance results - requests/sec

Node.js fared the best with this test strategy as it had the lowest average latency. One millisecond is not a lot of latency, nor is 12 milliseconds. But with a fixed number of generator threads creating over 1 million requests in three minutes, milliseconds have a cumulative effect. However, please bear in mind that this is an extreme test. These are just data points. What was surprising was that the full code Python service kept pace with the FastAPI implementation.

Couchbase Python performance results - latency

Since the first test scenario demonstrated that both the full code Python and FastAPI implementations were scalable, the second round of tests scaled the number of request threads with a fixed number of 32 service Pods. With this test scenario, the Python-based services could scale near 10,000 requests per second.Couchbase Python performance results - request thread scale


I think that Python is an excellent option for moderate load services. With all the testing done, the Kubernetes cluster nodes had ample available CPU and memory, so there was plenty of headroom to scale the service as needed. For implementations that require massive scale at the lowest latency, then Node.js may be a better option. Couchbase supports all the prevalent languages, so just as I was able to easily code three microservice implementations, anyone can use multiple languages and frameworks and integrate Couchbase with ease.

Up Next

In the next post in this blog series, I will talk about generating random test data for the microservice schema. Here are links to the resources mentioned in this post:

Random fun fact

HTTP response codes are defined per the protocol specification. The 400 range is reserved for situations where the error seems to have been caused by the client. HTTP 418 is the “I’m a teapot” error and the spec states the “I’m a teapot client error response code indicates that the server refuses to brew coffee because it is, permanently, a teapot. A combined coffee/teapot that is temporarily out of coffee should instead return 503.”


Posted by Michael Minichino

Michael Minichino is a Principal Solutions Engineer at Couchbase

One Comment

  1. Cool post!
    I think fastapi suffers a lot from the default serializer, also pydantic while it is fairly fast it is doing validation of output data and probably taking a lot of timing there.

Leave a reply