In this post, we introduce System Catalogs in Couchbase and walk through a specific use case compared to SQL Server.
If you’re like me, after 20 years of working with SQL Server, you tend to pick up on things that you find annoying. You’ll create an elegant workaround, but after changing jobs and not having your workaround, you encounter the same issues again and again. I’ve had a similar experience with annoying issues with system catalog views in SQL Server. Here I show how much simpler this can be using Couchbase.
System catalog views
System catalogs are those internal tables and views that show system-level metadata about objects stored in the database.
In SQL Server, you can look at this internal data using Transact-SQL (T-SQL) to access System Catalog Views and Dynamic Management Objects (DMOs)*.
There are other functions and features you can call, but most users stick with these two as the syntax is easier and there is a lot of reference material available.
The scope of SQL Server System Catalog Views always annoyed me because Microsoft set it to be per database in some cases, as we’ll look at below. Some think this is a good approach, but this can make it harder to compare results across several associated databases because the scope is limited to one database at a time.
* (If you’re used to calling them Dynamic Management Views (DMVs), remember there are also DMFs for Functions as well. DMOs is the generic term to use.)
Couchbase vs. SQL Server Terminology
Before we go any further, let’s compare the key terminology of some basic database concepts between SQL Server and Couchbase. As Couchbase is a NoSQL database with a SQL for JSON query language, there are some overlaps with SQL Server but also a few variations. This table describes the main terms across both systems.
|Cluster||Cluster||A Cluster in SQL Server is for High Availability where each instance is only active on one node. Couchbase benefits from a distributed architecture; each node can contain active data if so configured.|
|Instance||Cluster / Node||The installation of the SQL Server engine is classified as an instance. More than one instance can reside on the same machine. Couchbase applies the idea of separate nodes, where each node can contain multiple types of services. DBAs can also spread out services across several nodes. We call this multidimensional scaling a core, powerful feature of Couchbase.
A SQL Server Instance can be compared to a Couchbase Cluster that potentially contains multiple nodes.
Analyzing indexes using system catalogs
For our specific use case, we want to pull out a list of indexes for all databases to compare and check if any indexes are missing. If you are a business-to-business provider and host many customers on the same instance, this is a common question. Often each customer will have a separate database and apply the same schema and set of indexes.
The following pseudocode outlines one solution. The idea here is not to solve the issue for you but to demonstrate how much more effort SQL Server needs for similar results. As Couchbase catalogs are scoped to the cluster level, it allows you to compare multiple buckets in a cluster together easily.
Here we are not using SQL Server’s built-in function to loop through databases as it used to be a bit buggy. Aaron Bertrand wrote a much better version which you can find on MSSQLTips.
Pseudocode for SQL Server:
- Create a temporary table to hold data
- Start Loop/Cursor based on the number of user databases
- Create dynamic string to enter the scope of that database, Select data from index catalog or DMO along with DB Name, and Insert data into table
- Execute dynamic string
- Next Database
- Then your query (tongue in cheek example below):
SELECT Data FROM Table
WHERE Things = OtherThings
OR Things = 'Broken'
N1QL query for Couchbase:
In contrast, doing the same thing in Couchbase is simple because the catalog is at the cluster level rather than the database level. Here we just need to run a single query. In this scenario, we are using Couchbase 7 and no default scopes are being used. (A scope in Couchbase would be analogous with a Schema in SQL Server.)
The query uses the system catalog to access the active indexes:
COUNT(*) AS Qty
FROM system:indexes AS TabA
GROUP BY TabA.name) TabB
WHERE TabB.Qty < 3
From looking at the query, you will immediately notice its similarity to TSQL. Couchbase spent a lot of effort creating N1QL, our SQL-for-JSON language, so anybody familiar with SQL will be able to pick up N1QL almost instantly.
I created three buckets with one scope and one collection to test the above. I inserted some data and created a few indexes. To make life easier, I just put a hard-coded number in the WHERE clause, but I could have referenced other catalogs to total up the number of buckets, scopes, or collections you may want to compare.
Of course, indices aren’t the only objects we can check. We could also do the same thing with scopes (schemas) and collections (tables) without creating loops as you would with SQL Server because Couchbase system catalogs have access to all objects in the cluster.
As you can see, the simplified access to database objects across Couchbase can be very convenient. Accessing database metadata is important for any DBA and can be done easily with minimal code. I hope you’ll give this a try and see how quickly you can get up and running on Couchbase with this approach.