What is PL/SQL?
PL/SQL is a procedural language designed specifically to embrace SQL statements within its syntax. It includes procedural language elements such as conditions and loops, and can handle exceptions (run-time errors).
PL/SQL is native to Oracle databases, and databases like IBM DB2, PostgreSQL, and MySQL support PL/SQL constructs through compatibility features.
What is a JavaScript UDF?
JavaScript UDF is Couchbase’s alternative to PL/SQL.
JavaScript UDF brings JavaScript’s general-purpose scripting flexibility to databases, allowing for dynamic and powerful operations across modern database systems and enhances flexibility in data querying, processing, and transformation.
Most modern databases like Couchbase, MongoDB, Snowflake, and Google BigQuery support JavaScript UDF.
The problem
A common problem seen by users migrating from Oracle to Couchbase is porting their PL/SQL scripts. Instead of supporting PL/SQL, Couchbase lets users construct user-defined functions in JavaScript (supported since 2021).
JavaScript UDFs allow easy, intuitive manipulation of variant and JSON data. Variant objects passed to a UDF are transformed to native JavaScript types and values.
The unintended consequence of this is that the majority of RDBMS that have been in existence for the last ten years have strongly encouraged developers to access the database using their procedural extensions to SQL (PL/pgSQL, PL/SQL), which support procedural constructs, integration with SQL, error handling, functions and procedures, triggers, and cursors, or at the very least, functions and procedures (like Sakila). For any attempt to move away from them, all of their scripts would need to be rewritten.
Rewriting code is often a tedious task, especially when dealing with PL/SQL scripts that have been written in the 2000s and maintained since then. These scripts can be complex, often extending to thousands of lines, which can be overwhelming for the average enterprise user.
Solution
The ideal approach would be to develop a whole new PL/SQL evaluator, but that would require an excessive amount of engineering hours, and for the same use case, we already have a modern, stable, and fast JSEvaluator; So why support another evaluator?
This makes the problem a perfect use-case to leverage the ongoing advances in AI and LLMs. And that’s what we have done here. We have used Generative AI models to automate the conversion of PL/SQL to JSUDF.
As of June 2024, models have a limited context window, which means longer PL/SQL get hit with the error:
|
1 |
This model‘s maximum context length is 8192 tokens. However, your messages resulted in <More–than–8192> tokens. Please reduce the length of the messages. |
Note, that this is for GPT4.
So do we wait for AI to become more powerful and allow more tokens (like Moore’s Law but for the AI’s context-length-vs-precision)?
No, that’s where ANTLR, a parser generator tool, comes in. ANTLR is well-known to be used for Compiler and Interpreter Development. That way we can break the big script to smaller units that can be translated independently.
So are we now building a transpiler? Well, yes and no.
Stages in a transpiler:
- Lexical Analysis (Tokenization)
- Syntactic Analysis (Parsing)
- Semantic Analysis
- Intermediate Representation (IR) Generation
- Optimization (Optional)
- Target Code Generation
How the AI translator works
Above steps 1,2 are done using ANTLR. We use ANTLR’s Listener interface to grab individual Procedure/Function/Anonymous-block, as they are independent blocks of code. In a case where the Procedure/Function/Anonymous-block are themselves exceeding the context window, we translate at a statement level (where the LLM assumes the existence of use of variables/function calls that aren’t defined here but somewhere before).
Subsequently, steps 3, 4, 5, and 6 are left to the LLM (e.g., GPT), i.e., translating each PL/SQL block into a JavaScript function to the best of its ability that also preserves the operational semantics of the block and is syntactically accurate.
The results are surprisingly quite positive; the translation is 80-85% accurate.
Another benefit of the solution is that we reduce hallucination by focusing on one task at a time, resulting in more accurate translations.
To visualize:
How to use the tool
- Download the executable from Couchbase Labs GitHub and access the README.
The executable expects the following command-line arguments:
-u: capella signin email
-p: capella signin password
-cpaddr: capella-url for chat-completions api
-orgid: organisation id in the chat-completions api path
-cbhost: node-ip: cbcluster node
-cbuser: cluster-user-name: cbcluster user, added through database-acess
-cbpassword: cluster-password: cbcluster password, added through database-access
-cbport: query-service tls port: usually 18093
filepath , i.e path to the PL/SQL script that has to be translated
output-> In the output directory, a file with the same name as the plsql file is generated with translated JavaScript Library code.
An example:
cat example1.sql
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
DECLARE x NUMBER := 0; counter NUMBER := 0; BEGIN FOR i IN 1..4 LOOP x := x + 1000; counter := counter + 1; INSERT INTO temp VALUES (x, counter, ‘in OUTER loop’); —start an inner block DECLARE x NUMBER := 0; — this is a local version of x BEGIN FOR i IN 1..4 LOOP x := x + 1; — this increments the local x counter := counter + 1; INSERT INTO temp VALUES (x, counter, ‘inner loop’); END LOOP; END; END LOOP; COMMIT; END; |
To briefly explain the above script:
- An outer loop runs for 4 iterations, incrementing x by 1000, counter by 1
- Inner loop runs for 4 iterations, incrementing x by 1, counter by 1
Running the translator on the test PL/SQL:
|
1 |
./plsql–to–jsudf –u «capella–signin–mailid» –p «capella–signin–password» –cpaddr https://api.cloud.couchbase.com -orgid «capella-organisation-id» -cbhost «hostname of data node» -cbuser «cbcluster username» -cbpassword «cbcluster password» -cbport 18093 ./translator/test/plsql/example1.sql |
Output JSUDF:
cat output/example1.js
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
function nestedloop(){ var x = 0; var counter = 0; var querybegin = BEGIN WORK; querybegin.close(); for (var i = 1; i <= 4; i++){ x = x + 1000; counter = counter + 1; var params = [x, counter]; var query = N1QL(‘INSERT INTO test.testscope.temp VALUES (uuid(),{“val1″:$1,”val2″:$2,”val3″:”in OUTER loop”})’,params); query.close(); var x_inner = 0; for (var j = 1; j <= 4; j++){ x_inner = x_inner + 1; counter = counter + 1; var params_inner = [x_inner, counter]; var query_inner = N1QL(‘INSERT INTO test.testscope.temp VALUES (uuid(),{“val1″:$1,”val2″:$2,”val3″:”inner loop”})’,params_inner); query_inner.close(); } } var querycommit = COMMIT WORK; querycommit.close(); } |
The translated script has a function nestedloop (name generated by LLM) that does exactly what the original anonymous PL/SQL block specifies.
Side note: for named functions/procedures translated JS functions will have the same name. For anonymous blocks LLM uses a name it comes up with.
Known issues
PL/SQL and JS are 2 different languages, and the way they are supported in Oracle and Couchbase doesn’t allow for a clean direct mapping between the 2.
Below are some limitations we discovered and the workarounds we have implemented for the same:
1. console.log is not supported
DBMS_OUTPUT.PUT builtin procedure and other 2 similar builtins, DBMS_OUTPUT.PUT_LINE and DBMS_OUTPUT.NEW_LINE are translated to console.log(), but console.log is a browser API and is not supported by Couchbase’s JavaScript evaluation implementation. This has been a frequent ask, considering the Couchbase eventing function does support print() statements but not in JavaScript UDFs.
Workaround:
Users are expected to create a logging bucket.
Logs are inserted as part of a document INSERT into default.default collection. The document would look something like:
|
1 2 3 4 5 |
{ “udf”: «func–name», “log”: «argument to console.log», // the actual log line “time”: «current ISO time string» } |
User can lookup his logs by selecting on logging:
|
1 2 |
SELECT * FROM logging WHERE udf= “«func-name»”; SELECT * FROM logging WHERE time BETWEEN “«date1»” AND “«date2»”; |
An example:
The original PL/SQL
|
1 2 3 |
BEGIN DBMS.OUTPUT.PUT(“Hello world!”); END; |
Translated to JavaScript UDF
|
1 2 3 4 5 6 7 8 |
function helloWorld() { // workaround for console.log(“Hello world!”); var currentDate = new Date(); var utcISOString = currentDate.toISOString(); var params = [utcISOString,‘anonymousblock1’,“Hello world!”]; var logquery = N1QL(‘INSERT INTO logging VALUES(UUID(),{“udf”:$2, “log”:$3, “time”:$1}, {“expiration”: 5*24*60*60 })’, params); logquery.close(); } |
This is already implemented in the tool.
To view the log:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
EXECUTE FUNCTION helloWorld(); “results”: [ null ] CREATE PRIMARY INDEX ON logging; “results”: [ ] SELECT * FROM logging; “results”: [ {“logging”:{“log”:“Hello world!”,“time”:“2024-06-26T09:20:56.000Z”,“udf”:“anonymousblock1”}} ] |
2. Cross Package function calls
Procedures/Functions listed in the package specification are Global, and can be used from other packages via «package_name».«public_procedure/function». But the same is not true for a JavaScript Library in Couchbase, as import-export constructs are not supported by Couchbase’s JavaScript evaluation implementation.
Workaround:
In case of an interlibrary function call «lib_name».«function»(), user is expected to have the referenced library «lib_name» already created; you can verify this via GET /evaluator/v1/libraries
The referenced function «function» also is expected to be created as a global UDF; this can be verified via GET /admin/functions_cache or select system:functions keyspace. This way we can access the function via SQL++/N1QL.
An example:
math_utils Package
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
CREATE OR REPLACE PACKAGE math_utils AS — Public function to add two numbers FUNCTION add_numbers(p_num1 NUMBER, p_num2 NUMBER) RETURN NUMBER; END math_utils; / CREATE OR REPLACE PACKAGE BODY math_utils AS FUNCTION add_numbers(p_num1 NUMBER, p_num2 NUMBER) RETURN NUMBER IS BEGIN RETURN p_num1 + p_num2; END add_numbers; END math_utils; / |
show_sum Package
0
Translated code:
1
It is auto-handled by the program, with a warning that it should be verified by a human set of eyes!
3. Global Variables
PL/SQL supports package level and session level global variables, but global variables are not supported in JSUDF deliberately by design as this causes concern for memory leaks.
Suggested workaround requires manual tweaking of the generated translation. For example:
2
Any function that modifies a global variable must accept it as an argument and return it to the caller.
increment_counter:
3
Any function that only reads a global can accept it as an argument.
show_globals:
4
Package to Library
This section shows an end-to-end package-to-library conversion using the tool.
Sample PL/SQL package:
5
Translation:
6
Code:
7
Let’s insert a new employee document
Create employee collection:
8
Insert an employee:
9
This errors out, and that’s ok we can fix it manually.
Reading the reason and exception: Cannot INSERT non-string key 1 of type value.intValue, Ah! the key is always expected to be a string, passing insert_employee(“1”, “joe”, “briggs”, 10000) would do the trick, but it is unintuitive to expect employee_id to be a string.
Let’s alter the generated code:
0
And recreate the UDF:
1
Trying to insert again:
2
Update an employee:
Shoot! There’s a goof up, employee 1 isn’t Joe, it’s Emily.
Let’s update employee 1
3
View the employee:
4
Delete the employee:
Emily left.
5
Again an error with the generated code, looking at the reason and exception we can confirm that translated code encloses delete in a transaction, which wasn’t the case in the original.
For transactions, buckets need to have durability set, but this requires more than one data server, hence the error.
The fix here is to alter the code to remove the enclosing translation:
6
7
8
Now, all functions in the original PL/SQL work in Couchbase via JavaScript UDFs. Yes, the example is pretty trivial but you get the gist on how to go about using the tool to migrate your PL/SQL scripts with little manual supervision.
Remember the tool is supposed to take you 80% , the other 20% still needs to be done by you, but much better than writing all of that code yourself!
The future
This project is open source, so feel free to contribute. Some ideas that have been suggested:
- Critic AI that can criticize generated code to ensure manual intervention is not needed at all
- Currently the source code is code that just works; no thoughts for parallelism or code reuse were put to use.
And also include the limitations discussed earlier.
Resources
- Couchbase Labs GitHub – PL/SQL to JSUDF
- ANTLR parser generator
Finally, I’d like to thank Kamini Jagtiani for guiding me and Pierre Regazzoni for helping me test the conversion tool.


Leave a comment
You must be logged in to post a comment.