Introduction

The last major Python upgrade — to version 3 — arrived in Dec. 2008, nearly 12 years ago. And yet there is a good chance that you are still working on the Python 2 product or test code. If so, then you may be  seeing the below deprecation message as a reminder to update the Python version you’re working with.

“DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won’t be maintained after that date. A future version of pip will drop support for Python 2.7.”

Please take this seriously and plan accordingly: Python updates to 3.x are not backward compatible. What you write with Python 2.x versions may not function properly when using 3.x.

Be sure to also read the fine print. According to the website for the programming language, the final Python upgrade release date is still TBD: “Being the last of the 2.x series, 2.7 will receive bugfix support until 2020. Support officially stops January 1 2020, but the final release will occur after that date.” [1]

So, Python 2 is entering into unsupported mode by the end of this year. If you haven’t yet done so, now is a good time to migrate the current Python 2 code to Python 3 syntax and stick to Python 3 going forward.

Why don’t teams just jump start on this Python 2 to 3 migration? One of the major hurdles is that the majority of working code simply breaks (read more at why-was-python-3-made-incompatible-with-python-2), either because of the direct language syntax or issues with third party APIs. Let’s be fair here: few of us would bother with migration if the new Python updates were backwards compatible. Instead, version 2 will go unsupported, forcing many — including us here at Couchbase — to prioritize migration. Even if the team crosses the bug fix support deadline, it’s ok (because your code is still working). As a team, we decided it’s better to migrate as close to this date as possible so that we are on the same page with other Python community members and learn alongside them.  

This document is a collection of tips and tricks we learned while upgrading to Python 3 along with common problems we encountered during the Couchbase test infra migration process. As you’ll see we manually update python by command line after kick-starting with an automated process. Your approach may be different. Regardless, start as soon as you can. Updating python from version 2 to version 3 is important.

Couchbase is an open source Enterprise-class MultiCloud to Edge NoSQL Database. The Couchbase functional testing framework, TestRunner has been developed in Python 2. The TestRunner git  repository can be found at https://github.com/couchbase/testrunner .  Our goal now is to completely switch to Python 3 runtime instead of co-running with both Python 3 and Python 2.

As part of the Python upgrade process, we have identified the major changes needed to successfully port to version 3. Some of the problems you’ll read about we identified during the porting process. Our aim in sharing our learnings is to help you with your own migration. You can pick the latest Python 3.x version (it depends on the pre-release, stable, security-fixes version on a specific platform, 3.7 or 3.6), which we are referring to as Python 3 throughout this blog. See more details on the release at Python releases download and  Python 3 documentation.

Cheat Sheet

 

Major Changes: Python 2 vs. Python 3

To get an idea on the key changes, here is the summary list of code changes needed from Python 2 to Python 3.

Python 2 Python 3
Text utf-8 : str
Text is unicode : str
u”
Binary is same as Text: bytes/str
Example:
file.read(6) == ‘GIF89a’
Binary data is represented as b prefix: bytes
 b”
Use decode() to get the string, encode() to get bytes.Examples: 
file.read(6) == b’GIF89a’
b’hello’.decode() → ‘hello’
‘hello’.encode() → b’hello’
str(b’hello’) → “b’hello’
Print statement
Example: print ‘ ‘
Print function
Example:print(‘ ‘)
Integer division
Example: 5/2=2
Floor Division. Use 2 slashes
Example: 5//2 = 2 and 5/2=2.5
Float division
Example: 5/2.0 = 2.5 or 5.0/2 = 2.5
Float Division. Use single slash
Example: 5/2 = 2.5
Long type is different from int
long 
There is no long type. It is same as int
xrange() range()
Iteration functions had iter prefix. iterxxx()
Example: iteritems()
Dropped iter prefix. xxxx()
Example: items()
Lists are directly loaded (all elements loaded into memory when list is used)
Example: for i in [] 
Lists are lazy loaded (when an element is accessed, then only loaded into memory)
Example: for i in list([]) 
Dictionaries can be compared by default or against 2 dict.
Example: sorted(dict) 
Dictionaries can’t be compared directly. sorted() should have key.

Example: sorted(expected_result,key=(lambda x: x[bucket.name][‘name’]))For general dict/list comparison, you can use below: 
 from deepdiff import DeepDiff
 diffs = DeepDiff(actual_result[‘results’], expected_result[‘results’], ignore_order=True)if diffs:
   self.assertTrue(False, diffs)

Bytes and strings as values:
diffs = DeepDiff(set(actual_indexes), set(indexes_names), ignore_order=True, ignore_string_type_changes=True)

string.replace(data[i],…) data[i].replace(..)
urllib.urlencode() New modules

  • http.client
  • urllib.request, urllib.error, urllib.parse
  • sgmllib3k

Examples: 
urllib.parse.urlencode()

string.lowercase Attributes:
string.ascii_lowercase
string.ascii_uppercase

See the testrunner py3 commits  for changes

Python 3 Setup

To setup Python 3 from scratch, run the below commands on a new host with major supported platforms.
Later during the runtime, either use python 3 command or python in python 3 virtual env. Use either pip3 or pip3.x (pip3.6 for example) to install packages based on the installed Python 3 version.

Mac OS
(Example: Your laptop)
Direct setup (pip3 automatically installed):(https://wsvincent.com/install-python3-mac/)

Virtual environment setup:

Install required libraries:

For now, the below modification is required to the common Python 3 http client otherwise, you would hit an error.

 
CentOS  
(Example node: Jenkins Slave)
Direct setup and virtual environment:

Install required libraries:

 

Perform Couchbase CSDK and Python SDK installation on new slave:

For now, the below modification is required to the common Python 3 http client otherwise, you would hit an error.

 

Ubuntu slave using for Python 3 runtime verification
Direct Setup:

 

Install the required libraries:

Install CSDK and Python SDK installation: (Ref: https://docs.couchbase.com/c-sdk/2.10/start-using-sdk.html )

For now, the below modification is required to the common Python 3 http client otherwise, you would hit an error.

 

Windows
Download and install: https://www.python.org/ftp/python/3.7.4/python-3.7.4.exe

Install required libraries:

 

Porting Process

At a high level, the porting is a three step process. 1) Auto conversion 2) Manual changes 3) Runtime validation and fix

At first, clone the original repository and have the basic automatic conversion changes. Checkin the changes as a new repository until full conversion is done. This way, the current regression cycles can go without interruption.

1. Auto conversion

There is an automated tool called 2to3 tool, provided by the Python 3 team that helps in taking care of a few common patterns like print, exception, list wrapping, relative imports etc.  

You can start with a single directory in the locally cloned workspace to do double check. Later, the conversion can be done entirely on entire code so that basic porting is taken care of.

Below are some of the sample 2to3 conversion commands on the MacOS. In the last command, note that all idioms were applied. This way, the first time conversion can take care of key changes.

 

2. Manual changes

The auto conversion doesn’t do the complete porting. The below common problems might be experienced during the porting process than the common syntax changes done by the auto conversion 2to3 tool. 

Run the test class and see if there are any errors and fix appropriately, deciding whether to switch from bytes to str or str to bytes or some sort/comparison issue where one has to fix the key name in the sorted function. This is an iterative process until all the code runtime has been validated.

Once a common pattern for sure is clear, then you can do grep and sed to replace across many class files. If you are not sure about other code until runtime, then defer until that test class is executed. 

There might be issues with third party libraries/modules might have changed, those need to be searched on the web and used appropriately.

Make sure all the code path is covered by running across all supported platforms and parameters.

3. Runtime Validation and Fix

Once the conversion is done, then perform a lot of code runtime as Python is a dynamic language. Otherwise, the changes can break the things if you do just visual static code inspection/changes. You can start with basic sanity tests, acceptance tests and then select full tests from a single module of tests.

Once you’re comfortable, then go with all other modules one by one. Keep checking in the changes to the new repository. In addition, you need to make sure there are no regressions with ported changes from this new repository by running sanity tests on the newer builds. Also, the validation should include all the supported platforms with Python 3.

 

Python 3 Ported Code and Status

Below is where to find the new repository for Python 3 ported code until it is merged to the main repository. The plan is to do one cycle of porting or intermediately take the changes from the main repo and do a manual merge to this.

https://github.com/couchbaselabs/testrunner-py3/

(Branch: master)

Many common changes were already done but not completed as there might be some other runtime issues. Fixes in common can also be regressed to the earlier fixes because of assumptions on input value type conversions. There is still some more ported code to be validated with Python 3 and the effort is still in progress.

Now, let me show you the common issues that occurred during the runtime validation. You can use this as a reference when you hit an issue to see if you are having the similar issue. You can apply the same solution and see if it works for you. And if you have any new ideas, you can put them in the comments.

Common Runtime Problems

 

1. Problem(s):

  • You might get some of the below TypeErrors during runtime like str instead of bytes and bytes instead of str
  • Error#1. TypeError: can’t concat str to bytes
  • Error#2. TypeError: must be str, not bytes
  • Error#3. TypeError: a bytes-like object is required, not ‘str’
  • Error#4. TypeError: Cannot mix str and non-str arguments

Solution(s):

See the types of the variables in the statement and use xxx.encode() to get the bytes or xxx.decode() to get the string or use b prefix or use str(). Sometimes, the input might not be unknown and in this case, use try x.encode() except AttributeError: pass


2. Problem(s):

TypeError: root – ERROR – ——->installation failed: a bytes-like object is required, not ‘str’

Solution(s): 

In this case, add b as prefix to the string under comparison or change the byte type to string type. Example: lib/remote/remote_util.py

Surround with try-except to check the exact line causing the error (say above TypeError.) 

Sample output after traceback.print_exec() to see the full stack trace in similar to java.

Fix with changes to lib/remote/remote_util.py as below.

3. Problem(s):

 

Solution(s):

 

4. Problem(s):

AttributeError suite_setUp() or suite_tearDown() are missing for some testsuites.

Solution(s):

Add the dummy suite_setUp() and suite_tearDown() methods. 

 

5. Problem(s):

 

Solution(s):

 

6. Problem(s):

AttributeError: ‘Transport’ object has no attribute ‘_Thread__stop’

Solution(s):

There is no direct stopping of a non-daemonic thread. But syntax-wise use t._stop(). The recommendation is to use the graceful shutdown using a global flag and check in the thread’s run() to break.

(https://stackoverflow.com/questions/27102881/python-threading-self-stop-event-object-is-not-callable)

7. Problem(s):

Test expirytests.ExpiryTests.test_expired_keys was not found: module ‘string’ has no attribute ‘translate’

Solution(s):

Rewrite with str static methods. There is no old way of getting all chars, so we used the earlier code and used total set.

vi lib/membase/api/tap.py 

 

8. Problem(s):

TabError: inconsistent use of tabs and spaces in indentation

 

Solution(s):

Search for tab characters and replace with space characters. 

For the above issue, remove tab characters.

 

9. Problem(s):

Solution(s):

Case sensitiveness issue. Fixed by changing from x_couchbase_meta key to X_Couchbase_Meta

 

10. Problem(s):

  • Error#1. TypeError: ‘<‘ not supported between instances of ‘dict’ and ‘dict’
  • Error#2. TypeError: ‘cmp’ is an invalid keyword argument for this function

Solution(s):

   

11. Problem(s):

Solution(s):

 

12. Problem(s):

Solution(s):

 

13. Problem(s):

Solution(s):

Here, it should return int as python 3 doesn’t compare automatically as in python 2.

 

14. Problem(s):

Solution(s):

 

15. Problem(s):

Solution(s):

Converted the key to string so that ch is string instead of int with binary key. See the file.

 

16. Problem(s):

TypeError: ‘FileNotFoundError’ object is not sub-scriptable

Solution(s):

Changed in Python 3 as FileNotFoundError is not sub-scriptable and instead, use errno attribute,  e.errno

 

17. Problem(s):

Solution(s):

The nested dictionary/list comparison was not working because of the earlier sorted function to sort completely is now not available. Use deepdiff module and DeepDiff class to do the comparison

 

18. Problem(s):

AttributeError: module ‘string’ has no attribute ‘replace’

Solution(s):

Use direct str variable to replace as shown below for fixing the issue.

 

19. Problem(s):

Solution(s):

Use str or int function appropriately.

 

20. Problem(s):

NameError: name ‘cmp’ is not defined

Solution(s):

Use deepdiff module and DeepDiff class to do object comparison.

 

21. Problem(s):

Solution(s):

Convert str to int as below for the above type error issue.

—-

That’s all for now on the list of problems to watch for when you upgrade Python version 2 to Python version 3. We will post more learnings in future blog posts. In the meantime, good luck migrating!

 

Further readings

The following references were helped us. You can also read further at below reference links to get more details and improve your code porting to Python 3.

  1. https://www.python.org/dev/peps/pep-0373/
  2. https://wiki.python.org/moin/Python2orPython3
  3. https://www.toptal.com/python/python-3-is-it-worth-the-switch
  4. https://weknowinc.com/blog/running-multiple-python-versions-mac-osx
  5. https://docs.python.org/3/howto/pyporting.html
  6. https://wsvincent.com/install-python3-mac/
  7. http://python3porting.com/pdfs/SupportingPython3-screen-1.0-latest.pdf
  8. https://riptutorial.com/Download/python-language.pdf
  9. https://docs.couchbase.com/python-sdk/2.5/start-using-sdk.html
  10. https://docs.couchbase.com/c-sdk/2.10/start-using-sdk.html
  11. https://pypi.org/project/deepdiff/
  12. https://buildmedia.readthedocs.org/media/pdf/portingguide/latest/portingguide.pdf
  13. http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/promotions/python/python2python3.pdf

Hope you had a good time reading!

Disclaimer: Please view this as a quick reference for your Python 3 upgrade, rather than a complete guide to resolving porting issues. Our intent here is to help you at some level and give you a jump start on the porting process. Please feel free to share if you learned something new that can help us. Your positive feedback is appreciated!

 

Thanks to Raju Suravarjjala and Keshav Murthy for their key inputs and feedback.

 

Author

Posted by Jagadesh Munta, Principal Software Engineer, Couchbase

Jagadesh Munta is a Principal Software Engineer at Couchbase Inc. USA. Prior to this, he was a veteran in Sun Microsystems and Oracle together for 19 years. Jagadesh is holding Masters in Software Engineering at San Jose State University,USA and B.Tech. Computer Science and Engineering at JNTU,India. He is an author of "Software Quality and Java Automation Engineer Survival Guide” to help Software developers and Quality automation engineers.

Leave a reply