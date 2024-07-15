Sentry Answers>Python>

Fix file contents TypeError when migrating from Python 2 to 3

David Y.

The Problem

I’m attempting to migrate old Python 2.7 code to Python 3. I have this piece of code that worked fine in Python 2.7:

with open(file_name, 'rb') as file:
    lines = [line.strip() for line in file.readlines()]

new_lines = []
for line in lines:
    new_line = line.lower()
    if 'DELETE ME' in new_line:
        print "Skipping a line."
        continue

    new_lines.append(new_line)

When I try to run this code in Python 3, after changing the print statement to use parentheses, I get the following error message:

    if 'DELETE ME' in new_line:
       ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: a bytes-like object is required, not 'str'

I don’t understand this error at all. What’s happening here and how do I fix it?

The Solution

This error occurs because the file is being opened in binary mode, due to the b in open(file_name, 'rb'). The lines returned by file.readlines() are of type bytes rather than type str. The code’s attempt to find a string 'DELETE ME' contained in a bytes object new_line fails with a TypeError exception, as the objects on either side of in are of different types.

To fix the script, remove b and read the file as string data rather than byte data:

with open(file_name, 'r') as file:
    lines = [line.strip() for line in file.readlines()]

new_lines = []
for line in lines:
    new_line = line.lower()
    if 'DELETE ME' in new_line:
        print("Skipping a line.")
        continue

    new_lines.append(new_line)

In Python 2, the original code worked because the str type was just bytes – this made it necessary to use a different type, unicode, when dealing with strings that may contain Unicode characters. In Python 3, the old str type has been renamed to bytes and the old unicode type has been renamed str. While this change introduced incompatibility between Python 2 and 3, it also greatly improved string handling in Python 3, allowing seamless use of Unicode characters.

Binary mode is useful for reading non-text files, such as images or executable binaries, but standard text mode should be preferred for reading text files.

