Read a text file into a string and strip newlines in Python

David Y.
jump to solution

The Problem

I have a text file (dna.txt) containing multiple lines, for example:

ATCAGTGGAAACCCAGTGCTA
GAGGATGGAATGACCTTAAAT
CAGGGACGATATTAAACGGAA

Using Python, how do I read it into a string variable as one long line, i.e. removing newlines? I want the final string to look like this:

ATCAGTGGAAACCCAGTGCTAGAGGATGGAATGACCTTAAATCAGGGACGATATTAAACGGAA

The Solution

We can achieve this using the following Python code:

with open("dna.txt", "r") as file:
    dna = file.read().replace("\n", "")

print(dna)

# will print ATCAGTGGAAACCCAGTGCTAGAGGATGGAATGACCTTAAATCAGGGACGATATTAAACGGAA

In the above code:

  • open("dna.txt", "r") opens the file in read mode (r). We use Python’s with statement to automatically close the file at the end of the block.
  • file.read() reads the entire contents of the file into a string.
  • replace("\n", "") is a string method that replaces all newline characters in our string with empty strings.

In some cases, we may prefer to replace newlines with other characters, such as a single space. We can do this with a slight modification to the above code:

with open("dna.txt", "r") as file:
    dna = file.read().replace("\n", " ") # replace newline with space

print(dna)

# will print ATCAGTGGAAACCCAGTGCTA GAGGATGGAATGACCTTAAAT CAGGGACGATATTAAACGGAA

An alternative but less explicit way to produce the same output would be to use str.splitlines and str.join. This will create a list containing each line in the file, and then convert that list into a string with a specified delimiter. We can use an empty string to remove the new lines entirely:

with open("dna.txt", "r") as file:
    dna = "".join(file.read().splitlines())

print(dna)

# will print ATCAGTGGAAACCCAGTGCTAGAGGATGGAATGACCTTAAATCAGGGACGATATTAAACGGAA

Alternatively, we could use any other string to separate the lines with that string:

with open("dna.txt", "r") as file:
    dna = " ".join(file.read().splitlines())  # separate lines with a single space

print(dna)

# will print ATCAGTGGAAACCCAGTGCTA GAGGATGGAATGACCTTAAAT CAGGGACGATATTAAACGGAA

While both of these approaches produce the same output, the second one may be confusing to readers unfamiliar with Python.

Handle Windows paths in Python
David Y.
Print colored text to terminal with Python
David Y.
What are Python Metaclasses?
James W.

Considered "not bad" by 4 million developers and more than 150,000 organizations worldwide, Sentry provides code-level observability to many of the world's best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

Sentry