Sentry Answers>Python>

Change a column type in a DataFrame in Python Pandas

Change a column type in a DataFrame in Python Pandas

David Y.

The ProblemJump To Solution

How can I change the data type of a column in a Pandas DataFrame?

The Solution

There are a few different ways to do this in Pandas. Which one to use will depend on the data types we’re converting from and to.

  1. If we want to convert a column from any data type to one specific data type (e.g. integer, float, string), we should use the astype method.
  2. If we want to convert a column to a sensible numeric data type (integer or float), we should use the to_numeric function.
  3. If we want Pandas to decide which data types to use for each column, we should use the convert_dtypes method.

Each of these methods is detailed in the subsections below.

1. Conversion with astype

The first and most versatile method to use is the astype method. When called on a Pandas DataFrame or Series, this method will attempt to cast the values within to the specified type. We can use this method to change the type of one or more columns at a time, as shown in the example below:

Click to Copy
import pandas as pd # Create and print DataFrame df = pd.DataFrame({ 'A': ['1', '2', '3'], 'B': ['4', '5', '6'], 'C': ['7', '8', '9'] }) print(df) # Print data types of each column in DataFrame print("\n") print(df.dtypes) # Change column A's values to floats df['A'] = df['A'].astype(float) # Change column B and C's values to integers df = df.astype({'B': int, 'C': int}) print("\nConverted:\n") # Print altered DataFrame print(df) # Print data types of each column in DataFrame print("\n") print(df.dtypes)

This script produces the following output:

Click to Copy
A B C 0 1 4 7 1 2 5 8 2 3 6 9 A object B object C object dtype: object Converted: A B C 0 1.0 4 7 1 2.0 5 8 2 3.0 6 9 A float64 B int64 C int64 dtype: object

2. Conversion with to_numeric

If we want to convert a column to a numeric type, we can use the to_numeric function. Depending on the data in our columns, they will be converted into either integers or floats.

Click to Copy
import pandas as pd # Create and print DataFrame df = pd.DataFrame({ 'A': ['1', '2', '3'], 'B': ['4.0', '5.1', '6.2'], 'C': ['7', '8', '9'] }) print(df) # Print data types of each column in DataFrame print("\n") print(df.dtypes) # Change column A's values to a numeric type df['A'] = pd.to_numeric(df['A']) # Change column B and C's values to a numeric type df[['B', 'C']] = df[['B', 'C']].apply(pd.to_numeric) print("\nConverted:\n") # Print altered DataFrame print(df) # Print data types of each column in DataFrame print("\n") print(df.dtypes)

This script produces the following output:

Click to Copy
A B C 0 1 4.0 7 1 2 5.1 8 2 3 6.2 9 A object B object C object dtype: object Converted: A B C 0 1 4.0 7 1 2 5.1 8 2 3 6.2 9 A int64 B float64 C int64 dtype: object

Unlike with astype, we must use the apply method if we want to convert multiple columns at once.

3. Conversion with convert_dtypes

The convert_dtypes DataFrame method will convert all columns to the best possible types that support pd.NA, the Pandas missing value. Note that this method will not convert numeric strings to integers or floats.

Click to Copy
import pandas as pd # Create and print DataFrame df = pd.DataFrame({ 'A': [1.0, 2.0, 5.3], 'B': ['z', 'x', 'c'], 'C': [7, 8, 4], 'D': ['1', '2', '3'] }) print(df) # Print data types of each column in DataFrame print("\n") print(df.dtypes) # Change all columns to the appropriate types df = df.convert_dtypes() print("\nConverted:\n") # Print altered DataFrame print(df) # Print data types of each column in DataFrame print("\n") print(df.dtypes)

This script produces the following output:

Click to Copy
A B C D 0 1.0 z 7 1 1 2.0 x 8 2 2 5.3 c 4 3 A float64 B object C int64 D object dtype: object Converted: A B C D 0 1.0 z 7 1 1 2.0 x 8 2 2 5.3 c 4 3 A Float64 B string C Int64 D string dtype: object
  • Sentry BlogPython Performance Testing: A Comprehensive Guide
  • Sentry BlogLogging in Python: A Developer’s Guide
  • Syntax.fm logo
    Listen to the Syntax Podcast

    Tasty treats for web developers brought to you by Sentry. Get tips and tricks from Wes Bos and Scott Tolinski.

    SEE EPISODES

Loved by over 4 million developers and more than 90,000 organizations worldwide, Sentry provides code-level observability to many of the world’s best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

© 2024 • Sentry is a registered Trademark
of Functional Software, Inc.