Converting Data Types with pandas: to_numeric and to_datetime

pandas.to_numeric

Converts the argument to a numeric type (float64 or int64 by default). Use the downcast parameter to specify alternative return dtypes. Precision loss may occur with extremely large numbers due to ndarray limitations.

Syntax

pandas.to_numeric(arg, errors='raise', downcast=None, dtype_backend=_NoDefault.no_default)

Parameters

arg (required) The data to convert: scalar, list, tuple, 1-d array, or Series.

errors (optional, default 'raise') Controls how non-numeric values are handled:

  • 'raise': Raises an exception on invalid parsing
  • 'coerce': Sets invalid values to NaN
  • 'ignore': Returns the original input unchanged

downcast (optional, default None) Downcasts to the smallest possible numeric dtype:

  • 'integer' or 'signed': Smallest signed int (min: int8)
  • 'unsigned': Smallest unsigned int (min: uint8)
  • 'float': Smallest float (min: float32)

dtype_backend (optional, default numpy_nullable) Specifies the back-end type:

  • 'numpy_nullable': Returns nullable-dtype-backed DataFrame
  • 'pyarrow': Returns pyarrow-backed nullable ArrowDtype DataFrame

Return Value

Returns numeric data. The type depends on input:

  • Scalar → scalar
  • List/tuple → ndarray
  • Series → Series with numeric dtype
  • DataFrame → DataFrame with numeric columns

Usage Examples

Converting a Series with mixed content:

import pandas as pd

mixed_data = pd.Series(['100', '200', 'xyz', '400'])
result = pd.to_numeric(mixed_data, errors='coerce')
print(result)

Output:

0    100.0
1    200.0
2      NaN
3    400.0
dtype: float64

Converting multiple columns in a DataFrame:

import pandas as pd

data_frame = pd.DataFrame({
    'col_a': ['50', '75', 'invalid', '100'],
    'col_b': ['1.5', '2.7', '3.9', '4.2']
})

converted = data_frame.apply(pd.to_numeric, errors='coerce')
print(converted)

Output:

   col_a  col_b
0   50.0    1.5
1   75.0    2.7
2    NaN    3.9
3  100.0    4.2

Downcasting to reduce memory footprint:

import pandas as pd

large_values = pd.Series([1, 2, 3, 4, 5])
compact = pd.to_numeric(large_values, downcast='integer')
print(compact.dtype)

pandas.to_datetime

Converts the argument to datetime64[ns] type. Accepts scalars, arrays, Series, or DataFrames containing date-like strings, timestamps, or integers representing time values.

Syntax

pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=False, format=None, exact=_NoDefault.no_default, unit=None, infer_datetime_format=_NoDefault.no_default, origin='unix', cache=True)

Parameters

arg (required) The object to convert: int, float, str, datetime, list, tuple, 1-d array, Series, DataFrame, or dict-like.

errors (optional, default 'raise')

  • 'raise': Raises exception on invalid parsing
  • 'coerce': Sets invalid values to NaT
  • 'ignore': Returns original input unchanged

dayfirst (optional, default False) When True, interprets first two digits as day. For example, '10/11/12' becomes November 10, 2012.

yearfirst (optional, default False) When True, interprets first four digits as year. For example, '10/11/12' becomes 2010-11-12.

utc (optional, default False) When True, converts all outputs to timezone-aware UTC timestamps.

format (optional, default None) strftime format string. Use 'ISO8601' for ISO8601 parsing or 'mixed' for per-element inference.

exact (optoinal, default True) When True, requires exact format matching. When False, allows format to match anywhere in string.

unit (optional, default 'ns') Unit for numeric input: 'D', 's', 'ms', 'us', or 'ns'.

origin (optional, default 'unix') Reference date for numeric interpretation:

  • 'unix': Epoch 1970-01-01
  • 'julian': Julian calendar start (unit must be 'D')
  • Timestamp or datetime: Use specified value

cache (optional, default True) Uses cached unique converted dates for faster parsing when input contains 50+ values.

Return Value

  • Scalar → Timestamp (or datetime.datetime)
  • Array-like → DatetimeIndex (or Series of datetime objects)
  • Series → Series of datetime64 dtype
  • DataFrame → Series of datetime64 dtype

Usage Examples

Parsing a single date string:

import pandas as pd

timestamp_str = '2024-03-20'
parsed = pd.to_datetime(timestamp_str)
print(parsed)

Output:

2024-03-20 00:00:00

Converting a list of dates:

import pandas as pd

date_collection = ['2024-01-01', '2024-06-15', '2024-12-31']
index = pd.to_datetime(date_collection)
print(index)

Output:

DatetimeIndex(['2024-01-01', '2024-06-15', '2024-12-31'], dtype='datetime64[ns]')

Handling invalid entries gracefully:

import pandas as pd

dates_with_errors = pd.Series(['2024-01-01', 'invalid', '2024-03-15'])
cleaned = pd.to_datetime(dates_with_errors, errors='coerce')
print(cleaned)

Output:

0   2024-01-01
1          NaT
2   2024-03-15
dtype: datetime64[ns]

Specifying custom format with day-first interpretation:

import pandas as pd

custom_format = '15/06/2024'
result = pd.to_datetime(custom_format, format='%d/%m/%Y', dayfirst=True)
print(result)

Output:

2024-06-15 00:00:00

Converting Unix timestamps:

import pandas as pd

epoch_values = pd.Series([1640995200, 1641081600])
timestamps = pd.to_datetime(epoch_values, unit='s')
print(timestamps)

Output:

0   2022-01-01 00:00:00
1   2022-01-02 00:00:00
dtype: datetime64[ns]

Working with DataFrame columns:

import pandas as pd

source = pd.DataFrame({
    'year': [2022, 2023, 2024],
    'month': [1, 2, 3],
    'day': [15, 20, 25]
})
combined = pd.to_datetime(source)
print(combined)

Output:

0   2022-01-15
1   2023-02-20
2   2024-03-25
dtype: datetime64[ns]

Tags: Pandas to_numeric to_datetime data type conversion datetime parsing

Posted on Wed, 17 Jun 2026 16:03:11 +0000 by astaroth