Skip to content

Conversation

@yjennyli
Copy link
Contributor

@yjennyli yjennyli commented Dec 10, 2025

closes #59186

When reading Excel files with mixed data types (strings, dates, numbers in the same column), the calamine and openpyxl engines returned different Python types for datetime values:

Engine Returned type
openpyxl datetime.datetime
calamine pd.Timestamp

This is confusing for users who expect consistent behavior regardless of which engine they use.

This PR Changes the calamine reader to return standard library types (datetime.datetime, datetime.timedelta) instead of pandas types (pd.Timestamp, pd.Timedelta), matching openpyxl's behavior.

Testing:
All existing Excel reader tests pass (1467 passed)
Manual testing confirms both engines now return the same types

…aders (pandas-dev#59186)

The calamine Excel reader was returning pd.Timestamp and pd.Timedelta
for datetime values in mixed-type columns, while openpyxl returns
standard library datetime.datetime and datetime.timedelta objects.

This change modifies the calamine reader to return standard library
datetime types, ensuring consistent behavior across Excel reader engines.

- Return datetime.datetime as-is instead of converting to pd.Timestamp
- Convert date objects to datetime to match openpyxl behavior
- Return datetime.timedelta as-is instead of converting to pd.Timedelta
- Remove unused pandas import
- Update test helper get_exp_unit() to reflect new consistent behavior
@yjennyli yjennyli requested a review from rhshadrach as a code owner December 10, 2025 07:33
@yjennyli yjennyli marked this pull request as draft December 10, 2025 15:20
@yjennyli yjennyli marked this pull request as ready for review December 10, 2025 15:21
Comment on lines 115 to 118
elif isinstance(value, timedelta):
return pd.Timedelta(value)
# Return datetime.timedelta as-is to match openpyxl behavior (GH#59186)
# (previously returned pd.Timedelta)
return value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you combine this branch with the isinstance(value, datetime) branch by combining the isinstance checks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

@mroeschke mroeschke added the IO Excel read_excel, to_excel label Dec 10, 2025
@mroeschke mroeschke added this to the 3.0 milestone Dec 11, 2025
@mroeschke mroeschke merged commit 447249a into pandas-dev:main Dec 11, 2025
41 checks passed
@mroeschke
Copy link
Member

Thanks @yjennyli

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO Excel read_excel, to_excel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Difference between calamine and openpyxl readers - columns with mixed data types

2 participants