You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the issue: Not sure if this is a fastparquet or pyarrow (or pandas) issue, but I noticed that a column with pandas categorical dtype is read as object dtype if the Parquet file is created by the fastparquet engine and then read by the pyarrow engine. The other three cases preserve the dtype.
Thanks for notifying me, sounds like a metadata parsing thing. Whilst it should be easy to fix, I'm not sure when I will get to it.
Interestingly, with the fastparquet API, you can always assert that a give column should be a category type with categories=, but I don't think pyarrow can do that.
Describe the issue: Not sure if this is a fastparquet or pyarrow (or pandas) issue, but I noticed that a column with pandas categorical dtype is read as object dtype if the Parquet file is created by the fastparquet engine and then read by the pyarrow engine. The other three cases preserve the dtype.
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered: