Skip to content

Commit

Permalink
use string dtype for time for now
Browse files Browse the repository at this point in the history
related issue #124
  • Loading branch information
semio committed Jul 12, 2020
1 parent 2c29497 commit eed2e69
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions ddf_utils/model/ddf.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,9 +220,12 @@ def data(self):
cols = [*self.dimensions, self.id]
df = dd.read_csv(self.path, usecols=cols, **self.read_csv_options)
# handling time columns
for k, v in self.concept_types.items():
if v == 'time':
df[k] = parse_time_series(df[k], engine='dask')
# Because df.query() performance is poor when the df contains pd.Period dtype
# we decided not to parse them, which will result in string dtype
# TODO: use Period when perofrmance become better
# for k, v in self.concept_types.items():
# if v == 'time':
# df[k] = parse_time_series(df[k], engine='dask')
return df


Expand Down

0 comments on commit eed2e69

Please sign in to comment.