Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seq() converts pandas DataFrame into Sequence #168

Open
Arshaku opened this issue Nov 30, 2021 · 8 comments
Open

seq() converts pandas DataFrame into Sequence #168

Arshaku opened this issue Nov 30, 2021 · 8 comments

Comments

@Arshaku
Copy link

Arshaku commented Nov 30, 2021

from functional import seq
from pandas import DataFrame

df = DataFrame({'col1': [1,2,3], 'col2': [4,5,6]})
s = seq([df])
el = s.first()
print(type(el))

this code prints: "<class 'functional.pipeline.Sequence'>"
but the expected output is "<class 'pandas.core.frame.DataFrame'>"

@EntilZha
Copy link
Owner

This is related to #158, where the root issue is that for convenience I originally decided to wrap a little to aggressively. I think the fix would be in two steps: (1) add a configurable option to not wrap elements with default to wrap (2) bump version to 2.X and make default to not wrap to avoid breakage. I'd be open to a PR that does this.

@reklanirs
Copy link

Similar issue duing reduce with the lastest master.

Expected:

from functools import reduce

reduce(lambda x,y: x.add(y), [df,df])

Out[1]:
  A     B
0  24.0  14.0
1   8.0   4.0

In fact:

seq([df,df]).reduce(lambda x,y: x.add(y))

Out[2]:
[array([24., 14.]), array([8., 4.])]

to_pandas can give some help but the column names will be missing:

seq([df,df]).reduce(lambda x,y: x.add(y)).to_pandas()

Out[3]:
      0     1
0  24.0  14.0
1   8.0   4.0

Hope it can be fixed

@stale
Copy link

stale bot commented Feb 11, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 11, 2022
@Arshaku
Copy link
Author

Arshaku commented Feb 11, 2022

Hi, any chance this issue will be fixed soon?

@stale stale bot removed the stale label Feb 11, 2022
@EntilZha
Copy link
Owner

I don't have the bandwidth to contribute fixes myself right now, I'd welcome/review pull requests that fix it roughly how previously outlined.

@swiergot
Copy link
Contributor

@EntilZha Any reason why __getitem__() also wraps?

@EntilZha
Copy link
Owner

The reason I originally did it this way is the same, I wanted it to be easy to do something like:

In [1]: from functional import seq

In [2]: seq.range(10).grouped(3)[0].map(lambda x: x * 2)
Out[2]: [0, 2, 4]

In [3]: type(seq.range(10).grouped(3)[0].map(lambda x: x * 2))
Out[3]: functional.pipeline.Sequence

As I mentioned in my prior comments, in retrospect, this has three issues: (1) there is no way to configure the behavior, namely disable it, (2) even if it were configurable, I think its probably incorrect to make the default in most cases to wrap, it probably should do that more sparingly, and (3) changing this is a breaking change, likely requiring a move to 2.x.

I'd welcome/review PRs that would fix this, but don't have the time to do it myself right now. If you are interested, I can outline how I'd do this in a little more detail.

Thanks!

@swiergot
Copy link
Contributor

@reklanirs

In fact:

seq([df,df]).reduce(lambda x,y: x.add(y))

Out[2]:
[array([24., 14.]), array([8., 4.])]

This is a different problem caused by the fact that PyFunctional has a special handling for DataFrame - for some reason it extracts values from it.

to_pandas can give some help but the column names will be missing:

seq([df,df]).reduce(lambda x,y: x.add(y)).to_pandas()

Out[3]:
      0     1
0  24.0  14.0
1   8.0   4.0

How about this:

>>> seq(reduce(lambda x,y: x.add(y), [df,df])).to_pandas(df.columns)
   col1  col2
0     2     8
1     4    10
2     6    12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants