You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
H2O version, Operating System and Environment
h2o version 3.46.0.1, Windows 10.0.19045, Python v3.10.14.
I've run into this in several versions of h2o on various versions of Windows and Python. I wasn't able to recreate it in an Ubuntu environment.
Actual and Expected Behavior
When trying to read in a Python dictionary into an h2o dataframe, some rows are duplicated. This results in the h2o dataframe having more records than the original dictionary. The expected behavior is for the h2o dataframe to have the same number of records as the original dataframe.
Here is an example with random data that recreates the issue, though I unfortunately first ran across it with real data. This example should result in a dataframe with 2,364,350 records, but I end up with 2,364,353:
H2O version, Operating System and Environment
h2o version 3.46.0.1, Windows 10.0.19045, Python v3.10.14.
I've run into this in several versions of h2o on various versions of Windows and Python. I wasn't able to recreate it in an Ubuntu environment.
Actual and Expected Behavior
When trying to read in a Python dictionary into an h2o dataframe, some rows are duplicated. This results in the h2o dataframe having more records than the original dictionary. The expected behavior is for the h2o dataframe to have the same number of records as the original dataframe.
Here is an example with random data that recreates the issue, though I unfortunately first ran across it with real data. This example should result in a dataframe with 2,364,350 records, but I end up with 2,364,353:
Here is a screenshot comparing the row count from the h2o dataframe with a pandas dataframe reading the same data:
The text was updated successfully, but these errors were encountered: