You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently ran into rapidsai/cudf#6763 which triggers when trying to write booleans with nulls. We test writing booleans to ORC in our integration tests, but those tests did not trigger the issue. They missed it because they only write a single stripe to each file, because so few rows are written. If the test had written enough rows to trigger more than one stripe, the bug would have been caught.
The text was updated successfully, but these errors were encountered:
Based on the explanation in the comment here rapidsai/cudf#6763 (comment) and my experiment, generating multiple row groups can reproduce this error #11736 instead of multiple stripes.
The point of this issue is not to overly focus on the specific ORC boolean issue. Instead this issue was raised because we realized the integration tests are not testing the cases where we need to write more than one row group or stripe. We need tests for both. Yes, there's an issue with ORC booleans across multiple row groups in a single write, and we'll come up with a specific unit test for that when that is fixed. However there's not a test for generating multiple row groups or stripes in a single write, regardless of booleans, and we should have tests to cover that case in general. This issue is about a whole category of testing that has been missed, not the specific boolean failure.
We recently ran into rapidsai/cudf#6763 which triggers when trying to write booleans with nulls. We test writing booleans to ORC in our integration tests, but those tests did not trigger the issue. They missed it because they only write a single stripe to each file, because so few rows are written. If the test had written enough rows to trigger more than one stripe, the bug would have been caught.
The text was updated successfully, but these errors were encountered: