Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Python 3.11 to GHA #1090

Merged
merged 20 commits into from
Jun 12, 2024
Merged

Add Python 3.11 to GHA #1090

merged 20 commits into from
Jun 12, 2024

Conversation

gliptak
Copy link
Contributor

@gliptak gliptak commented Feb 2, 2024

No description provided.

@gliptak gliptak requested a review from a team as a code owner February 2, 2024 18:11
@taylorfturner taylorfturner changed the base branch from main to dev February 2, 2024 19:15
@taylorfturner
Copy link
Contributor

@gliptak rebased onto dev so that all the branches are going into same branch prior to deployment to main

@gliptak
Copy link
Contributor Author

gliptak commented Feb 2, 2024

@taylorfturner consider setting dev as default UI branch

python-snappy has no Python 3.11 currently intake/python-snappy#129

possible replacement is https://github.com/milesgranger/cramjam/tree/master/cramjam-python

@gliptak
Copy link
Contributor Author

gliptak commented Feb 2, 2024

#1091

@gliptak
Copy link
Contributor Author

gliptak commented Feb 2, 2024

will rebase after #1091 merged

@gliptak
Copy link
Contributor Author

gliptak commented Feb 26, 2024

@taylorfturner taylorfturner changed the base branch from old-old-dev to dev March 7, 2024 12:29
@taylorfturner
Copy link
Contributor

@gliptak rebase onto dev and I'll approve

@gliptak
Copy link
Contributor Author

gliptak commented Mar 7, 2024

@taylorfturner this might already be dev based

#1091 would have to be merged first

* add downloads tile (capitalone#1085)

* Replace snappy with cramjam

* Delete test_no_snappy

---------

Co-authored-by: Taylor Turner <[email protected]>
@taylorfturner
Copy link
Contributor

@gliptak #1091 merged ... rebase this and we'll take a look. Thanks for the contribution! 🎉

@gliptak
Copy link
Contributor Author

gliptak commented Mar 14, 2024

dask packaging changed? https://pypi.org/project/dask/#history

https://github.com/capitalone/DataProfiler/actions/runs/8282560184/job/22663633408?pr=1090

____ ERROR collecting dataprofiler/tests/validators/test_base_validators.py ____
/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/dask/dataframe/__init__.py:22: in _dask_expr_enabled
    import dask_expr  # noqa: F401
E   ModuleNotFoundError: No module named 'dask_expr'

During handling of the above exception, another exception occurred:
dataprofiler/tests/validators/test_base_validators.py:4: in <module>
    from dask import dataframe as dd
/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/dask/dataframe/__init__.py:87: in <module>
    if _dask_expr_enabled():
/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/dask/dataframe/__init__.py:24: in _dask_expr_enabled
    raise ValueError("Must install dask-expr to activate query planning.")
E   ValueError: Must install dask-expr to activate query planning.
=============================== warnings summary ===============================
dataprofiler/tests/profilers/test_histogram_utils.py:35
  /home/runner/work/DataProfiler/DataProfiler/dataprofiler/tests/profilers/test_histogram_utils.py:35: PytestCollectionWarning: cannot collect test class 'TestColumn' because it has a __init__ constructor (from: dataprofiler/tests/profilers/test_histogram_utils.py)
    class TestColumn(NumericStatsMixin):

dataprofiler/tests/profilers/test_numeric_stats_mixin_profile.py:[21](https://github.com/capitalone/DataProfiler/actions/runs/8282560184/job/22663633408?pr=1090#step:6:22)
  /home/runner/work/DataProfiler/DataProfiler/dataprofiler/tests/profilers/test_numeric_stats_mixin_profile.py:21: PytestCollectionWarning: cannot collect test class 'TestColumn' because it has a __init__ constructor (from: dataprofiler/tests/profilers/test_numeric_stats_mixin_profile.py)
    class TestColumn(NumericStatsMixin):

dataprofiler/tests/profilers/test_numeric_stats_mixin_profile.py:[34](https://github.com/capitalone/DataProfiler/actions/runs/8282560184/job/22663633408?pr=1090#step:6:35)
  /home/runner/work/DataProfiler/DataProfiler/dataprofiler/tests/profilers/test_numeric_stats_mixin_profile.py:34: PytestCollectionWarning: cannot collect test class 'TestColumnWProps' because it has a __init__ constructor (from: dataprofiler/tests/profilers/test_numeric_stats_mixin_profile.py)
    class TestColumnWProps(TestColumn):

@taylorfturner
Copy link
Contributor

@gliptak yeah I just started seeing this yesterday due to the package change by dask on the 12th. Haven't had the bandwidth to research why. I'd imagine a simple tag to not allow for this version would be a temporary fix to unblock

@taylorfturner
Copy link
Contributor

seeing on #1115 too from @carlsonp

@gliptak
Copy link
Contributor Author

gliptak commented Mar 14, 2024

@gliptak
Copy link
Contributor Author

gliptak commented Mar 14, 2024

@taylorfturner corrected dask modules install

now there is a Keras(?) error https://github.com/capitalone/DataProfiler/actions/runs/8283138450/job/22665615645?pr=1090

Copy link
Contributor

@taylorfturner taylorfturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase onto dev

@gliptak
Copy link
Contributor Author

gliptak commented Jun 7, 2024

dask/dask#11038

The advised solution is to upgrade to Dask version 2024.4.1.

@taylorfturner am I to proceed with Dask bump as per above?

https://github.com/capitalone/DataProfiler/actions/runs/9421219740/job/25954837467?pr=1090

=========================== short test summary info ============================
ERROR dataprofiler/tests/validators/test_base_validators.py - TypeError: descriptor '__call__' for 'type' objects doesn't apply to a 'property' object
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!

auto-merge was automatically disabled June 9, 2024 18:04

Head branch was pushed to by a user without write access

@gliptak
Copy link
Contributor Author

gliptak commented Jun 9, 2024

https://github.com/capitalone/DataProfiler/actions/runs/9438503984/job/25995558872?pr=1090

___________________ TestEvaluateAccuracy.test_save_conf_mat ____________________
self = <dataprofiler.tests.labelers.test_labeler_utils.TestEvaluateAccuracy testMethod=test_save_conf_mat>
mock_dataframe = <MagicMock name='DataFrame' id='140514062390544'>
mock_report = <MagicMock name='classification_report' id='[140](https://github.com/capitalone/DataProfiler/actions/runs/9438503984/job/25995558872?pr=1090#step:6:141)514060067088'>
    @mock.patch("dataprofiler.labelers.labeler_utils.classification_report")
    @mock.patch("pandas.DataFrame")
    def test_save_conf_mat(self, mock_dataframe, mock_report):
    
        # ideally mock out the actual contents written to file, but
        # would be difficult to get this completely worked out.
        expected_conf_mat = np.array(
            [
                [1, 0, 1],
                [1, 0, 0],
                [0, 1, 2],
            ]
        )
        expected_row_col_names = dict(
            columns=["pred:PAD", "pred:UNKNOWN", "pred:OTHER"],
            index=["true:PAD", "true:UNKNOWN", "true:OTHER"],
        )
>       mock_instance_df = mock.Mock(spec=pd.DataFrame)()
dataprofiler/tests/labelers/test_labeler_utils.py:255: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/unittest/mock.py:1106: in __init__
    _safe_super(CallableMixin, self).__init__(
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/unittest/mock.py:457: in __init__
    self._mock_add_spec(spec, spec_set, _spec_as_instance, _eat_self)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <[AttributeError('_mock_methods') raised in repr()] Mock object at 0x7fcc6aa22410>
spec = <MagicMock name='DataFrame' id='140514062390544'>, spec_set = None
_spec_as_instance = False, _eat_self = False
    def _mock_add_spec(self, spec, spec_set, _spec_as_instance=False,
                       _eat_self=False):
        if _is_instance_mock(spec):
>           raise InvalidSpecError(f'Cannot spec a Mock object. [object={spec!r}]')
E           unittest.mock.InvalidSpecError: Cannot spec a Mock object. [object=<MagicMock name='DataFrame' id='140514062390544'>]
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/unittest/mock.py:508: InvalidSpecError

@gliptak gliptak mentioned this pull request Jun 9, 2024
python-snappy>=0.5.4,
cramjam>=2.7.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be reverted 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should match requirements.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python-snappy>=0.7.1 supports Python 3.11 by switching over to cramjam and removing libsnappy-dev dependency (I contributed intake/python-snappy#130)

I will work dependencies in a minute

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we reverted these cramjam changes due to failing tests locally though in #1122.

  • do we have to include cramjam + 3.11 in same pr?
  • I'm game for the 3.11 add
  • is the test suite running locally for you using DATAPROFILER_SEED=0 python3 -m unittest discover -p "test*.py" and tox -e py311?

dataprofiler/__init__.py Show resolved Hide resolved
@taylorfturner taylorfturner enabled auto-merge (squash) June 10, 2024 14:59
@gliptak
Copy link
Contributor Author

gliptak commented Jun 10, 2024

this is the above outstanding test fail python/cpython#87644

mock_instance_df = mock.Mock(spec=pd.DataFrame)()

@taylorfturner
Copy link
Contributor

this is the above outstanding test fail python/cpython#87644

mock_instance_df = mock.Mock(spec=pd.DataFrame)()

I see -- you are welcome to propose a fix for this as part of this PR (instead of a separate PR). If you get something operational, we can include this in the 0.12.0 release; otherwise, I will need to deploy without. Thanks, @gliptak!

auto-merge was automatically disabled June 10, 2024 20:51

Head branch was pushed to by a user without write access

@gliptak
Copy link
Contributor Author

gliptak commented Jun 10, 2024

@taylorfturner I rewrote the test and it ran green locally. please review

also let me know if separate bump PRs would work better (and guide on how you would like to split)

python-snappy>=0.5.4,
cramjam>=2.7.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we reverted these cramjam changes due to failing tests locally though in #1122.

  • do we have to include cramjam + 3.11 in same pr?
  • I'm game for the 3.11 add
  • is the test suite running locally for you using DATAPROFILER_SEED=0 python3 -m unittest discover -p "test*.py" and tox -e py311?

@@ -55,7 +55,7 @@ repos:
pyarrow>=1.0.1,
chardet>=3.0.4,
fastavro>=1.0.0.post1,
python-snappy>=0.5.4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove here? it is kept in requirements.txt file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synced with requirements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome

@taylorfturner taylorfturner enabled auto-merge (squash) June 11, 2024 12:40
@taylorfturner
Copy link
Contributor

@taylorfturner I rewrote the test and it ran green locally. please review

also let me know if separate bump PRs would work better (and guide on how you would like to split)

Thanks, @gliptak !

  • keep this PR but only do 3.11 add and the test update
  • remove the cramjam / snappy(?)

auto-merge was automatically disabled June 11, 2024 13:39

Head branch was pushed to by a user without write access

@gliptak
Copy link
Contributor Author

gliptak commented Jun 11, 2024

python-snappy>=0.7.1 bump is required for Python 3.11

@@ -55,7 +55,7 @@ repos:
pyarrow>=1.0.1,
chardet>=3.0.4,
fastavro>=1.0.0.post1,
python-snappy>=0.5.4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome

dataprofiler/__init__.py Show resolved Hide resolved
dataprofiler/tests/test_data_profiler.py Show resolved Hide resolved
@taylorfturner taylorfturner enabled auto-merge (squash) June 11, 2024 17:54
@taylorfturner taylorfturner merged commit 4e4450a into capitalone:dev Jun 12, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants