Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poetry deps w/ python version and platform, and multiple constraints dependencies #592

Conversation

lorepirri
Copy link
Contributor

@lorepirri lorepirri commented Dec 15, 2024

Description

When reading Poetry dependencies from a pyproject.toml, Grayskull does not read the requirements of python version and platform.

Also, it does not support multiple constraints dependencies and it crashes (see #463 #535 and an open PR #523 by @xhochy ).

This PR aims to add some support for these two missing features.

As mentioned in the two issues, one could see grayskull crashing on:

Tests are added, including some basic regression tests for databricks-sql-connector, xypattern, ersilia, and nannyml.

Tagging also for visibility people active in the issues: @xylar @darynwhite @kcpevey @DhanshreeA @thewchan

Known limits

Platform

From Poetry dependencies documentation, the requirement platform seems to have only two values linux and darwin, probably windows but not sure (respectively rendered as linux, osx, and win selectors).

It is unclear if one could use platform = "!=darwin" to express # [not osx].

Python version

python version is rendered naively and differently than for pypi.py and py_base.py (I did not get the function generic_py_ver_to with py2k andpy3k, because a lot going on in that function and I am not familiar with those selectors). If the version is >=3.10 this will simply result in a selector py>=310.

If this is not wanted or bloating the code, I am willing to re-use functions from py_base.py or pypi.py with the right guidance (thanks).

It does expand caret and tilde (e.g. ^3.10 -> py>=310 and py<4), and always removes patch, and also minor if 0 (e.g. 3.11.1 -> py==311, 3.12.0 -> py==312, and 4.0.0 -> py==4).

Multiple version specifiers are at the moment always rendered in and. For example >=3.10.0,<3.11.0 with render py>=310 and py<311. Same as conda-build is doing. Ref: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html#package-match-specifications

If one would want to express <3.7 or >=3.10 should use the pipe operator <3.7|>=3.10. In fact, <3.7|>=3.10 renders py<37 or py>=310 and <3.8|>=3.10,!=3.11 renders py<38 or py>=310 and py!=311 (, and and take precedence over | and or).

It does handle compatible release operator ~= correctly and the wildcard .* suffix in the version as well (not very common though as the resulting selectors are not interesting).

Others

👉 Fixes a minor bug: the compatible release operator ~= in the dependency version could not be used because the parser was jumping to expanding the tilde ~ and crashing on = (or returning the wrong version containing =). Therefore I suspect that the ~= operator was never handled (maybe never used, or nobody reported this bug).

What does not add:

  • The requirement markers, extras, and others, are not supported (ignored). Nevertheless, the ground is set, and it is easier now to add support for this.
  • Expanded dependency specification syntax not touched. I don't know if the current version of grayskull is able to handle it.

xref #463 #535 / PR #523

@lorepirri lorepirri marked this pull request as ready for review December 15, 2024 20:50
@lorepirri lorepirri requested a review from a team as a code owner December 15, 2024 20:50

# handle exact version specifiers correctly
>>> encode_poetry_python_version_to_selector_item("3.8")
"py==38"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest is running your doctests here as well, that is why it is failing on ci

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, I've fixed the examples output format. Thanks for having a look at it!

Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you currently have or where there should be and, e.g. >=3.10.0,<3.11.0 --> py>=310 and py<311.

# handle caret operator correctly
>>> encode_poetry_python_version_to_selector_item("^3.10")
# renders '>=3.10.0,<4.0.0'
"py>=310 or py<4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"py>=310 or py<4"
"py>=310 and py<4"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

# handle tilde operator correctly
>>> encode_poetry_python_version_to_selector_item("~3.10")
# renders '>=3.10.0,<3.11.0'
"py>=310 or py<311"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"py>=310 or py<311"
"py>=310 and py<311"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

operator, version = parse_python_version(conda_clause)
version_selector = version.replace(".", "")
conda_selectors.append(f"py{operator}{version_selector}")
selectors = " or ".join(conda_selectors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
selectors = " or ".join(conda_selectors)
selectors = " and ".join(conda_selectors)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@xylar
Copy link
Contributor

xylar commented Dec 16, 2024

I think it may eventually be important to support ~= but I would be happy with a solution that doesn't support that for now. Same with .*.

Comment on lines 323 to 327
# Split into major, minor, and discard the rest (patch or additional parts)
major, minor, *_ = version.split(".")

# Return only major if minor is "0", otherwise return major.minor
return operator, major if minor == "0" else f"{major}.{minor}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably need this to also support:

parse_python_version('>=3')

When I tried that, I see (as I expected I might):

  File "/home/xylar/code/grayskull/poetry-deps-w-python-ver-platform-and-multiple-deps/./test.py", line 6, in <module>
    print(parse_python_version('>=3'))
          ~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/home/xylar/code/grayskull/poetry-deps-w-python-ver-platform-and-multiple-deps/grayskull/strategy/parse_poetry_version.py", line 324, in parse_python_version
    major, minor, *_ = version.split(".")
    ^^^^^^^^^^^^^^^^
ValueError: not enough values to unpack (expected at least 2, got 1)

Could you add a test for this situation and make the necessary changes to support a single digit as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added support for 1 digit and three dots case (e.g. 3.8.0.1, never seen for Python version specifier but valid syntax). Thanks for checking that out!

@xylar
Copy link
Contributor

xylar commented Dec 16, 2024

@lorepirri, I should have said before diving into the details. I really appreciate this work a ton! The lack for support for these python selectors has been bothering me a lot for a long time, but I've felt a little overwhelmed when I looked into trying to fix them. This is some great work! Not surprisingly, there are still a few wrinkles to iron out.

@lorepirri
Copy link
Contributor Author

I think it may eventually be important to support ~= but I would be happy with a solution that doesn't support that for now. Same with .*.

@xylar ~= and .* are not much more work to add. I will take some time for it this week.

@lorepirri
Copy link
Contributor Author

I believe you currently have or where there should be and, e.g. >=3.10.0,<3.11.0 --> py>=310 and py<311.

oh yeah... sorry about that! Nice catch. The or got stuck in my head... I worked the whole time with py<37 or py>=310 in mind (mostly coming from the skip: True), then did not test that case.

I'm left wondering how Poetry will interpret <3.7,>=3.10 🤔 is every comma an and, or do they check whether the lower/upper bounds converge? Probably.

@lorepirri
Copy link
Contributor Author

lorepirri commented Dec 17, 2024

@lorepirri, I should have said before diving into the details. I really appreciate this work a ton! The lack for support for these python selectors has been bothering me a lot for a long time, but I've felt a little overwhelmed when I looked into trying to fix them. This is some great work! Not surprisingly, there are still a few wrinkles to iron out.

Thank you for the nice words and the review for this PR @xylar . grayskull started to fail too many times on this (maybe more projects moved to Poetry) and I had to fix it ^^ I wanted to see the traction and if my changes made sense before trying to solve all of the issues with Poetry (I am not familiar with all the code in grayskull and I'm sure I could have reused some functions, e.g. the one for the selectors).

@xylar
Copy link
Contributor

xylar commented Dec 17, 2024

I'm left wondering how Poetry will interpret <3.7,>=3.10 🤔 is every comma an and, or do they check whether the lower/upper bounds converge? Probably.

I don't believe that's valid. That would be >3.7 and >=3.10 and could not be satisfied.

@xylar
Copy link
Contributor

xylar commented Dec 17, 2024

(I am not familiar with all the code in grayskull and I'm sure I could have reused some functions, e.g. the one for the selectors).

I'm not either. Hopefully @marcelotrevisani can provide that expertise.

@lorepirri
Copy link
Contributor Author

I'm left wondering how Poetry will interpret <3.7,>=3.10 🤔 is every comma an and, or do they check whether the lower/upper bounds converge? Probably.

I don't believe that's valid. That would be >3.7 and >=3.10 and could not be satisfied.

Indeed,
I see that conda-build uses always and for the comma , operator. That's probably what Poetry does as well.

Ref: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html#package-match-specifications

If one would want to express <3.7 or >=3.10 should use the pipe <3.7|>=3.10. Let me see how that renders.

@lorepirri
Copy link
Contributor Author

Added support for or. Updated the description of the PR.

Multiple version specifiers are at the moment always rendered in and. For example >=3.10.0,<3.11.0 with render py>=310 and py<311. Same as conda-build is doing. Ref: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html#package-match-specifications

If one would want to express <3.7 or >=3.10 should use the pipe operator <3.7|>=3.10.

<3.7|>=3.10 renders py<37 or py>=310 and <3.8|>=3.10,!=3.11 renders py<38 or py>=310 and py!=311 (, and and take precedence over | and or).

@lorepirri
Copy link
Contributor Author

When there are any extras in a dependency requirement for pypi the config.is_arch is set to True: https://github.com/conda/grayskull/blob/main/grayskull/strategy/pypi.py#L355

I did not explore the reason yet.

Shall it be the same for Poetry? Maybe this is another question @marcelotrevisani can help with.

@marcelotrevisani
Copy link
Member

When there are any extras in a dependency requirement for pypi the config.is_arch is set to True: https://github.com/conda/grayskull/blob/main/grayskull/strategy/pypi.py#L355

I did not explore the reason yet.

Shall it be the same for Poetry? Maybe this is another question @marcelotrevisani can help with.

that is set to True because when that function is called it means selectors are in use, and if we have selectors the recipe cannot be no arch. However, a few things have changed in conda-forge and that is not entirely true anymore. We allow a few selectors in some scenarios, but as a rule of thumb, if you don't want to dive into it, that is fine to still consider to be arch if it has selectors.

And thanks a lot for your contribution! I really appreciate it, I believe @xylar have done a good review and I am also happy with the current state of the PR, unless you want to improve a few bits more, but that can also be done in a second PR if you wish :)

@xylar
Copy link
Contributor

xylar commented Dec 18, 2024

I imagine there could be some growing pains with this update. There are a lot of cases to consider and @lorepirri, you've done quite a good job of that. Even so, if there are some bugs that come up after we send this into the wild, I'm more than happy to help with them in the future.

@xylar
Copy link
Contributor

xylar commented Dec 18, 2024

@lorepirri, it sounded like you wanted to tackle ~= and I think that would be great. Once that's in, I'm happy to approve.

@lorepirri
Copy link
Contributor Author

lorepirri commented Dec 22, 2024

add "~=" and ".*" support

@xylar I've added "~=" and ".*" support it was a bit more effort than I thought mostly because ".*" expands to two selectors in "or". The PR starts to be a bit crowded, I would not add anything else here. There are a few changes to review, unfortunately.

I imagine there could be some growing pains with this update. There are a lot of cases to consider and @lorepirri, you've done quite a good job of that. Even so, if there are some bugs that come up after we send this into the wild, I'm more than happy to help with them in the future.

Thanks for your support! I made the parser a little bit more robust and now it accepts a PEP440 version in the Python version even though I would not expect to ever receive 2!3.12.2.post1 in it (it will work and render py==312, though).

When there are any extras in a dependency requirement for pypi the config.is_arch is set to True: https://github.com/conda/grayskull/blob/main/grayskull/strategy/pypi.py#L355
I did not explore the reason yet.
Shall it be the same for Poetry? Maybe this is another question @marcelotrevisani can help with.

that is set to True because when that function is called it means selectors are in use, and if we have selectors the recipe cannot be no arch. However, a few things have changed in conda-forge and that is not entirely true anymore. We allow a few selectors in some scenarios, but as a rule of thumb, if you don't want to dive into it, that is fine to still consider to be arch if it has selectors.

And thanks a lot for your contribution! I really appreciate it, I believe @xylar have done a good review and I am also happy with the current state of the PR, unless you want to improve a few bits more, but that can also be done in a second PR if you wish :)

Thanks for having created grayskull in the first place @marcelotrevisani and thank you for the support! I've changed a few things to add "~=" and ".*" support. I hope the code did not become cumbersome.

As for config.is_arch=True, let me see if I can quickly bring in config because it's not there yet.

Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorepirri, I'm very happy with the work you have done on the ~= and .* support. It's clear that some choices need to be made about how these get supported in a way that makes sense for python and I think the choices you've documented make a lot of sense. Hopefully, no one will be silly enough to use some of the weirder cases you added support for but it's nice to have some sensible fallback even if they do rather than just having grayskull crash.

@lorepirri
Copy link
Contributor Author

PR description updated (removed the TODOs and strikethrough leftovers).

It does handle compatible release operator ~= correctly and the wildcard .* suffix in the version as well (not very common though as the resulting selectors are not interesting).

👉 Fixes a minor bug: the compatible release operator ~= in the dependency version could not be used because the parser was jumping to expanding the tilde ~ and crashing on = (or returning the wrong version containing =). Therefore I suspect that the ~= operator was never handled (maybe never used, or nobody reported this bug).

@lorepirri
Copy link
Contributor Author

lorepirri commented Dec 22, 2024

@lorepirri, I'm very happy with the work you have done on the ~= and .* support. It's clear that some choices need to be made about how these get supported in a way that makes sense for python and I think the choices you've documented make a lot of sense. Hopefully, no one will be silly enough to use some of the weirder cases you added support for but it's nice to have some sensible fallback even if they do rather than just having grayskull crash.

Thank you @xylar for having had a look at it once again! I'm glad that I could add some value to the project. I decided to simplify the cases for the ~= and .* anticipating how those were actually rendered, in the specific case of the Python version specifier rendered to conda selector.

In case there will be the need to fully expand .* for example for the dependency versions, one could use this function to get the full left and right bounds (I leave it here because I wrote it but it was not useful at the end, other than testing my theory):

def wildcard_version_bounds(wildcard_version: str) -> tuple[str | None, str | None]:
    """

    >>> wildcard_version_bounds("*")
    (None, None)

    >>> wildcard_version_bounds("1.*")
    ('1.0a0', '2')

    >>> wildcard_version_bounds("1.1.*")
    ('1.1.0a0', '1.2')

    >>> wildcard_version_bounds("1.1.1.*")
    ('1.1.1.0a0', '1.1.2')

    # Python versions
    >>> wildcard_version_bounds("3.*")
    ('3.0a0', '4')

    >>> wildcard_version_bounds("3.12.*")
    ('3.12.0a0', '3.13')

    >>> wildcard_version_bounds("3.9.1.*")
    ('3.9.1.0a0', '3.9.2')

    """
    if "*" not in wildcard_version:
        raise ValueError("The version must contain a wildcard '*'.")

    # Remove the wildcard and parse the base version
    base = wildcard_version.rstrip(".*")
    lower_bound = f"{base}.0a0" if base else None  # Lowest possible version

    if not base:
        upper_bound = None  # No restriction for `*`
    else:
        # Parse base as a Version object and increment the last meaningful part
        base_version = Version(base)
        parts = list(base_version.release)
        parts[-1] += 1  # Increment the last meaningful part
        # Construct the upper bound by resetting all lower components to 0
        upper_bound = ".".join(map(str, parts))

    return lower_bound, upper_bound

and expand only for:
"~=3.8" -> ">=3.8, ==3.*" -> ">=3.8, <4.0a" -> "py>=38 and py<4"
"~=3.0" -> ">=3.0, ==3.*" -> ">=3.0, <4.0a" -> "py>=3 and py<4"
in the rest of the cases it's a simple conversion to ">=" operator.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in the rest of the cases it's a simple conversion to ">=" operator.
in the rest of the cases it's a simple conversion to ">="/"==" operator.

Copy link
Member

@marcelotrevisani marcelotrevisani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for your contribution!

@marcelotrevisani marcelotrevisani merged commit fe9f0b7 into conda:main Dec 23, 2024
8 checks passed
@lorepirri
Copy link
Contributor Author

Thank you for merging @marcelotrevisani . We could close #463 #535 and the PR #523 (it seems stale).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants