Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter Scheduler 2.10.0 Source Distribution tar built too large causing PyPI upload failure #558

Open
andrii-i opened this issue Nov 13, 2024 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@andrii-i
Copy link
Collaborator

andrii-i commented Nov 13, 2024

Description

Jupyter Scheduler 2.10.0 Source Distribution initial upload failed due to PyPI source distributions size limits (~150 Mb) due to tar build being drastically larger in size vs before jupyter-releaser introduction.

Built distribution upload went through, npm upload did not as it's later in the script.

How to reproduce

Expected behavior

  • ✅ Source distribution of the normal size is available in PyPI, @jupyterlab/scheduler 2.10.0 is released at npm.
  • Workflow does not fail, source distribution of the reasonable size is built and uploaded to PyPI, upload to npm happens.
@andrii-i andrii-i added the bug Something isn't working label Nov 13, 2024
@andrii-i andrii-i self-assigned this Nov 13, 2024
@dlqqq
Copy link
Collaborator

dlqqq commented Nov 14, 2024

For reference, here is the relevant log excerpt:

WARNING  Error during upload. Retry with the --verbose option for more details.
ERROR    HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/
         File too large. Limit for project 'jupyter-scheduler' is 100 MB. See
         https://pypi.org/help/#file-size-limit for more information.

@andrii-i andrii-i changed the title Jupyter Scheduler 2.10.0 Source Distribution is not available Jupyter Scheduler 2.10.0 Source Distribution upload failure Nov 14, 2024
@andrii-i andrii-i changed the title Jupyter Scheduler 2.10.0 Source Distribution upload failure Jupyter Scheduler 2.10.0 Source Distribution tar built too large causing PyPI upload failure Nov 14, 2024
@andrii-i
Copy link
Collaborator Author

Jupyter Scheduler 2.10.0 npm package is now available at https://www.npmjs.com/package/@jupyterlab/scheduler, Source Distribution is now available at PyPI https://pypi.org/project/jupyter-scheduler/2.10.0/#files.

Let's use this issue to track the need to understand why Jupyter Scheduler 2.10.0 Source Distribution tar was built too large causing PyPI upload failure and to prevent it happening in the next release.

jupyter_releaser issue on the topic: jupyter-server/jupyter_releaser#592

@krassowski
Copy link
Contributor

Try building jupyter scheduler PyPI source distribution locally with jupyter-releaser build-python, see its size (>100 Mb)

Out of curiosity, do you know why it produces so big a distribution? From a quick look it seems that you might be missing:

[tool.jupyter-releaser.hooks]
before-build-python = ["jlpm clean:all"]

in the pyproject.toml but that's just a guess.

@andrii-i
Copy link
Collaborator Author

andrii-i commented Nov 14, 2024

@krassowski no. I've created jupyter-server/jupyter_releaser#592 in jupyter_releaser repo to surface the problem and hopefully get some insight from jupyter_releaser contributors.

Thank you for the suggestion and generally for looking into this.

@krassowski
Copy link
Contributor

Do you have the contents of the package built locally with jupyter-releaser build-python?

@andrii-i
Copy link
Collaborator Author

@krassowski
Copy link
Contributor

It looks like it includes .yarn and node_modules directories which I am sure is responsible for a large portion of the size. It obviously should not be included. Also see jupyter-server/jupyter_releaser#592 (comment).

I think in addition jlpm clean:all you should also add:

[tool.hatch.build.targets.sdist]
artifacts = ["jupyter_scheduler/labextension"]
exclude = [".github", "binder"]

so binder directory gets excluded.

That said, I already see jupyter_scheduler/labextension in the tarball you shared and it, along node_modules should have been excluded by hatch because it is in your .gitignore.

So why does it include things from the git repo?

In the logs of check-release action (https://github.com/jupyter-server/jupyter-scheduler/actions/runs/11809051201/job/32898683727) I see that the releaser is reading configuration from package.json rather than from pyproject.toml. I wonder if this could be related:

build-python

--------------------------------------------------
Using default value for dist_dir: 'dist'
Using default value for python_packages: '['.']'
Using default value for help: 'False'
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Running hooks for before-build-python
jupyter-releaser configuration loaded from package.json.

Also, that one does include the clean hook:

"jupyter-releaser": {
"hooks": {
"before-build-npm": [
"python -m pip install jupyterlab~=4.0",
"jlpm",
"jlpm build:prod"
],
"before-build-python": [
"jlpm clean:all"
]
}
}

Interesting. It looks like it did not use hatch at all?

@krassowski
Copy link
Contributor

None of that helps yet: #561

I went ahead and triggered a new check-release run on an unrelated project just to see if this is not a regression in the ecosystem (rather than a misconfiguration). Compare older run on variable inspector with the run triggered today and both result in 1.53 MB
of artifacts, so I do not think that this is a system-wide issue, but just a problem with configuration.

@krassowski
Copy link
Contributor

I tried aligning the scheduler config with other repos using releaser in #561 but nothing helped.

The thing is that jupyter-releaser does not do anything bespoke, it just runs pipx run build (here). It should not result in anything different from python -m build as used by the build action:

@krassowski
Copy link
Contributor

Running pipx run build locally does not produce such a large tarball for me, just 3.6 MB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants