Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When there is a corrupted jar in Maven's local repo, it will fail #7506

Open
panbingkun opened this issue Mar 1, 2024 · 2 comments
Open
Labels

Comments

@panbingkun
Copy link

panbingkun commented Mar 1, 2024

steps

When we use the local Maven repo as the first level cache of SBT, if there is a corrupted jar in this repo, it will directly fail.
This problem has been bothering us for a long time, and a similar situation has occurred since the SBT 1.9.4 version.

problem

https://github.com/panbingkun/spark/actions/runs/8105142421/job/22153031672
image
Starting from SBT version 1.9.3, When there is a corrupted jar in Maven's local repo, it will fail

expectation

When there is a corrupted jar in Maven's local repo, It should not fail, but rather automatically skip this cache.

notes

@panbingkun panbingkun added the Bug label Mar 1, 2024
@eed3si9n eed3si9n added the area/library_management library management label Mar 1, 2024
@eed3si9n
Copy link
Member

eed3si9n commented Mar 1, 2024

@panbingkun Thanks for the report. Please see https://www.scala-sbt.org/1.x/docs/GitHub-Actions-with-sbt.html#Caching for the recommendation on setting up the cache for GitHub Actions. In general I would recommend against the use of Resolver.mavenLocal if it contains any partial external artifacts.

Also the dependency resolution is delegated to Coursier, so could you report this issue to coursier/coursier as well plz, if you would like to have any action taken?

@panbingkun
Copy link
Author

panbingkun commented Mar 1, 2024

@eed3si9n Thank you for your reply so quickly!
In fact, before I reported the bug to here, I made many attempts

  1. We override the default CachePolicy from Vector(CachePolicy.LocalUpdateChanging, CachePolicy.LocalOnly, CachePolicy.Update) to Vector(CachePolicy.Update), as follows:
    https://github.com/apache/spark/pull/42961/files#diff-6f545c33f2fcc975200bf208c900a600a593ce6b170180f81e2f93b3efb6cb3eR310
image
  1. In our Spark build, local Maven repo was skipped in some workflows, as follows:
    https://github.com/apache/spark/pull/44516/files#diff-6f545c33f2fcc975200bf208c900a600a593ce6b170180f81e2f93b3efb6cb3eR272
image

All of the above can make GA pass, but none of them are perfect.

Especially when we observe some dependencies, it seems that they are not introduced by us in Spark, but rather dependencies of some tools themselves, eg:
image

This problem has been bothering us for a long time, since version 1.9.4 of SBT.

Besides the methods we tried above, is there a better way to help us get around it?
Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants