Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC for bundler checksum verification #50
base: master
Are you sure you want to change the base?
RFC for bundler checksum verification #50
Changes from 6 commits
0029908
3941ef0
5958473
8291281
e713a07
76794b9
7c9b3e8
59224ad
27c23fc
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if I understand correctly. The only way to opt-into checksums will be to run
bundle lock
explictly?In other words, will
bundle install
record checksums to theGemfile.lock
when it's actually generating it for the first time? I think we'll want to do eventually do this, but it probably doesn't hurt to battle test the feature fist by explicitly opting in.Since users need to explicitly opt-in to this, would it make sense to do it under a
--add-checksums
flag, to make sure there's no change to existing behavior?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is enabled by default and every install or update will save and compare checksums.
However, for most people they are not performing a clean install. If you run bundle install and nothing is installed from gem, you get no checksums. I wanted to provide a way for people to explicitly dive in fully.
I think
bundle lock
should just always add all available checksums from all sources to the lock.The full explicit record and test currently works via
bundle lock --update --bundler
(to avoid gem changes) and thenbundle pristine
which compares every gem to the recorded checksum. I wantlock
to grab checksums without the extra args.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this should also be a hard failure in frozen mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see an argument either way, but it's in my mind a bit of a stretch of what frozen already means (for missing checksums, at least).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the missing case, I meant only for GEM sources (which if I understood correctly are the only ones adding checksum). And also only if the Gemfile is already using CHECKSUMS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bundler docs:
I think there's 2 ways we could interpret this:
The first quote would suggest that we cannot make any change to the Gemfile.lock, so a frozen bundle should fail if checksums are missing. People use frozen for added security (don't install something different than what was committed and tested) and this fits with added security. If your production environment wants to install the alpine linux version of a bundled gem and you don't have a checksum for it then hard fail.
The second quote seems to say that if there are changes to the Gemfile not reflected in the Gemfile.lock, then fail. A missing checksum for a bundle that would otherwise work in frozen mode is not a clear violation of that rule. With this approach, bundler stays exactly as secure as it was previously in this case. Installing the alpine linux version of a gem without being able to verify the checksum is ok.
Option 1 seems implied to me, but option 2 is fully within existing expectations and will break less CI builds. The question is: should we break those CI builds? Is that what those engineers expected bundler to do? If frozen doesn't break that build, then what will?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My interpretation of frozen has always been 1. For example, if we run under a new platform that's not already in the lockfile, we fail in frozen mode even if there's no Gemfile changes. Basically in frozen mode, everything should be taken from the lockfile.
In addition to this, I think this feature kind of misses the point if we don't fail when we find a mismatch and instead happily install the mismatched gem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally wrote this because it seemed like a nice interface, but now I wonder how I can reliably tell how many gems could have checksums that don't.
Is this useful and we should add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also brings up the problem of not continually printing this when the source doesn't support checksums. If the source doesn't support it, do we add a special checksum like
unavailable=true
and then use that as a marker for a gem that doesn't need to be verified? We do support just about any key value pair in the checksum field.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I first saw this, I understood
bundle install
by default would only advertise checksums, but not add them, and I thought that was ideal, regardless of showing a number or not. I.e., let people know that this feature is available and tell them how to enable it. And don't change behavior unless a CHECKSUM section is present in the lockfile.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we really can't anticipate the problems of this feature, I'm starting to lean this way.
So the plan would be to build into this feature a line that suggests running
bundle lock --checksums
at the end of every install message until the CHECKSUMS exist in the Gemfile.lock. Until then, we treat the feature as disabled.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I think a line in
bundle install
output is not too invasive and properly advertises the feature.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this setting also control whether checksums are added to the lockfile or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO no, if we want a way to avoid putting checksums in the lock file (which im also not sure we want), it should be under a different flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure we want it, but I reckon it could be useful for a graceful migration, given that the setting already exists, even if we'll eventually remove the flag.
It'd be interesting to check why we added this flag in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It was there when I accidentally stole this work from @segiddins. It's a failsafe in case we really mess someone up or they like using hacked and corrupted gems and wish to continue doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the behavior for remote servers will be that there's only one level of verification, i.e., that the checksum of whatever gem that was once installed is recored in the lockfile and should be the same in for future installs.
What happens for
bundle lock
without the additional flag? Does it record empty checksums? Should it download gems to cache and compute their checksums without installing anything? Maybe too slow? I guess a mismatch " != " will be ignored and result in the non empty checksum being locked?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will also be checked against what the server reports in the compact index, if that is present (which is what already happens)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant "for remote servers that don't implement the compact index"!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On first install we will compute, compare and store all checksums for all .gem files that are downloaded. If you run
bundle pristine
get every checksum from every gem that can be installed from .gem source.As for an explicit way to fill in missing gems besides that, I don't know.
I'm actually concerned about another similar case, a broken gem that someone depends on and needs. Theoretically the server sends a bad checksum and the gem continually clashes. Can you ignore just that gem?
Also, missing checksums that keep breaking frozen bundles, like for different platforms or that didn't get recorded. Frozen says "raises if lock would be changed."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand that case very well. What do you mean by "a broken gem"? Maybe a gem with custom patches or something? Even if they are doing that, shouldn't they upload it to some gem server so they can distribute it to all users of the lockfile? Wouldn't all users of the lockfile want the exact same version?
Regarding missing platforms, I plan to lock all platform soon, so that shouldn't be a problem I think.
Lockfiles having only
ruby
platform are a problem because they don't lock specific platform gems. Instead, they pick the most suitable platform specific version at install time. So checksums won't play nice with these lockfiles I believe. And they are still the majority.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right. Anything broken like that should be a fail and we should address the failure from it, not preemptively prepare for it.
This seems like the distinction between frozen (strict) and unfrozen. If you want to use bundler loosely, then you also won't get this extra assurance from checksums.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, if a lockfile is using this loose mode, then you don't want checksums.
If you do explicitly add checksums to this kind of lockfile, that means you opt-in to strict mode, and you'll only get gems without specific platforms locked in there, with their respective checksums, and that's what will be consistently installed, regardless of the platform where the lockfile is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why will they fail? will dependabot not add the new lock file checksum in when it runs
bundle install
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not, please add a small section with instructions for how automation should be updated to be compatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dependabot (unfortunately) runs a fixed Bundler version. So until that is updated to a version that knows about checksums, it won't add checksums.
On top of that, Dependabot (also unfortunately) uses Bundler internals a lot, rather than only shelling out to Bundler commands, so it's likely that stuff breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dependabot will need updates to work with this feature, and I think that's maybe a good point to make this fully opt-in in the beginning? But I don't understand the point about this leading to users disabling features. As long as Dependabot correctly adds the proper checksums, shouldn't it be fine? That's what Dependabot does for other ecosystems, and it seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I could have made it more clear: "initially, dependabot will not be updated and users may disable the feature with the intention of it being temporary, but it could lead to the feature staying disabled for a long time."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I think this is indeed a risk and another reason to let this feature be battle tested for a while in an opt-in only mode first. I don't understand the hurry to force everyone into checksums before we know the full implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opt-in vs opt-out is interesting. We'll get a lot more useful (if potentially unhappy) feedback from opt-out, but maybe we need an opt-in period first?
Opt in to me would be triggered by running `bundle lock --checksums" and it would add the checksums. We would consider the feature enabled if the lockfile either didn't exist or already had checksums.
My experience using the feature in testing is that most of the time the checksums are not added right away. I think only the frozen-on-ci situation will cause problems with opt-out, and it will hopefully be found at the same time as the person upgrades bundler, so it will be intentional.
I'm open to either really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be my preferred approach. You get checksums when there's no prior lockfile, or when you explicitly add them.
If a lockfile has no checksums, they can only be added explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically
bundle install
in frozen mode in CI would catch it, since there won't be a local gem cache?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a command for syncing checksums from the remote to ensure you don't have an erroneous checksum generated from a broken gem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmmm, so a command that fixes the issue rather than detecting it, right? Maybe
bundle lock
without a local gem cache including that broken gem should do that, either by default or under an optional flag?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, I think it should sync and compare everything. This is the chance to print that "the gem you have installed generated a checksum that is different than what the server is saying."
Bundle lock should just pull all checksums every time then? Lock previously doesn't hit the network unless a resolve is necessary. This would make it always hit the network unless
--no-checksums
or--local
is set.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my explicit concern is this scenario:
If we had a process that could verify your checksums against the server without altering your bundle, it could run in CI as part of tests and help protect against malicious or bad checksums in lockfiles.
This is extra paranoid though. I think you could just run lock in CI on a frozen bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine for
bundle lock
to hit the network (or local cache) to look for checksums.I'm not sure I understand your concern, can you elaborate this 3 steps? What does "sets checksum" mean? Manually edit a lockfile changing the checksum of a gem? What does "upload gem" mean in this context? Are you implying a situation where a malicious actor has both control of your lockfile, and your remote server? That sounds wild and I don't see how anything can be done in that situation 🤣
Overall, I think
bundle install --force
in frozen mode should be a strict way to check remote checksums against lockfile checksums, since that bypasses the local cache and should not allow lockfile changes.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, maybe my imagination is running a bit wild. With that access you can do whatever you want.
The presence of bundle install --force should provide options for people wanting to use this feature like that.
The more we talk about it, the more clear it is that we can't anticipate everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, we shouldn't care about it since at that point there's no much we can do.