Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for bundler checksum verification #50

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
225 changes: 225 additions & 0 deletions text/0011-gem-checksum-verification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
- Feature Name: `gem_checksum_verification`
- Start Date: 2023-08-29
- RFC PR: https://github.com/rubygems/rubygems/pull/6374
- Bundler Issue: (leave this empty)

# Summary

Bundler is adding checksum verification of gems when they are installed. It should be secure by default and easy to use. It should not break assumptions or unnecessarily break deployment or CI unless there is a security problem.

# Motivation

Verifying gem checksums with known checksums at install time is a stronger way to verify that the exact same gem source is being used in every environment where the bundle is installed.
Using checksums, sourced from rubygems.org or from digests of local .gem file, is a more strict way of locking to a specific gem version.
The feature should work transparently in much the same way that bundler already locks to specific versions by ensuring that not just the same version, but the exact same rubygem file data, is installed on each environment.

# Guide-level explanation

Upon first upgrading to this version of Bundler, lockfile checksums will not enabled.
The feature will opt-in only at initial release.
It is important that Bundler maintains compatibility and there are edge cases that are difficult to accommodate without wider adoption.

Common Bundler commands like `bundle install`, `bundle update`, `bundle lock`, `bundle add` will now automatically record and verify checksums.
Commands that rely on checksums for verification will silently succeed when checksums match and fail with a unique non-zero exit code when checksums do not match.

If you wish to immediately add all available checksums to your lockfile for your bundled gems, run `bundle lock`.
Bundle lock now fetches checksums from remote sources by default.
If you would like to bypass this behavior, run `bundle lock --no-checksums`.

Example:

```
$ bundle install
Bundle complete! 88 Gemfile dependencies, 256 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.
```

Running `bundle update` or `bundle add` will record the checksum from the source (e.g. rubygems.org) into the Gemfile.lock, if it is available.
If a checksum is not available from the source because the source does not provide such info (e.g. private gemservers) then a checksum will be created during install using the .gem file.
If a checksum can't be created because the source is a path or git source, then only the gems name and version will be recorded with a blank checksum.

If you want to ensure that the bundled environment contains only gems matching the checksums in the lockfile, run `bundle pristine`.
The `bundle pristine` command installs every gem fresh from a newly downloaded .gem source.
The pristine install will trigger computation and comparison of the generated SHA256 checksum with the checksum stored in the lockfile.

Example:

```
$ bundle pristine
Installing rake 13.0.6
Installing rspec 3.12.0
42 gems without checksums.
Use `bundle lock` to add checksums to Gemfile.lock.
Comment on lines +52 to +53
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally wrote this because it seemed like a nice interface, but now I wonder how I can reliably tell how many gems could have checksums that don't.

Is this useful and we should add it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also brings up the problem of not continually printing this when the source doesn't support checksums. If the source doesn't support it, do we add a special checksum like unavailable=true and then use that as a marker for a gem that doesn't need to be verified? We do support just about any key value pair in the checksum field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first saw this, I understood bundle install by default would only advertise checksums, but not add them, and I thought that was ideal, regardless of showing a number or not. I.e., let people know that this feature is available and tell them how to enable it. And don't change behavior unless a CHECKSUM section is present in the lockfile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we really can't anticipate the problems of this feature, I'm starting to lean this way.

So the plan would be to build into this feature a line that suggests running bundle lock --checksums at the end of every install message until the CHECKSUMS exist in the Gemfile.lock. Until then, we treat the feature as disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I think a line in bundle install output is not too invasive and properly advertises the feature.

$ bundle lock
Writing lockfile to path/to/Gemfile.lock
martinemde marked this conversation as resolved.
Show resolved Hide resolved
$ bundle pristine
Installing rake 13.0.6
Installing rspec 3.12.0
$
```

When bundler installs a gem from a `.gem` file, it computes the SHA256 checksum of the file.
If an existing checksum is available from the lockfile or the remote source, it will be compared with the computed checksum at install time.
martinemde marked this conversation as resolved.
Show resolved Hide resolved
If the checksums do not match, an error is generated and installation is halted.
If no checksum is recorded in the lockfile, the computed checksum is saved to the lockfile where future installs can verify that the same gem is installed.

Example:

```
$ bundle install
Installing rake 13.0.6
Bundler found mismatched checksums. This is a potential security risk.
rake (13.0.6) sha256=2222222222222222222222222222222222222222222222222222222222222222
form the lockfile CHECKSUMS at Gemfile.lock:21:17
rake (13.0.6) sha256=814828c34f1315d7e7b7e8295184577cc4e969bad6156ac069d02d63f58d82e8
from the gem at path/to/rake-13.0.6.gem

To resolve this issue you can either:
1. remove the gem at path/to/rake-13.0.6.gem
2. run `bundle install`
or if you are sure that the new checksum from the gem at path/to/rake-13.0.6 is correct:
1. remove the matching checksum in Gemfile.lock:21:17
2. run `bundle install`

To ignore checksum security warnings, disable checksum validation with
`bundle config set --local disable_checksum_validation true`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this setting also control whether checksums are added to the lockfile or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO no, if we want a way to avoid putting checksums in the lock file (which im also not sure we want), it should be under a different flag

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure we want it, but I reckon it could be useful for a graceful migration, given that the setting already exists, even if we'll eventually remove the flag.

It'd be interesting to check why we added this flag in the first place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. It was there when I accidentally stole this work from @segiddins. It's a failsafe in case we really mess someone up or they like using hacked and corrupted gems and wish to continue doing so.

```

Certain checksums will always be unavailable because the source does not provide a checksum.
When a checksum is not available, the gem will be added to the Gemfile.lock CHECKSUMS section without a checksum.

Users should be aware that when bundling on CI or production, new gems can be added for a platform not found in the Gemfile.lock.
Bundler will silently record the new checksums for the missing gem just like on a local development machine.
If you would like to ensure that only lockfile checksums are used, bundle install should use use the frozen or deployment configurations.

# Reference-level explanation

Gem checksums are fetched from rubygems.org as part of the compact index.
The checksums are stored in memory during the bundle process.

If any gems are recorded from a separate source or installed via .gem file, another checksum is recorded and compared with the original.

If any of the checksums for the same gem name, version and platform differ, an error is raised with instructions for resolving the error.

When the Gemfile.lock is written, a new CHECKSUMS section is written with all the gems in the bundle and their corresponding checksums.

Future compatibility for new checksum algorithms is supported by reading and writing existing checksums for installed gems even if the algorithm is unknown.
Checksum comparisons take into account the algorithm used and raises errors accordingly.

Bundler commands that interact with checksums either fetch checksums from sources and update the Gemfile.lock or compare checksums in the Gemfile.lock with the computed digest of the gem being installed.

Internal storage of checksums indexes the checksums by a gem's NameTuple (name, version, platform) and the checksum's algorithm.
The source of each checksum is stored with the checksum so that errors can describe how to fix conflicting checksums.

When two remote sources have a checksum for the same gem, they are compared.
If they are the same, bundler proceeds as normal

### Example CHECKSUM section

A freshly created Rails 7.0.7.2 app creates the following CHECKSUMS section (snipped for brevity):

```
CHECKSUMS
actioncable (7.0.7.2) sha256=a921830a59ee314939955c9fc3b922d2b1f3ebc16fdf062370b9078aa0dc28c5
actionmailbox (7.0.7.2) sha256=33aeae209fc876c072e5ad28c7ffc16ace533d391368ad6390bb6183c2b27a24
actionmailer (7.0.7.2) sha256=0e9061159af8c220042b7714a2ba01e2d71d2904f308021ec714793e5f9811a0
martinemde marked this conversation as resolved.
Show resolved Hide resolved
actionpack (7.0.7.2) sha256=c441ff3898bf5827540bcab929d2f5be6e75b64c101513629a3c88e269615561
actiontext (7.0.7.2) sha256=d29eabbfbf0f084a0bddcfc6bd7e6245e209ec3a1def200e95b670e0cdfba033
actionview (7.0.7.2) sha256=15ba2612efb484ec80d5b656b4ea16e02d34d3f9980cabc13bd8ac15ccea3f94
activejob (7.0.7.2) sha256=6d8ebd81d29ce65bb57830640fa2d3f01e4cab0d71714a54c2b13763021023a4
activemodel (7.0.7.2) sha256=45ba827986065ac273b59cb3b6c9ab3da412beca5d465f1acf7a51fb5bc032b3
activerecord (7.0.7.2) sha256=425f84edb279c02fe2195eee166b20aabb36f51939087d040fa462859bd6790f
activestorage (7.0.7.2) sha256=8f1d79266f148d74e1cc7fcc91f3f04171e0d10c68f8a31ac95d11644114f4f0
activesupport (7.0.7.2) sha256=62e01393689c8514a65e2cf8be6f4781d1e6c7d9adc25b1056902d8abd659fee
addressable (2.8.5) sha256=63f0fbcde42edf116d6da98a9437f19dd1692152f1efa3fcc4741e443c772117
# ... SNIP ...
xpath (3.2.0) sha256=6dfda79d91bb3b949b947ecc5919f042ef2f399b904013eb3ef6d20dd3a4082e
zeitwerk (2.6.11) sha256=ade72f223a75c91f3b02b2c941a57fb697bc443d615f38c28773185e08698dd7
```

During the `rails new` command, `bundle install` pulled all the checksums from the compact index on rubygems.org, then computed checksums for each gem as it was installed.
martinemde marked this conversation as resolved.
Show resolved Hide resolved

# Drawbacks

### Excessive failures

If checksum verification failures happen more often than expected, it could cause the feature to be ignored or derided as poorly designed and implemented.
Bundler should progressively update the Gemfile.lock transparently without too much interaction or excessive warnings and failures.
Checksums verification is just a more strict approach to version verification, so this feature fits with the existing expectations of Bundler’s features and should be unobtrusive.
The feature is not a "big deal" that needs lots of warnings or errors to encourage usage.
It should not fail unless bundler is configured to be strict (a frozen bundle) or there is an actual verification failure (a corrupt or malicious gem).
martinemde marked this conversation as resolved.
Show resolved Hide resolved

### Increased installation time

Gems will be SHA256 digested during install, slightly adding to the install time.
This could be a larger burden for some bundles on slower machines.

Future proposals could attempt to address this slowdown, if necessary.

### Unverifiable gems

When gems come from private gem servers that do not implement the compact index, checksums will not be available.
Bundler will calculate digests from .gem files from sources that don’t supply checksums.

An additional flag for `bundle lock` could be provided that allows reinstalling, and thus calculating the checksum for, gems that don't have a checksum.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the behavior for remote servers will be that there's only one level of verification, i.e., that the checksum of whatever gem that was once installed is recored in the lockfile and should be the same in for future installs.

What happens for bundle lock without the additional flag? Does it record empty checksums? Should it download gems to cache and compute their checksums without installing anything? Maybe too slow? I guess a mismatch " != " will be ignored and result in the non empty checksum being locked?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the behavior for remote servers will be that there's only one level of verification, i.e., that the checksum of whatever gem that was once installed is recored in the lockfile and should be the same in for future installs.

it will also be checked against what the server reports in the compact index, if that is present (which is what already happens)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant "for remote servers that don't implement the compact index"!

Copy link
Member Author

@martinemde martinemde Oct 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On first install we will compute, compare and store all checksums for all .gem files that are downloaded. If you run bundle pristine get every checksum from every gem that can be installed from .gem source.

As for an explicit way to fill in missing gems besides that, I don't know.

I'm actually concerned about another similar case, a broken gem that someone depends on and needs. Theoretically the server sends a bad checksum and the gem continually clashes. Can you ignore just that gem?

Also, missing checksums that keep breaking frozen bundles, like for different platforms or that didn't get recorded. Frozen says "raises if lock would be changed."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand that case very well. What do you mean by "a broken gem"? Maybe a gem with custom patches or something? Even if they are doing that, shouldn't they upload it to some gem server so they can distribute it to all users of the lockfile? Wouldn't all users of the lockfile want the exact same version?

Regarding missing platforms, I plan to lock all platform soon, so that shouldn't be a problem I think.

Lockfiles having only ruby platform are a problem because they don't lock specific platform gems. Instead, they pick the most suitable platform specific version at install time. So checksums won't play nice with these lockfiles I believe. And they are still the majority.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. Anything broken like that should be a fail and we should address the failure from it, not preemptively prepare for it.

So checksums won't play nice with these lockfiles I believe. And they are still the majority.

This seems like the distinction between frozen (strict) and unfrozen. If you want to use bundler loosely, then you also won't get this extra assurance from checksums.

Copy link
Member

@deivid-rodriguez deivid-rodriguez Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, if a lockfile is using this loose mode, then you don't want checksums.

If you do explicitly add checksums to this kind of lockfile, that means you opt-in to strict mode, and you'll only get gems without specific platforms locked in there, with their respective checksums, and that's what will be consistently installed, regardless of the platform where the lockfile is used.


Gems from a git source are verified by nature of matching a specific git SHA and should be excluded from checksum verification.

Path source gems cannot be verified because the checksum of the entire path would be complicated to calculate and unreliable.

### GitHub Dependabot and other automations are at risk of failing.

In the normal case with a non-frozen bundle, Dependabot will continue to work as normal.
Dependabot will not immediately add checksums to Dependabot PRs, but this is not an error.
The next user install after a dependabot commit will add the checksum.

However, if CI uses a frozen bundle, then all dependabot pull requests will fail due to missing checksums.
This should be addressed by printing clear messaging about how to fix this problem: checkout the branch and run `bundle install`.

Until patched, dependabot update branches will be less useful because they will require manual intervention before the build will work.
This might lead users to disable frozen bundles or disable the checksums feature to avoid the CI failures and continue with their existing workflow.
In this case, leaving these features disabled, maybe for longer than necessary, could impact the security of these user's application.

Dependabot maintainers will need to update Dependabot to write the corresponding checksum to the Gemfile to prevent CI build failures caused by frozen bundle missing checksums.
This makes Dependabot the defacto source of the checksums in the Gemfile for updated gems, which is hopefully already clear to users merging these PRs and should be considered part of the trust extended to GitHub already.

We are open to working with the dependabot team to provide the information necessary to add checksums to the lockfile.

### Older versions of Bundler

Old versions of Bundler should ignore the CHECKSUMS section.

# Rationale and Alternatives

Rubygems.org already stores SHA256 checksums for gems and returns them in the compact index response.
All the information is already present for checksum verification on the client side.

Verifying .gem files at install time offers nearly complete protection against hacked, altered and corrupted gems.
Bundler already ensures that only the bundled gems are available to the app.
This feature adds the assurance that only verified gems were installed with the bundle.

One alternative to this solution is including (vendoring) the bundled gems in the repository.
This effectively has the same result, since the gems that will be installed during the production deploy will be verifiably the same gems that were used during CI and development.
The downside of this vendored approach is the increase in the repository size.
The upside is installation that doesn't depend on a remote source.
Checksums allow for a similar level of confidence without the larger repository size of vendoring gems.
It can be enabled by default so that more users will benefit.

# Unresolved questions

### What are the unexpected errors that may happen the first time developers interact with this feature?

In particular, the first deploy for users with a frozen bundle in CI and/or production may produce errors that might not have obvious solutions to someone unfamiliar with the feature.

### How do we handle confusion about the authority of checksums written to the Gemfile.lock

The source of checksums in the Gemfile.lock becomes a matter of trust once it's written.
Did the checksum come from the API or was it calculated from a .gem file on a developers computer.
If a checksum error is resolved by one developer in a way that saves an incorrect checksum, how should people know when to approve these changes or not.
It may not be common practice for most teams to look at the Gemfile.lock when reviewing code.
Gemfile.lock changes can be hidden in pull request reviews.
Without a process for checking that the checksums are trustworthy, it's left to every development team to decide on a process.

One solution would be a bundle command that could be run in CI every time the gems are installed that verifies the authenticity of checksums in the Gemfile.lock.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically bundle install in frozen mode in CI would catch it, since there won't be a local gem cache?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a command for syncing checksums from the remote to ensure you don't have an erroneous checksum generated from a broken gem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmm, so a command that fixes the issue rather than detecting it, right? Maybe bundle lock without a local gem cache including that broken gem should do that, either by default or under an optional flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, I think it should sync and compare everything. This is the chance to print that "the gem you have installed generated a checksum that is different than what the server is saying."

Bundle lock should just pull all checksums every time then? Lock previously doesn't hit the network unless a resolve is necessary. This would make it always hit the network unless --no-checksums or --local is set.

Copy link
Member Author

@martinemde martinemde Oct 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my explicit concern is this scenario:

  1. Malicious developer sets checksum and upload gem.
  2. Checksum would be invalid if compared with server, but doesn't happen during install.
  3. Malicious gem is successfully installed. (This is a subtle way to bypass security, when a blatant code change might get noticed)

If we had a process that could verify your checksums against the server without altering your bundle, it could run in CI as part of tests and help protect against malicious or bad checksums in lockfiles.

This is extra paranoid though. I think you could just run lock in CI on a frozen bundle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine for bundle lock to hit the network (or local cache) to look for checksums.

I'm not sure I understand your concern, can you elaborate this 3 steps? What does "sets checksum" mean? Manually edit a lockfile changing the checksum of a gem? What does "upload gem" mean in this context? Are you implying a situation where a malicious actor has both control of your lockfile, and your remote server? That sounds wild and I don't see how anything can be done in that situation 🤣

Overall, I think bundle install --force in frozen mode should be a strict way to check remote checksums against lockfile checksums, since that bypasses the local cache and should not allow lockfile changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, maybe my imagination is running a bit wild. With that access you can do whatever you want.

The presence of bundle install --force should provide options for people wanting to use this feature like that.

The more we talk about it, the more clear it is that we can't anticipate everything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that access you can do whatever you want.

Exactly, we shouldn't care about it since at that point there's no much we can do.