Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeJS: combined builder and overrides? #276

Open
wmertens opened this issue Aug 30, 2022 · 3 comments
Open

NodeJS: combined builder and overrides? #276

wmertens opened this issue Aug 30, 2022 · 3 comments
Labels

Comments

@wmertens
Copy link
Contributor

wmertens commented Aug 30, 2022

After much consideration I think I found the silver bullet for #195. The older lock files don't provide all the peer dep information for packages, and so those builds should have all packages combined into one derivation. Furthermore, cyclic derivations should always be built together in a single derivation, since they could rely on eachother during the npm run install step.

So I think that the NodeJS builder should make groups of packages, built according to the steps in npm/npm#5919. (copy all in place then build one by one)
If we have a v2 lock file and we have all the peer dep information, we can build individual packages but cyclic deps still need grouping.

Sometimes we'll have multiple versions of the same dependency, in which
case one is picked as canonical (the one that no other dep or only main is
requesting, or the highest version otherwise) and the others are stored as /$name-$version.
Packages that want the different version then get a symlink.

So for v1 lock files, this will result in node_modules dirs like this:

lib/node_modules/mainpkg/node_modules/
+--- dep1/
+--- dep2/
|    `--- node_modules/
|         `--- dep3 -> ../../dep3-1.0.4
+--- dep3/
`--- dep3-1.0.4/

This is a completely flat structure and it should work 100% for everything.

The only problem is with the overrides, since they are per-package. I think the only way to handle this is to first make the copied node_modules as combinedSrc, and then create the build derivations, running them one by one, using the output of the previous as the source for the next. So we'll sometimes have 1000 levels deep derivations. I don't think Nix will have a problem with that though.

However, if we know beforehand which packages have overrides, we can run the install steps for all packages in a single build and then only do the stepwise folding build for the overrides.

Is there a way to get this information (which packages have overrides) from inside the builder? If not, would it be possible to add that?

@DavHau really interested in your thoughts on this.

@DavHau
Copy link
Member

DavHau commented Aug 31, 2022

Sounds like a reasonable strategy to me.

Is there a way to get this information (which packages have overrides) from inside the builder? If not, would it be possible to add that?

You could compare the outputs of produceDerivation {name} {pkg} with just {pkg}, and if they have different hashes/store-paths, you know that the package has been overwritten.
This doubles the number of packages being evaled, but should be suitable for a POC of the new strategy. Optimization can be done later.


I also have an idea for an alternative strategy.
Actually no build system at all can build real circular dependencies. Two distinct procedures X and Y can never depend on each others output. That is an impossible to resolve situation.

The reason why we run into cyclic deps issue in the first place, is because we combine steps, which are actually logically separated.

Example. Package A's preinstall might depend on B's final output. At the same time B's preinstall can never depend on A's final output. That is impossible even in NPMs model and would lead to an error.

So, instead of combining packages, we could split single packages into several derivations, for each install step like preinstall, install, postinstall there could be a separate derivation.

The problem is, that given the graph of all packages individual steps, it might be hard to find out the exact dependencies between them. But maybe it is possible. Even if the automatic detection of the steps dependencies is not 100% acurate, we could provide a simple interface for the user to manipulate this graph slightly to fix issues + useful error messages on what changes might be required to the graph.

@wmertens
Copy link
Contributor Author

You could compare the outputs of produceDerivation {name} {pkg} with just {pkg}, and if they have different hashes/store-paths, you know that the package has been overwritten.

Good trick and it should avoid a ton of extra derivations we'll see what performance is like.

Example. Package A's preinstall might depend on B's final output. At the same time B's preinstall can never depend on A's final output. That is impossible even in NPMs model and would lead to an error.

I think this is actually undefined behavior, I couldn't find any docs around it. I'll just presume that any install-script-time dependencies are source-only, which is what the documented npm algorithm I linked seems to implement.

So I'll go ahead and try it that way. My branch already works for all my builds but this new way should make it very robust for the old locks as well.

@DavHau DavHau added the nodejs label Nov 9, 2022
@DavHau
Copy link
Member

DavHau commented Nov 9, 2022

So I'll go ahead and try it that way. My branch already works for all my builds but this new way should make it very robust for the old locks as well.

As we are proceeding with this, I guess we can close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants