Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodejs: Use more pure symlink builds #195

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

wmertens
Copy link
Contributor

@wmertens wmertens commented Jul 14, 2022

Note that this adds metadata to the sources (like dev, os and optional) which the yarn translate doesn't add yet.

Status: it works completely except for cycles that have extra nodes between repeats

TODO:

  • make it work
  • make translate pure
  • eslint-plugin-prettier doesn't see eslint-config-prettier - it only mentions it in peerDependenciesMeta.eslint-config-prettier.optional: true
  • for yet unknown reason one of my projects gives an infinite recursion while trying to iterate deps
  • when importing a dep, don't copy cyclic deps that aren't requested
    example: eslint-plugin-es imports eslint-utils 2.1 which clashes with copied eslint+eslint-utils 3.0
  • remove commented code in builder
  • rebase - copy vs symlink strategy needs merging with the node_modules generation. In particular, this branch always creates node_modules separately, and the copy-or-symlink strategy should just copy the generated node_modules, optionally deep as a workaround for broken tooling.
  • implement copy-modules option for devShell
  • move and squash commits to proper branches
  • figure out what passthru dependencies are for
  • figure out how to pass the nodejs pkg to the overrides
  • let package-lock translator call v2 if file is v2
  • put transitive binaries behind a flag. Default on or off?
  • if utils/translator: produce merged cycles #216 can't be a global feature, use it locally

@wmertens wmertens marked this pull request as draft July 14, 2022 00:41
@wmertens
Copy link
Contributor Author

@DavHau I'll pull into nice commits later

@wmertens
Copy link
Contributor Author

@DavHau so this branch works for me, except for an infinite recursion due to #198.

When I build packages with only one cyclic dep at a time, it works fine. The cyclic deps are placed together.

There are a bunch of changes in this PR, I need to group them by splitting the wip commits. The cyclic changes are separate commits.

I saw that nothing used the cyclic api yet, so I changed it to be easier to use in the nodejs builder; I presume this will also help in other subsystems.

The TODO comments also need addressing.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/yarn-plugnplay-and-direnv-packaging/19759/21

Copy link
Member

@DavHau DavHau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this already.
The handling of cyclic deps ads significant complexity, therefore please provide an example (project+flake), which failed before and now works because of this PR.
This will make it easier to evaluate the solution and possible alternatives.


stripDep = dep: l.removeAttrs dep ["pname" "version" "deps"];
in
utils.simpleTranslate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider using simpleTranslate2 here, which has a refined interface and adds unit tests to your translator.
See template at src/templates/translators/pure.nix

npm install --package-lock-only $npmArgs
fi

# resolve packages - TODO move to RunCommandLocal
node ${./resolver.cjs} package-lock.json resolved.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if generating a custom format resovled.json here brings us much benefit.
It might be more performant than parsing with nix, but the problem is, than we need a pure nix package-lock.json v2/v3 parser anyways, because otherwise we cannot support on-the-fly translation of upstream projects.

Your current implementation forces the user to use an impure translator whenever they have a v3 lock file, which is not optimal for all scenarios.

Therefore I think it would be better if we had all the lock file parsing logic in nix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if this were a runcommand, could it be part of pure? It's really hard to do this sort of conversion in Nix :-(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DavHau I tried moving the runcommandlocal to the pure translator but that doesn't get pkgs. Would it not be allowable to run pure commands in translators?

Like I said, it is way easier to resolve the packages in js than in nix, that cyclee inverter took me several hours where it would take me 10 minutes in js.

Copy link
Member

@DavHau DavHau Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed the comment earlier. if you used runCommand, then this would be a IFD/recursive translator. See the docs for translators This still counts as pure, but as it requires IFD it comes with a lot of penalties. The best would be to have it in pure nix language, though I'm fine with IFD as well. We can improve this later on.

@DavHau I tried moving the runcommandlocal to the pure translator but that doesn't get pkgs. Would it not be allowable to run pure commands in translators?

See the haskell stack-lock translator on how to get the pkgs

@@ -125,6 +125,7 @@ in
${coreutils}/bin/rm -rf $TMPDIR
'';

# TODO is this really needed? Seems to make builds slower, why not unpack + build?
Copy link
Member

@DavHau DavHau Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially yes, because it leads to more builds, and nix is bad in launching many small builds fast. But it leads to a caching of the archive extraction, which should improve performance on subsequent iterations on your build.

The main reason for this design is, that currently dream2nix's fetching layer is expected to output raw directory trees.
One reason for this decision was that sources for dependencies can be defined relative to other dependency sources.
For example source-foo can be defined as "${source-bar}/some/dir".
Knowing, we always work with extracted directories, we can assemble these paths during eval time previous to build time.

If we use source archives as build inputs, we'd have to move this logic to build time.
But maybe this would be the better design choice, as we ran into problems like: #158, where workspace subpackges try to access files of parent sources, which did not exist anymore.
Maybe it would be better to just pass the original source, and set sourceRoot to make mkDerivation cd into the correct subdir.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed a bunch of runCommands to runCommandLocal so that at least it wouldn't send unpacked archives back and forth to the binary cache. Saves a whole bunch of lookups on large projects, too.

if [ -f ./tsconfig.json ] \
&& node -e 'require("typescript")' &>/dev/null; then
# configure typescript to resolve symlinks locally
# TODO is this really needed? Node doens't use it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by node doesn't use it?
If this option is not set, then typescript compilation might fail, which we should avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this option and its Node equivalent is only necessary if the tree of symlinks is incorrect. The way the node_modules are built now, each module sees only the modules it's supposed to see, and it should work that way in ts as well.

So if typescript needs the flag, it means it's accessing modules that it didn't request.

Not sure what the memory implications of this flag are, if it treats multiple symlinks to the same module as multiple copies of the module. That would also break module-scoped storage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but I believe there are cases where upstream lock files do have incomplete dependency declarations. I think for example in package-lock v1, peerDependencies are not declared explicitly, instead it is assumed that the parent installs them. Therefore preserveSymlinks is required in order to find the parent. Otherwise a ../ will lead to the /nix/store path of the current module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in v2 they are indeed declared. What this PR does is extract all deps each package needs and then resolves them the way node would resolve them. So peer deps are either resolved to the correct version, or missing from the project entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great. But we need to be downward compatible to the old package-lock.json versions as well. So we'd need to find a of solution for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a solution, the information is just missing and the logic we have in place for v1 doesn't work for e.g. apollo-server-errors, it can't find graphql.

So in that case I would say that the pure translator should just say "please run resolveImpure", which would generate a v2 lockfile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's a solution, the information is just missing and the logic we have in place for v1 doesn't work for e.g. apollo-server-errors, it can't find graphql.

As explained in #198 (comment) It worked just fine with the old build logic, assuming that preserveSymlinks is used currently by all tooling.
I just tried building apollo-server-errors as well with the default dream2nix flake and it builds just fine.
If I'm wrong please paste a flake, that demonstrates it.

I think what you are doing here is an improvement, that frees us from having to use preserveSymlinks, which is really great, because many build tools don't cope well with it.

But I don't think we should throw away the old installation logic, as long as the new logic only improves upon some use cases, but breaks others. Currently it breaks all lock file v1 projects, and requiring resolveImpure on all these projects is a significant trade-off.

I think we should either extend the build logic to also support v1, or we should keep the old build logic as an alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

point taken, I can actually just do the v2 logic when the file is v2. I'll adjust.

@wmertens
Copy link
Contributor Author

@DavHau eslint has a cyclic dep with eslint-utils, and es-abstract has 3 packages like that. Just installing these in an empty dir is sufficient to test.

But as explained in #198, the cyclic resolver is not working right.

@DavHau
Copy link
Member

DavHau commented Jul 15, 2022

@DavHau eslint has a cyclic dep with eslint-utils, and es-abstract has 3 packages like that. Just installing these in an empty dir is sufficient to test.

But as explained in #198, the cyclic resolver is not working right.

eslint builds and executes just fine with current dream2nix.
Try this flake.nix:

{
  inputs.dream2nix.url = "github:nix-community/dream2nix";
  inputs.src = {flake = false; url = "github:eslint/eslint";};
  outputs = inp:
    let
      l = inp.dream2nix.inputs.nixpkgs.lib // builtins;
    in
    inp.dream2nix.lib.makeFlakeOutputs {
      # modify according to your supported systems
      systems = ["x86_64-linux"];
      config.projectRoot = ./.;
      source = l.path {
        path = inp.src;
        filter = path: _: ! l.hasInfix "/tests/" path;
      };
    };
}

Maybe you are running into issues, because your new build logic is installing dependencies with a different strategy.

Don't misunderstand me, I'm very open to any improvements. But if bigger changes are done to the logic, like with this PR, I'd really like to see at least one example that demonstrates the the benefit.

@wmertens
Copy link
Contributor Author

This now works like magic with the ifd! dream2nix is such a nice project 🤩

So apart from some small todos it just needs a solution for #198 so it doesn't crash on es-abstract. It supports v1 lockfiles again.

no need for package-json.bak, just put the original data aside in the JSON.
make it easier to implement cycle management
- prevent bad permissions if possible
- fix bad permissions with single chmod command
- join multiple jq calls
- make electron wrapping more readable
- works out-of-the-box, no node|tsc settings necessary
- optimal use of storage, composable
- handles cyclic dependencies by co-locating cycles in the same store path

Changes:
- build node_modules as a separate derivation, use everywhere
- if main package has a build script, run it with dev modules and
  afterwards run install script if present, then switch to prod modules
- only run build scripts when necessary, speedup
- optionally add all transitive binaries to the node_modules/.bin
- store binaries in bin/
  NixOS standard location, and .bin should only be used for dependencies,
  not the main package
- in a package, .bin is now a symlink to bin
- in bin name, strip .js ending for string case
- remove now-unnecessary code

This builder requires all peer dependencies for packages to be mentioned and all transitive circular dependencies to be grouped
- discovers peer dependencies correctly
- adds metadata like os, dev and hasInstallScripts to sources
no need for package-json.bak, just put the original data aside in the JSON.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants