Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework package manager #17392

Merged
merged 39 commits into from
Oct 9, 2023
Merged

rework package manager #17392

merged 39 commits into from
Oct 9, 2023

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Oct 4, 2023

Organize everything around a Fetch task which does a bunch of stuff in a
worker thread without touching any shared state, and then queues up
Fetch tasks for its dependencies.

This isn't the theoretical optimal package fetching performance because
CPU cores don't necessarily map 1:1 with I/O tasks, and each fetch task
contains a mixture of computations and I/O. However, it is expected for
this to significantly outperform master branch, which fetches everything
recursively with only one thread.

The logic is now a lot more linear and easy to follow. Everything that
is embarassingly parallel is done on the thread pool, and then after
everything is fetched, the worker threads are joined and the main thread
does the finishing touches of stitching together the dependencies.zig
import files. There is only one tiny little critical section.

This also lays the groundwork for #14281 because in system mode, all
this fetching logic will be skipped, but the "finishing touches"
mentioned above still need to be done. With this branch, that logic is
separated out and no longer recursively tangled with fetching stuff.

Additionally, this branch:

Merge Checklist:

  • read environment variable for workaround and set that flag
  • make it an error to omit paths in the root package but allow missing paths in dependencies to retain previous behavior. Keeps existing projects working while ensuring an upgrade over time.
  • apply filter rules in computeHash
  • move Module and a couple other files to inside Package/
  • regression test against ghostty
  • regression test against groove basin
  • regression test against mach
  • restore progress reporting
  • use the Compilation arena for creating @cImport modules
  • zig fetch on a dir is giving a bad error message
  • do something about the regressed Module.getName() compile error message test case

Follow-up issues

@xdBronch
Copy link
Contributor

xdBronch commented Oct 4, 2023

#17094 should be closed in favor of this if it works around the bug

@andrewrk andrewrk mentioned this pull request Oct 4, 2023

[semver](https://semver.org/)

### `dependenices`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### `dependenices`
### `dependencies`

self.entries.sort(sort_ctx);
pub inline fn sortUnstable(self: *Self, sort_ctx: anytype) void {
if (@sizeOf(ByIndexContext) != 0)
@compileError("Cannot infer context " ++ @typeName(Context) ++ ", call sortContext instead.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@compileError("Cannot infer context " ++ @typeName(Context) ++ ", call sortContext instead.");
@compileError("Cannot infer context " ++ @typeName(Context) ++ ", call sortUnstableContext instead.");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Organize everything around a Fetch task which does a bunch of stuff in a
worker thread without touching any shared state, and then queues up
Fetch tasks for its dependencies.

This isn't the theoretical optimal package fetching performance because
CPU cores don't necessarily map 1:1 with I/O tasks, and each fetch task
contains a mixture of computations and I/O. However, it is expected for
this to significantly outperform master branch, which fetches everything
recursively with only one thread.

The logic is now a lot more linear and easy to follow. Everything that
is embarassingly parallel is done on the thread pool, and then after
everything is fetched, the worker threads are joined and the main thread
does the finishing touches of stitching together the dependencies.zig
import files. There is only one tiny little critical section and it does
not even have any error handling in it.

This also lays the groundwork for #14281 because in system mode, all
this fetching logic will be skipped, but the "finishing touches"
mentioned above still need to be done. With this branch, that logic is
separated out and no longer recursively tangled with fetching stuff.

Additionally, this branch:
 * Implements inclusion directives in `build.zig.zon` for deciding which
   files belong the package (#14311).
 * Adds basic documentation for `build.zig.zon` files.
 * Adds support for fetching dependencies with the `file://` protocol
   scheme (#17364).
 * Adds a workaround for a Linux/btrfs file system bug (#17282).

This commit is a work-in-progress. Still todo:

1. Hook up the CLI to the new system.
2. Restore the module table creation logic after all the fetching is
   done.
3. Fix compilation errors, get the tests passing, and regression test
   against real world projects.
* start renaming "package" to "module" (see #14307)
  - build system gains `main_mod_path` and `main_pkg_path` is still
    there but it is deprecated.
* eliminate the object-oriented memory management style of what was
  previously `*Package`. Now it is `*Package.Module` and all pointers
  point to externally managed memory.
* fixes to get the new Fetch.zig code working. The previous commit was
  work-in-progress. There are still two commented out code paths, the
  one that leads to `Compilation.create` and the one for `zig build`
  that fetches the entire dependency tree and creates the required
  modules for the build runner.
Finish the work started in 4c4fb839972f66f55aa44fc0aca5f80b0608c731.
Now the compiler compiles again.

Wire up dependency tree fetching code in the CLI for `zig build`.
Everything is hooked up except for `createDependenciesModule` is not yet
implemented.
For path-relative dependencies, they always need to be added to the hash
table at the end. For remote dependencies, they never need to be added.
Only problem is that it looks like `has_build_zig` is being false when
it should be true.

After that is fixed then main.zig needs to create the `@dependencies`
module from the generated source code.
* add Module instances for each package's build.zig and attach it to the
  dependencies.zig module with the hash digest hex string as the name.
* fix incorrectly skipping the wrong packages for creating
  dependencies.zig
* a couple more renaming of "package" to "module"
@andrewrk andrewrk marked this pull request as ready for review October 8, 2023 23:55
@andrewrk andrewrk requested a review from kristoff-it as a code owner October 8, 2023 23:55
Since I'm about to rely on this behavior.
Instead of every file deletion being followed by a recursive attempt to
remove the parent directory, while walking the directory, every
directory that will have nonzero files deleted from it is tracked in an
array hash map.

After all the threads have finished deleting and hashing, the parent
thread sorts the "suspicious" directories by length, descending,
ensuring that children appear before parents, and then iterates over the
array hash map, attempting a rmdir operation on each. Any rmdir that
succeeds appends the parent directory to the map so that it will be
removed if empty.
@andrewrk andrewrk merged commit f7bc55c into master Oct 9, 2023
@andrewrk andrewrk deleted the fetch branch October 9, 2023 18:47
gentoo-bot pushed a commit to gentoo/guru that referenced this pull request Oct 10, 2023
mitchellh added a commit to mitchellh/libxev that referenced this pull request Oct 24, 2023
Small updates for latest Zig updates:

- ziglang/zig#17623
- ziglang/zig#17392

Tested with `zig build install` and `zig build test`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants