Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce Zig Object Notation and use it for the build manifest file (build.zig.zon) #14523

Merged
merged 2 commits into from
Feb 3, 2023

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Feb 3, 2023

Note: the actual diff is much smaller; I moved parse logic from one file to another, but it is unmodified except for an additional function.

Description

This branch introduces Zig Object Notation which can be parsed using the standard library. The same function used to parse zig files is augmented to also parse zon files, using an enum to toggle between the two modes. I considered using a separate implementation to parse zon files, however, I made the decision to reuse existing logic for the following reasons:

  • Shared logic is nice for maintenance
  • Shared logic keeps the binary size of the compiler smaller
  • Reusing the logic for rendering the AST is nice. We get zig fmt for free.
  • It also lets us reuse a bunch of error reporting logic

Next, this branch switches to build.zig.zon instead of build.zig.ini for the manifest file and enhances the relevant error reporting.

This is the last remaining planned breaking change to the package manager, other than these two:

These will not be quite as invasive breakages, so, after this branch is merged, it starts to become more feasible for people to jump in and start using this new feature of zig.

Closes #14290.

Follow-up work to be done:

@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. zig build system std.Build, the build runner, `zig build` subcommand, package management labels Feb 3, 2023
 * std.zig.parse is moved to std.zig.Ast.parse
 * the new function has an additional parameter that requires passing
   Mode.zig or Mode.zon
 * moved parser.zig code to Parse.zig
 * added parseZon function next to parseRoot function
 * improve error message when build manifest file is missing
 * update std.zig.Ast to support ZON
 * Compilation.AllErrors.Message: make the notes field a const slice
 * move build manifest parsing logic into src/Manifest.zig and add more
   checks, and make the checks integrate into the standard error
   reporting code so that reported errors look sexy

closes #14290
@Jarred-Sumner
Copy link
Contributor

Jarred-Sumner commented Feb 3, 2023

remove the restriction from the zig language about identifiers having null bytes or being empty. this is necessary so that the following two conditions can be met:

Does that mean you could technically have structs which appear callable? I think that could be nice if used sparingly

const log = struct {
     pub var allow_debug: bool = false;

     pub fn (comptime fmt: []const u8, args: anytype) void {
          std.io.getStdOut().writer().print(fmt, args) catch unreachable;
     }

     pub fn debug(comptime fmt: []const u8, args: anytype) void {
         if (!allow_debug) return;
          std.io.getStdOut().writer().print(fmt, args) catch unreachable; 
     }
};

pub fn main() anyerror!void {
   log("i am a callable struct", .{});
   log.debug("i am also callable", .{});
   log.allow_debug = true;
}

@InKryption
Copy link
Contributor

Does that mean you could technically have structs which appear callable?

I don't think that's what's implied. I imagine you'd still have to use arbitrary identifier syntax, so it would have to be '@""', thus your example would be log.@""("I am a callable struct", .{}).

@scheibo
Copy link
Contributor

scheibo commented Feb 3, 2023

On #14290 there appeared to be a decent amount of support for naming the file just build.zon instead of build.zig.zon #14290 (comment). Is this something you would entertain a proposal for?

@andrewrk
Copy link
Member Author

andrewrk commented Feb 3, 2023

The build manifest filename sits in the root of the global namespace of a project source directory, so it must communicate the following pieces of information:

  • That the file is for zig specifically
  • That the file pertains to the build system/package manager

"build.zon" fails the first requirement.

@scheibo
Copy link
Contributor

scheibo commented Feb 3, 2023

The "z" in .zon would probably help with the first requirement (in the same way the "js" in .json ties it to JS in the package.json)? .zig.zon adds stutter and is atypical compared to most filename extensions you encounter and I would argue that its a suboptimal place to spend weirdness budget given some is already being spent on .zon in the first place (though I do personally agree with the decision of using the notation for the config language).

This is perhaps a fairly minor gripe, and I can't argue with .zig.zon "communicating intent precisely", but just felt I'd bring it up since the suggestion seemed well received but left unaddressed on the original proposal. :)

@andrewrk
Copy link
Member Author

andrewrk commented Feb 3, 2023

"package.json" is a great example of a poorly named manifest file. JSON is a standard file format that, while it uses a subset of javascript syntax, it does not mean that the contents of the data are related to javascript. Similarly, ZON is intended to be a standard file format that any project could decide to use, and the contents of the file may not have anything to do with Zig.

"package.json" could very well be the package manifest file for Ruby, Python, Rust, Zig, Go, etc., since ".json" is only the data format, and does not have anything to do with the meaning of the data. Of course we know it to be specifically for nodejs, which already has a habit of naming things poorly ("/usr/bin/node" rather than "/usr/bin/nodejs").

So, I hope you will agree now, "build.zon" fails the first requirement. As an example of an acceptable filename, consider "zig-pkg.zon". Whether to spell it out as "package" should be correlated with the related subcommand, which would probably be zig pkg.

@andrewrk andrewrk merged commit 60935de into master Feb 3, 2023
@andrewrk andrewrk deleted the zon branch February 3, 2023 19:20
@mitchellh
Copy link
Contributor

No one asked or wants my opinion on this lol but I will just say that I think build.zig.zon is fine to me. I like that it will sort nicely and the appendage to “build.zig” shows its an addition or appendix to Zig building to me.

And, aside, I’m excited for Zon in general I think that was a great idea. Ini felt bleh but I had nothing constructive to add there, only complaints, so I kept my mouth shut. Now that Zon was proposed and done, I think this is a really cool direction. I’m going to play around with it more generally as a config format.

@kuon
Copy link
Contributor

kuon commented Feb 4, 2023

I agree with .zig.zon. At first I was hesitant, as I pointed, but then I realized that you don't want to use the extension .zig.zon forzon "in general", it's just specific to build.zig.zon which is zon describing build.zig code (like html.heex being html describer as heex in my example in my comment).

@francescoalemanno
Copy link
Contributor

I would have opted for having .zig and .zigon, but that's perhaps just me.
Anyway, excellent work. Brilliant idea using Zig notation for config files :)

@sarvex
Copy link

sarvex commented Apr 17, 2023

Apologies, I am late to the party, but Apologies for a noob question, but is there any specific reason we have to separate build.zig and build.zig.zon I would have preferred just an upgrade of URI to handle http as well. Something like this in build. Zig

    const libz_dep = b.dependency("libz", .{
        .target = target,
        .optimize = optimize,
        .url = "https://github.com/andrewrk/libz/archive/8bf2a1c61b87773131bee2086737b5f02b7c0888.tar.gz",
        .hash = "12209e851f7e2c6ba2f01de3e11b1771f03e49666065320abd8414aac152bfa75fae",
    });

or better yet

    const libz_dep = b.dependency("https://github.com/andrewrk/libz/archive/8bf2a1c61b87773131bee2086737b5f02b7c0888.tar.gz", .{
        .target = target,
        .optimize = optimize,
        .hash = "12209e851f7e2c6ba2f01de3e11b1771f03e49666065320abd8414aac152bfa75fae",
    });

or the best, hashes automatically calculated by external service providers like https://www.metalinker.org/

    const libz_dep = b.dependency("https://github.com/andrewrk/libz/archive/8bf2a1c61b87773131bee2086737b5f02b7c0888.tar.gz", .{
        .target = target,
        .optimize = optimize,
    });

@mlugg
Copy link
Member

mlugg commented Apr 17, 2023

The problem with specifying dependencies completely within the build.zig is that it requires executing arbitrary build code just to build a dependency graph. That's bad for tooling, both first-party Zig tooling (where you want to be able to, for instance, fetch dependencies without actually running the build script) and third-party tooling (which might be, as a random example, trying to output a HTML document visually representing dependency trees); effectively, it means you have to trust the build script to use any tooling at all

@mlugg
Copy link
Member

mlugg commented Apr 17, 2023

Also, the reason we have to specify the hash is that the package manager is completely decentralised and does not (and should not) rely on any external service, and the hash is necessary for the package manager in its current form to function without redundant fetches every single time you build.

@sarvex
Copy link

sarvex commented Apr 17, 2023

The problem with specifying dependencies completely within the build.zig is that it requires executing arbitrary build code just to build a dependency graph. That's bad for tooling, both first-party Zig tooling (where you want to be able to, for instance, fetch dependencies without actually running the build script) and third-party tooling (which might be, as a random example, trying to output a HTML document visually representing dependency trees); effectively, it means you have to trust the build script to use any tooling at all

I agree with the problem of executing arbitrary build code, but the problem is the same will remain same whether the dependency is mentioned in build.zig.zon or build.zig.

I am not a great fan of JAVA but one problem it solved for masses was the exclusion of ugly HEADER_FILES.

But now it seems that ZIG is trying to bring HEADER_FILES back under the guise of ZON.

ZON basically goes against the first principle of ZIG programming language
A Simple Language
Focus on debugging your application rather than debugging your programming language knowledge.

  • No hidden control flow.
  • No hidden memory allocations.
  • No preprocessor, no macros.

To be precise, ZON is the PREPROCESSOR zig promised to keep away....

@sarvex
Copy link

sarvex commented Apr 17, 2023

Also, the reason we have to specify the hash is that the package manager is completely decentralised and does not (and should not) rely on any external service, and the hash is necessary for the package manager in its current form to function without redundant fetches every single time you build.

You are missing the whole point here... Package Manager is decentralized, you will download the packages as peer dependency, and I am pointing to the fact that SHA256 HASH can also be downloaded, cached and verified completely through decentralization. Check the auto-update features implementation in Scoop for a clear understanding of the concept

@silversquirl
Copy link
Contributor

I agree with the problem of executing arbitrary build code, but the problem is the same will remain same whether the dependency is mentioned in build.zig.zon or build.zig.

No, zon is not code and doesn't need to be executed. It's just a data format, like JSON.

But now it seems that ZIG is trying to bring HEADER_FILES back under the guise of ZON.

You might have to explain this a bit more haha 😅
To me, build.zig.zon doesn't seem analagous to header files at all. It's just like the cargo.toml of Rust, or the packages.json of Node.js.

SHA256 HASH can also be downloaded, cached and verified

The hash serves two purposes:

  1. verification of package files (to ensure nothing changed since you started depending on them)
  2. finding the package in the cache

Both of these require that the hash should be stored in the package configuration, rather than downloaded from anywhere, for different reasons:

  1. if the hash is stored online, a malicious actor who might want to change the package content could also simply update the hash to match
  2. the hash now needs to be cached somehow, but since it's what's used for caching that now means we need an extra caching system for hashes, in addition to the normal package cache

@sarvex
Copy link

sarvex commented Apr 18, 2023

No, zon is not code and doesn't need to be executed. It's just a data format, like JSON.

Preprocessor also does not need to be executed. Check the definition and you will understand. JSON is a poor example, Javascript/ECMAScript is moving to config.js because JSON is bad. Also the build.zig.zon looks nothing like a Zig Structure. If it were a Zig Object, it would have looked like

Build {
    .name = "libffmpeg",
    .version = "5.1.2",

    .dependencies = Dependency {
        .libz = .{
            .url = "https://github.com/andrewrk/libz/archive/8bf2a1c61b87773131bee2086737b5f02b7c0888.tar.gz",
            .hash = "12209e851f7e2c6ba2f01de3e11b1771f03e49666065320abd8414aac152bfa75fae",
        },

        .libmp3lame = Dependency {
            .url = "https://github.com/andrewrk/libmp3lame/archive/b8408fa9dc751baee2a03cd430a996920ca2c5bc.tar.gz",
            .hash = "122025fb57739eb67edbbafed2b270479089ee7395cf4b0849001ab95c16a2bae0d9",
        },
        .libvorbis = Dependency {
            .url = "https://github.com/andrewrk/libvorbis/archive/a1dc13ff76e262007eea91894dd16d74c2147619.tar.gz",
            .hash = "1220d6c5bf73bee37589a086147ab8f7e855ee98becd3f2e26e58a920ca8f766105f",
        },
        .libogg = Dependency {
            .url = "https://github.com/andrewrk/libogg/archive/9f514ae46e28589e47d25017385eb0d6ff4c7e9a.tar.gz",
            .hash = "1220f96a4eaae5bad95ab9391431f125b7cc32edbd6d17397ce066d498f8fc9b63c2",
        },

        // This is used to compile some assembly files into object files for x86.
        // Without this, ffmpeg considers the build "crippled".
        .nasm = Dependency {
            .url = "https://github.com/andrewrk/nasm/archive/5ef3ed029e4917d07c88f265c7b12fcfd81bd6ab.tar.gz",
            .hash = "122032bc8d97d857b7c2f71252da293e4f293a4ea0d162909fb0705ba17c40ae2a87",
        },
    },
}

You might have to explain this a bit more haha 😅 To me, build.zig.zon doesn't seem analogous to header files at all. It's just like the cargo.toml of Rust, or the packages.json of Node.js.

Header It's CompSci 101 not Compiler Design 301, or are they not teaching compiler design anymore?

The same information is repeated, in the code and the header file. Who do you think is going to maintain those? Either you have worked on commercial C/C++ projects or you have not - no one can explain the problem of maintaining Header files.

cargo.toml sure if you want to be rust, but then tell the world that ZIG cannot handle building, don't brag that ZIG is a build system as well. CARGO:RUST:::ZIG - doesn't require a deep neural network based on BoW to solve this equation.

Even if you were to be cargo.toml, Cargo does not differentiate between local and global dependencies, neither should you. To make it clearer there is only one CARGO.TOML - uno, une, ein, #RustFoundation did not create another CARGO.TOML.RON to take care of external dependencies.

package.json sure if you don't want to learn from history, otherwise you should look at something better maybe go get.
Furthermore, there is only one package.json, do you see package.node anythere? Or just for enumeration do you see a npm.json
Again you prove my point internal and external dependencies are managed in a single file.

The hash serves two purposes:

  1. verification of package files (to ensure nothing changed since you started depending on them)
  2. finding the package in the cache

Both of these require that the hash should be stored in the package configuration, rather than downloaded from anywhere, for different reasons:

  1. if the hash is stored online, a malicious actor who might want to change the package content could also simply update the hash to match

LOL, haha 😅. If it could happen, it would have happened already. It is not possible and the proof is Magnet. Are you living under a rock? Let's leave security to the experts now.
Metalink version 3.0 was released in 2005? If the hallucinated hackers have not done this in last 18 years and peer reviewed RFC5854 and RFC6249 are farce, then show us a PoC?

  1. the hash now needs to be cached somehow, but since it's what's used for caching that now means we need an extra caching system for hashes, in addition to the normal package cache

Cached somehow, why do you think burdening programmer instead of a computer / build system / compiler is a better option?
Maybe it is still better to use a Magnetic Needle but I don't think they work on SSD or NVME 🤣😂🤣

If the community or foundation cannot stick to its own principles, I think it is better to go back to #EvilCorporations.
Why is it so that ZIG spends so much time bashing C / C++ and still follows the same design principles?
How is it that I have to copy/paste the contents of website as a reminder to what this is all about?

@silversquirl
Copy link
Contributor

@sarvex please try to be civil, being patronizing and throwing insults around helps nobody :)

JSON is a poor example

JSON is directly comparable to zon. Zon doesn't require execution because it's not a data format. This makes it easier to write tooling that deals with dependency graphs, as @mlugg mentioned earlier.

Also the build.zig.zon looks nothing like a Zig Structure

Actually, Zig struct literals can absolutely take the form .{ .field = value, ... }. The . means "infer the type". You can see this in the docs a little below the example you linked.

The same information is repeated

The only information that's duplicated between build.zig and build.zig.zon is the package names. Hopefully those are short enough that it's not a big deal :)

If it could happen, it would have happened already. It is not possible and the proof is Magnet

I'm not entirely sure how magnet links support your position here - they include a hash directly in the link, meaning you're still putting the hash in the file.

Cached somehow, why do you think burdening programmer instead of a computer / build system / compiler is a better option?

To be fair, you're right - you don't strictly need the hashes in order to do caching effectively. However, given you do need the hashes for security reasons, it makes sense to use them for caching as well :)

@sarvex
Copy link

sarvex commented Apr 18, 2023

JSON is directly comparable to zon. Zon doesn't require execution because it's not a data format. This makes it easier to write tooling that deals with dependency graphs, as @mlugg mentioned earlier.

Dependency Graphs are Dependency Graphs, they do not differentiate between file:// and https:// or git+shh://. If we can agree on that then the discussion can be fruitful. And we need to also agree @silversquirl that doing things right if more important than doing easy things.

Actually, Zig struct literals can absolutely take the form .{ .field = value, ... }. The . means "infer the type". You can see this in the docs a little below the example you linked.

Agreed, I was not aware of that.

The only information that's duplicated between build.zig and build.zig.zon is the package names. Hopefully those are short enough that it's not a big deal :)

The only information repeated in header file is the signature, when we are talking about C / C++.
Only cannot be the excuse for repetition.

I'm not entirely sure how magnet links support your position here - they include a hash directly in the link, meaning you're still putting the hash in the file.

Then we need to have a better understanding of how magnets work and how they can free us from the .hash. Won't you agree?

To be fair, you're right - you don't strictly need the hashes in order to do caching effectively. However, given you do need the hashes for security reasons, it makes sense to use them for caching as well :)

Yeah, so we agree that caching is different than burdening the programmer?

My argument boils down to a single line
When we can merge the files, why segregate them?
Simplicity is hard to achieve and that is the reason why we all started with ZIG!

We must test our underlying beliefs to see if they are really true or not

As the timeless saying goes - Only a fighter falls while fighting on the battlefield. It is not a child who walks on his knees.

@DraagrenKirneh
Copy link
Contributor

DraagrenKirneh commented Apr 18, 2023

@sarvex
The package manager / package handling is far from done. you can track the progress from https://github.com/ziglang/zig/projects/4.

I would say that the current version of zig.build.zon is no more than a lock file right now. The hashes are there to maintain a consistent reproducible build.

I would think based on the tasks in the project that this will evolve such that in most cases users will use the command line zig pkg with its sub-commands to manage package related stuff. the .zon file would just be mostly a data file you interface via these commands so you do not need to manually edit them unless you really really want to. Note also that it most likely would not just be the url and hash present in a dependency node, but it could be other fields as well supplied by that package.

A crude example of what it might look like using zig pkg:

zig pkg init my-package-name
zig pkg add "@zig/std-extra" --version 0.42
zig pkg add "@draagren/fuzzy-matcher" 
zig pkg add "@user/package" 
zig pkg fetch // downloads and verifies that dependencies match
zig pkg version next-minor // updates your own package version
... 
zig pkg check // checks for any updates
zig pkg update "@zig/std-extra"
zig pkg remove "@user/package"

@sarvex
Copy link

sarvex commented Apr 18, 2023

@DraagrenKirneh I was not familiar with the perspective you are sharing now, but if it is a lock file, automatically generated by the tool,

  1. Why publicize something which is going to be generated/maintained automatically. We are opening a pandora's box with ZON
  2. Why reinvent the wheel when lockfiles are already standardized.
  3. Why is MonoRepo ruled out, what are we offering different.
  4. When build.zig does not need a lockfile earlier with local repositories, why to create a lockfile explicitly for external dependencies.
  5. How are we better or different than go imports or deno import.

Even with the updated perspective it is really important to understand that just because we are handling external repositories, we do not need additional files.

zig pkg update "usr_pkg"
zig pkg remove "@user/package"
zig pkg add "../usr/pkg"

Why should these commands be different? And why do the need a separate ZON file, why can't they make changes to build.zig directly? Or coming back to the original design - why cannot these commands be automatically invoked by build.zig?

With a greenfield approach GO and DENO nailed it. the best possible mechanism for importing modules, the most simplistic model possible. When we have seen better things, Why are we regressing to dark ages?

@DraagrenKirneh
Copy link
Contributor

but if it is a lock file, automatically generated by the tool,

  1. Why reinvent the wheel when lockfiles are already standardized.

When build.zig does not need a lockfile earlier with local repositories, why to create a lockfile explicitly for external dependencies.

I think I may have not worded myself good enough in the previous post.
My own opinion of the what I compare the current zig.build.zon to is that of a lockfile of other package managers as it is in this current stage of development. It does not mean that it is a lockfile, nor that it will stay a lockfile in the future as development towards a fully workable package manager progresses.

If you read through some of the ideas / threads on the different tasks in the project you would see that there is more than just using it as a lockfile which are planned / discussed.

Why publicize something which is going to be generated/maintained automatically.

So that it is possible to get feedback, be enhanced, and built upon. For example like getting feedback, bugs, issues on downloading the packages themselves so this part can be improved upon.
The current version of the package system is a MVP/WIP.

Why is MonoRepo ruled out, what are we offering different.

??? what do you mean here, the terminology, or the tool, or something else?

Why should these commands be different? And why do the need a separate ZON file, why can't they make changes to build.zig directly? Or coming back to the original design - why cannot these commands be automatically invoked by build.zig?

There are a lot of pitfalls going down a edit build.zig as it can do whatever, and you would actually need to run the build itself and the code to get the real data needed, consider this mock example in a build.zig:

var list_of_dependencies_not_known_before_runtime = callSomeFunctionToGetData();
for (list_of...) | each | {
 const dep = b.dependency(each.name);
 lib.linkLibrary(dep.artifiact(each.artifact);
}

With a greenfield approach GO and DENO nailed it. the best possible mechanism for importing modules, the most simplistic model possible. When we have seen better things, Why are we regressing to dark ages?

Never used any of those so I cannot say much about them, but again if you read through the proposals it would give you a better idea about the scope this package manager is trying to solve.

@sarvex
Copy link

sarvex commented Apr 18, 2023

I think I may have not worded myself good enough in the previous post. My own opinion of the what I compare the current zig.build.zon to is that of a lockfile of other package managers as it is in this current stage of development. It does not mean that it is a lockfile, nor that it will stay a lockfile in the future as development towards a fully workable package manager progresses.

Sure, opinions are necessary. Either we need the lockfile or we don't - it can't be one for external dependencies i.e. https:// and another for local i.e. file://

If you read through some of the ideas / threads on the different tasks in the project you would see that there is more than just using it as a lockfile which are planned / discussed.

Definitely, I am unaware of these discussion. I stumbled upon this PR because mach wanted to move to new zig package manager. Point me in the right direction and I will definitely go through the discussions and contribute

So that it is possible to get feedback, be enhanced, and built upon. For example like getting feedback, bugs, issues on downloading the packages themselves so this part can be improved upon. The current version of the package system is a MVP/WIP.

Definitely feedback is crucial, but again there is nothing about ZON being self-generated and the answers I received earlier echo the same notion.

Why is MonoRepo ruled out, what are we offering different.
??? what do you mean here, the terminology, or the tool, or something else?

MonoRepo as a concept implemented by bazel, Nx and so on, basically...

  • Local computation caching
  • Local task orchestration
  • Distributed computation caching
  • Distributed task execution
  • Transparent remote execution
  • Detecting affected projects/packages
  • Workspace analysis
  • Dependency graph visualization
  • Source code sharing
  • Consistent tooling
  • Code generation
  • Project constraints and visibility

There are a lot of pitfalls going down a edit build.zig as it can do whatever, and you would actually need to run the build itself and the code to get the real data needed, consider this mock example in a build.zig:

var list_of_dependencies_not_known_before_runtime = callSomeFunctionToGetData();
for (list_of...) | each | {
 const dep = b.dependency(each.name);
 lib.linkLibrary(dep.artifiact(each.artifact);
}

Again, C / C++ started with HEADER_FILES to deal with list_of_dependencies_not_known_before_runtime and the rest is history. If GO and DENO can handle it so can ZIG.

Never used any of those so I cannot say much about them, but again if you read through the proposals it would give you a better idea about the scope this package manager is trying to solve.

It would request you to go through the examples provided in the links to have a better understanding of what if the current State of Craft.

Node has package.json because they do not have build.js, but ZIG already has build.zig and thus cannot have a build.zig.zon. build.zon or build.zig choose one, but not both... simplicity mandates....

@silversquirl
Copy link
Contributor

You haven't as of yet provided any practical solutions for the problems build.zig.zon solves. If you have a simpler solution in mind, please do suggest it, otherwise I don't think continuing this conversation is particularly productive :)

@sarvex
Copy link

sarvex commented Apr 19, 2023

@silversquirl I have provided all the information necessary to solve the problem without using build.zig.zon. But instead of understanding or working on it, everyone is ready to defend build.zig.zon. You can read all my previous comments and will know more about these.

To sum up

  1. Understand what a build system is and how monorepo is essential to build.zig
  2. The information added to build.zig.zon can easily be put in build.zig - my first comment defines the three approaches.
  3. Stick to simplicity

If this is hard to implement then let me know and I will make the necessary changes.

@silversquirl
Copy link
Contributor

The information added in build.zig.zon can easily be put in build.zig - my first comment defines the three approaches.

None of those approaches work, as they require executing code to build the dependency graph. The whole point of build.zig.zon is to remove the need for arbitrary code execution for building dependency graphs, which speeds things up and makes tooling safer and simpler.

@sarvex
Copy link

sarvex commented Apr 19, 2023

The information added in build.zig.zon can easily be put in build.zig - my first comment defines the three approaches.

None of those approaches work, as they require executing code to build the dependency graph. The whole point of build.zig.zon is to remove the need for arbitrary code execution for building dependency graphs, which speeds things up and makes tooling safer and simpler.

Premature optimization is root cause of all evil

What you are calling arbitrary code execution used to be called heuristics

Zig is not the first build system trying to build a dependency graph by arbitrary code execution, it has been done by so many other tools that in 2023 an AI bot can write the code required. Bazel, Nx and TurboRepo even provide visualization tools for dependency graph.

@sarvex
Copy link

sarvex commented Apr 19, 2023

Even at the dumbest level the contents if I put the contents of build.zig.zon inside build.zig we get rid of the the HEADER_FILE

@silversquirl
Copy link
Contributor

silversquirl commented Apr 19, 2023

If you have a suggestion for a safe, simple, fast way to do this with arbitrary code execution I'd love to hear it.
I do not believe this is a case of premature optimization - it makes package fetching significantly faster because it removes the need to run the compiler while fetching dependencies.
You can read more about the justification for it here: #14265 (comment)

@kuon
Copy link
Contributor

kuon commented Apr 19, 2023

Maybe the best way to evaluate this would be to create a real world example? We could create a script that generate a dependency graph and publish it on github in hundred repositories. This would provide a real world example taking into account all real world factors. Allowing for testing multiple approaches. I don't know how useful/realistic this would be, but many times real stress testing helped me evaluating real bottlenecks.

@sarvex
Copy link

sarvex commented Apr 19, 2023

If you have a suggestion for a safe, simple, fast way to do this with arbitrary code execution I'd love to hear it. I do not believe this is a case of premature optimization - it makes package fetching significantly faster because it removes the need to run the compiler while fetching dependencies. You can read more about the justification for it here: #14265 (comment)

@silversquirl why is it becoming very clear that you do not have any clue about Arbitrary Code Execution. Are you a developer or a security consultant? Because I am both, with three decades of experience. Why don't you do a PoC and tell the world how this imaginary ACE will work, because till now Google, Facebook, Microsoft have failed to see this flaw.

I am not here to educate you, still - File does not make code anything faster, it is the slowest mechanism. Code and Data makes fetching faster. And it is the Data which makes Arbitrary Code Execution possible, and thus a file filled with additional data will make ACE possible.

@andrewrk regarding your comment on #14265, if declarative programming makes so much sense why choose imperative programming at all in build.zig. And if you think one is better than another, stick to the better one i.e. migrate build.zig to declarative syntax and move on. Why do two files in the build system follow different paradigms?

@kuon the dependency graphs are already made by monorepp tools such as bazel, nx, turbobuild. But none of them required two different files in two different programming paradigms and worst containing duplicate information that is hard to maintain by hands of programmers.

@sin-ack
Copy link
Contributor

sin-ack commented Apr 19, 2023

  1. Your attitude is very dismissive and I'd like you to stop acting abrasively towards others.

  2. If the build.zig file contained the dependency graph, we would have multiple issues:

    • As was already discussed, extracting the dependency graph would require code execution. This would mean that for every dependency, Zig:

      • Downloads the dependency
      • Extracts it
      • Compiles the build.zig file (!)
      • Runs the build.zig file without any restrictions (!!!)

      Not only is the compilation of the build.zig file very slow (takes at least a few seconds, will improve with self-hosted backends), it means that now every dependency would have unrestricted access to the system. This might not be a problem now, but it will inevitably cause trouble as the package ecosystem grows.

    • Now you can't write a tool that adds dependencies to your project, because that would require editing the build.zig file by the tool itself which is extremely hairy once you consider the the fact that the imperative programming and comptime capabilities of Zig makes it very hard for any automated tool to insert statements. It's not a viable strategy.

    • Similarly, any third party tool would incur the cost of building build.zig and have the same security issues. It's very much not imaginary, because you would have to run build.zig without any restrictions for the time being.

    By separating the dependency metadata into build.zig.zon we put the part that makes the most sense to analyze by automated tools into an execution-free data file.

  3. Imperative programming makes a lot of sense in a build script, because declarative build systems have all inevitably required escape hatches. Rather than contorting to fit the Zig build system's model of the world, the build system instead lets you construct the build graph as you please. Also keep in mind that build.zig doesn't actually build the project; the build function only creates a build graph which the build runner then uses.

  4. All of your examples so far regarding other programming languages (package.json, Cargo.toml etc.) are inert data files without support for defining the build graph of a project imperatively. They have no bearing on how Zig has decided to do its dependency definitions, because they miss a crucial part: the build script. Also keep in mind that if you had custom steps in Rust, you would also have to define that separately from Cargo.toml, rendering your argument moot.

  5. You still have not supplied an alternative that would be better than using the .zon file that would handle the above issues. I don't know about others, but I don't particularly care for whining about parts of the project without producing a viable path forward.

@andrewrk

This comment was marked as off-topic.

@kuon
Copy link
Contributor

kuon commented Apr 19, 2023

I will try to stay technical, and hope for calm.

My suggestion of real world testing is to try to see if the .zon file is "a premature optimization" or just a requirement to have non exponential complexity. My feeling is that resolving a large package graph and run hundreds build.zig is very costly. But I've been wrong in the past, and intuition for this kind of things can often be wrong. Sometimes, companies deploy kubernetes and large cloud solutions, for something that could have been hosted on a raspberry pi with nginx. But the reverse can be true.

But regardless of the performance consideration, which is only important if the bottleneck is really important (we can allow a few seconds to resolve a graph, even maybe a minute, but not 15 minutes). The tooling is IMO the main argument. We want commands like pkg add <something> to be able to @import("something"), and editing build.zig automatically is hard, even impossible. (also, for this reason, I thing we should be able to do something like build_step.addAllPackagesFrom("build.zig.zon") or something similar).

@silversquirl
Copy link
Contributor

Why don't you do a PoC and tell the world how this imaginary ACE will work

Sorry for my unclear wording - by "arbitrary code execution" I mean "you have to run the build script". I'm not referring to any exploit, simply the fact that package-defined (ie. arbitrary) Zig code has to be run in order to build the dependency graph.

@kuon
Copy link
Contributor

kuon commented Apr 19, 2023

Why don't you do a PoC and tell the world how this imaginary ACE will work

Sorry for my unclear wording - by "arbitrary code execution" I mean "you have to run the build script". I'm not referring to any exploit, simply the fact that package-defined (ie. arbitrary) Zig code has to be run in order to build the dependency graph.

I looked a a few of my projects, and some have quite complex build.zig, one also does some network access to get some state, it is not open source, and open source libraries might never execute such code, but it still exist. And do adjust this code, build.zig would need to be split, but thinking about it, I would end in a separation a bit similar to .zon + .zig.

@kassane
Copy link
Contributor

kassane commented May 14, 2023

I don't know if that would be the objective, but zig.zon reminds me a lot of dub (dlang pkg).
https://github.com/dlang/dub/blob/master/ARCHITECTURE.md

e.g.:
dub.json

{
   "name": "header-lib-example",
    "description": "A simple D header library (C binding to libmylib.so)",
    "targetType": "sourceLibrary",
    "libs": ["mylib"]
}

@ERAGON007
Copy link

Can we use this to JUST download some non-Zig dependencies too? e.g Can i for example use it to just download the curl/curl repository? currently there is an error DOWNLOADED_PATH_OF_CURL/build.zig File not found

@andrewrk
Copy link
Member Author

andrewrk commented Aug 1, 2023

Yes, let me file the issue for that. One moment... done: #16643

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. enhancement Solving this issue will likely involve adding new logic or components to the codebase. zig build system std.Build, the build runner, `zig build` subcommand, package management
Projects
None yet
Development

Successfully merging this pull request may close these issues.

proposal: instead of build.zig.ini, use build.zig.zon