Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Landlock sandboxing support #829

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

MDr164
Copy link

@MDr164 MDr164 commented Aug 7, 2022

Inspired by https://justine.lol/make I attempted to add basic Landlock support to Task. This will obviously only work on Linux and requires a decently recent kernel. This PR should serve as a ground for discussion if this features is something we might want and if my initial attempt is worth extending or if a different route of implementation should be chosen.

@ghostsquad
Copy link
Contributor

this is interesting 🤔

@andreynering
Copy link
Member

Hi @MDr164,

Sorry for not responding earlier.

I'm a bit out-of-the-loop here. Can you explain what is Landlock and why would be useful to have this integrated into Task?

Is this a tool that needs to be installed separately and it's builtin into the Linux kernel?

@andreynering andreynering added the type: proposal A ticket that proposes a new feat/enhancement. label Nov 12, 2022
@MDr164
Copy link
Author

MDr164 commented Nov 15, 2022

Landlock is a Linux kernel LSM where you can find the documentation for here: https://www.kernel.org/doc/html/latest/userspace-api/landlock.html

It allows a userspace process like Task to restrict itself, more details to that are outlined in the blog entry I linked in the PR text. It does not need an additional tool, it simply needs to be enable in the kernel which is the case on most distributions I tested. Though it is Linux specific and apparently needs CGO to compile against a kernel library. This PR would need refactoring for it to be able to go mainline as right now it would break on non-Linux systems.

@andreynering
Copy link
Member

Unfortunately requiring CGO is not ideal, indeed.

@MDr164
Copy link
Author

MDr164 commented Jan 6, 2023

Unfortunately requiring CGO is not ideal, indeed.

Looks like it can also be used in pure-go mode but is then not as powerful as it uses psx from libcap. To quote the official documentation:

For pure Go programs, psx does the same as the syscall.AllThreadsSyscall function in the Go runtime (and that case is straightforward to understand).
For programs linked with cgo, there can be more OS threads than just the ones that were started by the Go runtime.

@MDr164
Copy link
Author

MDr164 commented Jan 6, 2023

I rebased the PR on master and added a build constraint so that the sandboxing will throw an error when attempted to be used outside of Linux. I'm not sure what the best way would be to expose this capability to users yet so I'll keep this PR as a draft until we can decide on that. A good thing to note: Even if the sandbox is turned on but the OS doesn't support it, we will simply not enable the sandbox instead of throwing an error. This could potentially promote the sandbox to be enabled all the time but only be active when enabled in the Kernel.

This commits adds basic support for the Landlock
LSM. This sandboxing is available since Linux 5.13
and allows restricting access to files or paths.
For now only basic support is added and the sandbox
is turned off by default.

Signed-off-by: Marvin Drees <[email protected]>
This commit adds build constraints as Landlock is only supported
on Linux. It will simply return an error everywhere else.
Also use "BestEffort" in order to not throw errors on Linux
systems that don't have Landlock enabled in their Kernel.

Signed-off-by: Marvin Drees <[email protected]>
@MDr164
Copy link
Author

MDr164 commented Mar 24, 2023

Any chance this could get a review (be it positive or negative)? This change allows Task to obtain the same amount of sandboxing as build systems like Bazel while maintaining performance. The conditional compilation makes it possible to only include this code on Linux builds. We can also think about adding sandboxing to other platforms though I'm most familiar with Linux. I'm using these patches sind quite some time now and would love not having to always rebase them against master 😃

@pd93
Copy link
Member

pd93 commented Mar 24, 2023

@MDr164 I've tried to look into this a couple of times since you created the PR. Perhaps I can outline my current understanding and you can correct me if I'm wrong or missing something?

  • This functionality essentially limits task's ability to access the global filesystem. Rather, it only grants access to a specific set of files during execution. i.e. A sandbox.
  • You are setting this list of allowed files to a task's defined sources and generates lists (plus a couple of other standard directories like /tmp and .task.

My questions are:

  1. What is the aim of this functionality? Or rather, who/what is it intended to protect against? I can only assume that since the feature is behind a flag, that it is to protect the user from themselves rather than a full security feature since I could just add random files to my sources/generates or disable the flag and access any file. I think it would help me to understand this better if you could explain why you need this functionality in your project. What requirement are you fulfilling?
  2. We definitely can't enable cgo for functionality that effects such a small percentage of users. Task runs on a wide variety of systems and being able to run without system dependencies such as glibc or musl is important. I know that I've used Task inside an alpine image in CI before for example. I don't think we can change this. Is the lack of cgo a deal-breaker or would you be happy with this merged with cgo disabled?
  3. What happens when a task depends on or calls another task that accesses different files? At the moment you only appear to be passing the sources and generates of the called Task, but not any "child" tasks.
  4. If I check out your branch and try to run something like task --sandbox mod on this repo, I get a permission denied. I assume this is because I'm accessing /usr/lib/go/bin/go. What is your proposal for allowing the user to call commands like this? The go binary is neither a source or a generated file, so adding it to one of those lists feels wrong.

@MDr164
Copy link
Author

MDr164 commented Mar 24, 2023

Thanks for giving it a try!

@MDr164 I've tried to look into this a couple of times since you created the PR. Perhaps I can outline my current understanding and you can correct me if I'm wrong or missing something?

  • This functionality essentially limits task's ability to access the global filesystem. Rather, it only grants access to a specific set of files during execution. i.e. A sandbox.

Correct.

  • You are setting this list of allowed files to a task's defined sources and generates lists (plus a couple of other standard directories like /tmp and .task.

Correct, we could also add a different global directive like allow-list to set the scope of the sandbox.

My questions are:

  1. What is the aim of this functionality? Or rather, who/what is it intended to protect against? I can only assume that since the feature is behind a flag, that it is to protect the user from themselves rather than a full security feature since I could just add random files to my sources/generates or disable the flag and access any file. I think it would help me to understand this better if you could explain why you need this functionality in your project. What requirement are you fulfilling?

I set it behind a flag for now as I didn't want users to wonder why they're suddenly getting access denied. The scope is fairly easy: If you wanna run an unknown task or have a rouge unit test that gets executed you're protecting your rootfs from changes. It's a lot easier to quickly check the scope of the sandbox by looking into the sources/generates instead of trying to understand giant Taskfiles with shell voodoo or look into every test/programm that gets executed. So yes, it kinda protects the user from the user running unknown, likely 3rdparty scripts/code.

  1. We definitely can't enable cgo for functionality that effects such a small percentage of users. Task runs on a wide variety of systems and being able to run without system dependencies such as glibc or musl is important. I know that I've used Task inside an alpine image in CI before for example. I don't think we can change this. Is the lack of cgo a deal-breaker or would you be happy with this merged with cgo disabled?

No problem at all. When CGO is disabled it falls back to the (albeit slower) pure Go implementation of the kernel interfaces. So that got resolved already :)

  1. What happens when a task depends on or calls another task that accesses different files? At the moment you only appear to be passing the sources and generates of the called Task, but not any "child" tasks.

Those "child" tasks would inherit the scope of the sandbox from the parent. But you're right, I think we would need more robust parsing of the allowed paths from included tasks, I'll tackle that next!

  1. If I check out your branch and try to run something like task --sandbox mod on this repo, I get a permission denied. I assume this is because I'm accessing /usr/lib/go/bin/go. What is your proposal for allowing the user to call commands like this? The go binary is neither a source or a generated file, so adding it to one of those lists feels wrong.

Yes I was also planning on finding a better way to declare the scope, any preference? I don't want the user to invest too much effort when writing the sandbox rules but I also don't want to overblock paths of course.

@pd93
Copy link
Member

pd93 commented Mar 24, 2023

Thanks for giving it a try!

No problem, I think I'm on the same page now. I'm generally pretty positive towards this and don't have any issues with the code fundamentals. I think it just needs a bit of polish now.

We could also add a different global directive like allow-list to set the scope of the sandbox

Since the standard terminology for this is "scoped access-control", I would suggest that something like scopes would make sense as a keyword to define which files Task should have access to. By no means do I have final say on naming, but that's just my suggestion 😛 Short, concise and doesn't conflict with anything else.

I set it behind a flag for now as I didn't want users to wonder why they're suddenly getting access denied

Yes, this makes a lot of sense, we definitely can't enable this by default as it would break many (probably most?) Taskfiles. This is something we could perhaps revisit on a major version change, but I think this would be much more compelling if we had x-platform support. We'd also have to consider if this makes writing basic/personal configurations annoying.

Maybe another approach would be creating a TASK_SANDBOX environment variable (or similar) that users can set in their shell environment. This would mean they don't have to add --sandbox every time. We could even go further and add something like --no-sandbox or --unsafe which disables landlock if they've verified/trust the task.

It's a lot easier to quickly check the scope of the sandbox by looking into the sources/generates instead of trying to understand giant Taskfiles with shell voodoo or look into every test/programm that gets executed

This is the selling point for me. It lowers the bar to understand the implications of running a complex task.

When CGO is disabled it falls back to the (albeit slower) pure Go implementation

👍

We would need more robust parsing of the allowed paths from included tasks, I'll tackle that next!

This is probably the biggest blocker. This feature has no value if someone can hide their malicious/dangerous code in a dependant task. Feel free to ping me if/when you get around to fixing this or if you need any help.

Yes I was also planning on finding a better way to declare the scope, any preference?

I mentioned scopes earlier. I think something like this would be nice :)

tasks:
  default:
    deps:
      - mydeptask
    sources:
      - ./my/sources/**/*
    generates:
      - ./my/generated/files/**/*
    scopes:
      - /path/to/binfoo
      - path: binbar
    cmds:
      - task: mychildtask
      - /path/to/binfoo
      - binbar

It also occurred to me after my last comment that user A might have Go installed in /usr/lib/go/bin/go and user B might have it installed in /usr/local/go/bin/go depending on distro and/or installation method. This is why I added the subkey path under scopes. This would do the equivalent of a which binbar. i.e. It does a lookup for the binary in your $PATH and adds the first result to the scopes.

Also, note the subtask and dependant task. The scopes of which would need to be merged into the parent.

@andreynering It would be good to get your very quick thoughts on all of this too. It would be a shame to spend any more time on the PR if you're unsure about adding this functionality.

@andreynering
Copy link
Member

My two cents on this...

This PR has only two up votes (👍) today, while many other issues have a lot more than this.

This makes me think that few users are interested in this at the moment, compared to other priorities.

Because of this, I'm inclined to categorize this feature request as a "maybe in the future".

If in the future we see more upvotes here, and hopefully also comments explaining why they think this is useful, we can reconsider.

Also, this would give us time to think better about the open questions: "what to do for OSes that don't support this", etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: proposal A ticket that proposes a new feat/enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants