Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable SECCOMP_FILTER_FLAG_SPEC_ALLOW per default #938

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions pkg/seccomp/default_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ func DefaultProfile() *Seccomp {
enosys := uint(unix.ENOSYS)
eperm := uint(unix.EPERM)

flags := []string{SeccompFilterFlagSpecALlow}

syscalls := []*Syscall{
{
Names: []string{
Expand Down Expand Up @@ -882,5 +884,6 @@ func DefaultProfile() *Seccomp {
DefaultErrnoRet: &enosys,
ArchMap: arches(),
Syscalls: syscalls,
Flags: flags,
}
}
3 changes: 3 additions & 0 deletions pkg/seccomp/seccomp.json
Original file line number Diff line number Diff line change
Expand Up @@ -1037,5 +1037,8 @@
},
"excludes": {}
}
],
"flags": [
"SECCOMP_FILTER_FLAG_SPEC_ALLOW"
]
}
12 changes: 12 additions & 0 deletions pkg/seccomp/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,18 @@ type Seccomp struct {
Flags []string `json:"flags,omitempty"`
}

const (
// SeccompFilterFlagLog is the filter to return actions except
// SECCOMP_RET_ALLOW should be logged. An administrator may override this
// filter flag by preventing specific actions from being logged via the
// /proc/sys/kernel/seccomp/actions_logged file. (since Linux 4.14)
SeccompFilterFlagLog = "SECCOMP_FILTER_FLAG_LOG"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a risk that a malicious container could fill the log if this is enabled by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message always has a limited line length accordingly to: https://github.com/torvalds/linux/blob/23d04328444a8fa0ca060c5e532220dac8e8bc26/kernel/auditsc.c#L2946-L2970

The kernel has an audit rate limit as well as backlog limit. Auditd has a file rotation in place as well.

In theory users can still specify SCMP_ACT_LOG as default action, which would log all syscalls and not exclude the allowed ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log filter level may have a negative performance impact, I'll do some testing around this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark results

Environment

  • Linux 5.16.10
  • cri-o 1.23.1
  • Kubernetes 1.23.4

Test pod

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    securityContext:
      seccompProfile:
        type: RuntimeDefault
    image: nginx:1.21.6

Test

ab -n 100000 -c 100 http://$HOST/index.html

Results

Seccomp Profile Test time (in sec) Requests per second
RuntimeDefault 5.0 19913.1
RuntimeDefault (modified to log sendfile 1x per request) 46.3 2156.4
RuntimeDefault (modified to log close 2x per request) 93.6 1068.7
{ "defaultAction": "SCMP_ACT_LOG" } 521.9 191.5
Unconfined 5.1 19323.4

It's interesting to see the impact of logging, which is executed in the kernel there:
https://github.com/torvalds/linux/blob/7e57714cd0ad2d5bb90e50b5096a0e671dec1ef3/kernel/auditsc.c#L2946-L2970

My local machine spikes audit CPU usage during the test, for example with SCMP_ACT_LOG:
screenshot

Audit settings:

> sudo auditctl -s
enabled 1
failure 1
pid 803
rate_limit 0
backlog_limit 64
lost 163
backlog 65
backlog_wait_time 60000
loginuid_immutable 0 unlocked


// SeccompFilterFlagSpecALlow can be used to disable Speculative Store
// Bypass mitigation. (since Linux 4.17)
SeccompFilterFlagSpecALlow = "SECCOMP_FILTER_FLAG_SPEC_ALLOW"
)

// Architecture is used to represent a specific architecture
// and its sub-architectures
type Architecture struct {
Expand Down