From efd45cca5019e57cab878d14378905ac12f0b071 Mon Sep 17 00:00:00 2001 From: Bjorn Neergaard Date: Wed, 2 Nov 2022 09:06:23 -0600 Subject: [PATCH] config: base GID must be present in the supplementary GIDs array Currently, the spec is unclear whether the list of supplementary GIDs[1] used to create a container process should include the 'base' GID implicitly, or whether the config needs to specify this explicitly if desired. While per POSIX[2] it is permissible for a system to include or exclude the base GID from the list of supplementary GIDs, in all Runtime Spec platforms the base GID is always added, and omitting it can have real security consequences[3] as fully dropping a group is not typically allowed in Unix. This recently led to a number of CVEs in OCI Runtime Spec implementations, as it was concluded that it is necessary for a Unix container to always include the base GID in the list of supplementary GIDs, as originally established by 4.4BSD. Some of the CVEs include: * Podman (CVE-2022-2989) * Moby (CVE-2022-36109) * Buildah (CVE-2022-2990) * CRI-O (CVE-2022-2995) Some examples of how existing implementations handle this: * util-linux calls initgroups(3) with the user's primary GID. [4,5] * shadowutils (Linux) calls initgroups(3) with the user's primary GID. [5,6] * FreeBSD calls initgroups(3) with the user's GID from the password file (aka the primary GID). [7,8] * Solaris calls initgroups(3) with the user's primary GID. [9,10] * Z/OS's session creation code is not available; however initgroups(3) specifies a convention of including the real group ID from the user database (aka the primary GID). [11] * OpenSSH[12] calls initgroups(3) with the user's primary GID; on all of the above platforms this will have the same result as a login(1), including the primary GID in the list of supplementary GIDs. While login(1) has generally been used as the example above, the same holds true for su(1) and other methods of starting a new session (including OpenSSH, as explained above). Given this seems clearly desirable and the OCI runtime is effectively the equivalent of login(1)/su(1)/any other program that sets up a new session, the OCI runtime is the best place to ensure that the list of supplementary group IDs contains the base GID. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_378 [2]: https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap03.html#tag_21_03_00_73 [3]: https://www.benthamsgaze.org/2022/08/22/vulnerability-in-linux-containers-investigation-and-mitigation/ [4]: https://github.com/util-linux/util-linux/blob/96ccdc00e1fcf1684f9734a189baf90e00ff0c9a/login-utils/login.c#L1443 [5]: https://man7.org/linux/man-pages/man3/initgroups.3.html [6]: https://github.com/shadow-maint/shadow/blob/eaebea55a495a56317ed85e959b3599f73c6bdf2/libmisc/setugid.c#L55 [7]: https://github.com/freebsd/freebsd-src/blob/eeaf9d562fe137e0c52b8c346742dccfc8bde015/lib/libutil/login_class.c#L486 [8]: https://www.freebsd.org/cgi/man.cgi?initgroups(3) [9]: https://github.com/illumos/illumos-gate/blob/d9c3e05c2d8261e3f133b5e96a300b4fa6c0f1b7/usr/src/cmd/login/login.c#L1926 [10]: https://illumos.org/man/3C/initgroups [11]: https://www.ibm.com/docs/en/zos/2.2.0?topic=functions-initgroups-initialize-supplementary-group-id-list-process [12]: https://github.com/openssh/openssh-portable/blob/25bd659cc72268f2858c5415740c442ee950049f/session.c#L1379 [CVE-2022-2989]: https://access.redhat.com/security/cve/cve-2022-2989 [CVE-2022-36109]: https://github.com/moby/moby/security/advisories/GHSA-rc4r-wh2q-q6c4 [CVE-2022-2990]: https://access.redhat.com/security/cve/cve-2022-2990 [CVE-2022-2995]: https://access.redhat.com/security/cve/cve-2022-2995 Signed-off-by: Bjorn Neergaard Signed-off-by: Cory Snider --- config.md | 48 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 10 deletions(-) diff --git a/config.md b/config.md index 6d278d180..3c0c3d3cc 100644 --- a/config.md +++ b/config.md @@ -226,16 +226,40 @@ The user for the process is a platform-specific structure that allows specific c For POSIX platforms the `user` structure has the following fields: -* **`uid`** (int, REQUIRED) specifies the user ID (UID) in the [container namespace](glossary.md#container-namespace). -* **`gid`** (int, REQUIRED) specifies the group ID (GID) in the [container namespace](glossary.md#container-namespace). -* **`umask`** (int, OPTIONAL) specifies the [umask][umask_2] of the user. If unspecified, the umask should not be changed from the calling process' umask. -* **`additionalGids`** (array of ints, OPTIONAL) specifies additional group IDs in the [container namespace](glossary.md#container-namespace) to be added to the list of supplementary group IDs. - -On a POSIX platform, processes have both a 'base' GID (as specified in the `gid` field), and an array of supplementary group IDs as described in [IEEE Std 1003.1-2008][ieee-1003.1.2008-xbd-c3.378]. -Runtimes MUST ensure that all group IDs specified by `gid` and `additionalGids` are present in the array of supplementary group IDs. -Runtimes SHOULD preserve the order of `additionalGids`; if the base GID (as specified in the `gid` field) is absent from `additionalGids`, it SHOULD be positioned at the start of the supplementary group ID array. - -Entities which create a container using a runtime on a POSIX platform SHOULD duplicate the base GID (as specified in the `gid` field) as `additionalGids[0]`; this maximizes compatibility and consistency when using runtimes that target a previous version of this specification. +* **`uid`** (int, REQUIRED) specifies a user ID (UID) in the [container namespace](glossary.md#container-namespace). + The container process MUST be started with the [real user ID, effective user ID and saved set-user-ID][ieee-1003.1-2008-xbd-c3.436] set to the value of `uid`. +* **`gid`** (int, REQUIRED) specifies a group ID (GID) in the [container namespace](glossary.md#container-namespace). + The conatiner process MUST be started with the [real group ID, effective group ID, and saved set-group-ID][ieee-1003.1-2008-xbd-c3.189] set to the value of `gid`. +* **`umask`** (int, OPTIONAL) specifies the [umask][umask.2] of the user. + If unspecified, the umask should not be changed from the calling process' umask. +* **`additionalGids`** (array of ints, OPTIONAL) specifies a list of group IDs in the [container namespace](glossary.md#container-namespace) + to be added to the [supplementary group IDs][ieee-1003.1-2008-xbd-c3.378] of the container process. +* **`sgids`** (array of ints, OPTIONAL) specifies a list of group IDs in the [container namespace](glossary.md#container-namespace). + This field takes precedence over `additionalGids`: + if `sgids` is specified, including if set to the empty array, + the container process MUST be started with its [supplementary group IDs][ieee-1003.1-2008-xbd-c3.378] set + such that a call to [getgroups][getgroups.2] from the container process + would return a _grouplist_ which contains all distinct group IDs in `sgids` and no group IDs not in `sgids`. + The group IDs in _grouplist_ SHOULD be in the same order as `sgids`. + +When the configuration does not define `sgids`, +the container process MUST be started with its [supplementary group IDs][ieee-1003.1-2008-xbd-c3.378] set +such that a call to [getgroups][getgroups.2] from the container process +would return a _grouplist_ which contains all distinct group IDs specified by `additionalGids`. +The order of group IDs in `additionalGids` SHOULD be preserved in _grouplist_. +If the group ID specified by `gid` is not present in `additionalGids`, +the container process _grouplist_ MUST additionally have `gid` as index 0. + +_Note: producers of configuration files which wish to be backwards-compatible +with runtimes that are only compliant with earlier revisions of the specification +should always include the `gid` group ID as the first item in the `additionalGids` array +to ensure that `gid` is a supplementary group ID of the container process. +Otherwise, processes in the container may be able to [bypass certain filesystem access controls.][negative-group-perms]_ + +_Note: producers of configuration files which require full control over the supplementary group IDs of the container process +should specify `sgids` and omit `additionalGids`, +and specify a revision of the specification which defines `sgids` as the configuration `ociVersion`. +As the security implications are subtle, use of the `sgids` field is discouraged._ _Note: symbolic name for uid and gid, such as uname and gname respectively, are left to upper levels to derive (i.e. `/etc/passwd` parsing, NSS, etc)_ @@ -987,12 +1011,16 @@ Here is a full example `config.json` for reference. [selinux]:http://selinuxproject.org/page/Main_Page [no-new-privs]: https://www.kernel.org/doc/Documentation/prctl/no_new_privs.txt [proc_2]: https://www.kernel.org/doc/Documentation/filesystems/proc.txt +[getgroups.2]: https://pubs.opengroup.org/onlinepubs/009695399/functions/getgroups.html [umask.2]: http://pubs.opengroup.org/onlinepubs/009695399/functions/umask.html [semver-v2.0.0]: http://semver.org/spec/v2.0.0.html [ieee-1003.1-2008-xbd-c8.1]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_01 [ieee-1003.1-2008-functions-exec]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [naming-a-volume]: https://aka.ms/nb3hqb +[ieee-1003.1-2008-xbd-c3.189]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_189 [ieee-1003.1-2008-xbd-c3.378]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_378 +[ieee-1003.1-2008-xbd-c3.436]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_436 +[negative-group-perms]: https://www.benthamsgaze.org/2022/08/22/vulnerability-in-linux-containers-investigation-and-mitigation/ [capabilities.7]: http://man7.org/linux/man-pages/man7/capabilities.7.html [mount.2]: http://man7.org/linux/man-pages/man2/mount.2.html