Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: expose docker cache-from capability #940

Open
1 of 2 tasks
LucasRoesler opened this issue Oct 27, 2022 · 4 comments
Open
1 of 2 tasks

Proposal: expose docker cache-from capability #940

LucasRoesler opened this issue Oct 27, 2022 · 4 comments

Comments

@LucasRoesler
Copy link
Member

Expected Behaviour

When building a function using faas-cli build, it should be possible to reference an external source for the docker build/ayer cache, the most common case would be referencing a pervious build. When enabled, it would allow infrequently changed steps, for example pip install to be cached between builds and reduce the total build time.

Current Behaviour

Only the local build cache can be used by faas-cli build. This is most noticeable in CI/CD workflows where the docker builder is often isolated and new between each build. For example, in Github Actions, this seems to be the case. Because the build cache starts empty, every layer of the function build must be rebuilt, even if only a small change was made at the end of the docker file. This is very noticable in Python and NodeJS projects where the final step is often just copying a small amount of function code, but there is often a slow pip install or npm -i.

Why do you need this?

This will improve build times in CI/CD environments.

Who is this for?

I work at Contiamo but the feature could benefit any function build

Are you a GitHub Sponsor (Yes/No?)

Check at: https://github.com/sponsors/openfaas

  • Yes
  • No

List All Possible Solutions and Workarounds

  1. one possible solution is to add docker pull as a build step prior to running faas-cli build. This would seed the docker cache with the relevant images. This has a disadvantage that it will always pull those images even if they could not be used in the cache. This requires time and uses network bandwidth that wasn't needed, actually making the build longer than if that step was skipped. In some of the other options we will see that it can be more efficient.

  2. Use the --shrinkwrap flag to prepare the build context and then use docker build --cache-from to pass references to the candidate images for the build cache.

  3. Allow passing a flag or set a yaml config for faas-cli build so that it can set the --cache-from flag. This would allow the inline cache mode when the user also passes the BUILDKIT_INLINE_CACHE=1 build arg.

  4. Allow passing arbitrary flags=value pairs to faas-cli build so that the developer can set the appropriate flag: --cache-from, if using docker build, or --cache-from / --cache-to, when using docker buildx.

Which Solution Do You Recommend?

I think either 3 or 4 are the best alternatives, option 3 is most focused on just this problem, but option 4 would be the most flexible.

For any solution, I think the feature shoudl be opt in, meaning the feature does not automatically enable additional caching by default. The developer must explicitly enable this additional caching behavior.

Option 3 experience / implementation

For option 3 I think we could implement this as just providing the --cache-from flag in faas-cli and then adding a new section to the function spec.

The DX would look like this

faas-cli build -f stack.yaml --cache-from=ghcr.io/lucasroesler/my-function:latest,ghcr.io/lucasroesler/my-other-function:latest --build-arg BUILDKIT_INLINE_CACHE=1

Alternatively, in the YAML it could look like this

version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:latest
        - ghcr.io/lucasroesler/other-cache-candidate:latest
    build_args:
      BUILDKIT_INLINE_CACHE: 1

In both configurations, the values should simply be passed to the --cache-from as is without modification. This will then allow usage of the advanced options when buildx is explicitly enabled in the environment, for example, "type=local,src=path/to/dir"

--cache-from stringArray        External cache sources (e.g., "user/app:cache","type=local,src=path/to/dir")

It would be nice to also support the cache-to flag from buildx, but this flag is not supported by the default docker build and would cause an error. However, it allows for much more advanced caching options, such as storing the cache locally in a folder, in a blob storage, or in a registry. It also allows "max" mode, caching all of the build layers, including intermediate layers from multi-stage builds. This provides significantly more cache hit opportunities. If we want to allow this opportunity, but add the required documentation, I think it could look like this

version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:latest
        - ghcr.io/lucasroesler/other-cache-candidate:latest
      to:
        - ghcr.io/lucasroesler/my-function:latest
    build_args:
      BUILDKIT_INLINE_CACHE: 1

or

version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:cache
        - ghcr.io/lucasroesler/other-cache-candidate:cache
      to:
        - type=registry,ref=ghcr.io/lucasroesler/my-function:cache,mode=max

Option 4 experience / implementation

Option 4 enables the same experience, but would look like this

faas-cli build -f stack.yaml  --builder-flag "--cache-from=ghcr.io/lucasroesler/my-function:latest,ghcr.io/lucasroesler/my-other-function:latest" --build-arg BUILDKIT_INLINE_CACHE=1

and

version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  telephone:
    lang: python3-flask-debian
    handler: ./telephone
    image: ghcr.io/lucasroesler/my-function:latest
    builder_flags:
      - --cache-from=ghcr.io/lucasroesler/my-function:latest,ghcr.io/lucasroesler/other-cache-candidate:latest
    build_args:
      BUILDKIT_INLINE_CACHE: 1

Caching impact

inline caching

There are several styles and options of docker layer caching enabled by docker and buildkit, note that buildkit is required for this feature.

The first and simplest is called inline caching. This adds some additional metadata to the image config to indicate that the layers can be reused in build caches. It is only some additional metadata in the docker manifest config and it requires that the image is built with this inline cache enabled. The result has no impact an the actual image or layer sizes because it is only additional metadata that is pushed to the remote registry.
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 -t caching-test:with-cache .

I tested this with an image and the build size was the same with and without the inline cache. This can be tested with any docker image

DOCKER_BUILDKIT=1 docker build caching-test:without-cache .
DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 -t caching-test:with-cache .
docker images | grep "cacheing-test"

During subsequent builds, the builder will download just this metadata to determine if a cache hit is possible and then, only when it is useful, download the actual layer data. The result has no impact an the actual image or layer sizes because it is only additional metadata that is pushed to the remote registry.

other caching modes

Note that there are two caching modes, min and max. The inline caching will use min mode, which is why it has no impact on the final size, it is just a tiny amount of metadata.

With max mode all build layers, including ephemeral multi-stage build layers are saved. This clearly results in more data, but is not supported by the inline cache type. Instead these layers can be exported to a local folder, a blob storage, or to a docker registry.

To use these other destinations or the max mode, we would need to enable support for the --cache-to flag.

Additional caching background

This feature is also implemented in the docker-build-push Github action, see here https://github.com/docker/build-push-action/blob/master/docs/advanced/cache.md. This could provide a good example for how to document the feature.

Relevant docs about docker/buildkit caching:

Context

I have a python function that we build frequently because it bundles a machine learning model in the image. As a result, the last layer is just copying the machine learning model but all of the other layers (the dependencies and the function code) are not frequently changing.

In our CI/CD system (github actions) the build cache is always empty, which means our builds spend a lot of time on the apt get and pip install stages even though these are not actually changing and would normally be skipped when built on my local laptop, where the build cache contains previous versions of the function.

Your Environment

  • FaaS-CLI version ( Full output from: faas-cli version ): 0.14.11

  • Docker version ( Full output from: docker version ):

    Client: Docker Engine - Community
    Version:           20.10.14
    API version:       1.41
    Go version:        go1.16.15
    Git commit:        a224086
    Built:             Thu Mar 24 01:47:58 2022
    OS/Arch:           linux/amd64
    Context:           default
    Experimental:      true
    
    Server: Docker Engine - Community
    Engine:
    Version:          20.10.14
    API version:      1.41 (minimum version 1.12)
    Go version:       go1.16.15
    Git commit:       87a90dc
    Built:            Thu Mar 24 01:45:50 2022
    OS/Arch:          linux/amd64
    Experimental:     false
    containerd:
    Version:          1.5.11
    GitCommit:        3df54a852345ae127d1fa3092b95168e4a88e2f8
    runc:
    Version:          1.0.3
    GitCommit:        v1.0.3-0-gf46b6ba
    docker-init:
    Version:          0.19.0
    GitCommit:        de40ad0
    
@alexellis
Copy link
Member

Thanks for suggesting this feature and for providing the examples too.

It's on our list of things to review / prioritise.

@LucasRoesler
Copy link
Member Author

@alexellis , this is something I would be willing to work on and support, so if there is a preferred design or an alternative, just let me know.

@kevin-lindsay-1
Copy link
Sponsor

For the sake of a specific implementation, we use a command that looks like this:

docker build
  ...
  --cache-from=type=registry,ref=$CI_REGISTRY_IMAGE:cache-$TARGET_PLATFORM
  --cache-to=type=registry,ref=$CI_REGISTRY_IMAGE:cache-$TARGET_PLATFORM,mode=max
  --output=type=image,push=true

@LucasRoesler
Copy link
Member Author

@alexellis After discussing the proposal more in the community calls, we decided it would be good to try and simplify the yaml/flags a little bit so that the most common usecases do not require extensive configuration.

I present two options below.

Initial simplification

First, to simplify the configuration, if build-cache is configured or if any cache flags are sent to the CLI, then we will

  • set the DOCKER_BUILDKIT=1 env variable, and
  • automatically add BUILDKIT_INLINE_CACHE=1 as a build arg.

Second, the yaml and cli will support two options to control the cache from and to.

In general, if the value is a string with no , or =, then we assume it is the cache image reference and construct the correct parameterized argument for the docker cache flags.

Also, simplify the configuration, we assume that users of the cache to will wan to use the max mode to cache the multistage steps.

This presents several configuation combinations. Below i outline each configuration, what it means, and the equivalent docker build command for comparison. He main simplification in this proposal is removing the specification of build args and some of the cache parameters for the simple cases.

  1. A simple inline build cache
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  my-function:
    lang: python3-flask-debian
    handler: ./my-function
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:latest

This is the simplest cache option, the cache metadata is stored with the output image. This is equivalent to

DOCKER_BUILDKIT=1 docker build \
  --cache-from ghcr.io/lucasroesler/my-function:latest \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .
  1. External cache destination,
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  my-function:
    lang: python3-flask-debian
    handler: ./my-function
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache:
      from:
        - ghcr.io/lucasroesler/my-function:buildcache
      to:
        - ghcr.io/lucasroesler/my-function:buildcache

In this configuration we assume that the registry cache is desired and this becomes the equivalent of

DOCKER_BUILDKIT=1 docker build \
  --cache-from type=registry,ref=ghcr.io/lucasroesler/my-function:buildcache \
  --cache-to type=registry,ref=ghcr.io/lucasroesler/my-function:buildcache,mode=max \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .
  1. External cache with parameters, we will always try to parse the parameters, if they exist, so this is also valid
version: 1.0
provider:
 name: openfaas
 gateway: http://127.0.0.1:8080
functions:
 my-function:
   lang: python3-flask-debian
   handler: ./my-function
   image: ghcr.io/lucasroesler/my-function:latest
   build_cache:
     from:
       - ghcr.io/lucasroesler/my-function:buildcache
     to:
       - ref=ghcr.io/lucasroesler/my-function:buildcache,mode=max

In this configuration we assume that the registry cache is desired and this becomes the equivalent of

DOCKER_BUILDKIT=1 docker build \
  --cache-from type=registry,ref=ghcr.io/lucasroesler/my-function:buildcache \
  --cache-to type=registry,ref=ghcr.io/lucasroesler/my-function:buildcache,mode=max \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .
  1. Using alternative cache destination, if the user specifies a type parameter, then we use those parameters as is without any modification. This is the most advanced usage and allows the user to simply follow the documentation from buildkit (if they wish) https://github.com/moby/buildkit#export-cache
version: 1.0
provider:
 name: openfaas
 gateway: http://127.0.0.1:8080
functions:
 my-function:
   lang: python3-flask-debian
   handler: ./my-function
   image: ghcr.io/lucasroesler/my-function:latest
   build_cache:
     from:
       - type=gha
     to:
       - type=gha,mode=max

In this configuration we simply pass the parameters as is

DOCKER_BUILDKIT=1 docker build \
  --cache-from type=gha \
  --cache-to type=gha,mode=max \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .

Further simplification

I think that example (1) and (2) could potentiallly further simplified. I believe the two most common configurations will be (1) inline cache to the current image or (b) external max cache to a separate tag.

To simplify these two cases we can accept a single string value for the build_cache or an object for advanced configuration. This creates the three configuration examples:

  1. For inline mode,
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  my-function:
    lang: python3-flask-debian
    handler: ./my-function
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache: inline

This is equivalent to

DOCKER_BUILDKIT=1 docker build \
  --cache-from ghcr.io/lucasroesler/my-function:latest \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .
  1. External cache destination,
version: 1.0
provider:
  name: openfaas
  gateway: http://127.0.0.1:8080
functions:
  my-function:
    lang: python3-flask-debian
    handler: ./my-function
    image: ghcr.io/lucasroesler/my-function:latest
    build_cache: max

In this configuration we assume that the registry cache is desired and this cache from/to the buildcache tag of the function image.

DOCKER_BUILDKIT=1 docker build \
  --cache-from type=registry,ref=ghcr.io/lucasroesler/my-function:buildcache \
  --cache-to type=registry,ref=ghcr.io/lucasroesler/my-function:buildcache,mode=max \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .
  1. Advanced mode, the user specifies the from/to values and matching the configuration options from buildkit https://github.com/moby/buildkit#export-cache
version: 1.0
provider:
 name: openfaas
 gateway: http://127.0.0.1:8080
functions:
 my-function:
   lang: python3-flask-debian
   handler: ./my-function
   image: ghcr.io/lucasroesler/my-function:latest
   build_cache:
     from:
       - type=gha
     to:
       - type=gha,mode=max

In this configuration we simply pass the parameters as is

DOCKER_BUILDKIT=1 docker build \
  --cache-from type=gha \
  --cache-to type=gha,mode=max \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  -t ghcr.io/lucasroesler/my-function:latest \
  .

User experience

There are a couple of errors that can occur and should probably be handled directly in the CLI.

Docker buildx

The max and other advanced modes require buildx, typically the recommendation would be to use docker buildx install. This caused docker build to become equivalent to docker buildx build.

When cache is enabled, we can either

  1. add a check and add a warning that docker buildx install is required
  2. change the build from docker build to docker buildx build

Registry errors

Not all registries will support the cache images, in my experience, this is typically seen as a 400 error during the cache export stage.

When cache mode is enabled and the builder returns an error (doesn't exit cleanly) then we should print a warning and a link to the docs about known supported/unsupported registries. For example GCR is does not support cache but Google Artifact Registry does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants