Add serverless functionality

ai-dock · Dec 4, 2023 · a7bffb6 · a7bffb6
1 parent 6072aa7
commit a7bffb6
Show file tree

Hide file tree

Showing 24 changed files with 260 additions and 54 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,5 @@
 workspace
+*__pycache__
 build/COPY_ROOT_EXTRA/
 config/authorized_keys
 config/rclone

diff --git a/README.md b/README.md
@@ -90,6 +90,19 @@ Supported Pytorch versions: `2.0.1`
 
 Supported Platforms: `NVIDIA CUDA`, `AMD ROCm`, `CPU`
 
+
+## Building Images
+
+You can self-build from source by editing `docker-compose.yaml` or `.env` and running `docker compose build`.
+
+It is a good idea to leave the source tree alone and copy any edits you would like to make into `build/COPY_ROOT_EXTRA/...`. The structure within this directory will be overlayed on `/` at the end of the build process.
+
+As this overlaying happens after the main build, it is easy to add extra files such as ML models and datasets to your images. You will also be able to rebuild quickly if your file overrides are made here.
+
+Any directories and files that you add into `opt/storage` will be made available in the running container at `$WORKSPACE/storage`.  
+
+This directory is monitored by `inotifywait`. Any items appearing in this directory will be automatically linked to the application directories as defined in `/opt/ai-dock/storage_monitor/etc/mappings.sh`.  This is particularly useful if you need to run several applications that each need to make use of the stored files.
+
 ## Run Locally
 
 A 'feature-complete' `docker-compose.yaml` file is included for your convenience. All features of the image are included - Simply edit the environment variables in `.env`, save and then type `docker compose up`.
@@ -162,7 +175,7 @@ You can use the included `cloudflared` service to make secure connections withou
 | `PROVISIONING_SCRIPT`    | URL of a remote script to execute on init. See [note](#provisioning-script). |
 | `RCLONE_*`               | Rclone configuration - See [rclone documentation](https://rclone.org/docs/#config-file) |
 | `SKIP_ACL`               | Set `true` to skip modifying workspace ACL |
-| `SSH_PORT`               | Set a non-standard port for SSH (default `22`) |
+| `SSH_PORT_LOCAL`         | Set a non-standard port for SSH (default `22`) |
 | `SSH_PUBKEY`             | Your public key for SSH |
 | `WEB_ENABLE_AUTH`        | Enable password protection for web services (default `true`) |
 | `WEB_USER`               | Username for web services (default `user`) |
@@ -203,7 +216,7 @@ The URL must point to a plain text file - GitHub Gists/Pastebin (raw) are suitab
 If you are running locally you may instead opt to mount a script at `/opt/ai-dock/bin/provisioning.sh`.
 
 >[!NOTE]  
->If configured, `sshd`, `caddy`, `cloudflared`, `rclone`, `port redirector` & `logtail` will be launched before provisioning; Any other processes will launch after.
+>If configured, `sshd`, `caddy`, `cloudflared`, `rclone`, `serviceportal`, `storagemonitor` & `logtail` will be launched before provisioning; Any other processes will launch after.
 
 
 >[!WARNING]  
@@ -282,6 +295,24 @@ To manage this service you can use `supervisorctl [start|stop|restart] comfyui`.
 >[!NOTE]  
 >_If you have enabled `CF_QUICK_TUNNELS` a secure `https://[random-auto-generated-sub-domain].trycloudflare.com` link will be created. You can find it at `/var/log/supervisor/quicktunnel-comfyui.log`_
 
+### ComfyUI RP API
+
+This service is available on port `8188` and is used to test the [RunPod serverless](https://link.ai-dock.org/runpod-serverless) API.
+
+You can access the api directly at `/rp-api/runsync` or you can use the Swager/openAPI playground at `/rp-api/docs`.
+
+There are several example payloads included.
+
+This API is available on all platforms - But the container can ony run in serverless mode on RunPod infrastructure.
+
+To learn more about the serverless API see the [serverless section](#runpod-serverless)
+
+<details>
+  <summary>API Playground</summary>
+    ![openapi playground](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api1.png)
+</details>
+
+
 ### Jupyter (with tag `jupyter` only)
 
 The jupyter server will launch a `lab` instance unless you specify `JUPYTER_MODE=notebook`.
@@ -311,6 +342,19 @@ For each service, you will find a direct link and, if you have set `CF_QUICK_TUN
 
 A simple web-based log viewer and process manager are included for convenience.
 
+<details>
+  <summary>Service Portal links</summary>
+    ![Service Portal links page](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/serviceportal-links.png)
+</details>
+<details>
+  <summary>Service Portal logs</summary>
+    ![Service Portal logs page](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/serviceportal-logs.png)
+</details>
+<details>
+  <summary>Service Portal process manager</summary>
+    ![Service Portal processes page](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/serviceportal-processes.png)
+</details>
+
 ### Cloudflared
 
 The Cloudflare tunnel daemon will start if you have provided a token with the `CF_TUNNEL_TOKEN` environment variable.
@@ -385,6 +429,10 @@ This script follows and prints the log files for each of the above services to s
 
 If you are logged into the container you can follow the logs by running `logtail.sh` in your shell.
 
+### Storage Monitor
+
+This service detects changes to files in `$WORKSPACE/storage` and creates symbolic links to the application directories defined in `/opt/ai-dock/storage_monitor/etc/mappings.sh`
+
 ## Open Ports
 
 Some ports need to be exposed for the services to run or for certain features of the provided software to function
@@ -452,4 +500,81 @@ A curated list of VM providers currently offering GPU instances:
 
 ---
 
+## RunPod Serverless
+
+The container can be used as a [RunPod serverless](https://links.ai-dock.org/runpod-serverless) worker.  To enable serverless mode you must run the container with environment variables `SERVERLESS=true` and `WORKSPACE=runpod-volume`.
+
+The handlers will accept a job, process it and upload your images to s3 compatible storage.
+
+You may either set your s3 credentials as environment variables or you can pass them to the worker in the payload.
+
+You should set `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_ENDPOINT_URL` and `AWS_BUCKET_NAME`.
+
+<details>
+  <summary>Serverless template example</summary>
+    ![RunPod serverles template](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/runpod-template.png)
+</details>
+
+If passed in the payload these variables should be in lowercase.
+
+Failure to correctly set your credentials will not resut in job failure and you can still retrieve your images from the network volume.
+
+When used in serverless mode, the container will skip provisioning and will not update ComfyUI or the nodes on start so you must either ensure everyting you need is built into the image (see [Building Images](#building-images)) or first run the container with a network volume to get everything set up before launching your workers.
+
+After launching a serverless worker, any instances of the container launched on the network volume in GPU cloud will also skip auto-updating. All updates must be done manually.
+
+The API is documented in openapi format. You can test it in a running container on the ComfyUI port at `/rp-api/docs` - See [ComfyUI RP API](#comfyui-rp-api) for more information.
+
+---
+
+The API can use multiple handlers which you may define in the payload. Three handlers have been included for your convenience
+
+### Handler: RawWorkflow
+
+This handler should be passed a full ComfyUI workflow in the payload.  It will detect any URL's and download the files into the input directory before replacing the URL value with the local path of the resource.  This is very useful when working with image to image and controlnets.
+
+This is the most flexible of all handlers.
+
+<details>
+  <summary>RawWorkflow schema</summary>
+    ![RawWorkflow schema](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api-schema-rawworkflow.png)
+
+    [example payload](https://raw.githubusercontent.com/ai-dock/comfyui/main/build/COPY_ROOT/opt/serverless/docs/example_payloads/raw_controlnet_t2i_adapters.json)
+</details>
+
+
+### Handler: Text2Image
+
+This is a basic handler that is bound to a static workflow file (`/opt/serverless/workflows/text2image.json`).
+
+You can define several overrides to modify the workflow before processing.
+
+<details>
+  <summary>Text2Image schema</summary>
+    ![Text2Image schema](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api-schema-text2image.png)
+
+    [example payload](https://raw.githubusercontent.com/ai-dock/comfyui/main/build/COPY_ROOT/opt/serverless/docs/example_payloads/bound_text2image.json)
+</details>
+
+### Handler: Image2Image
+
+This is a basic handler that is bound to a static workflow file (`/opt/serverless/workflows/image2image.json`).
+
+You can define several overrides to modify the workflow before processing. 
+
+<details>
+  <summary>Image2Image schema</summary>
+    ![Image2Image schema](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api-schema-text2image.png)
+
+    [example payload](https://raw.githubusercontent.com/ai-dock/comfyui/main/build/COPY_ROOT/opt/serverless/docs/example_payloads/bound_image2image.json)
+</details>
+
+These handlers demonstrate how you can create a very simple endpoint which will require very little frontend work to implement.
+
+You can find example payloads for these handlers [here](https://github.com/ai-dock/comfyui/tree/main/build/COPY_ROOT/opt/serverless/docs/example_payloads)
+
+
+---
+
+
 _The author ([@robballantyne](https://github.com/robballantyne)) may be compensated if you sign up to services linked in this document. Testing multiple variants of GPU images in many different environments is both costly and time-consuming; This helps to offset costs_
diff --git a/build/COPY_ROOT/etc/supervisor/supervisord/conf.d/comfyui_rp_api.conf b/build/COPY_ROOT/etc/supervisor/supervisord/conf.d/comfyui_rp_api.conf
@@ -0,0 +1,21 @@
+[program:comfyui_rp_api]
+command=/opt/ai-dock/bin/supervisor-comfyui-rp-api.sh
+process_name=%(program_name)s
+numprocs=1
+directory=/opt/serverless/providers/runpod
+priority=1500
+autostart=true
+startsecs=5
+startretries=3
+autorestart=unexpected
+stopsignal=TERM
+stopwaitsecs=10
+stopasgroup=true
+killasgroup=true
+stdout_logfile=/var/log/supervisor/comfyui-rp-api.log
+stdout_logfile_maxbytes=10MB
+stdout_logfile_backups=1
+stderr_logfile=/dev/null
+stderr_logfile_maxbytes=0
+stderr_logfile_backups=0
+environment=PROC_NAME="%(program_name)s"
diff --git a/build/COPY_ROOT/opt/ai-dock/bin/preflight.sh b/build/COPY_ROOT/opt/ai-dock/bin/preflight.sh
@@ -4,14 +4,12 @@
 
 function preflight_main() {
     preflight_copy_notebook
-    preflight_link_baked_models
     preflight_update_comfyui
     printf "%s" "${COMFYUI_FLAGS}" > /etc/comfyui_flags.conf
 }
 
 function preflight_serverless() {
-  printf "Refusing to update ComfyUI in serverless mode\n"
-  preflight_link_baked_models
+  printf "Skipping ComfyUI updates in serverless mode\n"
   printf "%s" "${COMFYUI_FLAGS}" > /etc/comfyui_flags.conf
 
 }
@@ -32,28 +30,6 @@ function preflight_update_comfyui() {
     fi
 }
 
-# Baked in models cannot exist in /opt/ComfyUI or they will sync with volume mounts
-# We don't want that, so they live in /opt/model_repository and get symlinked at runtime
-# We force this, because loading from volumes will always be slower, so let's avoid having them there
-
-function preflight_link_baked_models() {
-    for file in /opt/model_repository/checkpoints/*; do
-        ln -sf "$file" "${WORKSPACE}ComfyUI/models/checkpoints/"
-    done
-    for file in /opt/model_repository/controlnet/*; do
-        ln -sf "$file" "${WORKSPACE}ComfyUI/models/controlnet/"
-    done
-    for file in /opt/model_repository/esrgan/*; do
-        ln -sf "$file" "${WORKSPACE}ComfyUI/models/upscale_models/"
-    done
-    for file in /opt/model_repository/lora/*; do
-        ln -sf "$file" "${WORKSPACE}ComfyUI/models/loras/"
-    done
-    for file in /opt/model_repository/vae/*; do
-        ln -sf "$file" "${WORKSPACE}ComfyUI/models/vae/"
-    done
-}
-
 if [[ ${SERVERLESS,,} != "true" ]]; then
     preflight_main "$@"
 else

diff --git a/build/COPY_ROOT/opt/ai-dock/bin/supervisor-comfyui-rp-api.sh b/build/COPY_ROOT/opt/ai-dock/bin/supervisor-comfyui-rp-api.sh
@@ -0,0 +1,31 @@
+#!/bin/bash
+
+trap cleanup EXIT
+
+LISTEN_PORT=38188
+SERVICE_NAME="RunPod Serverless API"
+
+function cleanup() {
+    kill $(jobs -p) > /dev/null 2>&1
+}
+
+
+function start() {
+    if [[ ${SERVERLESS,,} = "true" ]]; then
+        printf "Refusing to start hosted API service in serverless mode\n"
+        exec sleep 10
+    fi
+
+    printf "Starting %s...\n" ${SERVICE_NAME}
+
+    kill $(lsof -t -i:$LISTEN_PORT) > /dev/null 2>&1 &
+    wait -n
+
+    cd /opt/serverless/providers/runpod && \
+    micromamba run -n serverless python worker.py \
+        --rp_serve_api \
+        --rp_api_port $LISTEN_PORT \
+        --rp_api_host 127.0.0.1
+}
+
+start 2>&1
diff --git a/build/COPY_ROOT/opt/ai-dock/storage_monitor/etc/mappings.sh b/build/COPY_ROOT/opt/ai-dock/storage_monitor/etc/mappings.sh
@@ -0,0 +1,10 @@
+# Key is relative to $WORKSPACE/storage/
+
+declare -A storage_map
+storage_map["stable_diffusion/models/ckpt"]="/opt/ComfyUI/models/checkpoints"
+storage_map["stable_diffusion/models/lora"]="/opt/ComfyUI/models/loras"
+storage_map["stable_diffusion/models/controlnet"]="/opt/ComfyUI/models/controlnet"
+storage_map["stable_diffusion/models/vae"]="/opt/ComfyUI/models/vae"
+storage_map["stable_diffusion/models/esrgan"]="/opt/ComfyUI/models/upscale_models"
+
+# Add more mappings for other repository directories as needed
diff --git a/build/COPY_ROOT/opt/caddy/share/service_config_18188 b/build/COPY_ROOT/opt/caddy/share/service_config_18188
@@ -0,0 +1,12 @@
+:!PROXY_PORT {
+    handle_path /openapi.json {
+        root * /opt/serverless/docs/swagger/openapi.yaml
+        file_server
+    }
+
+    handle_path /rp-api* {
+        reverse_proxy localhost:38188
+    }
+
+    reverse_proxy localhost:!LISTEN_PORT
+}
diff --git a/build/COPY_ROOT/opt/caddy/share/service_config_18188_auth b/build/COPY_ROOT/opt/caddy/share/service_config_18188_auth
@@ -0,0 +1,16 @@
+:!PROXY_PORT {
+    basicauth * {
+      import /opt/caddy/etc/basicauth
+    }
+
+    handle_path /openapi.json {
+        root * /opt/serverless/docs/swagger/openapi.yaml
+        file_server
+    }
+
+    handle_path /rp-api* {
+        reverse_proxy localhost:38188
+    }
+
+    reverse_proxy localhost:!LISTEN_PORT
+}
diff --git a/build/COPY_ROOT/opt/serverless/handlers/basehandler.py b/build/COPY_ROOT/opt/serverless/handlers/basehandler.py
@@ -4,6 +4,7 @@
 import time
 import os
 import base64
+import shutil
 from utils.s3utils import s3utils
 from utils.network import Network
 
@@ -142,7 +143,12 @@ def get_result(self, job_id):
                 for image in outputs[item]["images"]:
                     original_path = f"{self.OUTPUT_DIR}{image['subfolder']}/{image['filename']}"
                     new_path = f"{custom_output_dir}/{image['filename']}"
-                    os.rename(original_path, new_path)
+                    # Handle duplicated request where output file in not re-generated
+                    if os.path.islink(original_path):
+                        shutil.copyfile(os.path.realpath(original_path), new_path)
+                    else:
+                        os.rename(original_path, new_path)
+                        os.symlink(new_path, original_path)
                     key = f"{self.request_id}/{image['filename']}"
                     self.result["images"].append({
                         "local_path": new_path,

diff --git a/build/COPY_ROOT/opt/serverless/handlers/image2image.py b/build/COPY_ROOT/opt/serverless/handlers/image2image.py
@@ -43,7 +43,6 @@ def apply_modifiers(self):
         self.prompt["prompt"]["7"]["inputs"]["text"] = self.get_value(
             "exclude_text",
             "")
-        self.prompt["prompt"]["9"]["inputs"]["filename_prefix"] = f"{self.request_id}/img-{timestr}"
         self.prompt["prompt"]["10"]["inputs"]["image"] = self.get_value(
             "input_image",
             "")

diff --git a/build/COPY_ROOT/opt/serverless/handlers/text2image.py b/build/COPY_ROOT/opt/serverless/handlers/text2image.py
@@ -52,7 +52,6 @@ def apply_modifiers(self):
         self.prompt["prompt"]["7"]["inputs"]["text"] = self.get_value(
             "exclude_text",
             "")
-        self.prompt["prompt"]["9"]["inputs"]["filename_prefix"] = f"{self.request_id}/img-{timestr}"
 
 
 

diff --git a/build/COPY_ROOT/opt/serverless/providers/runpod/worker.py b/build/COPY_ROOT/opt/serverless/providers/runpod/worker.py
@@ -21,10 +21,11 @@ def get_handler(payload):
 def worker(event):
     result = {}
     try:
-        if is_test_job(event):
-            event["id"] = str(uuid.uuid4())
         payload = event["input"]
-        payload["request_id"] = event["id"]
+        if is_test_job(event):
+            payload["request_id"] = str(uuid.uuid4())
+        else:
+            payload["request_id"] = event["id"]
         handler = get_handler(payload)
         result = handler.handle()
     except Exception as e:

diff --git a/build/COPY_ROOT_EXTRA/opt/model_repository/checkpoints/.gitkeep b/build/COPY_ROOT_EXTRA/opt/model_repository/checkpoints/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/model_repository/controlnet/.gitkeep b/build/COPY_ROOT_EXTRA/opt/model_repository/controlnet/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/model_repository/esrgan/.gitkeep b/build/COPY_ROOT_EXTRA/opt/model_repository/esrgan/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/model_repository/lora/.gitkeep b/build/COPY_ROOT_EXTRA/opt/model_repository/lora/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/model_repository/vae/.gitkeep b/build/COPY_ROOT_EXTRA/opt/model_repository/vae/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/storage/ckpt/.gitkeep b/build/COPY_ROOT_EXTRA/opt/storage/ckpt/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/storage/controlnet/.gitkeep b/build/COPY_ROOT_EXTRA/opt/storage/controlnet/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/storage/esrgan/.gitkeep b/build/COPY_ROOT_EXTRA/opt/storage/esrgan/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/storage/lora/.gitkeep b/build/COPY_ROOT_EXTRA/opt/storage/lora/.gitkeep
diff --git a/build/COPY_ROOT_EXTRA/opt/storage/vae/.gitkeep b/build/COPY_ROOT_EXTRA/opt/storage/vae/.gitkeep
-Original file line number
+Diff line change
@@ Expand Up / @@ -52,7 +52,6 @@ def apply_modifiers(self): @@
             self.prompt["prompt"]["7"]["inputs"]["text"] = self.get_value(
                 "exclude_text",
                 "")
-            self.prompt["prompt"]["9"]["inputs"]["filename_prefix"] = f"{self.request_id}/img-{timestr}"
@@ Expand Down @@