Skip to content

Commit

Permalink
Add serverless functionality
Browse files Browse the repository at this point in the history
  • Loading branch information
robballantyne committed Dec 4, 2023
1 parent 6072aa7 commit a7bffb6
Show file tree
Hide file tree
Showing 24 changed files with 260 additions and 54 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
workspace
*__pycache__
build/COPY_ROOT_EXTRA/
config/authorized_keys
config/rclone
Expand Down
129 changes: 127 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,19 @@ Supported Pytorch versions: `2.0.1`

Supported Platforms: `NVIDIA CUDA`, `AMD ROCm`, `CPU`


## Building Images

You can self-build from source by editing `docker-compose.yaml` or `.env` and running `docker compose build`.

It is a good idea to leave the source tree alone and copy any edits you would like to make into `build/COPY_ROOT_EXTRA/...`. The structure within this directory will be overlayed on `/` at the end of the build process.

As this overlaying happens after the main build, it is easy to add extra files such as ML models and datasets to your images. You will also be able to rebuild quickly if your file overrides are made here.

Any directories and files that you add into `opt/storage` will be made available in the running container at `$WORKSPACE/storage`.

This directory is monitored by `inotifywait`. Any items appearing in this directory will be automatically linked to the application directories as defined in `/opt/ai-dock/storage_monitor/etc/mappings.sh`. This is particularly useful if you need to run several applications that each need to make use of the stored files.

## Run Locally

A 'feature-complete' `docker-compose.yaml` file is included for your convenience. All features of the image are included - Simply edit the environment variables in `.env`, save and then type `docker compose up`.
Expand Down Expand Up @@ -162,7 +175,7 @@ You can use the included `cloudflared` service to make secure connections withou
| `PROVISIONING_SCRIPT` | URL of a remote script to execute on init. See [note](#provisioning-script). |
| `RCLONE_*` | Rclone configuration - See [rclone documentation](https://rclone.org/docs/#config-file) |
| `SKIP_ACL` | Set `true` to skip modifying workspace ACL |
| `SSH_PORT` | Set a non-standard port for SSH (default `22`) |
| `SSH_PORT_LOCAL` | Set a non-standard port for SSH (default `22`) |
| `SSH_PUBKEY` | Your public key for SSH |
| `WEB_ENABLE_AUTH` | Enable password protection for web services (default `true`) |
| `WEB_USER` | Username for web services (default `user`) |
Expand Down Expand Up @@ -203,7 +216,7 @@ The URL must point to a plain text file - GitHub Gists/Pastebin (raw) are suitab
If you are running locally you may instead opt to mount a script at `/opt/ai-dock/bin/provisioning.sh`.

>[!NOTE]
>If configured, `sshd`, `caddy`, `cloudflared`, `rclone`, `port redirector` & `logtail` will be launched before provisioning; Any other processes will launch after.
>If configured, `sshd`, `caddy`, `cloudflared`, `rclone`, `serviceportal`, `storagemonitor` & `logtail` will be launched before provisioning; Any other processes will launch after.

>[!WARNING]
Expand Down Expand Up @@ -282,6 +295,24 @@ To manage this service you can use `supervisorctl [start|stop|restart] comfyui`.
>[!NOTE]
>_If you have enabled `CF_QUICK_TUNNELS` a secure `https://[random-auto-generated-sub-domain].trycloudflare.com` link will be created. You can find it at `/var/log/supervisor/quicktunnel-comfyui.log`_
### ComfyUI RP API

This service is available on port `8188` and is used to test the [RunPod serverless](https://link.ai-dock.org/runpod-serverless) API.

You can access the api directly at `/rp-api/runsync` or you can use the Swager/openAPI playground at `/rp-api/docs`.

There are several example payloads included.

This API is available on all platforms - But the container can ony run in serverless mode on RunPod infrastructure.

To learn more about the serverless API see the [serverless section](#runpod-serverless)

<details>
<summary>API Playground</summary>
![openapi playground](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api1.png)
</details>


### Jupyter (with tag `jupyter` only)

The jupyter server will launch a `lab` instance unless you specify `JUPYTER_MODE=notebook`.
Expand Down Expand Up @@ -311,6 +342,19 @@ For each service, you will find a direct link and, if you have set `CF_QUICK_TUN

A simple web-based log viewer and process manager are included for convenience.

<details>
<summary>Service Portal links</summary>
![Service Portal links page](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/serviceportal-links.png)
</details>
<details>
<summary>Service Portal logs</summary>
![Service Portal logs page](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/serviceportal-logs.png)
</details>
<details>
<summary>Service Portal process manager</summary>
![Service Portal processes page](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/serviceportal-processes.png)
</details>

### Cloudflared

The Cloudflare tunnel daemon will start if you have provided a token with the `CF_TUNNEL_TOKEN` environment variable.
Expand Down Expand Up @@ -385,6 +429,10 @@ This script follows and prints the log files for each of the above services to s

If you are logged into the container you can follow the logs by running `logtail.sh` in your shell.

### Storage Monitor

This service detects changes to files in `$WORKSPACE/storage` and creates symbolic links to the application directories defined in `/opt/ai-dock/storage_monitor/etc/mappings.sh`

## Open Ports

Some ports need to be exposed for the services to run or for certain features of the provided software to function
Expand Down Expand Up @@ -452,4 +500,81 @@ A curated list of VM providers currently offering GPU instances:

---

## RunPod Serverless

The container can be used as a [RunPod serverless](https://links.ai-dock.org/runpod-serverless) worker. To enable serverless mode you must run the container with environment variables `SERVERLESS=true` and `WORKSPACE=runpod-volume`.

The handlers will accept a job, process it and upload your images to s3 compatible storage.

You may either set your s3 credentials as environment variables or you can pass them to the worker in the payload.

You should set `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_ENDPOINT_URL` and `AWS_BUCKET_NAME`.

<details>
<summary>Serverless template example</summary>
![RunPod serverles template](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/runpod-template.png)
</details>

If passed in the payload these variables should be in lowercase.

Failure to correctly set your credentials will not resut in job failure and you can still retrieve your images from the network volume.

When used in serverless mode, the container will skip provisioning and will not update ComfyUI or the nodes on start so you must either ensure everyting you need is built into the image (see [Building Images](#building-images)) or first run the container with a network volume to get everything set up before launching your workers.

After launching a serverless worker, any instances of the container launched on the network volume in GPU cloud will also skip auto-updating. All updates must be done manually.

The API is documented in openapi format. You can test it in a running container on the ComfyUI port at `/rp-api/docs` - See [ComfyUI RP API](#comfyui-rp-api) for more information.

---

The API can use multiple handlers which you may define in the payload. Three handlers have been included for your convenience

### Handler: RawWorkflow

This handler should be passed a full ComfyUI workflow in the payload. It will detect any URL's and download the files into the input directory before replacing the URL value with the local path of the resource. This is very useful when working with image to image and controlnets.

This is the most flexible of all handlers.

<details>
<summary>RawWorkflow schema</summary>
![RawWorkflow schema](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api-schema-rawworkflow.png)

[example payload](https://raw.githubusercontent.com/ai-dock/comfyui/main/build/COPY_ROOT/opt/serverless/docs/example_payloads/raw_controlnet_t2i_adapters.json)
</details>


### Handler: Text2Image

This is a basic handler that is bound to a static workflow file (`/opt/serverless/workflows/text2image.json`).

You can define several overrides to modify the workflow before processing.

<details>
<summary>Text2Image schema</summary>
![Text2Image schema](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api-schema-text2image.png)

[example payload](https://raw.githubusercontent.com/ai-dock/comfyui/main/build/COPY_ROOT/opt/serverless/docs/example_payloads/bound_text2image.json)
</details>

### Handler: Image2Image

This is a basic handler that is bound to a static workflow file (`/opt/serverless/workflows/image2image.json`).

You can define several overrides to modify the workflow before processing.

<details>
<summary>Image2Image schema</summary>
![Image2Image schema](https://raw.githubusercontent.com/ai-dock/comfyui/main/.github/images/api-schema-text2image.png)

[example payload](https://raw.githubusercontent.com/ai-dock/comfyui/main/build/COPY_ROOT/opt/serverless/docs/example_payloads/bound_image2image.json)
</details>

These handlers demonstrate how you can create a very simple endpoint which will require very little frontend work to implement.

You can find example payloads for these handlers [here](https://github.com/ai-dock/comfyui/tree/main/build/COPY_ROOT/opt/serverless/docs/example_payloads)


---


_The author ([@robballantyne](https://github.com/robballantyne)) may be compensated if you sign up to services linked in this document. Testing multiple variants of GPU images in many different environments is both costly and time-consuming; This helps to offset costs_
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[program:comfyui_rp_api]
command=/opt/ai-dock/bin/supervisor-comfyui-rp-api.sh
process_name=%(program_name)s
numprocs=1
directory=/opt/serverless/providers/runpod
priority=1500
autostart=true
startsecs=5
startretries=3
autorestart=unexpected
stopsignal=TERM
stopwaitsecs=10
stopasgroup=true
killasgroup=true
stdout_logfile=/var/log/supervisor/comfyui-rp-api.log
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=1
stderr_logfile=/dev/null
stderr_logfile_maxbytes=0
stderr_logfile_backups=0
environment=PROC_NAME="%(program_name)s"
26 changes: 1 addition & 25 deletions build/COPY_ROOT/opt/ai-dock/bin/preflight.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,12 @@

function preflight_main() {
preflight_copy_notebook
preflight_link_baked_models
preflight_update_comfyui
printf "%s" "${COMFYUI_FLAGS}" > /etc/comfyui_flags.conf
}

function preflight_serverless() {
printf "Refusing to update ComfyUI in serverless mode\n"
preflight_link_baked_models
printf "Skipping ComfyUI updates in serverless mode\n"
printf "%s" "${COMFYUI_FLAGS}" > /etc/comfyui_flags.conf

}
Expand All @@ -32,28 +30,6 @@ function preflight_update_comfyui() {
fi
}

# Baked in models cannot exist in /opt/ComfyUI or they will sync with volume mounts
# We don't want that, so they live in /opt/model_repository and get symlinked at runtime
# We force this, because loading from volumes will always be slower, so let's avoid having them there

function preflight_link_baked_models() {
for file in /opt/model_repository/checkpoints/*; do
ln -sf "$file" "${WORKSPACE}ComfyUI/models/checkpoints/"
done
for file in /opt/model_repository/controlnet/*; do
ln -sf "$file" "${WORKSPACE}ComfyUI/models/controlnet/"
done
for file in /opt/model_repository/esrgan/*; do
ln -sf "$file" "${WORKSPACE}ComfyUI/models/upscale_models/"
done
for file in /opt/model_repository/lora/*; do
ln -sf "$file" "${WORKSPACE}ComfyUI/models/loras/"
done
for file in /opt/model_repository/vae/*; do
ln -sf "$file" "${WORKSPACE}ComfyUI/models/vae/"
done
}

if [[ ${SERVERLESS,,} != "true" ]]; then
preflight_main "$@"
else
Expand Down
31 changes: 31 additions & 0 deletions build/COPY_ROOT/opt/ai-dock/bin/supervisor-comfyui-rp-api.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash

trap cleanup EXIT

LISTEN_PORT=38188
SERVICE_NAME="RunPod Serverless API"

function cleanup() {
kill $(jobs -p) > /dev/null 2>&1
}


function start() {
if [[ ${SERVERLESS,,} = "true" ]]; then
printf "Refusing to start hosted API service in serverless mode\n"
exec sleep 10
fi

printf "Starting %s...\n" ${SERVICE_NAME}

kill $(lsof -t -i:$LISTEN_PORT) > /dev/null 2>&1 &
wait -n

cd /opt/serverless/providers/runpod && \
micromamba run -n serverless python worker.py \
--rp_serve_api \
--rp_api_port $LISTEN_PORT \
--rp_api_host 127.0.0.1
}

start 2>&1
10 changes: 10 additions & 0 deletions build/COPY_ROOT/opt/ai-dock/storage_monitor/etc/mappings.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Key is relative to $WORKSPACE/storage/

declare -A storage_map
storage_map["stable_diffusion/models/ckpt"]="/opt/ComfyUI/models/checkpoints"
storage_map["stable_diffusion/models/lora"]="/opt/ComfyUI/models/loras"
storage_map["stable_diffusion/models/controlnet"]="/opt/ComfyUI/models/controlnet"
storage_map["stable_diffusion/models/vae"]="/opt/ComfyUI/models/vae"
storage_map["stable_diffusion/models/esrgan"]="/opt/ComfyUI/models/upscale_models"

# Add more mappings for other repository directories as needed
12 changes: 12 additions & 0 deletions build/COPY_ROOT/opt/caddy/share/service_config_18188
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
:!PROXY_PORT {
handle_path /openapi.json {
root * /opt/serverless/docs/swagger/openapi.yaml
file_server
}

handle_path /rp-api* {
reverse_proxy localhost:38188
}

reverse_proxy localhost:!LISTEN_PORT
}
16 changes: 16 additions & 0 deletions build/COPY_ROOT/opt/caddy/share/service_config_18188_auth
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
:!PROXY_PORT {
basicauth * {
import /opt/caddy/etc/basicauth
}

handle_path /openapi.json {
root * /opt/serverless/docs/swagger/openapi.yaml
file_server
}

handle_path /rp-api* {
reverse_proxy localhost:38188
}

reverse_proxy localhost:!LISTEN_PORT
}
8 changes: 7 additions & 1 deletion build/COPY_ROOT/opt/serverless/handlers/basehandler.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import time
import os
import base64
import shutil
from utils.s3utils import s3utils
from utils.network import Network

Expand Down Expand Up @@ -142,7 +143,12 @@ def get_result(self, job_id):
for image in outputs[item]["images"]:
original_path = f"{self.OUTPUT_DIR}{image['subfolder']}/{image['filename']}"
new_path = f"{custom_output_dir}/{image['filename']}"
os.rename(original_path, new_path)
# Handle duplicated request where output file in not re-generated
if os.path.islink(original_path):
shutil.copyfile(os.path.realpath(original_path), new_path)
else:
os.rename(original_path, new_path)
os.symlink(new_path, original_path)
key = f"{self.request_id}/{image['filename']}"
self.result["images"].append({
"local_path": new_path,
Expand Down
1 change: 0 additions & 1 deletion build/COPY_ROOT/opt/serverless/handlers/image2image.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ def apply_modifiers(self):
self.prompt["prompt"]["7"]["inputs"]["text"] = self.get_value(
"exclude_text",
"")
self.prompt["prompt"]["9"]["inputs"]["filename_prefix"] = f"{self.request_id}/img-{timestr}"
self.prompt["prompt"]["10"]["inputs"]["image"] = self.get_value(
"input_image",
"")
Expand Down
1 change: 0 additions & 1 deletion build/COPY_ROOT/opt/serverless/handlers/text2image.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ def apply_modifiers(self):
self.prompt["prompt"]["7"]["inputs"]["text"] = self.get_value(
"exclude_text",
"")
self.prompt["prompt"]["9"]["inputs"]["filename_prefix"] = f"{self.request_id}/img-{timestr}"



Expand Down
7 changes: 4 additions & 3 deletions build/COPY_ROOT/opt/serverless/providers/runpod/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,11 @@ def get_handler(payload):
def worker(event):
result = {}
try:
if is_test_job(event):
event["id"] = str(uuid.uuid4())
payload = event["input"]
payload["request_id"] = event["id"]
if is_test_job(event):
payload["request_id"] = str(uuid.uuid4())
else:
payload["request_id"] = event["id"]
handler = get_handler(payload)
result = handler.handle()
except Exception as e:
Expand Down
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Loading

0 comments on commit a7bffb6

Please sign in to comment.