Add Docker files #117

bytebone · 2024-04-26T12:30:03Z

This PR adds the required files to quickly setup the training UI and Tensorboard with Docker. It is based on the Docker template for Python apps, which is where the comments come from (can be removed if desired).

To run, you need the fully cloned git repo. Just like on the InvokeAI main repo, all dockerfiles and bind mounts (except the existing sample_data folder) are in a docker subfolder. I've kept all approaches similar to the main repo's dockerfiles.

GPU integration is done the same way it's done for InvokeAI. The training UI and Tensorboard are executed on neighbouring ports for convenience.

If you're ready to accept the PR, I can also add usage instructions to the README / docs if desired.

RyanJDick · 2024-04-30T22:07:41Z

Thanks for this PR. Sorry that I've been slow in getting around to it. I do think it is worthwhile to get this in. I'll test it out and do a review in the next day or two.

RyanJDick · 2024-05-01T15:14:15Z

I tested this out today. There are a few things that need to be addressed before we can merge.

File Permissions

I ran into some file permission errors when the scripts try to write to the output directory. (I'm running on a Ubuntu host.) To fix this, I think we need to do two things:

Let's commit empty data/ and output/ directories to the repo. When these are created automatically by docker, they are owned by root rather than the current user. I'm thinking something like this:

.
├── compose.yaml
├── data
│   └── DOCKER_DATA_MOUNT_DIR.txt
├── Dockerfile
└── output
    └── DOCKER_OUTPUT_MOUNT_DIR.txt

We need to use the host UID/GID in the container so that the user in the container can write to output/ with the correct permissions. There are a few ways to accomplish this. One option is to add user: ${MY_UID-10001}:${MY_GID-10001} to the compose.yaml and then run it with MY_UID="$(id -u)" MY_GID="$(id -g)" docker compose up. We also need to make sure this works on Windows - instructions might have to be slightly different on Windows.

Hugging Face Cache

The hugging face cache should ideally be mounted into the container so that huggingface models can be loaded automatically without re-downloading.

Documentation

The docs for this feature should include:

Prerequisites (docker, nvidia-container-toolkit)
Command to launch
Explanation of bind mount directory structure
Explanation of why there are two containers, and which ports are exposed

bytebone · 2024-05-02T14:59:43Z

Thanks for the elaborate feedback. I'll implement it as soon as possible, might take me some days due to other plans.

Points on file permissions make sense, though I'm considering making the user hardcode their UID/GID into the compose file to reduce the typing required for startup and prevent issues when forgetting to do so. Alternatively we could use a startup script that sets the variables on its own, but that feels like overengineering to me.

Can you recall from the top of your head where the HF cache lies? I thought they're placed in the users home directory, which is already mounted to the ./data folder.

Since I'm building this on a Windows machine, everything is compatible OOB. But I am dependant on your testing on Linux, including GPU mounting.

I'll deal with the docs once we've got everything else locked in.

bytebone added 4 commits April 26, 2024 13:52

added docker files

ea6777a

fixed minor errors

01e9449

final corrections for docker compose

ea68815

adjusted namings to align with invokeai

3618cde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Docker files #117

Add Docker files #117

bytebone commented Apr 26, 2024 •

edited

Loading

RyanJDick commented Apr 30, 2024

RyanJDick commented May 1, 2024

bytebone commented May 2, 2024

Add Docker files #117

Are you sure you want to change the base?

Add Docker files #117

Conversation

bytebone commented Apr 26, 2024 • edited Loading

RyanJDick commented Apr 30, 2024

RyanJDick commented May 1, 2024

File Permissions

Hugging Face Cache

Documentation

bytebone commented May 2, 2024

bytebone commented Apr 26, 2024 •

edited

Loading