refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite #88

buzsh · 2024-04-17T17:38:24Z

Goals

Model management, editor unification
- 1-model setup: merge StoredPrompt and PromptModel, child models
- SwiftData @Model accessors for PromptView
- SwiftData modern store + save management

With the existing setup, implementing new interface features from the A1111 backend requires a number of moving parts, along with a handful of modifications in different parts of the codebase. With this change, we look to merge all of these parts into one model. This will also bring us closer to the goal of direct API → SwiftUI translation for plugin components.

The current setup for all stable diffusion clients is as such: load the current model into RAM, use said model to generate existing prompt, leave model (and weights, prompt-dependencies, etc.) in memory until either: it is overridden by another model, or the process is shut down. This is beneficial for lower-end hardware, as it removes the need to reload model on each new prompt generation, saving anywhere from 30-90s in between prompts.

However, for higher-end hardware (especially the M3 Pro/Max), the time it takes to load an SDXL model into RAM usually takes a maximum of 2-3s. As a result, these clients will reserve 30-50GB of active memory for as long as the process is running—all to save this particular user a second or two of time (2-3s in worst case scenarios). Furthermore, you can restart the Python process, and load the previous model into RAM, which will only result in ~5GB of idle memory usage and add a measly 1-2s of time onto each generation queue.

As such, I propose two separate strategies that I plan to implement (as options) within SwiftDiffusion:

Setup	Idle RAM usage (SDXL)	Added time (per queue)
`default` (current)	30-50GB+	0s
`restartWithLoad`	5-6GB	1-2s
`startOnQueue`	1-2GB	2-3s

After a generation queue has finished successfully:

restartWithLoad: end Py process, start Py process, load last model into RAM
startOnQueue: end Py process, start Py process with no loaded model

On new generation queue:

restartWithLoad: make generation request
startOnQueue: load model into RAM, make generation request

Other Planned Improvements

(Maybe) Ship with self-contained stable-diffusion-webui release (exclude repository, venv)
- Manually build repository, venv on first launch
Rewrite ScriptManager / PythonProcess implementations
- Model LoadStates and GenStates to their own respective classes

import: base project

0f845a8

buzsh changed the title ~~refactor: v1.9.0 compatibility + scheduler, Observation rewrite, SwiftData management~~ refactor: v1.9.0 compatibility + scheduler, idle RAM management, Observation rewrite Apr 17, 2024

buzsh changed the title ~~refactor: v1.9.0 compatibility + scheduler, idle RAM management, Observation rewrite~~ refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite #88

refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite #88

buzsh commented Apr 17, 2024 •

edited

refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite #88

Are you sure you want to change the base?

refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite #88

Conversation

buzsh commented Apr 17, 2024 • edited

Goals

Other Planned Improvements

buzsh commented Apr 17, 2024 •

edited