Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite #88

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

buzsh
Copy link
Owner

@buzsh buzsh commented Apr 17, 2024

Goals

  • Model management, editor unification
    • 1-model setup: merge StoredPrompt and PromptModel, child models
    • SwiftData @Model accessors for PromptView
    • SwiftData modern store + save management

With the existing setup, implementing new interface features from the A1111 backend requires a number of moving parts, along with a handful of modifications in different parts of the codebase. With this change, we look to merge all of these parts into one model. This will also bring us closer to the goal of direct API → SwiftUI translation for plugin components.

  • Observation framework (see Apple docs)

    • Remove need for Combine: ObservableObject, @Published, ...
    • Inherently allow for improved state management
  • A1111 v1.9.0 compatibility (see stable-diffusion-webui releases)

    • Dynamic sampling methods+associated schedulers from API
    • Implement scheduler with default .automatic
  • (Optional) Idle RAM management for powerful hardware

    • Start Python process for a queue of tasks (load model, start generation process, etc.)
    • End Python process on queue completion (see --api-server-stop in A1111 CLI args)
    • Revert Python process to baseline initialization

The current setup for all stable diffusion clients is as such: load the current model into RAM, use said model to generate existing prompt, leave model (and weights, prompt-dependencies, etc.) in memory until either: it is overridden by another model, or the process is shut down. This is beneficial for lower-end hardware, as it removes the need to reload model on each new prompt generation, saving anywhere from 30-90s in between prompts.

However, for higher-end hardware (especially the M3 Pro/Max), the time it takes to load an SDXL model into RAM usually takes a maximum of 2-3s. As a result, these clients will reserve 30-50GB of active memory for as long as the process is running—all to save this particular user a second or two of time (2-3s in worst case scenarios). Furthermore, you can restart the Python process, and load the previous model into RAM, which will only result in ~5GB of idle memory usage and add a measly 1-2s of time onto each generation queue.

As such, I propose two separate strategies that I plan to implement (as options) within SwiftDiffusion:

Setup Idle RAM usage (SDXL) Added time (per queue)
default (current) 30-50GB+ 0s
restartWithLoad 5-6GB 1-2s
startOnQueue 1-2GB 2-3s

After a generation queue has finished successfully:

  • restartWithLoad: end Py process, start Py process, load last model into RAM
  • startOnQueue: end Py process, start Py process with no loaded model

On new generation queue:

  • restartWithLoad: make generation request
  • startOnQueue: load model into RAM, make generation request

Other Planned Improvements

  • (Maybe) Ship with self-contained stable-diffusion-webui release (exclude repository, venv)
    • Manually build repository, venv on first launch
  • Rewrite ScriptManager / PythonProcess implementations
    • Model LoadStates and GenStates to their own respective classes

@buzsh buzsh changed the title refactor: v1.9.0 compatibility + scheduler, Observation rewrite, SwiftData management refactor: v1.9.0 compatibility + scheduler, idle RAM management, Observation rewrite Apr 17, 2024
@buzsh buzsh changed the title refactor: v1.9.0 compatibility + scheduler, idle RAM management, Observation rewrite refactor: v1.9.0 + scheduler, idle RAM management, Observation rewrite Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant