Image alchemy Banner

  • Personalizing text-to-image diffusion models like Stable Diffusion XL (SDXL) often leads to issues like catastrophic forgetting, overfitting, or high computational costs.
  • We propose a pipeline for finetuning diffusion models using low rank adaptation using 3 to 4 images of the subject.

1. Generative models

GenVSDis A generative model could generate new photos of animals that look like real animals, while a discriminative model could tell a dog from a cat.

2. Diffusion models

diffusion model A class of generative AI models that create synthetic data, like images, by gradually adding noise to existing data and then learning to reverse this process, reconstructing or generating new data

3. Dreambooth

Dreambooth DreamBooth is a fine-tuning technique that personalizes text-to-image diffusion models using just a few images of a subject.

4. Low Rank Adaptation(LoRA)

Lora LoRA is a parameter-efficient fine-tuning method that injects trainable low-rank matrices into a frozen pre-trained model.

5. Our pipeline

ImageAlchemy In this project we developed a two-stage pipeline using LoRA-based fine-tuning on SDXLs attention layers, followed by a segmentation-driven Img2Img approach to insert personalized subjects while preserving the models generative capabilities.

Updated: