• CodeInvasion@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    17
    ·
    9 months ago

    This is done by combining a Diffusion model with ControlNet interface. As long as you have a decently modern Nvidia GPU and familiarity with Python and Pytorch it’s relatively simple to create your own model.

    The ControlNet paper is here: https://arxiv.org/pdf/2302.05543.pdf

    I implemented this paper back in March. It’s as simple as it is brilliant. By using methods originally intended to adapt large pre-trained language models to a specific application, the author’s created a new model architecture that can better control the output of a diffusion model.