Relative newbie here, be gentle with me.

I get really good results on Perchance, but I have a hard time replicating them when running SD locally, even when I copy/paste prompts. Perchance’s are just better , especially with Disney and Anime related subjects.

Do we know what Model/Checkpoint/LoRA’s the image generator uses, and if so, is it publicly available to download?

  • perchance@lemmy.worldM
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 year ago

    @[email protected]

    The text-to-image plugin backend server is changing often, but I’m always using popular models from civit (sort by highest rating), with the settings they recommend in their model description. You might have recently noticed some changes to results when you use ‘photo’ in your prompt - that’s because I’ve been using realistic vision (v5.1 IIRC) for some photo-related prompts. I think it’s pretty good, but a bit too high-contrast perhaps (apparently this can be fixed with loras). The goal is to have a model per popular category of things people tend to want to generator. E.g. if people start generating a lot of pixel art, I’ll add the best available pixel art model. This multi-model approach may become unnecessary once the base models are good enough to know multiple styles.

    “we can work during months without to find the right settings”

    I thought I already mentioned this in the other thread, but: I haven’t played around with loras yet, no embeddings, no fancy sampler stuff, no hi-res fix, no hidden ‘quality’ prompts (but there is hidden stuff to reduce suggestive/nudity/etc). Step count is usually less than recommended (to make it cheaper), so perchance quality is usually worse than the settings from in the model description. I barely have time to test models right now, let alone use fancy settings - hence my very simple sort-by-rating strategy. I’m spending my days trying to prevent the servers from catching fire (figuratively), and watching my bank account go down (not complaining, brought it upon myself, and there’s a path sustainability here so it’s fine).

    If you sort by popular/rating on civit, close your eyes and click the page, and use the settings they recommend in their model description, and you’re not getting perchance-quality results (or better), then you’re doing something wrong, and you should troubleshoot on a more appropriate forum (e.g. r/stablediffusion perhaps). This forum is for people to share, get help with, and talk about building their perchance generators.

    • AlabasterEntity@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Thanks for sharing the model you’ve been using.

      catbat is absolutely right though, the settings and hidden details really do make or break generation. It’d be absolutely stellar if you’d kindly share the full settings used. If nothing else, it’d allow for independent optimisation that could help perchance overall down the line. Cheers :)

      • catbat@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 months ago

        Yes, I tried again an again to reproduce the perchance text2img but there is really something behind the scene. The hidden stuff make the difference I think. We know the model (for photo it is alternatively deliberatev2 and realistc vision), he said he use no lora, “no embeddings, no fancy sampler stuff, no hi-res fix, no hidden ‘quality’ prompts” BUT “there is hidden stuff to reduce suggestive/nudity/etc”. I tried to use negative prompts like “nsfw” , “porn” and things like that, but I never have exactly the same thing as in perchance. It is very frustrating. please @[email protected] , could you help us ? it will literally take you 2 minutes to explain the full settings and we will all leave you lone about it lol thank you very much !!

    • catbat@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Thank you for your answer !

      could you please tell what was the previous model used for photo (before “realistic vision” which is , imho, less good than the previous one), and the sampling methods/steps and the hidden stuffs you used to reduce suggestive/nudity/etc (I supposed these are hidden prompts; right ?): these “hidden stuffs” can make a big difference on the results (SD is very sensitive to small changes)! I don’t have bad results on my local automatic1111, I just try to have the same results with the same prompts; and so I just need these simple infos (and I think it’s the same for other people who asked for them). thank you !