Forgive me if this question is a bit silly, or conceptually incorrect (I’m not fully across how the APIs work and effect ChatGPT implementations etc).

I’m looking for a ChatGPT (or similar) desktop app. HemulGM’s ChatGPT app is tempting, especially because it seems to include/support DALL-E and GPT-3 and -4, but I haven’t found much discussion online from users. My biggest concern is security…I chickened out before running the .exe due to Windows’ “unknown dev” warning, and because the dev’s website is a .ru address (hate to say it, but that’s a yellow flag for me in today’s cyber climate).

Does anyone have any thoughts or experience with this particular desktop app? Or better alternatives? Ideally free and open-source, but I know that may be asking a lot and I’m open to suggestions.

Appreciate it!

  • relevants@feddit.de
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    1 year ago

    “Runs locally” is a very different requirement and not one you’ll likely be able to find anything for. There are smaller open source LLMs but if you are looking for GPT-4 level performance your device will not be able to handle it. Llama is probably your best bet, but unless you have more VRAM than any consumer gpu currently does , you’ll have to go with lower size models which have lower quality output.

    • snipermonkey@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      Thank you, I realised that once I installed GPT4All! I’ve got Llama going now, and will look at upgrading my RAM to accommodate a larger model like Falcon if I feel I need it. I’ve learned a lot this morning, it’s been great!

      • relevants@feddit.de
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        Do keep in mind that if you upgrade your regular RAM this will only benefit models running on the CPU, which are far slower than models on the GPU. So with more RAM you may be able to run bigger models, but when you run them they will also be more than a literal order of magnitude slower. If you want a response within seconds you would want to run that model on the GPU, where only VRAM counts.

        Probably in the near future there will be models that perform much better at consumer device scale, but for now unfortunately it’s still a pretty steep tradeoff, especially since large VRAM hasn’t really been in high demand and is therefore much harder to come by.