There may be other out of the box type solutions. This setup really isn’t bad. You can find info on places like YT that are step by step for Windows.
If you are at all interested in learning about software and how to get started using a command line, this would be a good place to start.
Oobabooga is well configured to make installation easy. It just involves a few commands that are unlikely to have catastrophic errors. All of the steps required are detailed in the README.md file. You don’t actually need to know or understand everything I described in the last message. I described why the model is named like x/y/z if you care to understand. This just explained details I learned by making lots of mistakes. The key here is that I linked to the model you need specifically and tried to explain how to choose the right file from the linked model. If you still don’t understand, feel free to ask. Most people here remember what it was like to learn.
Originally posted this to beehaw on another account:
Oobabooga is the main GUI used to interact with models.
https://github.com/oobabooga/text-generation-webui
FYI, you need to find checkpoint models. In the available chat models space, naming can be ambiguous for a few reasons I’m not going to ramble about here. The main source of models is Hugging Face. Start with this model (or get the censored version):
https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GGML
First, let’s break down the title.
As for options on the landing page or “model card”
The 7B models are about like having a conversation with your average teenager. Asking technical questions yielded around 50% accuracy in my experience. A 13B model got around 80% accuracy. The 30B WizardLM is around 90-95%. I’m still working on trying to get a 70B running on my computer. A lot of the larger models require compiling tools from source. They won’t work directly with Oobabooga.
This is a general list that was shared recently (has google analytics though):
PrivateGPT is on my list to try after someone posted about it weeks ago with this how to article (that has a view limit embedded before a pay wall)/github project repo:
What software do you want to run?
I’ve been doing a lot of research on this over the last 2 weeks. I have my machine in the mail, but have not tried anything myself on my own hardware.
For Stable Diffusion, 8GBV is usually considered absolute minimum to do very basic stuff only. 16GBV or more is the basic need for a decent workflow.
For AMD I have seen multiple sources saying to avoid it, but there are a few people that have working examples in the wild. Apparently, AMD only supports the 7k series of GPUs officially with ROCm/hips/AI stuff.
Officially with Stable Diffusion, only nvidia is supported.
Open source project device support pages are always my first stop. If you have access to a git repo for the project, use gource to visualize who is doing what and where within the project recently. This will make it obvious what hardware the main devs are invested in the most, and therefore what will have the best support and user experience.
https://gource.io/