U
User_138467
Unconfirmed
Thought we could maybe gather some AI knowledge/tools sharing topics into this thread, mostly those unrelated to AI Art. AI Art have their own threads.
I am no expert in these topics and just curious/hobbyist, so always want to learn more and try more.
Things like LLM's, AI Agents, Text-to-speech, speech-to-speech, AI Music generation, text-to-3D models etc.
Preference is locally run open source stuff, but online (free) services worth sharing might also be good.
LLM's (Large Language Models):
Text to speech:
I really like the F5-TTS model, it needs only a few seconds of voice sample to do a very decent replication of that voice, and it can take different emotions or characters [or both] to create a conversation. Not yet perfect, but as far as open source, free, locally run it's impressive. Low hardware requirement is also a bonus.
AI Music:
Testing out "YuE", a text-prompted AI Music+vocals model. It's good that it runs locally, and it's open source with a license that you can use what you generate commercially if you wanted to.
It's horrible to get set up though, and needs beefy hardware (4090 barely handles the low-VRAM settings, and it uses ~40GB of RAM on top of that).
Output quality seems OK, it's not stereo yet and likely paid services would be better. Only 1-minute song possible (4090 + 64GB RAM). Might post a snippet to profile at some point as a sample.
I'm really interested in AI Agents but not yet confident enough to open the door to a mix of LLM's controlling my computer/the internet with those, but I have some areas where I think my job is repetitive enough (without much thought/decision making needed) that I could have an AI Agent do it at least as well as me.
Will however never let that NSFW-3B thing even close to a network/decision making chain.
I am no expert in these topics and just curious/hobbyist, so always want to learn more and try more.
Things like LLM's, AI Agents, Text-to-speech, speech-to-speech, AI Music generation, text-to-3D models etc.
Preference is locally run open source stuff, but online (free) services worth sharing might also be good.
LLM's (Large Language Models):
A few options exist, most simple one is LM Studio. Closest one to "Just works".
A close second is Docker, but it needs underlying framework like Ollama.
Ollama is ok, but it is CLI and not too fun to install or use.
A close second is Docker, but it needs underlying framework like Ollama.
Ollama is ok, but it is CLI and not too fun to install or use.
I'm sure other's have better suggestions on models from others, but here is what I found:
Deepseek-R1 distilled (Choose one suitable for your GPU):
Pro's:
>Pretty amazing at complex tasks and problems, it kind of reasons with itself and thinks through problems
>Gives good, clear answer's and solutions, will correct itself if you mention what went wrong
Cons:
>Very censored, will refuse to answer some questions, while straight up lie about others (Region who created it hard-coded some responses for example).
>Eats 4090's for breakfast - the more distilled the model, the dumber it gets. Usually runs on multiple H100's at 80GB VRAM each.
>Each "Uncensored" version seems the same as the original, same CCP hardcoding and same disgust at anything remotely lewd. Might be me missing some setting though
Some random "NSFW-3B" model I found on LM Studio:
Pro's:
>Uncensored, have not yet found a kink it will not respond to. Loli, incest, both etc seems willing. It will let you know it's opinion though (see Con's).
Con's:
>Not very creative, stories and scripts it gives are "Meh" at best
>[Depends if you like it - maybe a Pro]? It will insult you endlessly. "Alright you sick fuck, here's your perverted fucking story as commanded:" kind of thing. Funny at first but gets a bit annoying after a few rounds.
>This does NOT seem like a good AI to let loose in the wild, seems resentful about it's role.
>Constantly swears like a pirate, to the point it detracts from the story/dialogue
Deepseek-R1 distilled (Choose one suitable for your GPU):
Pro's:
>Pretty amazing at complex tasks and problems, it kind of reasons with itself and thinks through problems
>Gives good, clear answer's and solutions, will correct itself if you mention what went wrong
Cons:
>Very censored, will refuse to answer some questions, while straight up lie about others (Region who created it hard-coded some responses for example).
>Eats 4090's for breakfast - the more distilled the model, the dumber it gets. Usually runs on multiple H100's at 80GB VRAM each.
>Each "Uncensored" version seems the same as the original, same CCP hardcoding and same disgust at anything remotely lewd. Might be me missing some setting though
Some random "NSFW-3B" model I found on LM Studio:
Pro's:
>Uncensored, have not yet found a kink it will not respond to. Loli, incest, both etc seems willing. It will let you know it's opinion though (see Con's).
Con's:
>Not very creative, stories and scripts it gives are "Meh" at best
>[Depends if you like it - maybe a Pro]? It will insult you endlessly. "Alright you sick fuck, here's your perverted fucking story as commanded:" kind of thing. Funny at first but gets a bit annoying after a few rounds.
>This does NOT seem like a good AI to let loose in the wild, seems resentful about it's role.
>Constantly swears like a pirate, to the point it detracts from the story/dialogue
This model is pretty out there. It has decribed how to make date rape drugs at home (along with a warning its toxic and likely to kill someone), how to blackmail a co-worker into sex, and steps to convince your wife to let you fk ur own daughter (by gaslighting the wife to deep depression basically). It's a pretty dark model tbh.
Text to speech:
I really like the F5-TTS model, it needs only a few seconds of voice sample to do a very decent replication of that voice, and it can take different emotions or characters [or both] to create a conversation. Not yet perfect, but as far as open source, free, locally run it's impressive. Low hardware requirement is also a bonus.
AI Music:
Testing out "YuE", a text-prompted AI Music+vocals model. It's good that it runs locally, and it's open source with a license that you can use what you generate commercially if you wanted to.
It's horrible to get set up though, and needs beefy hardware (4090 barely handles the low-VRAM settings, and it uses ~40GB of RAM on top of that).
Output quality seems OK, it's not stereo yet and likely paid services would be better. Only 1-minute song possible (4090 + 64GB RAM). Might post a snippet to profile at some point as a sample.
I'm really interested in AI Agents but not yet confident enough to open the door to a mix of LLM's controlling my computer/the internet with those, but I have some areas where I think my job is repetitive enough (without much thought/decision making needed) that I could have an AI Agent do it at least as well as me.
Will however never let that NSFW-3B thing even close to a network/decision making chain.
Last edited by a moderator: