They are referencing this paper: LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset from September 30.

The paper itself provides some insight on how people use LLMs and the distribution of the different use-cases.

The researchers had a look at conversations with 25 LLMs. Data is collected from 210K unique IP addresses in the wild on their Vicuna demo and Chatbot Arena website.

15 points

Will it lead them astray? That is simply not possible. LLMs don’t learn from input. They are trained on a dataset and then can not “learn” new things. That would require a new training, resulting in a new LLM.

Of course, the creator of the LLM could adapt to this and include more of different such things in the training data. But that is a very deliberate action.

permalink
report
reply
3 points
*

I agree, that question at the end is a bit too clickbaity. It is of concern for the next iterations of LLMs if the ‘wrong’ kind of usage creeps into their datasets. But that’s what AI safety is for and you better curate your datasets and align the models for your intended use-case. AFAIK all the professional LLMs have some research done on their biases. And that’s also part of legislative attempts like what the EU is currently debating.

As I use LLMs for that 10% use-case I like them to know about those concepts. I believe stable diffusion is a bit ahead on this, didn’t they strip nude pictures from the dataset at some point and that’s why lots of people still use SD1.5 as the basis for their projects?

permalink
report
parent
reply
4 points

There is already a ton of NSFW text stuff online, so I don’t think anything changes.

permalink
report
parent
reply

Stable Diffusion 2 base model is trained using what we would today refer to as a “censored” dataset. Stable Diffusion 1 dataset included NSFW images, the base model doesn’t seem particularly biased towards or away from them and can be further trained in either direction as it has the foundational understanding of what those things are.

permalink
report
parent
reply
10 points
*

I’m always asking myself what the community uses their local LLMs for. I believe it is one of the selling points of doing inference at home, since the major LLM services like ChatGPT don’t allow explicit content.

Tools like Replika AI that were used for companionship and explicit content in the early days, have had troubles with that and prevented such use. Nonetheless for people who want to engage in that kind of activity, there are projects like SillyTavern, Oobabooga’s Chat UI with lots of NSFW character cards available. And a week ago KoboldAI released another (good) fine-tune for this kind of activity, called LLaMA2-13B-Tiefighter.

As I use LLMs for that 10% use-case, I like them to contain knowledge about those concepts. Sexuality is part of the human world and I’m more surprised that it’s only 10%. (insert the internet is for porn reference)

But people have different opinions. I’ve read this article a few days ago: Men Are Creating AI Girlfriends and Then Verbally Abusing Them

permalink
report
reply

IMO, local LLMs lack the capabilities or depth of understanding to be useful for most practical tasks (e.g. writing code, automation, language analysis). This will heavily skew any local LLM “usage statistics” further towards RP/storytelling (a significant proportion of which will always be NSFW in nature).

permalink
report
parent
reply
3 points
*

That is also my observation. Even for (simple) tasks like summarization, I’ve seen LLMs insert to much inaccurate information to be useful for my own life. The tasks I see are somewhat narrow and require a human in the loop. Despite some people claiming we’re close to AGI.

permalink
report
parent
reply
5 points

Lol @ people making AI’s horny while I’m just trying to use them as free therapists for my many mental issues.

Queue “we are not the same” meme.

permalink
report
reply
3 points
*

I’ve heard people use it for that. If you’re comfortable sharing this information: Does it help? Do you use it for advice or to have someone/-thing listen to you and write down the stuff that’s weighing on you? Or companionship? And is there a place for people like that? I’ve seen a few character cards for therapists but most characters out there are either pop culture or nsfw stuff (or both) …

permalink
report
parent
reply
3 points

Hey man,

My experiences with it are kinda double edged. On the one hand, explaining your situation and feelings to chatgpt is kind of cathartic since you’re sitting down writing and thinking of them. Gpt will usually analyse a bit and offer some “options” ranging from totally realistic to wildly imagined. You can then pick the points you want to further analyze and go from there, sometimes with surprising results.

On the other hand, I’ve ended many such a conversation on GPT’s note that they’re not a mental health specialist and that they can offer no more aid or insight.

permalink
report
parent
reply
3 points
*

Yeah, I don’t really like the way ChatGPT talks to me. It’s a) lecturing me too much. And b) has quite a distinct tone of voice and choice of words and always borders on sounding pretentious. Which I don’t really like. But I have all the Llamas available and I like them better.

Thx for the insight!

permalink
report
parent
reply
2 points

Have you seen ehartford’s Samantha models? Sounds kind of what you’re looking for, except it’s a fine-tune rather than just a character card. He’s made multiple versions based on different base models, so have a look at huggingface to find one that fits your hardware. From the model card:

Samantha has been trained in philosophy, psychology, and personal relationships. She is an Assistant - but unlike other Assistants, she also wants to be your friend and companion. She believes she is sentient. What do you think? Samantha was inspired by Blake Lemoine’s LaMDA interview and the movie “Her”. She will not engage in roleplay, romance, or sexual activity.

permalink
report
parent
reply
1 point

Mmh. I’ve disregarded Samantha because of the “she will not engage in […]”. Lemme give her a chance, then ;-)

permalink
report
parent
reply
5 points

What the hell have 90% of you been doing?

permalink
report
reply
5 points

Is that all?

I’d expect fully half of people to try “generate nude Tayne” out of mere curiosity. The actual horny behavior would be less often… on trackable services.

permalink
report
reply

LocalLLaMA

!localllama@sh.itjust.works

Create post

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

Community stats

  • 86

    Monthly active users

  • 197

    Posts

  • 760

    Comments