micheal65536
Probably another case of “I don’t want people training AI on my posts/images so I’m nuking my entire online existence”.
Without knowing anything about this model or what it was trained on or how it was trained, it’s impossible to say exactly why it displays this behavior. But there is no “hidden layer” in llama.cpp that allows for “hardcoded”/“built-in” content.
It is absolutely possible for the model to “override pretty much anything in the system context”. Consider any regular “censored” model, and how any attempt at adding system instructions to change/disable this behavior is mostly ignored. This model is probably doing much the same thing except with a “built-in story” rather than a message that says “As an AI assistant, I am not able to …”.
As I say, without knowing anything more about what model this is or what the training data looked like, it’s impossible to say exactly why/how it has learned this behavior or even if it’s intentional (this could just be a side-effect of the model being trained on a small selection of specific stories, or perhaps those stories were over-represented in the training data).
IMO, local LLMs lack the capabilities or depth of understanding to be useful for most practical tasks (e.g. writing code, automation, language analysis). This will heavily skew any local LLM “usage statistics” further towards RP/storytelling (a significant proportion of which will always be NSFW in nature).
Stable Diffusion 2 base model is trained using what we would today refer to as a “censored” dataset. Stable Diffusion 1 dataset included NSFW images, the base model doesn’t seem particularly biased towards or away from them and can be further trained in either direction as it has the foundational understanding of what those things are.
So… If this doesn’t actually increase the context window or otherwise increase the amount of text that the LLM is actually able to see/process, then how is it fundamentally different to just “manually” truncating the input to fit in the context size like everyone’s already been doing?
I tried getting it to write out a simple melody using MIDI note numbers once. I didn’t think of asking it for LilyPond format, I couldn’t think of a text-based format for music notation at the time.
It was able to produce a mostly accurate output for a few popular children’s songs. It was also able to “improvise” a short blues riff (mostly keeping to the correct scale, and showing some awareness of/reference to common blues themes), and write an “answer” phrase (which was suitable and made musical sense) to a prompt phrase that I provided.
To be honest, the same could be said of LLaMa/Facebook (which doesn’t particularly claim to be “open”, but I don’t see many people criticising Facebook for doing a potential future marketing “bait and switch” with their LLMs).
They’re only giving these away for free because they aren’t commercially viable. If anyone actually develops a leading-edge LLM, I doubt they will be giving it away for free regardless of their prior “ethics”.
And the chance of a leading-edge LLM being developed by someone other than a company with prior plans to market it commercially is quite small, as they wouldn’t attract the same funding to cover the development costs.
IMO the availability of the dataset is less important than the model, especially if the model is under a license that allows fairly unrestricted use.
Datasets aren’t useful to most people and carry more risk of a lawsuit or being ripped off by a competitor than the model. Publishing a dataset with copyrighted content is legally grey at best, while the verdict is still out regarding a model trained on that dataset and the model also carries with it some short-term plausible deniability.
Yeah, I think you need to set the contextsize
and ropeconfig
. Documentation isn’t completely clear and in some places sort of implies that it should be autodetected based on the model when using a recent version, but the first thing I would try is setting these explicitly as this definitely looks like an encoding issue.
I would guess that this is possibly an issue due to the model being a “SuperHOT” model. This affects the way that the context is encoded and if the software that uses the model isn’t set up correctly for it you will get issues such as repeated output or incoherent rambling with words that are only vaguely related to the topic.
Unfortunately I haven’t used these models myself so I don’t have any personal experience here but hopefully this is a starting point for your searches. Check out the contextsize
and ropeconfig
parameters. If you are using the wrong context size or scaling factor then you will get incorrect results.
It might help if you posted a screenshot of your model settings (the screenshot that you posted is of your sampler settings). I’m not sure if you configured this in the GUI or if the only model settings that you have are the command-line ones (which are all defaults and probably not correct for an 8k model).