China has released a set of guidelines on labeling internet content that is generated or composed by artificial intelligence (AI) technology, which are set to take effect on Sept. 1.
This is a smart and ethical way to include AI into everyday use, though I hope the watermarks are not easily removed.
Think a layer deeper how can it misused to control naratives.
You read some wild allegation, no AI marks (they required to be visible), so must written by someone? Right? What if someone, even the government jumps out as said someone use an illiegal AI to generate the text? The questioning of the matter will suddently from verifying if the allegation decribed happened, to if it itself is real. The public sentiment will likely overwhelmed by “Is this fakenews?” or “Is the allegation true?” Compound that with trusted entities, discrediting anything become easier.
Give you a real example. Before Covid spread globally there was a Chinese whistleblower, worked in the hospital and get infected. He posted a video online about how bad it was, and quickly got taken down by the government. What if it happened today with the regulation in full force? Government can claim it is AI generated. The whistleblower doesn’t exist. Nor the content is real. 3 days later, they arrested a guy, claiming he spread fakenews using AI. They already have a very efficient way to control naratives, and this piece of garbage just give them an express way.
You though that only a China thing? No, every entities including governments are watching, especially the self-claimed friend of Putin and Xi, and the absolute free speech lover. Don’t think it is too far to reach you yet.
It’s still a good thing. The alternative is people posting AI content as though it is real content, which is a worldwide problem destroying entire industries. All AI content should by law have to be clearly labeled.
Then what AI generated slop without label are to the plain eyes? That label just encourge the laziness of the brain as an “easy filter.” Those slop without label just evelated itself to be somewhat real, becuase the label exist exploiting the laziness.
Before you said some AI slop are clearly identifiable, you can’t rule out everyone can, and every piece are that identifiable. And for those images that looks a little unrealistic, just decrease the resolution to very grainy and hide those details. That will work 9 out of 10. You can’t rule out that 0.1% content that pass sanity check can’t do 99.9% damage.
After all, human are emotional creatures, and sansationism is real. The urge of share something emotional is why misinformation and disinformation are so common these days. People will overlook details when the urge hits.
Somethimes, labeling can do more harm than good. It just give a false sense.
It will be relatively easy to strip that stuff off. It might help a little bit with internet searches or whatever, but anyone spreading deepfakes will probably not be stopped by that. Still better than nothing, I guess.
it will be relatively easy to strip off
How so? If it’s anything like llm text based “water marks” the watermark is an integral part of the output. For an llm it’s about downrating certain words in the output, I’m guessing for photos you could do the same with certain colors, so if this variation of teal shows up more than this variation then it’s made by ai.
I guess the difference with images is that since you’re not doing the “guess the next word” aspect and feeding the output from the previous step into the next one, you can’t generate the red green list from the previous output.
You can use things like steganography to embed data into the AI output.
Imagine a text has certain letters in certain places which can give you a probability rating that it’s AI generated, or errant pixels of certain colors.
Printers already do something like this, printing imperceptible dots on pages.
I’m going to develop a new AI designed to remove watermarks from AI generated content. I’m still looking for investors if you’re interested! You could get in on the ground floor!
Will be interesting to see how they actually plan on controlling this. It seems unenforceable to me as long as people can generate images locally.
That’s what they want. When people doing it locally, they can discredit anything as AI generated. The point isn’t about enforability, but can it be a tool to control narative.
Edit: it doesn’t matter if people actually generating locally, but if people can possibly doing it. As long as it is plausible, the argument stands and the loop completes.
It’s not like this wasn’t always the issue.
Anything and everything can be labelled as misinformation.
As an exception to most regulations that we hear about from China, this approach actually seems well considered - something that might benefit people and work.
Similar regulations should be considered by other countries. Labeling generated content at the source, hopefully without the metadata being too extensive (this is where China might go off the handle) would help avoid at least two things:
- casual deception
- training AI with material generated by another AI, leading to degradation of ability to generate realistic content
Stable Diffusion has the option to include an invisible watermark. I saw this in the settings when I was running it locally. It does something like adds a pattern that is easy to detect with machines but impossible to see. The idea was that you could check an image for it before putting it into training sets. Because I never needed to lie about things I generated I left it on.
That’s something that was really needed.