Just FYI the license in your comment doesn’t actually exist and the creative commons license it links to does not mention AI anywhere.
I’ve tried multiple things to try and help people understand why I paste that there. This is the closest I got with the least amount of comments about it.
Does it apply if you don’t say that you are posting under the license? It may be implied, the intent is reasonably clear, but an argument of ambiguity can be made. You’re merely linking to a license.
Does it apply if the link label mismatches the license? CC by-nc-sa does more than deny commercial AI training. It requires attribution, requires general non-commercial use, and requires share-alike.
Personally, I prefer when it’s at least differently formatted to indicate it as a footer and not comment content. I’ve seen them smaller and IIRC italic on other commenters, which seems more appropriate and less distracting and noisy [for human consumption]. When the comment is no longer than the license footer… well…
Given how many comments I receive in the vein of “are you seriously licensing your comment?”, I think the intent is quite clear. But since people keep asking why, I might add a blurb (since I don’t have a page I can link to) to explain that.
You are free to improve on the format if you like. Maybe I’ll see a comment of yours with a format I agree with and copy that formatting.
Well i understand its to combat ai from training on your comments right, maybe also to poison the data?
I just don’t see what taking a non relevant licensee and giving it a different name is doing to stop that. Trivial to filter stuff like this out in a dataset.
At best an individual data scraping company decides to honor it out of kindness. At worst people think that its a real license and copy it with a false sense of security.
I think the fact that you got the point from the license name and the link fulfilled the purpose of informing the reader.
As for AI training who knows how well they clean their data. Copilot spit out the entire GPL verbatim as well as a few other licenses and got sued. Data cleaning processes clearly vary among companies.
But if you have an idea on how to better indicate that content is licensed at a glance, go ahead and do it.