74 points
*

Google is planning to roll out a technology that will identify whether a photo was taken with a camera, edited by software like Photoshop, or produced by generative AI models.

So they are going to use AI to detect AI. That should not present any problems.

permalink
report
reply
23 points

They’re going to use AI to train AI*

So nothing new here

permalink
report
parent
reply
9 points

Use AI to train AI to detect AI, got it.

permalink
report
parent
reply
8 points

Yes, it’s called a GAN and has been a fundamental technique in ML for years.

permalink
report
parent
reply
23 points

You may be able to prove that a photo with certain metadata was taken by a camera (my understanding is that that’s the method), but you can’t prove that a photo without it wasn’t, because older cameras won’t have the necessary support, and wiping metadata is trivial anyway. So is it better to have more false negatives than false positives? Maybe. My suspicion is that it won’t make much difference to most people.

permalink
report
reply
11 points
*

A fair few sites will also wipe image/EXIF metadata for safety reasons, since photo metadata can include things like the location where the photo was taken.

permalink
report
parent
reply
7 points
*

Even if you assume the images you care about have this metadata, all it takes is a hacked camera (which could be as simple as carefully taking a photo of your AI-generated image) to fake authenticity.

And the vast majority of images you see online are heavily compressed so it’s not 6MB+ per image for the digitally signed raw images.

permalink
report
parent
reply
4 points

You don’t even need a hacked camera to edit the metadata, you just need exiftool.

permalink
report
parent
reply
0 points
*

It’s not that simple. It’s not just a “this is or isn’t AI” boolean in the metadata. Hash the image, then sign the hash with digital signature key. The signature will be invalid if the image has been tampered with, and you can’t make a new signature without the signing key.

Once the image is signed, you can’t tamper with it and get away with it.

The vulnerability is, how do you ensure an image isn’t faked before it gets to the signature part? On some level, I think this is a fundamentally unsolvable problem. But there may be ways to make it practically impossible to fake, at least for the average user without highly advanced resources.

permalink
report
parent
reply
22 points
*

looks dubious

The problem here is that if this is unreliable – and I’m skeptical that Google can produce a system that will work across-the-board – then you have a synthesized image that now has Google attesting to be non-synthetic.

Maybe they can make it clear that this is a best-effort system, and that they only will flag some of them.

There are a limited number of ways that I’m aware of to detect whether an image is edited.

  • If the image has been previously compressed via lossy compression, there are ways to modify the image to make the difference in artifacts in different points of the image more visible, or – I’m sure – statistically look for such artifacts.

  • If an image has been previously indexed by something like Google Images and Google has an index sufficient to permit Google to do fuzzy search for portions of the image, then they can identify an edited image because they can find the original.

  • It’s possible to try to identify light sources based on shading and specular in an image, and try to find points of the image that don’t match. There are complexities to this; for example, a surface might simply be shaded in such a way that it looks like light is shining on it, like if you have a realistic poster on a wall. For generation rather than photomanipulation, better generative AI systems will also probably tend to make this go away as they improve; it’s a flaw in the image.

But none of these is a surefire mechanism.

For AI-generated images, my guess is that there are some other routes.

  • Some images are going to have metadata attached. That’s trivial to strip, so not very good if someone is actually trying to fool people.

  • Maybe some generative AIs will try doing digital watermarks. I’m not very bullish on this approach. It’s a little harder to remove, but invariably, any kind of lossy compression is at odds with watermarks that aren’t very visible. As lossy compression gets better, it either automatically tends to strip watermarks – because lossy compression tries to remove data that doesn’t noticeably alter an image, and watermarks rely on hiding data there – or watermarks have to visibly alter the image. And that’s before people actively developing tools to strip them. And you’re never gonna get all the generative AIs out there adding digital watermarks.

  • I don’t know what the right terminology is, but my guess is that latent diffusion models try to approach a minimum error for some model during the iteration process. If you have a copy of the model used to generate the image, you can probably measure the error from what the model would predict – basically, how much one iteration would change an image or part of it. I’d guess that that only works well if you have a copy of the model in question or a model similar to it.

I don’t think that any of those are likely surefire mechanisms either.

permalink
report
reply
3 points

The problem here is that if this is unreliable…

And the problem if it is reliable is that everyone becomes dependent on Google to literally define reality.

permalink
report
parent
reply
1 point
*

Fun fact about AI products (or any gold rush economy) it doesn’t have to work. It just has to sell.

I mean this is generally true about anything but it’s particularly bad in these situations. Also PT Barnum had a few thoughts on this as well.

permalink
report
parent
reply
-2 points

I guess this would be a good reason to include some exif data when images are hosted on websites, one of the only ways to tell an image is true from my little understanding.

permalink
report
parent
reply
3 points

Exif data can be faked.

permalink
report
parent
reply
-3 points
*

I guess, but the original image would be somewhere to be scraped by google to compare and see an earlier version. Thats why you don’t just look at the single image, you scrape multiple sites looking for others as well.

Theres obviously very specific use cases that can take advantage of brand new images that are created on a computer, but theres still ways of detecting that with other methods as explained by the user I responded to.

permalink
report
parent
reply
1 point

No, the default should be removing everything but maybe the date because of privacy implications.

permalink
report
parent
reply
-3 points
*

include some EXIF data

Thats what I said.

Date, device, edited. That can all be included, location doesn’t need to be.

permalink
report
parent
reply
11 points

Lol, knowing the post processing done with your IPhone this whole thing sounds like an actual joke, does no one remember the fake moon incident? Your photos have been Ai generated for years and no one noticed, no algorithm on earth could tell the difference between a phone photo and an Ai photo because they are the same thing.

permalink
report
reply
2 points

Are you saying the moon landing was faked or did I miss something?

permalink
report
parent
reply
6 points

You absolutely missed everything, the moon is fake literally… when you take a picture of the moon your camera uses AI photo manipulation to change your garage picture to a completely Ai generated image because taking pictures of the moon is actually pretty difficult so it makes pictures look much better and in %99 of cases it is better but in edge cases like trying to take a picture of something flying in front of the moon like the ISS or a cloud it is not, also it may cause issues if you try to introduce your photos in court because everything you take is inherently doctored.

permalink
report
parent
reply
1 point

Huh. I thought that was just based on promo “Space zoom” photos from Samsung and it never made it into the wild.

permalink
report
parent
reply
5 points

It’s of course troubling that AI images will go unidentified through this service (I am also not at all confident that Google can do this well/consistently).

However I’m also worried about the opposite side of this problem- real images being mislabeled as AI. I can see a lot of bad actors using that to discredit legitimate news sources or stories that don’t fit their narrative.

permalink
report
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 17K

    Monthly active users

  • 12K

    Posts

  • 543K

    Comments