I see what you mean, and while you raise a few excellent points, you seem to forget that a human looking at mashed potatoes have far more data than a computer lookkng at an image.
A human get data about smell, temperature texture and weight in addition to a simple visual impression.
This is why I picked a book/letter example, I wanted to reduce the variables available to a human to get closer to what a computer has from a photo.
It needn’t be exact. A ballpark calorie/sugar that’s 90% accurate would be sufficient. There’s some research that suggests that’s possible: https://arxiv.org/pdf/2011.01082.pdf
But what use would it be then, you wouldn’t be able to compare one potato to another, both would register the same values.
I think the use case is not people doing potato study but people that want to lose weight and need to know the amount of calories in the piece of cake that’s offered at the office cafeteria.
You are correct but you are speaking for yourself and not for example the disabled community who may lack senses or the capacity to calculate a result. While ai still improves its capabilities they are the first to benefit.