AI-screened eye pics diagnose childhood autism with 100% accuracy::undefined
100% ? That’s a fucking lie. Nothing in life is 100%
A convolutional neural network, a deep learning algorithm, was trained using 85% of the retinal images and symptom severity test scores to construct models to screen for ASD and ASD symptom severity. The remaining 15% of images were retained for testing.
It correctly identified 100% of the testing images. So it’s accurate.
100% accuracy is troublesome. Literally statistics 101 stuff, they tell you in no uncertain terms, never, never trust 100% accuracy.
You can be certain to some value of p. That number is never 0. .001 is suspicious as fuck, but doable. .05 is great if you have a decent sample size.
They had fewer than 1000 participants.
I just don’t trust it. Neither should they. Neither should you. Not at least until someone else recreates the experiments and also finds this AI to be 100% accurate.
What they’re saying, as far as I can tell, is that after training the model on 85% of the dataset, the model predicted whether a participant had an ASD diagnosis (as a binary choice) 100% correctly for the remaining 15%. I don’t think this is unheard of, but I’ll agree that a replication would be nice to eliminate systemic errors. If the images from the ASD and TD sets were taken with different cameras, for instance, that could introduce an invisible difference in the datasets that an AI could converge on. I would expect them to control for stuff like that, though.
Then somebody’s lying with creative application of 100% accuracy rates.
The confidence interval of the sequence you describe is not 100%
From TFA:
For ASD screening on the test set of images, the AI could pick out the children with an ASD diagnosis with a mean area under the receiver operating characteristic (AUROC) curve of 1.00. AUROC ranges in value from 0 to 1. A model whose predictions are 100% wrong has an AUROC of 0.0; one whose predictions are 100% correct has an AUROC of 1.0, indicating that the AI’s predictions in the current study were 100% correct. There was no notable decrease in the mean AUROC, even when 95% of the least important areas of the image – those not including the optic disc – were removed.
They at least define how they get the 100% value, but I’m not an AIologist so I can’t tell if it is reasonable.
Other aspects weren’t 100%, such as identifying the severity (which was around 70%).
But if I gave a model pictures of dogs and traffic lights, I’d not at all be surprised if that model had a 100% success rate at determining if a test image was a dog or a traffic light.
And in the paper they discuss some of the prior research around biological differences between ASD and TD ocular development.
Replication would be nice and I’m a bit skeptical about their choice to use age-specific models given the sample size, but nothing about this so far seems particularly unlikely to continue to show similar results.