A million tiny decisions can be just as damaging. In my limited experience with several different local and cloud models you have to review basically all output as it can confidently introduce small errors. Often code will compile and run, but it has small errors that can cause output to drift, or the aforementioned long-run overflow type errors.
Those are the errors that junior or lazy coders will never notice and walk away from, causing hard to diagnose failure down the road. And the code “looks fine” so reviewers would need to really go over it with a fine toothed comb, which only happens in critical industries.
I will only use AI to write comments and documentation blocks and to get jumping off points for algorithms I don’t keep in my head. (“Write a function to sort this array”) It’s better than stack exchange for that IMO.