But the actions taken by the model in the virtual environments can always be described as discrete steps.
That’s technically correct, but practically useless information. Neural networks are stochastic by design, and while Turing machines are technically deterministic, most operating systems’ random number generators will try to introduce noise from the environment (current time, input devices data, temperature readings, etc). So because of that randomness, those discrete steps you’d have to walk through would require knowing intimate details of the environment that the PC was in at precisely the time it ran, which isn’t stored. And even if it was or you used a deterministic psuedo-random number generator, you’d still essentially be stuck reverse engineering the world’s worse spaghetti code written entirely in huge matrix multiplications, code that we already know can’t possibly be optimal anyway.
If a software needs guaranteed optimality, then a neural network (or any stochastic algorithm) is simply the wrong tool for the job. No need to shove a square peg in a round hole.
Also I can’t speak for AI devs, in fact I’ve only taken an applied neural networks course myself, but I can tell you that computer architecture was like a prerequisite of a prerequisite of a prerequisite of that course.