Those are all one token. A token can be a whole sentence. Tokenization tends to be based on LZW compression which combines common phrases (of any length, e.g. “Once upon a time” could be a single token because it’s recurring)
“Yes” is almost always followed by an explanation of a single idea while “It depends” is followed by several possible explanations.
I really hate that LLM stuff has been bazinga’d because it’s actually really cool. It’s just not some magical solution to anything beyond finding statistical patterns
Yes I recognize that it is very advanced statistical analysis at its core. It’s so difficult to get that concept across to people. We have a GenAI tool at work but I asked it a single question with easily verifiable and public data and it got it so wrong. It got the structure correct but all of the figures were made up