As far as I understand (I haven’t had much experience with high frequencies), it depends on the amount of energy you are willing to push.
Smaller band gap and higher mobility, means less energy needed to switch, but also higher sensitivity to random noise. If you want to “transmit” a high frequency signal, you want a larger signal eye, which for a single signal, you can do by increasing the energy. However, if you want a few billion transistors, each one with “its own signal”, then you want to reduce energy usage first, think about ways to shield them from the noise later.
For reference, a modern CPU running at 0.6V with a 50W TDP, means it potentially could be pulling up to 80A internally. Reducing that by half, would directly allow 4 times more simultaneously active transistors at the same TDP. Having them switch 20 times faster, if you managed to deal with the noise, it would pave the way to 80 times faster multiprocessors for a similar power budget.
The good news is, if they use less energy, then they also emit less noise, which decays with the square of the distance. So half the emitted noise, even with twice the noise susceptibility, could let you pack them twice as close, meaning 4x as many per area, or 8x as many per volume.