Even if you have encrypted your traffic with a VPN (or the Tor Network), advanced traffic analysis is a growing threat against your privacy. Therefore, we now introduce DAITA.
Through constant packet sizes, random background traffic and data pattern distortion we are taking the first step in our battle against sophisticated traffic analysis.
I think you both are talking past each other. You said “But if nobody else is using those same endpoints.” but @MigratingtoLemmy@lemmy.world said “There’s plenty of people who are going to be renting VPSes and will have their traffic originate from the same IP range as mine”. Reading this thread, it seems like you both have different network setups in mind.
Thanks for pointing that out. I tried to address that. When I responded about net flow analysis. Having the same IP range as other people does not let you hide in the crowd. The net flow data will identify exact IPs.
Hypothetically, what if everybody in the world were using mixnets to obfuscate destination/origin, and then mullvad’s DAITA to obfuscate traffic timing and size. Would netflow analysis be able to defeat that?
What is a mix net? Something like TOR? An onion overlay Network where the routing goes between multiple hops before it exits the network?
Let’s go through a few scenarios first
Scenario A: you have a link to a common VPN endpoint, that other people use. On this link you generate traffic, a consistent 1 megabyte per second up and down.
There is now ambiguity about what traffic goes into the VPN, and goes to you. And outside observer would not be able to deduce what traffic is yours just by size and timing.
This is the gold standard. You remove all possible signal data.
Scenario B: everyone is using a onion overlay network, and their traffic has a little padding added, and a little extra timing added at every link. This would reduce the probability and outside observer could deduce the entire end to end flow of your traffic. But the type of your traffic could defeat whatever level of obscuring is happening. Imagine you have a real time connection to an network, and you’re typing out Morse code… - - - - sort of thing. Imagine each of those packets has a different size. If I’m observing the network for long enough, I’m going to notice the Morse code type of packets, with the timing and the size go through the onion network. There will be some ambiguity. But enough traffic over enough time would give me high confidence that you’re the source of the traffic. Because the extra obscuring traffic has a probability, but not a guarantee, of masking the shape of your traffic.
So scenario a is the gold standard, scenario b would be better then nothing. Having a global onion network has its own issues, now you have to trust many nodes instead of one node. All this is down to your threat model and how much effort you’re willing to do.