From The Hacker News
A naive implementation of a conversion from displayable IPv4 to 32bit integer IPv4 might look like this:
You have taken the four integer numbers of the IPv4 address from the text input, converted them to an integer and for simplicity sake we will say that they were put into variables a, b, c and d. It is common to just use ‘int’ for numbers and not choose something more fitting like ‘uint8’, so in our case the variables will look like this after assignment:
a: 0 b:0 c:0 d:2130706433
Now let’s declare a 32 bit variable R for our result and initialize it with 0 (or with d to skip a step, but optimization is not the point here).
Let’s work from left to right along the original representation. That means we’ll start with variable a. In an ideal case, only the rightmost 8 bits are set (0-255 decimal). Many will just make that (naive) assumption. We shift it to the left, so that the 8 bits that were previously on the very right are now in the leftmost 8 bit positions of variable a, like this: a<<24.
We are working with 32 bit integer variables and that means that in this first case, any bits beyond the intended 8 bits were “shifted out” to the left and are now gone. There is no problem here yet. We simply OR it to the result variable. The result R now contains the first 8 bits of the IPv4 address and they are correct.
Only when we want to do the same with variable b, c or d is when problems might appear. If we do the same as above with b, we now only need to shift to the second octet to the left: b<<16. If we simply put out shifted b into our result variable R by OR-ing it with the current result, we not only add 8 bits of information to our result, but possibly 16. In this case the input was 0, so nothing bad has happened yet.
We repeat the same for c by shifting it by only one octet: c<<8 and as c is also 0, nothing bad happens when we or it together with the previous result.
Now in the last step is where the (minimal) example from above gives us problems. In this case we do not need to shift at all (d<<0). Remember that our result R is still 0 at this point. If we now OR 0 and 2130706433 together, we just get 2130706433. This last step added 32 bits of information even though we only intended to add 8.
Our converted IPv4 address is valid and some might say we even did a correct conversion, but it is of course not the intuitive result we wanted.
As you can see, an attacker might choose b,c and d as they please to make the notation look irregular, but our conversion simply eats it up and spits out a valid address. That’s why it is important to always work with IP addresses as integer numbers in bit format. Convert first, then filter on that. You never know how somebody may have implemented the conversion and whether they reject odd choices like “0.0.0.2130706433”.
You could of course also prevent stuff like this by making the conversion smarter and have it reject a,b,c,d >256 or something similar, but then you are just putting bandaids on a simple problem with a simple solution. Our conversion had to work with 4 independent numbers in a loop. It’s best to only do that once and use 32 bit integers beyond that.