How Vulnerabilities Happen: Input Validation Problems
We would like to thank Richard Ackroyd of RandomStorm for reporting a critical input validation error in our site to us. As we have done before, here is how it happened so hopefully you can learn from it as well.
Lets start with a bit of background. Our site deals a lot with IPv4 addresses. Most of the time, we store IPv4 addresses as a string. I know this isn't the most efficient way, but well, that decision goes back to "the beginning". To make sorting and indexing simpler, we "pad" IPv4 addresses with zeros, and you may have seen this on the site. 192.0.2.1 becomes 192.000.002.001.
Originally, I used a simple trick to validate IP addresses. I just converted the IP address to its long integer format, and then back to a string. This guarantees that you end up with a valid IP address. Later, we started using more of the standard "FILTER" functions in php to make some input validation easier, and modified the IPv4 validation function to use it. At the same time, to make the code a bit simpler, we also added an "unpad" function to fix up the IP address by removing extra 0s first.
Here is a quick view at the vulnerable code:
if ( is_ipv4($sIPAddress) { ... use $sIPAddress ... } else { ... display error ... } function is_ipv4($sValue) { $sValue=unpadip($sValue); if ( filter_var($sValue,FILTER_VALIDATE_IP,FILTER_FLAG_IPV4) ) { return $sValue; } } function unpadip($sIP) { $aIP=explode('.',$sIP); if ( sizeof($aIP)<4) return false; return sprintf("%d.%d.%d.%d",$aIP[0],$aIP[1],$aIP[2],$aIP[3]); }
So why is this wrong? The big problem is that I am modifying the value (unpad) before validating it, and then use the unmodified value, not the one I modified. At first, that doesn't look too bad in this case. But turns out that the "unpad" function does more then just remove extra 0s. Any other non-numeric character is removed. E.g. try:
printf("%d","123 this is an exploit");
and you will get "123" back. That is part of the point of %d. The end result was that we validated a value that was cleansed by "unpad", but then used the "dirty" value which still included the exploit code.
Our solution for now is twofold:
- add a bit more input validation to the unpad function, just in case we use it unsafely in other parts of the code
- remove the unpad from the is_ipv4 validation function.
The ultimate solution would be to change how we store IPv4 addresses and store them as long integer, which is much more efficient, but will take some time as we got a lot of code that needs to read/write from those tables. For IPv6 we use two 64 bit numbers ("BIGINT" in mysql) which works very well as it splits network and interface part.
Application Security: Securing Web Apps, APIs, and Microservices | Online | US Eastern | Jan 27th - Feb 1st 2025 |
Comments