I believe that professional wrestling is clean and everything else in the world is fixed.
- Frank Deford
The code snippet shown below has a security vulnerability.
Can you spot the vulnerability in this piece of code? If so,feel free to leave a comment. None of the comments will be shown until Friday,to prevent spoilers.
If you are a development manager or an instructor you can integrate these security source code challenges into your development program or your curriculum.
Vulnerable Code Snippet
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | function clean_url( $url, $protocols = null, $context = 'display' ) { $original_url = $url; if ('' == $url) return $url; $url = preg_replace('|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\\x80-\\xff]|i', '', $url); $strip = array('%0d', '%0a'); $url = str_replace($strip, '', $url); $url = str_replace(';//', '://', $url); if ( strpos($url, ':') === false && substr( $url, 0, 1 ) != '/' && substr( $url, 0, 1 ) != '#' && !preg_match('/^[a-z0-9-]+?\.php/i', $url) ) $url = 'http://' . $url; // Replace ampersands and single quotes only when displaying. if ( 'display' == $context ) { $url = preg_replace('/&([^#])(?![a-z]{2,8};)/', '&$1', $url); $url = str_replace( "'", '', $url ); } if ( !is_array($protocols) ) $protocols = array('http', 'https', 'ftp', 'ftps', 'mailto', 'news', 'irc', 'gopher', 'nntp', 'feed', 'telnet'); if ( wp_kses_bad_protocol( $url, $protocols ) != $url ) return ''; return apply_filters('clean_url', $url, $original_url, $context); } |
If you enjoyed this post,make sure you subscribe to my RSS feed!



Argh! Regular expressions!!
Not my Forte,but I’ll take some time to decipher them.
BTW,is this a snippet from a WordPress plugin? Although I haven’t looked at much PHP code,the apply_filters scheme looks vary similar to the stuff that WordPress plugins are made of (I developed plugins and a theme for WP once).
may i get entire code?
OK,I can’t resist.
Mainly,this script is backwards –it cleans a few known dangerous characters rather than only allowing known good characters. In fact,it still allows most ASCII characters including things that have special meanings to shell,etc.,like !,*,etc. It allows % so could do a format string attack on printf. It allows url encoded characters other than \n and \r. It allows ASCII / hex encoding so you could pass a control (^),ESC,etc. It doesn’t prevent buffer overflows.
Also,it returns the original URL,which hasn’t been cleaned. No point in trying to clean it then accessing the tainted variable later. Dump it.
Finally,it’s in PHP.
I generally try to specify which characters I WILL allow,as then I don’t have to worry about which esoteric characters the attacker might come up with.
So,if the vulnerability is in the regex filter,I claim a win.
I’m still looking for a logical vulnerability,though…
I had to look at the wordpress code to understand what this function is expected to do. It is impossible to know what could be a security vulnerability on the code without knowing how the function output is going to be used.