These Pipes are Clean

I believe that professional wrestling is clean and everything else in the world is fixed.
- Frank Deford

The code snippet shown below has a security vulnerability.

Can you spot the vulnerability in this piece of code? If so,feel free to leave a comment. None of the comments will be shown until Friday,to prevent spoilers.

If you are a development manager or an instructor you can integrate these security source code challenges into your development program or your curriculum.

Vulnerable Code Snippet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
function clean_url( $url, $protocols = null, $context = 'display' ) {
$original_url = $url;

if ('' == $url) return $url;
$url = preg_replace('|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\\x80-\\xff]|i', '', $url);
$strip = array('%0d', '%0a');
$url = str_replace($strip, '', $url);
$url = str_replace(';//', '://', $url);

if ( strpos($url, ':') === false &&
substr( $url, 0, 1 ) != '/' && substr( $url, 0, 1 ) != '#' && !preg_match('/^[a-z0-9-]+?\.php/i', $url) )
$url = 'http://' . $url;

// Replace ampersands and single quotes only when displaying.
if ( 'display' == $context ) {
$url = preg_replace('/&([^#])(?![a-z]{2,8};)/', '&$1', $url);
$url = str_replace( "'", '', $url );
}

if ( !is_array($protocols) )
$protocols = array('http', 'https', 'ftp', 'ftps', 'mailto', 'news', 'irc', 'gopher', 'nntp', 'feed', 'telnet');
if ( wp_kses_bad_protocol( $url, $protocols ) != $url )
return '';

return apply_filters('clean_url', $url, $original_url, $context);
}
If you enjoyed this post,make sure you subscribe to my RSS feed!

5 comments to These Pipes are Clean

  • Argh! Regular expressions!! :-( Not my Forte,but I’ll take some time to decipher them.

    BTW,is this a snippet from a WordPress plugin? Although I haven’t looked at much PHP code,the apply_filters scheme looks vary similar to the stuff that WordPress plugins are made of (I developed plugins and a theme for WP once).

  • may i get entire code?

  • OK,I can’t resist.

    Mainly,this script is backwards –it cleans a few known dangerous characters rather than only allowing known good characters. In fact,it still allows most ASCII characters including things that have special meanings to shell,etc.,like !,*,etc. It allows % so could do a format string attack on printf. It allows url encoded characters other than \n and \r. It allows ASCII / hex encoding so you could pass a control (^),ESC,etc. It doesn’t prevent buffer overflows.

    Also,it returns the original URL,which hasn’t been cleaned. No point in trying to clean it then accessing the tainted variable later. Dump it.

    Finally,it’s in PHP. :-)

  • I generally try to specify which characters I WILL allow,as then I don’t have to worry about which esoteric characters the attacker might come up with.

    So,if the vulnerability is in the regex filter,I claim a win.

    I’m still looking for a logical vulnerability,though…

  • I had to look at the wordpress code to understand what this function is expected to do. It is impossible to know what could be a security vulnerability on the code without knowing how the function output is going to be used.

Leave a Reply

  

  

  

You can use these HTML tags

<a href=""title=""><abbr title=""><acronym title=""><b><blockquote cite=""><cite><code><del datetime=""><em><i><q cite=""><strike><strong><pre lang=""line=""escaped=""highlight="">