2010
08.06

Last weekend, I evaluated a couple Wordpress plugins that I wanted to add to this site. One of the plugins I loooked at was Scripts Gzip: its goal is to reduce the number of requests needed to fetch CSS/JS. As the author puts it:

It does not cache, minify, have a “PRO” support forum that costs money, keep any logs or have any fancy graphics. It only does one thing, merge+compress, and it does it relatively well.

I happened to take a look at the plugin’s code because I was curious how it was fetching the CSS/JS. However, I soon discovered a number of fairly serious security vulnerabilities. I immediately contacted the author with an explanation of what I had found. He responded quickly, acknowledged the issues, and released a new version of the plugin that addressed all of the vulnerabilities I had found.

Since a fixed version of the plugin has now been released, I feel comfortable discussing the flaws publicly. I’ll be giving a brief synopsis along with details about each vulnerability, ordered below from most to least serious.

1. Arbitrary Exposure of File Contents

This vulnerability was quite serious. It allowed an attacker to view the complete contents of any file within the Wordpress directory, even sensitive PHP files like wp-config.php.

The problem came from this function in gzip.php:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
<?php
private function collectData($options)
{
    $returnValue = array('data' => '', 'latestModified' => null);
    foreach($options['files'] as $file)
    {
        $oldDirectory = getcwd();
        $parsedURL = parse_url($file);
        $file = $parsedURL['path'];
        $file = preg_replace('/\.\./', '', $file);  // Avoid people trying to get into directories they're not allowed into.
        $parents = '../../../';                 // wp-content / plugins / scripts_gzip
        $realFile =  $parents . $file;
        if (!is_readable($realFile))
            continue;

        if ($options['mime_type'] == 'text/css')
        {
            chdir($parents);
            $base = getcwd() . '/';
            // Import all the files that need importing.
            $contents = $this->import(array(
                'contents' => '',
                'base' => $base,
                'file' => $file,
            ));
            // Now replace all relative urls with ones that are relative from this directory.
            $contents = $this->replaceURLs(array(
                'contents' => $contents,
                'base' => $base,
                'parents' => $parents,
            ));
        }
        else
        {
            $contents = file_get_contents($realFile);
        }

        $returnValue['data'] .= $contents;

        if ($options['mime_type'] == 'application/javascript')
            $returnValue['data'] .= ";\r\n";

        chdir($oldDirectory);

        $returnValue['latestModified'] = max($returnValue['latestModified'], filemtime($realFile));
    }
    return $returnValue;
}

The key to understanding this vulnerability is understanding the inputs. $options is an array containing user data, including the names of files that the user has asked the script to gzip. So, we understand that lines 6-13 are sanitization/validation checks that prevent the user from trying to load files outside of the Wordpress directory or trying to load non-existent files. However, we can also see that on line 34, the file requested by the user is loaded without any other special validation. That code path is executed when the user requests that JS be gzipped.

As a concerete example, we can say that http://example.com/wordpress/wp-content/plugins/scripts-gzip/gzip.php?js=wp-config.php would load the Wordpress configuration file on a vulnerable installation.

2. “Limited” Exposure of File Contents

A less serious security vulnerability occurs if the attacker tries loading CSS instead of JS. As we can see from the code above, if the script is loading CSS it calls an import function, which is reproduced in part below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<?php
/**
 * Import a CSS @import.
 */
private function import($options)
{
    $dirname = dirname($options['file']);
    if (!is_readable($dirname))
        return $options['contents'];

    // SECURITY: file may not be outside our WP install.
    if (!str_startsWith(realpath($options['file']), $options['base']))
        return $options['contents'];

    // SECURITY: file may not be a php file.
    if (strpos($options['file'], '.php') !== false)
        return $options['contents'];

    // Change the directory to the new file.
    chdir($dirname);
    $newContents = file_get_contents(basename($options['file']));

    // The rest of the code has been omitted for space considerations
}

It’s clear that this code path does do some extra validation. However, we can also see that the validation is not strong enough. It employs a blacklist model (forbidding files ending in .php) rather than a whitelist model (eg: only allowing files ending in .css). Accordingly, an attacker can request potentially sensitive but not forbidden files (eg: .htaccess) that are within the Wordpress directory.

As another concerete example, we can say that http://example.com/wordpress/wp-content/plugins/scripts-gzip/gzip.php?css=.htaccess would load the .htaccess file for Wordpress, if it exists.

3. Path Disclosure

This vulnerability is the least serious of the bunch. However, a path disclosure vulnerability can be useful for hackers looking to gain more information about a target. It can also make other sorts of attacks easier to pull off. This vulnerability required an attacker to be able to write a file somewhere within the Wordpress directory (possibly if they’re allowed to upload attachments). If the file contained a line with url(‘…’) and the script were directed to parse it it as CSS, the result returned would contain the absolute path to the Wordpress directory.

Summary

All of these vulnerabilities have been resolved in the latest version of the plugin, 0.9.1. I urge anyone using Scripts Gzip to upgrade as soon as possible. I’d especially like to thank the plugin’s author, Edward Hevlund, for his speedy responses to my emails, his attentiveness to security issues, and (of course) for a wonderfully useful plugin.

2010
08.04

My laptop was infected by a virus recently. My Java installation was out of date (so much for auto-update!) and I browsed to a page containing malicious advertisements which downloaded a virus. I immediately stopped what I was working on and cleaned off the computer. After spending a couple hours on it, I was reasonably certain the virus was gone.

Unfortunately, I started getting messages from Symantec Auto-Protect telling me that it had found an infected file in my temporary directory. Since that was where the infection started, I began to worry that I hadn’t completely eliminated the virus. So I rebooted into safe mode, re-scanned, etc. Nothing found!

This was puzzling, but I figured I would just continue to work like normal. Unfortunately, today the popups from Auto-Protect started again. Originally I thought that one of the sites I was visiting was infected: however, when I closed the site in question, the alerts continued. Finally, I Googled, hoping to find a solution. All I knew was that the filename was random-ish (DWH*.tmp) and that Norton was describing the virus as Trojan.gen (Norton’s generic term for “a trojan horse of some kind”).

Luckily, I ended up stumbling upon my answer. It was written by a Symantec employee in response to a topic about the issue:

http://www.symantec.com/connect/forums/generic-trojan-dwhtmp-temp-folder

The DWH files are temp files that are created by our process called defwatch.exe. These files are quarantined threats that we pull out of quarantine to scan during a quick scan. This usually happens when new defs are applied…What we have seen in most cases, is the indexing service, or some other real-time scanner is touching the file and then auto-protect is re-scanning it.

So, mystery solved! The files I was seeing where actually quarantined versions of the viruses I had eliminated earlier. A couple clicks to empty my quarantine and I wasn’t getting any more alerts. Very satisfying. :-)

2010
08.02

In case anyone else is interested in counting requests per IP without the use of some scary sed/awk, here’s a combination of shell commands that I found very useful:

1
grep 'text' /path/to/access.log | cut -d' ' -f1 | sort | uniq -c | sort -r

It breaks down like this:

  1. You grep for whatever string you’re interested in inside your access log. You could want to find a certain path, a certain user-agent, etc. The grep could really be replaced by any command, so long as the result is lines from a standard Apache access log.
  2. The result is piped to cut, which splits each line and grabs the first field (the IP by default)
  3. The result is piped to sort, which sorts the result, putting identical IPs next to each other. Remember, one line corresponds to one request.
  4. The result is piped to uniq, which groups sorted IPs together. The -c option causes uniq to also return the number of lines containing the IP. This is important since we’re interested in frequency
  5. Finally, we pipe the result through sort one more time, which sorts the results by their frequency. The -r option puts the most frequent IPs at the top.

Kudos to http://blogs.law.harvard.edu/djcp/2009/04/how-to-extract-uniq-ips-from-apache-via-grep-cut-and-uniq/, which pointed me in the right direction to begin with.

Note: As Frank mentions in the comments, this tip applies equally as well to other web servers (eg: lighttpd, Nginx) when they’re using their default log format. In fact, the basic principles can be applied to just about any data: you don’t have to be looking at IPs in an access log. The cut/sort/uniq/sort chaining should work well no matter what kind of textual data you’re looking at!

2010
07.31

Arbitrary PHP execution vulnerabilities are both nasty and powerful. However, there is one aspect of these vulnerabilities that people don’t seem to keep in mind: that any code that’s being run, any requests that are being made, etc are being made from the compromised server. That compromised server is a platform for the attacker: once an attacker has compromised one server, he/she is in a better position to compromise more.

Lets take an example. I set up a server to host some websites for me. I’m really paranoid, so I lock the server down as much as I can: I disable any commands that allow processes to be run outside of PHP (proc_*, exec, passthru, system, etc), I stick it behind a firewall that only allows traffic on port 80, and I lock the webserver down in a jail environment (so it doesn’t have access to any other parts of the filesystem, just its files).

I’m safe, right?

Wrong!

One of my applications has a silly bug: it allows the user to enter code which it parses as PHP. An attacker who uses my application finds the vulnerability and starts exploiting it. He quickly realizes that he’s “locked in”: he can’t execute shell commands and he can’t access most of the filesystem. All he can do is execute PHP: that allows him to download any files the webserver can read and/or to run arbitrary PHP code (say, some very expensive calculations designed to cause a DOS attack).

But there’s nothing else he can do, right?

Wrong!

It turns out that the attacker has much more power than he thinks. He’s now inside the firewall: he can execute whatever connections he wants within the supposedly “safe” environment, as long as he uses PHP to start. A savvy attacker could:

  • Launch a port scan attack against other servers. You can identify the services on other systems and maybe find a more vulnerable one.
  • Gain access to the server via SSH. A pure PHP implementation of SSH can be found at http://phpseclib.sourceforge.net/. If you can upload a large script to the server (all the necessary dependencies combined takes up ~400 KB), you have an SSH client. That gives you the opportunity to launch an attack against the server you’re on, or any other servers on the network. Combined with some social engineering (to gain a username/password without bruteforce), and you have access to the server.

The best part about these attacks is they’re all internal: a sysadmin looking at the logs would see a server attacking itself or attacking other servers. The attack vector isn’t immediately obvious.

2010
07.22

In no particular order, this is a list of my favorite Firefox extensions. They make my browsing experience awesome and I wouldn’t trade them for anything in the world. :)

  1. Live HTTP Headers. I’ve had this extension for years and it has proved invaluable time and time again. I use it for debugging and security testing. Very simple, very easy to use, very powerful.
  2. Firebug. This one is important for anyone doing web design/development. It’s just an absolutely awesome tool. I can’t imagine ever having developed HTML/CSS without it. It even comes with its own extension system, which I haven’t taken advantage of but which looks incredible.
  3. Web Developer Toolbar
  4. User Agent Switcher. Both #3 and #4 are made by the same developer. These were two of the first extensions I ever installed. I keep using them because of how useful they are for debugging and testing.
  5. View Source With. Allows me to view HTML in my favorite Windows text editor, Notepad2
  6. BugMeNot. While it has grown less useful over the years (sites are getting better and better at banning their accounts), it’s always a good time-saver to have. It can’t hurt! :-)

There are definitely a couple developers who I need to donate to. BRB! :-P

2010
07.15

Everyone who has worked with PHP should be familiar with the bare fundamentals of its syntax: an opening PHP tag followed by code and (optionally) followed by a closing PHP tag. Incredibly though, there isn’t just one set of tags that can be used to invoke the PHP interpreter on a block of code: there are a total of four separate sets! Each set of tags has a slightly different set of behaviors and awareness of those differences is crucial in preventing certain types of security vulnerabilities.

My goal here is to lay out the four different sets of tags, describe what makes them special, and finally to explain why knowing about them is so important.

Standard” PHP

This is the syntax that most people are familiar with; accordingly, it’s what you see most often within PHP code. The opening tag is defined to be <?php and the closing tag is ?>. This set of tags is always usable within PHP.

1
2
3
4
5
<?php

// code goes here

?>

PHP Short Tags

This is another fairly well-known syntax. The opening tag is shorted to just require <?, matching up nicely the closing tag. There’s even an abbreviated syntax for echoing using the opening tag <?=. Unfortunately, these types of tags conflict with XML documents due to the documents’ use of <?xml.

PHP can be configured to accept or ignore these tags using the short_open_tag directive in php.ini. As a result, they’re considered non-portable; if you’re writing code that’s intended to be used by others or that may be used in environments where you can’t control PHP settings, you’re encouraged to forgo short tags in favor of standard tags.

1
2
3
4
5
6
<?

// code goes here

?>
<?= $var ?>

ASP-style” Short Tags

This is a less well-known syntax. It behaves in a similar fashion to regular short tags; the only difference is that the opening and closing tags are <% and %>, respectively. This change avoids conflicting with XML documents.

PHP can be configured to accept or ignore these tags using the asp_tags directive in php.ini. As a result, they’re also considered non-portable. They are used much less frequently than regular PHP short tags.

1
2
3
4
5
6
7
<%

// code goes here

%>

<%= $var %>

PHP <script> Tags

This is possibly the least well known of the four tags: I’ve never used it personally and I’ve only seen it referenced in very old books on PHP. It behaves just like the normal PHP tags though. All PHP installations will parse PHP code written in this format; there is no way to disable that behavior.

1
2
3
4
5
<script language="php">

// code goes here

</script>

Why Different Tags Matter

This example is based on an actual security vulnerability I encountered in a live application.

Lets say you write an application that wants to read in user-supplied data via include/require (generally a bad idea, but there are applications out there that do this). You, as a smart PHP programmer, realize that you have a potential security vulnerability on your hands: all a user needs to do is write an opening PHP tag and they can execute arbitrary code! So, you decide to filter their input: you reject their input if it contains <% or <? anywhere in it. Maybe you even turn asp_tags off in php.ini, since you never plan to use them anyway. That takes care of <%, <?, <?php, and any special echoing syntax that they might provide. You’re safe and secure now, right?

Oh, wait. There’s a fourth set of tags you forgot about.

<script language="php"> is not very well-known and it doesn’t look like the other sets of tags. So, it’s both harder to remember and harder to defend against than any of the other tags. A blacklist like the one above would still allow the <script> tags through, allowing for arbitrary PHP code execution.

Now, although this is a very good argument for why developers shouldn’t rely on blacklists for security and why applications should not call include/require on files created from user input, both of those things do happen. Security-conscious developers need to be aware of all of these risks and loopholes so they can be properly mitigated.

2010
07.11

About a year ago, a couple friends and I were working on a PHP-based web application for one of our classes. The web application needed to take in input from another website (the school’s calendar system), which provided data in CSV format. As a result, I became very familiar with PHP’s (lack of) CSV parsing functionality and there’s one tip in particular that I think is worth sharing.

PHP has two main functions for parsing CSV data: fgetcsv and str_getcsv. fgetcsv complements other functions like fopen, fsockopen, etc: it’s used to read data from files, similar to fread. In contrast, str_getcsv works on strings: there’s no file I/O necessary. This is much more appealing than having to perform low-level reads/writes to fetch data (especially when your data is coming from another website). Unfortunately, it only exists in PHP >= 5.3.0.

So, for our project, we took in data from a website (using an HTTP client class provided by the PHP framework we were using) and stored it in a string. But since we didn’t have PHP 5.3, we needed to turn the data into a file if we wanted to read from it. Luckily, a comment on PHP.net alerted us to a convenient way to “work around the issue”:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
<?php
// Assume this variable contains CSV data from some external source, stored as a string
$my_csv_text = get_csv_from_internet();

// Open a memory "file" for read/write...
$f = fopen('php://temp', 'r+');
fwrite($f, $my_csv_text);
rewind($f);

while (($data = fgetcsv($f)) !== FALSE)
{
    // do stuff with the data!
}

What the code does is it takes advantage of a PHP 5.1 feature: the ability to write to a temporary location (which, according to the PHP docs, is stored in memory unless the data is above a certain size, at which point it’s written to disk instead). That allows for the data to be read without any messy cleanup. The only disadvantage is slightly higher memory usage, since the entire CSV file is stored twice (once as a string in PHP, once after being written to php://temp). In addition, anyone using an earlier version than 5.1 or who wants to avoid the memory usage can just write to a file instead: the code should work exactly the same.

2010
07.07

When I was browsing /r/programming earlier this morning, I came across a link to a web application named Tweeter. I played around with it for a while and I think it’s a really awesome application, so I figured I’d write a post about it. :-)

Tweeter is a web application designed for a single purpose: to give people a chance to apply their knowledge of SQL injections to a “real” site. The attacker’s goal is to use his/her knowledge of SQL injections to post as an existing user named agentgill. Once the “hack” is complete, the attacker is directed to a new version of the website, designed with more safeguards and security measures that need to be circumvented. I don’t want to delve into the specifics of the different versions, but there are a total of four levels, each with their own set of challenges that must be overcome.

The new interface's type-ahead functionality, hard at work

Screenshot of Tweeter Level 1

I really enjoyed playing with Tweeter. It was a fun challenge and it gave me a chance to reuse some basic SQL injection knowledge I haven’t used in a while. It reminded me a little bit of Jarlsberg, a similar application created by Google to teach people about possible attack vectors in web applications (but which does not demonstrate SQL injections, since it does not use SQL). I believe tools like Tweeter are integral in teaching web application security; learning about SQL injections in class is nowhere near the same experience as being able to exploit them properly on a real website. I’ll definitely be adding it to my bookmarks.

If you’d like to try it out for yourself, you can click on this link to create a new instance on the author’s site.

More information about Tweeter (including a link to download the source) can be found on the author’s blog.

2010
07.03

Recently at work, one of my friends ran into a problem trying to integrate the Google Visualization API and jQuery. His setup seemed fairly straightforward and it didn’t seem like there would be any problems. He wanted the user to select an option from a dropdown box. That selection would trigger a GET request to fetch some JSON (using the getJSON function of jQuery) from his web server. The result of the request would be passed via getJSON’s callback to Google Visualizations, which would then display some nice charts.

Sounds easy, right?

Unfortunately, when he tested his code in Firefox, the visualization he wanted never showed up. All he saw was an empty iFrame where his chart should have been (Google Visualizations draws certain types of charts using SVG in iFrames).

Puzzled, he started debugging. He quickly confirmed that the proper calls to draw the visualization were being made. Yet somehow, they were just returning an empty iFrame instead of one filled with data. Even more puzzling, his code worked fine when he tested it in Google Chrome, Safari, etc: Firefox was the only browser where it failed!

His next step was to try replacing his chart code with Google’s example code, thinking that maybe the issue was with his data. Amazingly, that chart also failed to display properly! However, it also led him to another discovery: the chart drawing would only fail if it was attempted from within getJSON’s callback function. If the chart was drawn before or after, it would work correctly.

Finally, after several people spent hours wrestling with the problem, one of our fellow employees came up with a solution. He created a closure that would call the correct Javascript functions to draw a graph. Then, he passed that closure to setTimeout with a timeout of 0. It was a fairly elegant workaround: using setTimeout avoided the weird scoping issue (we still don’t understand what the problem was!) and the closure allowed the data to be passed out of the troublesome callback and to the Google Visualization classes. Definitely not the cleanest solution, but very welcome after several hours of wrestling with the problem!

2010
06.26

Disclaimer: This is not about hiding or disabling Wordpress’s automatic update notices. If that’s your goal, there are a series of wonderful plugins which will do the job for you (for core, plugin, and theme updates). Instead, this is about making it physically impossible for administrators of a Wordpress installation to use the install/update functionality in the backend of the site.

If you choose to do this, you must make sure to keep up-to-date with new releases and update your site as soon as new versions are released. If you fail to do so, you run an increased risk of security vulnerabilities or performance issues with your website/blog.

Short version

If you create a file named upgrade in the wp-content directory, Wordpress will be unable to download any new plugins/themes or upgrade existing code.

Long version

I needed to figure out how to disable the automatic update feature because I was using SVN to store/deploy the code for several clients’ Wordpress installations. I had a central SVN repository for each client which I pushed changes to and which the websites pulled their code from. Unfortunately, Wordpress deletes all traces of the existing plugin when upgrading; that means the .svn directory inside each plugin’s directory would also be removed, breaking the working copy. If I didn’t disable the feature, someone with administrative access could break the website’s working copy just by upgrading a plugin.

As it so happened, I found an easy solution. All I had to do was create a 0 byte file at wp-content/upgrade (removing the directory with that name if it exists). That directory is used by Wordpress as temporary space when installing/updating code. Placing a file there instead causes the process to fail; Wordpress has no place to store files temporarily. It is sort of a hack, but it works well enough for my purposes; I don’t have to modify the Wordpress core, I just have to create a single file.