07.15
Everyone who has worked with PHP should be familiar with the bare fundamentals of its syntax: an opening PHP tag followed by code and (optionally) followed by a closing PHP tag. Incredibly though, there isn’t just one set of tags that can be used to invoke the PHP interpreter on a block of code: there are a total of four separate sets! Each set of tags has a slightly different set of behaviors and awareness of those differences is crucial in preventing certain types of security vulnerabilities.
My goal here is to lay out the four different sets of tags, describe what makes them special, and finally to explain why knowing about them is so important.
“Standard” PHP
This is the syntax that most people are familiar with; accordingly, it’s
what you see most often within PHP code. The opening tag is defined to
be <?php
and the closing tag is ?>
. This set of tags is always usable
within PHP.
1 2 3 4 5 | <?php
// code goes here
?>
|
PHP Short Tags
This is another fairly well-known syntax. The opening tag is shorted to
just require <?
, matching up nicely the closing tag. There’s even an
abbreviated syntax for echoing using the opening tag <?=
. Unfortunately,
these types of tags conflict with XML documents due to the documents’
use of <?xml
.
PHP can be configured to accept or ignore these tags using the
short_open_tag
directive in php.ini. As a result, they’re
considered non-portable; if you’re writing code that’s intended to be
used by others or that may be used in environments where you can’t
control PHP settings, you’re encouraged to forgo short tags in favor of
standard tags.
1 2 3 4 5 6 | <?
// code goes here
?>
<?= $var ?>
|
“ASP-style” Short Tags
This is a less well-known syntax. It behaves in a similar fashion to
regular short tags; the only difference is that the opening and closing
tags are <%
and %>
, respectively. This change avoids conflicting with
XML documents.
PHP can be configured to accept or ignore these tags using the
asp_tags
directive in php.ini. As a result, they’re also
considered non-portable. They are used much less frequently than regular
PHP short tags.
1 2 3 4 5 6 7 | <%
// code goes here
%>
<%= $var %>
|
PHP <script>
Tags
This is possibly the least well known of the four tags: I’ve never used it personally and I’ve only seen it referenced in very old books on PHP. It behaves just like the normal PHP tags though. All PHP installations will parse PHP code written in this format; there is no way to disable that behavior.
1 2 3 4 5 | <script language="php">
// code goes here
</script>
|
Why Different Tags Matter
This example is based on an actual security vulnerability I encountered in a live application.
Lets say you write an application that wants to read in user-supplied
data via include/require (generally a bad idea, but there are
applications out there that do this). You, as a smart PHP programmer,
realize that you have a potential security vulnerability on your hands:
all a user needs to do is write an opening PHP tag and they can execute
arbitrary code! So, you decide to filter their input: you reject their
input if it contains <%
or <?
anywhere in it. Maybe you even turn
asp_tags
off in php.ini
, since you never plan to use them anyway. That
takes care of <%
, <?
, <?php
, and any special echoing syntax that they
might provide. You’re safe and secure now, right?
Oh, wait. There’s a fourth set of tags you forgot about.
<script language="php">
is not very well-known and it doesn’t look like
the other sets of tags. So, it’s both harder to remember and harder to
defend against than any of the other tags. A blacklist like the one
above would still allow the <script>
tags through, allowing for
arbitrary PHP code execution.
Now, although this is a very good argument for why developers shouldn’t rely on blacklists for security and why applications should not call include/require on files created from user input, both of those things do happen. Security-conscious developers need to be aware of all of these risks and loopholes so they can be properly mitigated.
Comments