Preserve encoded slashes in URL

Photo by Kon Karampelas on Unsplash.

To use a reserved character in a URL without invoking its special meaning, the character must be URL encoded. For example, a ? separates the page name from the query string, a & separates query string parameters, and a = separates parameter names from their values. These characters must be converted to %3F, %26, and %3D respectively if we want them interpreted literally.

In MVC applications, these special characters are less common because URLs follow a pattern like /Module/Controller/Action/Parameter1/Parameter2. In this case, the most important reserved character is the forward slash. Let's see how to handle slashes in Zend Framework 1 URLs.

Encoding in a basic PHP application

A basic PHP-on-IIS application handles encoded characters well. Let's review the output of phpinfo().

Literal URL: http://localhost/index.php?param=value&%2F%3F%26%3D=%3D%2F%26%3F
_GET["param"] value
_GET["/?&="] =/&?
_SERVER["REQUEST_URI"] /index.php?param=value&%2F%3F%26%3D=%3D%2F%26%3F
_SERVER["QUERY_STRING"] param=value&%2F%3F%26%3D=%3D%2F%26%3F

Note that the characters have been decoded in the $_GET superglobal, but the encoded version remains intact in the $_SERVER superglobal.

Encoding in an MVC PHP application

Compare this behavior to an MVC PHP application hosted on IIS.

Literal URL: http://localhost/param/value/%2F%3F%26%3D/%3D%2F%26%3F
_GET["?"] no value
_SERVER["REQUEST_URI"] /param/value/?&=/=/&?
_SERVER["QUERY_STRING"] &=/=/&?

This time the URL interpretation is all wrong. The encoded characters were decoded before IIS parsed the URL, so they were treated as query string delimiters. None of the PHP superglobals contain the encoded URL that appears in the browser bar. How can we pass encoded characters to the PHP MVC application?

Rewriting to the Rescue

The solution lies in our URL rewriter. On IIS, my tool of choice is Ionic's ISAPI Rewrite. To funnel MVC requests to the Zend Framework I use the following rule.

# IIRF.ini
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !\. [I]
RewriteRule ^[^?]*(\?.*|$) /index.php$1 [L]

These rules include some enhancements beyond the ones you might find online. Namely the RewriteRule will persist any query string parameters, which helps me use GET forms in my MVC application. But these customizations don't solve the encoding problem.

To preserve the encoded characters we need only add a modifier to the RewriteRule.

# IIRF.ini
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !\. [I]
RewriteRule ^[^?]*(\?.*|$) /index.php$1 [L,U]

The U modifier will save the original URL, before IIS decodes it, to the HTTP_X_REWRITE_URL server variable.

Literal URL: http://localhost/param/value/%2F%3F%26%3D/%3D%2F%26%3F
_SERVER["HTTP_X_REWRITE_URL"] /param/value/%2F%3F%26%3D/%3D%2F%26%3F

When the Zend Framework (Zend_Controller_Request_Http::setRequestUri()) parses the URL to determine the route and parameters, it will first look for the HTTP_X_REWRITE_URL variable containing the preserved URL. Voilà! Our MVC application sees our encoded characters.

Drew

Drew

Hi! I'm Drew, the Wimpy Programmer. I'm a software developer and formerly a Windows server administrator. I use this blog to share my mistakes and ideas.