Is XXE the new SQLi?
Many modern applications today use XML documents to transfer data between clients and servers. Due to its simplicity, XML is actually great for this and is therefore very often used for representation of various data structures.
Typically a rich web application will use XML to send data from the web server to the server side. This XML document, which might contain various data structures related to the web application, is then processed on the server side. Here we can see a typical problem with untrusted input – since an attacker can control anything on the client side he can impact integrity of the XML document that is submitted to the server. Generally this should not be a problem unless the following happens.
Since the web application (on the server side) receives the XML document we just sent, it has to somehow parse it. Depending on the framework the application uses on the server side, it’s is most often (especially business applications) either a Java or a .NET application; other frameworks typically rely on libxml2.
The security problem here is in a special structure defined by the XML standard, entity. Every entity has certain content and are normally used throughout the XML document. However, one specific entity type is particularly dangerous: external entities.
External entity declaration further allows declaration of two types: SYSTEM and PUBLIC. The SYSTEM external entity is what we are interested about – it allows one to specify a URI which will be used during dereferencing to replace the entity. One example of such an entity is shown below:
<!ENTITY ISCHandler
SYSTEM “https://isc.sans.edu/api/handler”>
Now when parsing such an XML document, wherever we have the ISCHandler entity, the parser will replace it with the contents of the retrieved URI. Pretty bad already, isn’t it? But it gets even worse – by exploiting this we can include any local file by simply pointing to it:
<!ENTITY ISCHandler
SYSTEM “file:///etc/passwd”>
Or on Windows
<!ENTITY ISCHandler
SYSTEM “file:///C:/boot.ini”>
As long as our current process has privileges to read the requested file, the XML parser will simply retrieve it and put it when it finds a reference to the ISCHandler entity (&ISCHandler;).
The impact of such a vulnerability is pretty obvious. A malicious attacker can actually do much more – just use your imagination:
- We can probably DoS the application by reading from a file such as /dev/random or /dev/zero.
- We can port scan internal IP addresses, or at least try to find internal web servers.
Obviously, probably the most dangerous “feature” is extraction of data – similarly to how we would pull data from a database with a SQL Injection vulnerability, we can read (almost) any file on the hard disk. This includes not only the password file but potentially more sensitive files such as DB connection parameters and what not. Depending on the framework, including a directory will give us even the directory’s listing, so we don’t have to blindly guess file names! Very nasty.
Everything so far is nothing new really, however in last X pentesting engagements, for some reason XXE vulnerabilities started popping up. So what is the reason for that? First of all, some libraries allow external entity definitions by default. In such cases, the developer himself has to explicitly disable inclusion of external entities. This is probably still valid for Java XML libraries while with .NET 4.0 Microsoft changed the default behavior not to allow external entities (in .NET 3.5 they are allowed).
So, while XXE will not become as dangerous as SQLi (hey, the subject was there to get your attention), we do find them more often than before which means that we should raise awareness about such vulnerabilities.
Keywords: xxe
6 comment(s)
My next class:
Web App Penetration Testing and Ethical Hacking | Amsterdam | Mar 31st - Apr 5th 2025 |
×
Diary Archives
Comments
Remote file inclusion on webservers is a problem, not with just XML, but with PHP as well. Egress filtering outgoing traffic from servers at the firewall, or at least locally via Iptables -- is a fabulous idea.
In general... there's no reason webservers should be opening connections and downloading files from random servers :)
E.g.
iptables -A OUTPUT -m owner --uid-owner root -d X.Y.Z.W/24 -j ACCEPT
iptables -A OUTPUT -m owner -p tcp --dport 80:443 --uid-owner avupdates -j ACCEPT
-p udp --dport 53 -d [local DNS server]
-p tcp --dport 25 -d [local smtp server]
....
iptables -A OUTPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -j REJECT --reject-with icmp-port-unreachable
Anonymous
Jan 9th 2014
1 decade ago
Anonymous
Jan 9th 2014
1 decade ago
Interesting... I guess people forgot, that often an attacker's first arbitrary command is to use 'WGET' or 'Curl' to download a binary exploit file to a temporary directory, or a .C program, that is then compiled using GCC already installed on the box.
Breaking outgoing WGET commands and making sure non-admins cannot use GCC, by restricting access to unnecessary binaries, and mounting all the world-writable temporary directories and web content directories noexec, will not make the system unhackable, but they are definitely important things to think about, in the security hardening list :)
Anonymous
Jan 9th 2014
1 decade ago
Admins and operational accounts shouldn't have the development access, as normal course. Do that kind of thing as an exception.
Anonymous
Jan 10th 2014
1 decade ago
I am not sure we're in disagreement. The web applications running on a server, should only have access to whatever system binaries are absolutely required to run the application. Any extra binaries on the system, could be used by an attacker.
Tools like compilers, assemblers, or even the 'chmod' command can be used to facilitate adapting exploit code to the system: where the attacker would otherwise have a much higher burden.
Development activities should not occur on operational web servers, so developers don't even need access to those; only on dev servers that accept TCP connections only from the developer that owns it. But compilers are an important part commonly used in the software installation and deployment process -- including the building and installation of dependencies, or the building of RPMs from sources used to install dependencies. Root always has access to any installed compilers, but admins will and should use dedicated non-root accounts for compiling software received from 3rd party sources;
Anonymous
Jan 10th 2014
1 decade ago
Anonymous
Jan 17th 2014
1 decade ago