This post follows on from what’s running that - cloud load balancer ID from hostname - an exploration of how we might detect the cloud load runner serving a particular hostname. In the last episode we managed to differentiate between AWS API Gateway, Application Load Balancer, and Network Load Balancers, but relied a lot on patterns in the DNS records AWS uses. In some cases this isn’t possible so we want something a bit more robust.
Let’s try and do better! Once again, this’ll be an exploration into what else we can find out, and may not end up anywhere particularly “useful” 🙂
Prior Art
For inspiration I turned to nmap which can do a bunch of things, including fingerprinting servers. I ran a few different incantations across a HTTPS server running on my LAN to see what it’s it up to:
… this produces a lot of output. But one thing caught my eye - one of the
things http-server-header
does it to send a HTTP/1.0
request to the server
without a Host
header set:
… I wonder if we get anything interesting back in the response headers from the AWS load balancers if we do this? We can try this pretty easily with curl:
Interestingly enough an edge API Gateway API will return a Server: Cloudfront
header! Every other combination of load balancer and HTTP version never returns
a Server
header. I reckon we know have enough to identify Edge and Regional API
Gateway endpoints without having to use hostnames:
- HTTP/1.1 response includes
x-amz-apigw-id
and HTTP/1.0 response includesServer: Cloudfront
header? —> Edge API Gateway - HTTP/1.1 response includes
apigw-requestid
and HTTP/1.0 response does not includeServer: Cloudfront header
? —> Regional API Gateway
Adding the HTTP/1.0 Probe
By looking at nmap, we’ve learnt that we can get more signal for our load
balancer fingerprinting by making different sorts of HTTP requests. Lets modify
our golang app so that we have a pattern to easily add new probes in the same
fashion as we add different load balancer classifiers; extractDomainInfo
is
already getting different to deal and this seems like a good moment.
First I’ll add an interface for the probes themselves:
Then we can take extractDomainInfo
, break it apart, and make the existing probes it performs (HTTP/1.1 and CNAME)
fit this interface. Something like this:
… and for our CNAME probe:
… and our new HTTP/1.0 probe will follow the same pattern:
Finally, we modify our classifiers to pick out what they need from an aggregated
collection of all the probe results. Here we can see how ClassifyRegionalAPI
implements the logic outlined earlier and can identify a Regional API Gateway
endpoint without relying on hostname !
Now that we’ve got a nice structure to add separate probes, let’s see if we can remove the DNS dependency for the API gateway - so we can identify one without having to look at the DNS at all. This will make us robust in the face of APIs that use aliases to “hide” Amazon’s DNS endpoints. Here’s what that looks like for our API Gateway - Edge API classifier:
What’s next?
This doesn’t help us at all with ALBs or NLBs - we get no extra signal from the HTTP/1.0 probe - we are still stuck on the CNAME rules here. There’s some other hints in the NMAP scripts from earlier that might be useful - for instance, trying to cluster around the TCP TTLs delivered back by the server. But that feels a bit thin here 1, and we’ll leave it for another day!
As always, you can find the complete app on GitHub.
Footnotes
-
Maybe it is enough to differentiate between load balancers if you know that you have an LB, but it is hard to imagine a pattern in TTLs is going to be enough to say “of all the things on the internet, this is definitely an AWS NLB ↩