skip to content
Scott's Ramblings
Photo by Christophe Hautier / Unsplash

What's running that endpoint?

/ 8 min read

Spending time reading about the tech companies use to build their stacks is something I enjoy. Lots of the big names like Netflix, Dropbox, Meta, and of course Datadog, maintain engineering blogs that are worth a read. But - sometimes I want to know something very particular, and its time to turn to the tools - for instance, cracking open web inspector to pull apart a web page. On this miserable European autumn day that “something very particular” is …

Given a URL, can we identify the cloud load balancer running it?

I think this should be straightforward! You already get a lot of signal out of the DNS resolution itself - e.g. ALBs look like bloop-blup.us-east-2.elb.amazonaws.com, NLBs look like bloop-blup.elb.us-east-2.amazonaws.com, and API Gateways look like bloop.execute-api.us-east-2.amazonaws.com.

… but this all feels a bit meh. Ultimately I hope to be able to identify where a host name leads by seeing how the other side talks, not where it is sitting. This is going to be an article going through my attempt to write a tool to solve this. It may not end up where I want it to go, but let’s see.

To start with, let’s try tell if the thing listening for TLS on the other side of a hostname is one of AWS’ load balancing services …

  • Application Load Balancer
  • Network Load Balancer
  • API Gateway
  • Something else

We’ll try and get there using the SSL negotiation and server headers, but failing that, we can fall back on hostnames as well. Let’s try!

First steps - Edge vs Regional API GW

To get started, I dived into my AWS account and found both edge and regional API gateway URLs. Starting with two similar-ish things let’s me work out what tools might be useful on the CLI before I write any code and bounds the problem space a bit. It also means I avoid spending ~hours spinning up AWS infra for the other endpoints and focus on the problem at hand 😄

Let’s start with curl --verbose and curl --trace-ascii output for the / url on both; this’ll give me a quick indication of what we can see. I’ve included the heavily-summarized output below -

API Gateway - Regional Endpoint

# abridged curl --verbose output
* Host myhost.execute-api.us-east-2.amazonaws.com:443 was resolved.
* IPv4: 1.2.3.4, 5.6.7.8
* Trying 1.2.3.4:443...
* Connected to myhost.execute-api.us-east-2.amazonaws.com (1.2.3.4) port 443
* ALPN: curl offers h2,http/1.1
* SSL connection using TLSv1.3 / AEAD-AES128-GCM-SHA256
* Server certificate: CN=*.execute-api.us-east-2.amazonaws.com
* No CloudFront headers
* Request completed with HTTP/2 404
# abridged curl --trace-ascii output
== Info: Host myhost.execute-api.us-east-2.amazonaws.com:443 was resolved.
== Info: IPv6: (none)
== Info: IPv4: 1.2.3.4, 5.6.7.8
== Info: Trying 1.2.3.4:443...
== Info: Connected to myhost.execute-api.us-east-2.amazonaws.com (1.2.3.4) port 443
== Info: ALPN: curl offers h2,http/1.1
== Info: (304) (OUT), TLS handshake, Client hello (1):
=> Send SSL data, 351 bytes (0x15f)
== Info: (304) (IN), TLS handshake, Server hello (2):
<= Recv SSL data, 90 bytes (0x5a)

Some things jump out:

  • The SSL handshake in both is identical, and the certificates too.
  • The edge API resolves to more IPs than the regional API
  • The edge API comes back with a couple of Cloudfront headers - x-amz-cf-pop rightly points out that I am somewhere near Zürich

My heuristic will therefore be:

  • Hostname matches [.]+.execute-api.[^.]+.amazonaws.com$ - I don’t care about the AWS region or the API name, but there should be an execute-api in the middle
  • At most 2 IPs and no Cloudfront headers? => Regional API
  • At least 4 IPs and x-amz-cf-pop header? => Edge API
  • Something else? => Flag it!

Sketching out a CLI tool

Now that I have some differences I can spot, I’m going to start building a little CLI tool to automagically classify APIs. I’m going to use golang because it should be quick and easy, and I’ve not used it so much lately. I’ll show the interesting bits of code as we go, and link the whole thing at the end.

I’ll start with the entrypoint. I want a CLI I can pass a single domain to, and get a classification printed to the output. I’ll start with classification rules for the two API Gateway types now, but i’ll split apart the “connect and collect information” from the “classify” steps, so that we can easily extend to other gateways later. Something like this:

// Info we need for classification. Will extend as we add more classifiers!
type DomainInfo struct {
Domain string // Domain name
IPs []string // List of IPs
HttpResponseHeaders map[string]string // Map of HTTP response headers
}
// Extract relevant information from the HTTP response for classification
func extractDomainInfo(domain string) (*DomainInfo, error) {
// ....
}
// Classifier function types
type ClassifierFunc func(*DomainInfo) string
// List of classifiers
var classifiers = []ClassifierFunc{
classifyRegionalAPI,
classifyEdgeAPI,
}
// Main function
func main() {
// Get the domain name from the CLI
// ...
domain := os.Args[1]
// Extract domain information
domainInfo, err := extractDomainInfo(domain)
if err != nil {
fmt.Printf("Error fetching domain information: %v\n", err)
return
}
// Classify the domain
classification := "Something else"
for _, classifier := range classifiers {
result := classifier(domainInfo)
if result != "" {
classification = result
break
}
}
fmt.Println(classification)
}

Our classifiers are then pretty straightforward:

// Classifier for API Gateway Edge API
func classifyEdgeAPI(info *DomainInfo) string {
// Check if the domain matches the API Gateway pattern
if !apiGatewayRegex.MatchString(info.Domain) {
return ""
}
// Check if there are at least 4 IPs and the required CloudFront headers are present
if len(info.IPs) >= 4 && info.HttpResponseHeaders["X-Amz-Cf-Pop"] != "" && info.HttpResponseHeaders["Via"] != "" {
return "Edge API"
}
return ""
}

I’ve thrown in a few basic unit tests as well to make sure I don’t break anything without noticing, and that’s that! Let’s move onto identifying some of the other load balancer types.

ALBs, NLBs

ALBs always terminate HTTP or HTTPS, NLBs can terminate anything TCP or UDP. If we get an endpoint that talks something other than HTTP and looks like a load balancer URL from AWS we can assume it’s an NLB, but we’ll just ignore that case for now and focus on TLS.

Based on what we used above to differentiate between the two API gateways, here’s what we get from an ALB and NLB:

ALB Salient Bits

URL: bloop.us-east-1.elb.amazonaws.com
IPv4: 1.2.3.4, 2.3.4.5, 3.4.5.6
IPv6: [enormous IPv6] x 4
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 / [blank] / UNDEF
Server Cert: [Custom Cert, Not Amazon]
Response Headers: content-length, location, x-content-type-options, strict-transport-security

NLB Salient Bits

URL: bloob.elb.us-east-1.amazonaws.com
IPv4: 1.2.3.4, 2.3.4.5, 3.4.5.6
IPv6: [enormous IPv6] x 3
TLS: SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
Server Cert: [Custom Cert, Not Amazon]
Response Headers: location, x-content-type-options, strict-transport-security, date

This is a bit bleak. None of the response headers look load balancer specific - these are almost certainly things the workload on the other side has set.

Thinking harder about this, we’re a bit stuck with the NLB in general. An NLB can be run as a pass-through TCP load balancer back to any arbitrary workload. When its running in this TCP proxy mode we can’t infer anything about the load balancer, because it is essentially transparent 1.

So we’re back to hostnames for ALBs and NLBs, unfortunately. Our classifiers end up looking like this:

// Classifier for ALB
func classifyALB(info *DomainInfo) string {
// Check if the domain matches the ALB pattern
if !albRegex.MatchString(info.Domain) {
return ""
}
}

Now that we’re so dependent on the domain, it’s worth thinking a bit harder here. So far we’ve dealt directly with the AWS-owned domain names for each load balancer. For this to work reliably and to address my original question (“where’s this ${thing} running?” - we need to do CNAME resolution to get back to the actual domain name. A CNAME is a DNS record that tells you to go lookup another DNS record, and is a typical way of “aliasing” things back to cloud infrastructure. If I setup an API Gateway at myapi.scottgerring.com I’d CNAME it back to the .amazonaws.com. We can chase back the CNAME like this:

func resolveCNAME(domain string) (string, error) {
// Perform a CNAME lookup
cname, err := net.LookupCNAME(domain)
if err != nil {
return domain, nil // If no CNAME is found, return the original domain
}
return strings.TrimSuffix(cname, "."), nil // Remove trailing dot from CNAME
}

There’s one other thing haunting me, which is that you can create alias records pointing to all of these load balancer types. Aliases are a DNS extension that solves a similar problem to CNAMEs, but does it on the server side - when you look up myapi.scottgerring.com, Amazon’s DNS service understands that it is pointing at an API Gateway, and passes back the IPs directly as an A record rather than the CNAME pointer. Our approach will not be able to identify this.

Where did we land?

Well, we can:

  • tell the difference between API Gateways, ALBs, and NLBs in most cases ✅
  • … but relying heavily on domain names for a lot of the classification and not jazzy TLS info 🟠
  • … and NLBs are hard 🟠
  • … and we can’t handle alias records ❌
  • … and we haven’t got to google yet ❌

You can find the complete code here. I will likely come back to this later and see if we can’t do something lower level to profile the differences - this would be particularly interesting with the NLBs - and also use it as an excuse to check out what cloud low balancers on GCP and Azure look like!

Footnotes

  1. I would be unsurprised to discover you can do something fancy with TCP fingerprinting, or at least BGP lookup. But - this is a wild diversion i’ll set aside for now.