A wide variety of HTTP-based attacks can (and should) be prevented in your application. The OWASP Top 10 is a prime example of attack techniques that are both detectable and preventable from within any application. A plethora of tools including static and dynamic analysis as well as penetration testing can ensure that these vulnerabilities do not pass go, into production. Provided, of course, that such security testing is shifted left into the CI/CD process rather than left as the final step before flipping the switch (or bits, as it were) that makes it accessible to users.
Even if you’ve eliminated every potential vulnerability found in dev, applications are still at risk. That’s because some attacks can’t – and by can’t I really mean can’t – be detected by the application. In fact, by the time the attack reaches the application, it’s already too late.
Of course I’m talking about application layer (HTTP) DDoS attacks. You know, the vampiric ones that exploit the HTTP protocol itself to basically suck up every last available drop of compute and memory from an application so as to render it useless for legitimate users.
There are basically two types of HTTP DDOS attacks: the fast and the slow; the flood and the drain.
An HTTP DDoS attack based on flooding uses the fact that apps are expected to accept HTTP requests and respond to them. That’s kind of their shtick, isn’t it? So they do. And they do regardless of how fast those requests are coming in. Even if the requests are coming in at a rate that will deplete the resources available to that server in minutes – or seconds – it tries to respond. See, every app (in fact every device, service, system, etc…) has an upper limit to the number of TCP connections it can hold open at any time before it simply can’t open any more. When it hits that upper limit any subsequent requests are simply ignored. Users experience this with the innocuous status “Trying to connect…” as their browser or app waits until the system-specified amount of time it must wait has passed, and then apologizes for being unable to connect.
This kind of attack can come in so fast (hence the analogy of a “flash flood”) that it outstrips the systems’ ability to scale to meet the demand. Not even auto-scaling can help in this scenario, as the time it takes to provision and launch a new instance of the app is greater than the time it takes for the attack to strip the existing instances of all their resources.
The opposite – a draining attack – accomplishes the same task but does so by forcing the app to keep a connection open longer than necessary. It manages this by pretending it’s on dial-up; trickling a few drops of data out of the app per second rather than the rate at which it is actually capable of receiving it. Doing so means connections last longer, and if you do that with enough connections you basically end up in the same situation as the flooding attack: resource depletion.
Neither of these attacks are detectable from within an application. Why would they be? To the application these requests are all legitimate; they are all well-formed, HTTP-based requests for data that it likely answers thousands of times a day. There is no flag in the HTTP headers or payload that indicate the nefarious nature of the requests. The application is completely and utterly blind to the malicious intentions behind these requests because it has no visibility into the network or into the broader environment – specifically the server’s session table that maintains the master list of all open connections. I’ll spare you the technical details and lecture on threads, processes, and volatility of data in multi-threaded environments and just stick with the “no visibility into the processing of other requests.”
Suffice to say, the application itself has no way to determine whether or not any given request is part of a larger attack (flooding) or whether its behavior is inconsistent with its known capabilities (draining).
What does have visibility into both is the proxy sitting upstream (in front of) the application. That’s because the proxy is probably doing load balancing and thus has to pay attention to how many requests are currently in process as well as how many are coming in because it has to send them to one of the apps in the cluster (or pool, if you prefer).
It further more has to know where to send the response, so it knows about the client and its network connection (only the client IP address is typically sent to the app, nothing more). Unlike the application, it has the visibility necessary to detect both flooding and draining attacks – and stop them before they can sink their vampiric teeth into the app’s resources.
That’s why proxy-based services like a WAF (Web Application Firewall) or even an advanced load balancer is a critical player in today’s application security strategies. Because they have the visibility and the means to detect the anomalous behavior indicative of an attack – and repel it like garlic and a vampire.
And because these traditionally “network” services should necessarily become part of the application architecture, it seems logical that a DevOps approach would tend to stretch its wings and be more inclusive of those services that naturally gravitate toward the application in the first place, like app security and scalability (load balancing).
Applications can’t stop every attack, particularly those that require a level of visibility that simply isn’t available to them. Working in concert with application-affine services like web app security and load balancing, however, can provide the means by which a more comprehensive set of attacks can be detected and repulsed to ensure fewer outages and breaches.