In the 2019 Application Protection report, F5 Labs found a majority (51.8%) of breaches in 2019 were caused by access control attacks. Our research showed these breaches resulted from stolen login credentials obtained by phishing and brute force as well as stealing credentials elsewhere and using them as part of a credential stuffing attack. Also, between 2017 and 2019, the F5 Security Incident Response Team noted increasing incidents of brute force and credential stuffing attacks (41%) in the financial sector. It is easy to see that unauthorized logins are a significant threat.
Once attackers gain access with stolen credentials, they steal sensitive data or perform an account takeover (ATO) to commit fraud. Often the credentials that attackers hijack were stolen from a completely different site than the one they're used on. This means that the defenders have no idea or warning about the potential attack until they are already under siege.
In 2019, F5 Labs wrote about unwanted bots, the mayhem they cause, and how to detect them. Yet many bot attacks can evade antibot controls. This turns into an exhausting game of Whac-a-Mole. The defender blocks and then the attacker uses various evasion techniques to slip past the countermeasures. Let’s take a deeper look at how this plays out.
The Preliminary Credential Stuffing Attack
Attackers understand the scaling power of technology, so they often employ automation, using bots to launch and orchestrate credential stuffing campaigns. Many point-and-click attack credential tools exist, such as Sentry MBA,1 OpenBullet,2 BlackBullet, Snipr, STORM, and Private Keeper. Attackers also leverage basic open source operational tools like Wget, Selenium, PhantomJS, and cURL. These tools simulate a browser and run scripted web login sessions. With tools like these, the attacker can create an army of bots to do their work for them.
Tapping the Vast Caches of Stolen Credentials
To perform a credential stuffing attack, the tool needs a stolen credential list to run against the targeted web login. These credential lists are simply a file of usernames (usually email addresses) and passwords. If the attacker hasn’t already obtained a batch of them through phishing, they can easily turn to the dark web. In 2017, F5 Labs noted a credential crisis in which billions of stolen login credentials were being sold, traded, or even given away in cybercriminal marketplaces. In the three years that have passed, it has only gotten worse. These lists can be loaded right into these attack tools, as shown in Figure 1.
Credential Stuffing Causes Outages
It’s not hard for attackers to find poorly defended web logins. Many sites often have only a basic web application firewall (WAF), or nothing at all. Many WAFs do not detect or defend against credential stuffing attacks. In general, WAFs are designed to block application attacks, malformed requests, and web exploits. But a credential stuffing attack looks like a legitimate web login; there will be many of them at once, and many with the incorrect passwords, so these things can look suspicious. This assumes that the defender is watching their failed login attempts and noting surges. The reality is that many victims often mistake a credential stuffing attack for a denial-of-service attack. The login pages then become overwhelmed with failed logins, and either the site crashes or customers can’t get in through the load. There have been cases of backend infrastructure failing under the heavy load of authentication requests.
Preliminary Credential Stuffing Mitigation Attempts
Once the victim organization detects the attack, it looks to stem the tide. The trick is to stop the attackers from logging in but not obstruct or inconvenience users and customers. Some basic defensive measures include inspecting and blocking the web session, which some WAFs can do. If the attack tool or bot uses plain web login requests, then the user agent (used by a web browser to advertise and identify itself to a web server) may be identified as irregular and blocked.
Another basic defense is using IP address denylists to block the known bad IP addresses. The denylist is often based on simple geographic origins, IP addresses from earlier attacks, or canned third-party reputation lists of known attackers. Another tool is rate limiting of login attempts, which unfortunately applies to both attackers and customers. This makes it hard to find the right balance.
The next step beyond this is to add a CAPTCHA test to the login process, which presents users with a simple puzzle. The idea is that bots can’t solve this puzzle but humans can, thus blocking the bots. The downside is that CAPTCHAs can annoy customers. In some cases, CAPTCHAs become a significant barrier for people with disabilities.1
Attackers Always Retool
The cybercrime community already knows how to work around these simple defenses. In fact, attackers already have plenty of plugins, scripts, and utilities they can configure to evade antibot defenses. Cybercriminals upgrade and enhance their tools, often cribbing (or outright appropriating) penetration testing tools. Most of the time, the real work for attackers is configuring them for the specific victim’s website and modifying the scripts.
Attacker Evasion: Fake the Bot’s Originating Network
Anyone who has tried to use IP address denylists to stop credential stuffing knows that even if it works for a while, it won’t work for very long. Rarely do attackers use a stable, known set of bots. Once those bots are reputation filtered, they have plenty of other victimized computers and IoT devices for launching attacks. Bots often run on consumer Internet connections, which use dynamic IP addressing that continually changes addresses. Blocking based on geographic origin is also ineffective, as attackers use bots from around the world, not just their current location. Most credential stuffing attack tools have configuration options to load and use new lists of proxies, as shown in Figure 2.
As rate limiting is also often based on the originating IP address, this defensive tool is also neutered by bot IP address hopping. Also, attackers can configure their bots to stagger attacks and spread out between addresses. This means bots can come in at different times from different places to slip around rate limiters and IP address blockers.
Attacker Evasion: Disguise the Bot as a Legitimate Web Browser
Since the bot is automated and not surfing the web in a normal way, it can stand out from real browsers like Google Chrome or Microsoft Edge. However, many credential stuffing bot tools have ways of imitating a real browser. A simple fakeout is to forge a user agent. Bots can also spoof a referer request header, which identifies the URL that’s linked to the webpage being requested. These headers provide a way for websites to loosely check the customer’s clickstreams as legitimate. Many of these basic evasions are often enough to imitate a customer’s browser and evade basic WAF blocking rules. Some of these Sentry MBA browser-spoofing configuration options appear in Figure 3.
Attacker Evasion: Impersonate a Human
The final basic antibot defense is using CAPTCHAs, which we mentioned earlier creates a hurdle for legitimate customers and users. Naturally, attackers have worked out ways around them. Many attack tools have optional plugins to match and supply answers for thousands of known CAPTCHA puzzles, as shown in Figure 4. F5 Labs researchers wrote a detailed analysis of the CAPTCHA solver market and how CAPTCHAs are a whack-a-mole response rather than a definitive solution to the problem.
Attacker Evasion: Impersonate Human Mouse Movement
Some bot scraping tools watch user activity looking for scripted mouse movements or keystrokes. These too can be spoofed with a wide variety of tools. BezMouse is just one example of an open source tool for this.1 This one simulates humanlike mouse movements with Bézier curves to evade antibot defenses. Figure 5, from the BezMouse GitHub page, shows a pseudorandom pattern simulating a human selecting number keys with a mouse.
Look for Smarter Antibot Tools
In the end, the best defenses against credential stuffing bot attacks need to be sophisticated. It begins with gathering a combination of factors on the web user. These factors are then scored and weighted using machine learning to weed out bots. Intelligent antibot systems can also spot the predictability of pseudorandom mouse and keyboard actions. They also do things like interrogate the user’s browser during the web session. This interrogation looks for the characteristics of a real browser on an actual computer (such as the ability to run JavaScript). Even the login and password combinations can be examined in real time to check if they are part of known leaked credential databases.
Bot-driven credential stuffing attacks can be relentless, especially as attackers adapt and evade weak defenses. However, if simple defenses can be skirted, smarter ones can raise the cost for attackers so they look elsewhere for easier prey.