Intro
In Episode 2 of our 2019 Application Report, we found that two attack methods dominated the reports of successful attacks resulting in breaches.1 We unpacked the details of one of those methods, web injection, in Episode 3. Here we explore the other attack method that is responsible for the bulk of breaches: attacks against the access tier of the application. But first, we need some context on how access attacks fit into the bigger picture.
In day-to-day speech, we still tend to describe the Internet as being composed of “sites.” However, web applications, not sites, are really the engines that drive Internet traffic. Webmail, ecommerce, social media, online banking, eLearning, web search, and media streaming all happen through web applications. Unlike static web sites (if any still exist), web applications do not just process requests for existing html files on a server and deliver them; they also accept user input, process it, and return data. Because they transmit and process data, they are also the single most attacked asset on the Internet, and account for a wide range of compromises. Our earlier research showed that applications were the initial targets in 53% of breaches in the past decade.
In examining both the mandatory breach reports from the offices of the U.S. State Attorneys General and incident reports that we gather from F5 customers, we found that in 2018, applications are most often attacked at the access tier (47%). Access tier attacks are any that seek to circumvent the legitimate processes of authentication and authorization that we use to control who gets to use an application, and how they can use it. The result of this kind of attack is a malicious actor gaining entry to a system while impersonating a legitimate user. They then use the legitimate user’s authorization to accomplish a malicious goal—usually data exfiltration. Attacks of this nature can be hard to spot because as far as the system is concerned, the attacker appears to be the rightful user.
Access Attacks Recap
Reports of confirmed data breaches published by the U.S. State Attorneys General in 2018 showed that access-related breaches constituted the bulk of successful breaches, as shown in Figure 1.
We can further break down the broader category of access attacks into subtypes as shown below in Figure 2.
However, as we noted in episode 2, because these data come from legal documents that often lack detail, many of these categories have significant or total overlap, and it is difficult to be sure what part of an attack chain these access attacks occupied. In Figure 3, you can see a sample of the text of the breach letters we used to sort into these categories.
As we noted in Episode 2, we also found a relationship between the targets’ industries and business models and the type of attack they were likely to experience. Organizations in industries that do not rely heavily on ecommerce—such as finance, healthcare, education, non-profits, and accounting—were more likely to experience a breach through access attacks than any other attack type. By contrast, other industries like retail and the service industries were more likely to be compromised through web hacking techniques such as injection.
So, we know that according to the data available to us, access attacks represent the most likely vector into an application. This is particularly true in industries whose business model requires storing valuable information far from their perimeters. To understand why, we need to dive into the various forms of access attacks we saw in 2018, how they work, and what kind of threat actors use them.
Phishing
Phishing is a form of social engineering in which attackers use email or another form of electronic communication to impersonate an entity whom the victim trusts. The goal is to induce the victim to perform an action that allows the attacker to gain something valuable to them. Usually attackers want to capture the victim’s credentials for another system, or to drop malware on their system as part of a broader attack, such as a ransomware campaign. Phishing is frequently the first aggressive action an attacker takes as part of a longer effort.
Despite the fact that phishing is a low-tech attack type that is already fairly well known in the mainstream, we can see from the data that it is still gaining steam. This is partly because it is a reliable strategy for attackers across the spectrum of training and sophistication. It works whether the goal is to steal financial information, intellectual property, or military intelligence.
Phishing used to be a fairly random, unsophisticated attack, and we used to laugh over the obvious grammatical errors and desultory attempts to spoof a known entity. However, it has grown significantly in sophistication, particularly with regard to targeting. Highly targeted phishing using detailed intelligence to craft a highly personalized message is known as spear phishing. Spear phishing was formerly a tactic reserved for state-sponsored actors and the more sophisticated end of the cybercrime spectrum, but it has become easier for attackers of all levels due to the explosion in personal information on the web. The result is phishing attacks of increasingly high fidelity, and with greater specificity to victim’s actual lives.
Phishing Kits on Github
In the same way that tools enabling automated DDoS attacks have brought DDoS “downmarket” into the realm of script kiddies, phishing kits that do much of the legwork have become common.2 This partly explains why phishing remains such a prevalent tactic, not only for well-resourced and knowledgeable adversaries but also low-level cybercriminals and amateurs. It is increasingly important to train everybody on phishing techniques because it is no longer just executives or owners of sensitive intelligence who are being phished.
Other than inducing a victim to enter credentials into a spoofed authentication portal, the most common objective of a phishing campaign is to install malware on the victim’s computer. From here an attacker can capture credentials, escalate privileges, look for other valuable forms of information, pursue a cryptojacking or ransomware campaign, or use the target machine as part of a botnet.
Although phishing campaigns can be used to deliver malware, malware is its own field in and of itself. We’ll be diving into the details of what malware looked like in 2018, and how to mitigate the risk, in a future episode. For now, suffice it to say that phishing is, among other things, a simple and reliable vector for malware installation, which is why it is frequently used at the start of complicated, multi-part attacks.
Credential Stuffing and Brute Force Attacks
In addition to phishing, we are seeing access attacks take less surgical forms. In these cases, attackers either try known passwords from stolen databases of credentials, or enter passwords that are known to be common. This might look like a rudimentary tactic, but the prevalence of password reuse, and the large amount of personal information already breached and available for sale on the Internet, make these tactics valuable in spite of their simplicity.
Credential Stuffing
Credential stuffing is the practice of trying passwords that attackers already know the victim uses elsewhere, with the expectation that the victim uses the same password in multiple places. These attacks are surprisingly successful as a result of two linked trends.
The first is that as more organizations move to the Internet, the number of credential pairs we need to remember has grown tremendously. In 2015, the password management organization Dashlane found that their customers had an average of more than 90 online accounts, and projected that by 2020 the average user would have more than 200.3 The result is that we can’t possibly keep track of that many unique credential pairs, which is why we all reuse passwords quite a lot. According to a 2017 survey by another password manager, Keeper Security, 87% of respondents ages 18-30 and 81% of respondents ages 31 and older reuse passwords.4 This pattern is what makes credential stuffing possible.
The other related trend is that after all of the data breaches over the last few years, it is highly likely that any given individual in the U.S. or Western Europe has had at least one set of credentials leaked online. The result is that we’ve given attackers both the motive and the tools necessary to try to stuff known creds into authentication portals.
Because attackers recognize that many organizations have monitoring in place to detect many rapid authentication attempts, credential stuffing attacks tend to fall into one of two patterns. Attackers either try to log in to a large number of accounts using a small number of known passwords, or a small number of accounts using a large number of passwords.
Attackers either try to log in to a large number of accounts using a small number of known passwords, or a small number of accounts using a large number of passwords.
Over the years, most organizations have implemented some kind of password rotation policy with the intent of reducing the risk that a leaked password will still be valid. This has had some unintended consequences, with many people using the same pattern of characters and something memorable. For instance, if policy requires a password change every 90 days, chances are that someone in the target organization is going to use something cued to the seasons, such as Spring19 or Summer19.5
Another form of access attack that deserves mention here is those against APIs. As APIs become more common, they increasingly represent a vector into an organization’s environment whose mitigation presents challenges in terms of visibility, availability and impact. However, because API security is a complex and rapidly changing issue, we’ll be addressing it in a forthcoming episode devoted solely to this topic.
Brute Force
We typically define brute force attacks as either ten or more successive failed attempts to log in in less than a minute, or 100 or more failed attempts in one 24-hour period. However, attackers realize that these kinds of behaviors are easily monitored and so have begun to alter behavior.
One of the biggest threat intelligence sources we have for brute force attacks comes from our own F5 Security Incident Response Team (SIRT). The SIRT reported that in 2018, brute force attacks against F5 customers were the second most frequent type that they encountered, and they constituted 19% of the incidents they addressed. Before we explore the SIRT’s intelligence, however, we need to break down the different targets and techniques we see in brute force attacks.
Types of Brute Force Attacks
Any application that requires authentication is a potential venue for a brute force attack, but we tend to see attacks against a few specific surfaces in particular:
- HTTP form-based authentication brute force–these are attacks against web authentication forms in the browser. Most of the traditional logins that we see on the web take this form.
- Office 365/ADFS brute force–these are attacks against authentication protocols for Exchange servers, Microsoft Active Directory and Federated Services. Since these services are not accessed through a browser, users authenticate to them through separate prompts. Because of the single sign-on capabilities of AD and federation, successful access attacks of these protocols represent not just access to mail but often to entire intranets and significant amounts of sensitive information.
- SSH/SFTP brute force–SSH and SFTP access attacks are some of the most prevalent attacks of any type we see, partly because successful SSH authentication is often a quick path to administrator privileges. Because so many systems leave default credentials on systems for their own ease of use, brute forcing SSH is, in the words of our own Sara Boddy, “the one big bang-for-the-buck win” for attackers.
- FTP brute force–Similar to SSH, FTP brute force is a dangerous form of attack because it is a method to drop malware, which presents a wide range of options, including escalation of privilege, keylogging or other forms of surveillance, network traversal, and more.
Depending on how robust your monitoring capabilities are, brute force attacks can appear innocuous, like a legitimate login with correct username and password.
Depending on how robust your monitoring capabilities are, brute force attacks can appear innocuous, like a legitimate login with correct username and password. Detecting these attacks hinges on both capturing and examining detailed logs. The number of attempts, the location of the login, and the time can be clues to anomalous behavior even if the attacker happened upon the right creds early on. We’ll dive into monitoring more in the mitigation section below.
Web brute force tools have been around nearly as long as the web. Figure 6a is a screen shot of Brutus AET2, released just after Y2K. As you can imagine, these tools have improved a bit in the past couple of decades. Figure 6b shows the leading tool in 2019 for brute force and credential stuffing, Sentry MBA.6 This new brute forcer adds features like “low and slow” timing, OCR support, and keyword matching.
Brute Force Data
F5 SIRT noted lots of brute force attacks in 2018. As we noted above, nearly a fifth of all confirmed attacks take this form, and 19% of customer support calls to the SIRT were about brute force attacks. While the SIRT also noted a low success rate, even failed brute force attacks can affect an organization’s environment. On six separate occasions, the SIRT found that brute force attacks had actually caused the target’s entire authentication infrastructure to go down. Even when the servers stayed up, authentication for legitimate users locked out or bogged down, resulting in an indirect denial-of-service attack.
Unfortunately, these kinds of secondary effects were sometimes the only way that organizations even knew they were under attack. Many organizations go weeks, months or years between looking at log data, which is where the context necessary to identify this kind of attack resides. We regret to say that this is all too common in the security industry, with the widespread implementation of (low-cost) basics taking a backseat in favor of high-tech, expensive solutions that are slick but narrow in scope.
For those interested in more detail on those brute force attacks, figures 8 through 10 break them down by month, industry, and region. Keep in mind that these data points represent only current F5 customers who called our SIRT for response help. They do not represent the industry as a whole, but given the size and scope of F5’s installed base, they do give a clue as to general brute force attack trends.
Email Hacks
We mentioned above that 20% of the confirmed breaches in 2018 started by targeting email access. As we discussed in episode 2, for organizations that do not rely heavily on ecommerce, the most valuable assets are often stored far from the perimeter, behind multiple layers of controls. When attackers go after organizations like this, email is often a useful staging ground that can provide valuable information in itself, as well as the tools to get to valuable information elsewhere.
The breach data also featured email as a primary target. Email was involved in the top two subcategories of access breaches, representing 39% of access breaches and 34.6% of all breach causes. Think about that: email is directly attributed as a factor in over a third of all breach reports. A typical breach notification letter goes something like “Unauthorized persons used stolen credentials to gain access to emails containing confidential records…” By accident or design oversight, organizations are still storing unencrypted medical and financial data in weakly protected email boxes. This has been a problem for decades and looks like it will persist for some time.
While the transport layer for email is usually not secured, and it is feasible to sniff email in transit, the chances of finding anything valuable in a single message is low. The more fruitful attack against email is focused on where the mail lands, because mail storage often persists in perpetuity. Email boxes are filled with gigabytes of easily searchable information. In addition to sensitive information that might be sitting there, contact lists provide more targets for future attacks. Users often forward or redistribute mails as well, so it is difficult to say for sure where any sensitive information is once it has been transmitted through email.
We will explore options for mitigating these risks below, but for now, we’ll just say that mailboxes are not a good long-term storage option for private information. Large-volume, unencrypted mailboxes can be an unexpected magnet for lawsuits, as they often contain information that is equal or greater in value to assets that are stored under much greater control, such as databases of customer information.
Attackers and Targeting
The contours of the risks that access attacks pose are complex specifically because these attacks are launched by a range of actors with different techniques, goals and motivations. Understanding these differences, and which ones apply to your organization, is key to managing these risks in a way that is driven by intelligence and not by fear.
Criminals
Criminals attacking the access tier usually target organizations in the finance, healthcare, education, service, non-profit, or accounting industries. We found that four percent of the email hacking cases explicitly noted that attackers used a stolen mailbox to phish others within the organization. This highlights the open-ended potential of these kinds of attacks. In the event that a specific attempt doesn’t yield any directly valuable data, attackers can keep expanding their focus until they get the credentials they need to find the value they are seeking.
Advanced Attackers
Like criminal actors, state-sponsored actors or APTs often initiate their illicit access campaigns with spear phishing. However, advanced actors have more time and resources on their hands, and can fashion something of value even from apparently useless data. Large caches of innocuous information, such as email addresses, can be used to look for access elsewhere, since we use email addresses as usernames for other accounts. The main difference in this case between cybercriminals and state-sponsored actors is the sophistication of the intelligence analysis programs behind the actual attacks. Collecting as much information as possible allows traditional intelligence organizations to understand their targets with granularity, supporting physical espionage operations as well as digital exploits.
For example, the 2015 Office of Personnel Management breach provided detailed personal histories, including psychological profiles and biometric information such as fingerprints.7 The intelligence value of these records is enormous, because the combination of the depth and the breadth of the dataset allows adversaries to make connections and draw conclusions about what steps to take next. For instance, between the OPM and the Equifax datasets, an adversary could get a very clear picture not only of whom to target but how–whether to use blackmail, financial incentives, ideological incentives, or other techniques.
Mitigation
So, how do you reduce the risk that access attacks pose? We’d love to say “just MFA it” and drop the mic, but we realize that multi-factor authentication can be hard to implement and not always feasible in the time frames we’d like. As much as passwords are flimsy protection, we found in 2018’s report that 75% of organizations still used simple username/password credentials for critical web applications, so we can’t just pretend that they don’t exist anymore.
To start, make sure your system can at least detect brute force attacks. Setting up alarms is a good start, but it’s better to slow down the session by throttling or CAPTCHA, or even denylist the IP. However, one of the things that makes access such a tricky tier to work on is that confidentiality and integrity can sometimes find themselves at odds with availability. Locking the account in perpetuity is good for protecting unauthorized access, but also results in denial of service for the user. If you’re going to lock someone out, make sure you can fail gracefully, and look out for the bane of the false positive. Set up reset mechanisms that work for both you and your users and get the legitimate traffic back online as quickly as possible.
In other words, it’s not enough to set up some firewall alarms on brute force attempts and take a nap. You have to test these monitoring and response controls, run incident response scenario tests, and develop incident response playbooks so that you can react quickly and reliably.
The NIST Digital Authentication Guidelines offer principles that represent a good baseline and get away from some well-intentioned but obsolete ideas about access control:8
- Make your password policies user friendly
- Check passwords against a dictionary of default, stolen, and well-known passwords, both when users choose a password, and on a recurring basis
- Password reset should never use hints
- Use long passwords
- Avoid arbitrary 30/45/60/90-day password rotations
- Lock or remove unnecessary credentials
At a more advanced level, authentication can turn into a continuous practice instead of a one-time check. We don’t want to make users re-enter a password every time they act on a system, such as accessing or changing data. Such a thing would be about as user-unfriendly as we can get. However, there are backend authentication tools, like cookies and session tokens, that can be used to reduce the attack surface, prevent escalation of privilege and network traversal, and effectively function as a sort of digital quarantine.
Some cloud providers have suspicious activity alert capability for their customer accounts. Specifically, Microsoft Azure has a mechanism to flag and block the use of known bad passwords in AD cloud deployments.9
The same accidental denial of service issues we outlined above apply especially to email, so controlling risk around email attacks is tricky. Make sure you monitor load on your authentication infrastructure using threshold alarms.
As part of an assume breach approach, plan for an attacker to gain access to email, and gear your forensics accordingly. Assume that attackers will set up email forwarding and account delegation on a stolen mailbox, and the user may not even know it. Write up procedures on how to review this and make them part of the incident response plan.
Assume that attackers will set up email forwarding and account delegation on a stolen mailbox, and the user may not even know it.
When setting up logging, check what level of detail your email system provides. Can you recreate an entire email session with log data? Could you tell what settings the attacker might have changed? Can you tell exactly what they downloaded or forwarded? This will figure prominently in your breach reporting. Set the log settings and test them by logging in and see what actually appears in the logs. In the event of an incident, these logs may be your lifeline, so plan and test accordingly.
Incident response should include a streamlined and guiltless method for users to report suspected phishing. Users should feel no shame in asking about or reporting a phish so you can catch and/or contain them quickly.
Web mail authentication sessions can remain active for hours after changing a password. Test the timing and verify the procedures for this to add to your incident response procedures.
Putting It All in Perspective
All of the data we’ve seen have shown that access tier attacks are one of the two most prominent types of attacks, both in terms of number of attempts and successful breaches. They present defenders with unique challenges in terms of visibility and graceful failure. The wide disparity in awareness and lack of suspicion within the user population means that there will always be a viable human to target, so while the tactics of access attacks will certainly change with technologies and defenses, the core principles will remain significant for the foreseeable future. However, protecting applications against access attacks is especially problematic for a bigger reason. As we noted above, these tactics often place availability in direct conflict with confidentiality and integrity; the same processes whose weaknesses attackers are exploiting are the ones with which users directly interact. In other words, they drive to the heart of the conflict between the application protection team and the users, who now realize that they cannot trust one another and do not necessarily have the same goals. The tools of protection, that is, MFA and service lockout, are antithetical to what the users want: free and easy access.
This trend, then, forces us security professionals to confront the question of whom a security program serves. Convoluted access controls, failing closed, and multiple forms of verification might serve the business’ security goals, but at the cost of the value of the application to its audience. This means that, at a high level, access tier attacks threaten the value proposition of the Internet as a marketplace. From the standpoint of an individual security practitioner, there is much that you can do to control the current manifestation of this risk. We hope that the mitigation section above provides a strong platform for that. However, the underlying questions that access attacks pose are fundamental to the relationship between the digital world and the real one. How this specific leg of the arms race evolves will determine much in terms of how we interact with internetworked information systems on the broadest possible level.