From the broadest perspective, zero-trust principles can be applied to the entire application development lifecycle, including design of the system, hardware platforms used, and procurement procedures.2 However, this paper discusses the operational aspects of implementing zero trust for defending applications and data in runtime.
Broadly speaking, zero trust security uses technologies to achieve one of three distinct goals:
The following graphic depicts this overall zero trust security transactional model, with the following sections diving deeper into each class of technologies.
The first two technologies—authentication and access control—are closely related and are directly motivated by the principles of “explicitly verify” and “least privilege,” since these technologies are at the core of enforcing “Who can do What.” More sophisticated implementations of authentication watch the ongoing behavior of an actor, capturing the mindset of “continuously assess.”
Authentication technologies are all about building confidence in an attested identity: Who is acting in a transaction. The authentication process has three components:
The most basic form of attestation is often referred to as a “user”—a human, or agent acting on behalf of a human, that wishes to perform a transaction. However, in the case of zero trust used within an application an actor might be a workload (such as a process, service, or container), so the generalized concept of identity should include such actors. In other cases, the notion of Who includes not just the human or workload, but additional considerations or dimensions of identity. From that perspective, additional dimensions of identity might include the device or platform of the user/workload, or the ecosystem being used for the interaction or the location of the agent. For example, a user “Alice” may be on a PC tagged as “ABC- 0001” using a specific, fingerprinted browser instance, sourced from IPv4 address 10.11.12.13.
Some systems allow unauthenticated users, sometimes referred to as “guests” or “anonymous” users, to perform a limited set of transactions. For such systems, the additional steps of proving identity and the system rendering a verdict is not relevant. However, for any specific attested identity, the following methods are commonly used to support that attestation:
Often, if a high degree of confidence is required, multiple methods are used. This is evidenced in the Google BeyondCorp model,3 which requires multi-factor authentication (MFA) before allowing higher value transactions. The more sophisticated authentication solutions associate a “confidence” with each identity and specify a minimum confidence level for each type of transaction, based on the value and risk of the transaction.
Finally, note that some of these methods are not static, one-shot actions but can and should be ongoing as per the principle of “continuously assess.” In such cases, the confidence score assigned to the identity attestation can change up or down over time. For example, the browser fingerprint or IP address may change within a single user session, which could be viewed as suspicious, reducing confidence; or as more data is collected on the actor’s behavior in a session, the confidence score may either increase or decrease depending on how the current behavior compares to past observations.
Dynamic authentication can work hand in hand with access control in more advanced systems. As the first level of this interaction, the access control policy can specify a minimum confidence score for different classes of transactions, as mentioned earlier. The next level of the interaction allows the access control subsystem to provide feedback to the authentication subsystem, typically asking for additional authentication to increase the confidence score to the minimum threshold.
After using authentication techniques to ascertain Who is acting in a transaction, the next questions are: What is that actor allowed to do? And to Whom? This is the purview of access control technologies.
To take a physical security analogy, imagine you wanted to visit a military base. After the guards confidently determine whether you are a civilian, politician, or soldier, they would use that determination to decide which buildings you could enter and whether you could bring a camera into each building that you might be allowed to enter. The policy governing those choices might be very coarse and apply to all buildings (for example, “politicians can enter any building”) or might be more fine-grained (such as “politicians can only enter building <A> and <B> but can only bring cameras into <A>”).
Applied to the cybersecurity context, access control techniques should embody the zero trust principle of “least privilege.” In other words, the optimal access control policy would only allow exactly those privileges that the actor requires and disallow all other privileges. Additionally, an ideal robust policy would be conditional on a specific minimum level of confidence in the authenticity of the actor’s identity, with the confidence threshold specified at the granularity of each allowed privilege.
Therefore, value of an access control solution can be judged by how closely it aligns to these ideals. Specifically, a zero trust security solution must include access control and should evaluate the access control technology along the dimensions depicted below and described thereafter.
Noting the principle of “continuously assess (and reassess),” any belief in the authenticity of the actor should adjust over time. In a simple solution it may simply be a timeout; in more sophisticated systems the confidence could vary based on observations of the actor’s behavior over time.
If authentication and access control are implementations of the “always verify” and “least privilege” mindset, then visibility and contextual analysis are foundational to the “continuously assess” and “assume breach” principles.
Visibility is the necessary precursor to analysis—a system cannot mitigate what it cannot see. Thus, the efficacy of the zero trust security solution will be directly proportional to the depth and breadth of telemetry that can be gathered from system operations and outside context. However, a modern visibility infrastructure will be capable of providing much more potentially useful data, metadata, and context than any reasonable unassisted human will be able to deal with in a timely manner. As a result of desires for both more data and the ability to distill that data into insights more quicky, a key requirement is machine assistance for the human operators.
This assistance is typically implemented using automated algorithms that span the spectrum from rule-based analysis to statistical methods to advanced machine learning algorithms. These algorithms are responsible for translating the fire hose of raw data into consumable and operationalized situational awareness that can be used by the human operators to assess and, if necessary, to remediate. For this reason, ML-assisted analysis goes hand in hand with visibility.
The generalized pipeline from raw data (visibility) to action (remediation) is shown below:
Visibility is the implementation—the “how”—of the “continuously assess” zero trust principle. It includes keeping an inventory of available data inputs (Catalog) and real-time telemetry plus historical data retention (Collect).
The maturity of a zero trust visibility implementation should consider four factors:
The latency provides a lower bound to how quickly a potential threat can be responded to. A zero trust solution’s latency should be measured in seconds or less; otherwise, it is quite likely any analysis—no matter how accurate—will be too late to prevent the impact of the exploit, such as data exfiltration/encryption or unavailability due to resource exhaustion. More sophisticated systems may allow both synchronous and asynchronous mitigations. Synchronous mitigation would inhibit completion of the transaction until full visibility and analysis are completed. Because synchronous mitigation is likely to add latency to the transaction, this mode of operation would be reserved for particularly anomalous or risky transactions, while allowing all other transactions to send telemetry and be analyzed asynchronously.
This concern is relevant if data arrives from multiple sources or types of data sensors, which is a common scenario. This factor typically breaks down into two sub-concerns.
One key value derived from a high-quality visibility solution is the ability to discover suspicious activities as an indicator of possible breach. To do so effectively the solution must receive telemetry across all the relevant “layers” of application delivery: the application itself, of course, but also the application infrastructure, the network infrastructure, any services applied to or used by the application, and even the events on the client device. For example, identifying a user coming in from a new device, never seen before, may be slightly suspicious on its own; but when combined with network information (such as GeoIP mapping from a foreign country) the suspicion level goes up higher. This suspicion level is manifested as a lower confidence score in the identity of the user. In the context of a zero trust security policy, when this actor attempts a high-value transaction (such as transfer of funds to a foreign account), the access control solution can choose to block the transaction, based on the low confidence.
As it relates to zero trust mindset, the deeper and more complete the visibility solution is, the more effective the system can be in appropriately limiting transactions and detecting breaches
Finally, any collection of data must be compliant with statutory and licensing requirements relating to the security, retention, and use of data. Therefore, a robust visibility solution must address each of these needs. Understanding the constraints on data use implied by governance must be factored into a zero trust visibility solution. For example, if an IP is considered Personally Identifiable Information (PII), then the use and long-term retention of IP addresses for analysis must cater to permissible use of the IP addresses.
In addition to visibility, the other machinery required to implement “continuously assess” is the analytical tooling required to perform meaningful assessment; that is, to have assessment that can be operationalized by a zero trust solution.
One consideration for analysis is the scope and breadth of the input data. The inputs to the analysis algorithms can be limited to a single stream of data from a single source, or can look across multiple streams, including from various data sources and all layers of the infrastructure and application.
A second particularly relevant aspect of analysis in the zero trust framework is dealing with the volume and rate of data ingested, which will exceed the capability of any human to digest. Therefore, some sort of machine assistance to form human digestible insights is required. Once again, the sophistication of the assist can be described as a progression.
As with the rules-based approach, ML assistance can be for detection only or it can be tied to automatic remediation. Additionally, ML assistance can be used in conjunction with a rules- based system, where the ML “verdict” (or opinion or confidence) can be used as an input into a rule, such as “do action <X> if <ML evaluator [bot_detector_A] reports bot with confidence greater than 90%> .”
The final tenet of the zero rust mindset is to “assume breach.” To be clear and provide perspective, properly implemented authentication and access control methods are effective at preventing the overwhelming majority of malicious transactions. However, one should, out of an abundance of paranoia, assume that the enforcement mechanisms of authentication and access control will be defeated by some sufficiently motivated or lucky adversary. Detection of breaches, necessary for responding to these escapes in a timely manner, requires visibility and machine assisted analysis. Therefore, it is because the other enforcement mechanisms will be defeated on occasion that the technologies of visibility feeding ML-assisted contextual analysis are a critical need to feed the zero trust security backstop solution of risk-based remediation.
For the “false negative” cases where an actual malicious transaction did defeat authentication and access control, the mechanism of automated risk-based remediation should be used as a backstop. But because this technology is applied as a backstop against transactions that passed the prior enforcement checks, there is a higher concern around incorrectly flagging what was, in truth, a “true negative” (a valid, desirable transaction) into a “false positive” (incorrectly flagged as malicious transaction). To mitigate this concern, any remediation actions triggered by a belief in possible maliciousness, that somehow was not caught by authentication or access control, should be based on the following three factors:4
Zero trust security is a more modern take on prior approaches to security such as defense in depth, extending the prior art by taking a transaction-centric view on security—Who is attempting to do What to Whom. This approach enables securing not only external access to an application but is applicable to protecting the application internals as well.5 Given this foundational transactional view, zero trust security is rooted in a set of core principles that are used to defend applications within today’s more complex and challenging environment, with the principles then mapped to a set of subsystem-level solutions, or methods, that embody those principles. The core principles and how they map to solution methods are summarized below.
These tools—the methods of authentication, access control, visibility, contextual analysis, and risk-aware remediation—are necessary and sufficient to prevent a wide variety of attack types.
1 https://www.f5.com/services/resources/white-papers/why-zero-trust-matters-for-more-than-just-access
2Zero trust can, and should, be applied even “to the left” of the CI/CD pipeline. Tools such as vulnerability assessment tools, static analysis, CVE databases, open-source code reputation databases, and supply chain integrity monitoring systems are consistent with the zero-trust mindset.
3https://cloud.google.com/beyondcorp-enterprise/docs/quickstart
4Note that the line between contextual, risk-aware access control and the general topic of risk-aware remediation is a fuzzy one, and some overlap does exist.
5Often referred to as “East-West” intra-app protection, as opposed to “North-South” to-the-app protection.